Dataface Tasks

Add histogram bins and date distributions to profiler

IDISSUE-281
Statuscompleted
Priorityp1
Milestonem1-ft-analytics-analyst-pilot
Ownersr-engineer-architect

Problem

The profiler's column profiles show basic statistics (min, max, mean, null count) but provide no information about value distribution. Analysts frequently need to understand whether a numeric column is uniformly distributed, skewed, or bimodal — and whether date columns have gaps, seasonal patterns, or clustering — but the only way to get this is to write ad-hoc exploratory SQL. Without histograms and temporal distribution buckets in the profile output, the inspector report gives an incomplete picture of data shape, forcing analysts out of the profiling workflow and back into manual querying for every column they want to understand.

Context

  • Numeric columns include robust histogram bin summaries with sensible defaults.
  • Date/time columns expose distribution buckets at useful granularities.
  • Inspector surfaces these distributions with readable visual encoding and labels.

Possible Solutions

Plan

  • Define profiling query strategy for numeric histograms and temporal buckets.
  • Implement summarization logic with null/outlier handling.
  • Update inspector rendering templates/components for new profile sections.
  • Add tests for correctness across data types and sparse/large datasets.
  • Document how distributions should be interpreted by internal analysts.

Implementation Progress

Review Feedback

  • [ ] Review cleared