Add histogram bins and date distributions to profiler
Problem
The profiler's column profiles show basic statistics (min, max, mean, null count) but provide no information about value distribution. Analysts frequently need to understand whether a numeric column is uniformly distributed, skewed, or bimodal — and whether date columns have gaps, seasonal patterns, or clustering — but the only way to get this is to write ad-hoc exploratory SQL. Without histograms and temporal distribution buckets in the profile output, the inspector report gives an incomplete picture of data shape, forcing analysts out of the profiling workflow and back into manual querying for every column they want to understand.
Context
- Numeric columns include robust histogram bin summaries with sensible defaults.
- Date/time columns expose distribution buckets at useful granularities.
- Inspector surfaces these distributions with readable visual encoding and labels.
Possible Solutions
Plan
- Define profiling query strategy for numeric histograms and temporal buckets.
- Implement summarization logic with null/outlier handling.
- Update inspector rendering templates/components for new profile sections.
- Add tests for correctness across data types and sparse/large datasets.
- Document how distributions should be interpreted by internal analysts.
Implementation Progress
- GitHub issue: https://github.com/fivetran/dataface/issues/281
Review Feedback
- [ ] Review cleared