dataface/core/render/chart/DESIGN.md

Chart Rendering — Design Philosophy

Core Principle: Extended Vega-Lite Grammar

Dataface charts use an extended Vega-Lite grammar. For any chart type that Vega-Lite supports natively (bar, line, area, arc, point, etc.), our YAML should be a thin wrapper around Vega-Lite's own concepts — not a parallel chart language that happens to compile down to Vega-Lite.

Concretely:

The translation layer should be mechanical: read YAML fields, build a Vega-Lite spec dict, inject the query data, apply theme/config defaults. No hidden mutations, no data-driven rewrites, no surprise layers.

Data Belongs to Queries, Not Charts

The query layer owns dataset meaning and grain. The chart layer owns visual encoding and presentation.

That means chart rendering should consume already-shaped data:

This keeps authored charts predictable. A user reading chart YAML should be able to assume the chart is a presentation over query results, not a second transformation pipeline hidden inside render code.

Native-Option-First Fixes

When fixing a Vega-Lite chart bug, the default move is to expose or pass through a native Vega-Lite option, not to add renderer-side policy code.

Before adding renderer logic for a Vega-Lite-native chart, answer:

  1. Which Vega-Lite property already expresses this behavior?
  2. Can we expose that property through existing chart settings or config?
  3. If not, what exact Vega-Lite gap forces Python-side logic?

Config Is the Single Source of Truth — No Fallbacks in Code

All defaults live in YAML. Code must never hardcode fallbacks.

The config system works like this:

default_config.yml + chart_defaults.yml + built-in structures/themes + user dataface.yml overlays  →  selected structure reapplied  →  get_config() returns complete DotDict

By the time any rendering code runs, get_config() is guaranteed to return a complete config with every value populated. This means:

The Vega-Lite config object in chart_defaults.yml (under vega.config) is applied to every spec at render time, after the active structure has been merged into the runtime config. Structural defaults like y-axis orientation, legend placement, title anchoring, and whether grids/domains exist belong in built-in structures. Visual defaults like grid colors, font sizes, and mark styles live in chart/theme config. The Python translation layer should NOT re-specify these — they come from config.

That includes structure-owned heuristics we previously hardcoded in Python, such as the default bar-chart categorical y-axis side and the x-axis label posture choices used by the smart-axis helper.

A useful metaphor:

Optional explicit structure presets and theme-paired structure overlays merge before the final theme layer. This keeps mechanics like spacing/orientation separate from brand styling while preserving one mechanical config pipeline.

Semantic State vs Presentation Defaults

Not every unresolved chart field should be filled from config.

Dataface keeps these responsibilities separate:

This prevents semantic chart meaning from being silently filled by presentation config.

Why This Matters

AI agents and new contributors instinctively add defensive defaults:

# BAD — hardcoded default that duplicates/contradicts config
y_axis_config = {"orient": "right", "labelLimit": 200}

# GOOD — config is guaranteed complete, just read it
config = get_config()
# orient comes from the active structure via vega.config.axisY
# labelLimit comes from chart.axis_label_limit

Every hardcoded default is a bug waiting to happen: it can drift from the config value, it can't be overridden by user config, and it makes the code harder to reason about.

What Goes Where

Default type Where it lives Example
Structural chart defaults structures.<name>.* in chart_structures/*.yml vega.config.axisY.orient: "right"
Vega-Lite axis/legend/mark config vega.config in chart_defaults.yml axis.gridColor: "#DFE1E5"
Chart dimension defaults chart.* in default_config.yml chart.height: 300
Chart-type-specific settings chart_types.<type>.* in default_config.yml chart_types.line.stroke_width: 2
Layout defaults layout.* in default_config.yml layout.rows.gap: 20
Style defaults style.* in default_config.yml style.font_family: "Liberation Sans"

Structures can inherit from other structures so we can keep a small set of canonical scaffolding presets instead of copying near-identical defaults into many theme-specific files. Legacy built-in themes can then point at one of those shared structures via theme_structures without re-bundling structure values inside the theme itself.

Practical Rule

Use this test:

Examples of values that belong in structure:

Examples of values that belong in theme:

Internal Framing: Data, Encoding, Structure, Theme

For internal design discussions, it helps to separate chart concerns into four concepts without changing authored YAML:

Another way to say this:

Authored chart YAML remains flat. This vocabulary is for internal architecture and config organization, not for introducing new required YAML nesting.

Chart Reasoning Frameworks

Dataface uses a small set of internal spectra and framings to reason about chart choice, chart sophistication, and what kinds of enrichment are justified in a given situation.

These frameworks are meant to help with:

They complement the data / encoding / structure / theme framing:

Current Canonical Spectra

Text-to-Mark Spectrum

Another useful internal framing is the discrete text-to-mark spectrum:

table -> colored table -> spark table -> graphic table -> chart -> mini-chart

This spectrum describes gradual changes in how statistical meaning is carried: first mainly by text and tabular structure, then increasingly by graphical marks.

For structure design, this spectrum is a taxonomy above any individual preset. It helps us choose and name structure families, especially when deciding how much tabular scaffolding to preserve versus how much of the canvas should be given over to marks.

Context Spectrum

Another useful internal framing is the context spectrum: how much context we have available when deciding what chart to make and how much meaning it can responsibly carry.

data type -> field semantics -> observed values -> comparative context -> human context

The levels are:

This spectrum is useful for chart recommendation, AI-assisted design, and internal reasoning about progressive enrichment.

Safe-Enrichment Rule

Moving rightward on the context spectrum expands what we may reasonably infer, but it does not erase the data-shape boundary.

In short: more context allows richer charts, but extra meaning still needs an explicit source.

Interpretive-Commitment Spectrum

Another useful internal framing is the interpretive-commitment spectrum: how much analytical opinion the chart expresses through its form, defaults, and added guidance.

least committed -> lightly framed -> analytically guided -> interpretively emphasized -> narratively directed

This is different from the context spectrum.

Examples:

The important idea is not that more commitment is always better. Lower commitment can be better for open-ended exploration. Higher commitment can be better for monitoring, explanation, persuasion, or decision support.

This spectrum helps us reason about how far to go from neutral display toward explicit interpretation.

Control-and-Freedom Spectrum

Another useful internal framing is the control-and-freedom spectrum: where chart decisions should remain fixed and dependable, where Dataface should carry an opinion, where behavior may adapt to context, and where authored specificity should take over.

conventional and dependable -> opinionated defaults -> context-sensitive adaptation -> authored specificity

This spectrum is different from both the context spectrum and the interpretive-commitment spectrum.

The levels are:

This framing is useful because not every chart decision should be optimized for the same kind of freedom.

This spectrum also helps clarify where those decisions should live:

Chart Navigation

Dataface should be understood not as a single chart chooser, but as a multidimensional territory of chart and dashboard potential that a user can navigate.

The system's job is:

The first chart should be the lowest-regret starting point given current context. That means the safest high-value representation we can justify with the information currently available, not the most elaborate chart we can imagine.

Navigation then happens through degrees of freedom. A degree of freedom is any handle the user can change while exploring the chart space.

Examples of degrees of freedom:

Internally, Dataface may have many degrees of freedom. Product-wise, however, we should expose them through simple navigation affordances that let the user move along one dimension without accidentally changing several others at once.

In practice, that means:

The general principle is:

This is the core reason to think in spectra. They give us a way to group many low-level controls into a smaller number of meaningful traversal paths.

Candidate Spectra To Explore Later

These are promising directions, but they are not yet canonical and should not be treated as settled design language.

Pending Additions

This section is expected to grow. When we add a new spectrum, the bar for inclusion should be that it helps future chart choice, chart enrichment, or chart-authoring policy in a concrete way.

Where Dataface Adds Value

Dataface extends beyond vanilla Vega-Lite in specific, explicit areas:

Feature Why it exists
KPI charts (type: kpi) Single-number display — not a Vega-Lite chart type
Tables (type: table) Rendered as custom SVG, not Vega-Lite
Spark bars (type: spark_bar) Inline sparkline bars, custom SVG
Geographic maps (type: map, choropleth, point_map) Dataface provides built-in geo sources, join logic, and projection defaults on top of Vega-Lite's geoshape mark
Auto chart type (type: auto) Detects best chart type from data shape — runs decisions.py to pick fields, formats, scales
Theming / Structure Merges Dataface structure and theme config into Vega-Lite's config object
Data binding Injects query result data into data.values
YAML shorthand x: date, y: revenue instead of verbose Vega-Lite encoding objects

What the Translation Layer Should NOT Do

Crosshairs — Removed

Line and area charts previously generated a multi-layer crosshair spec by default (_generate_line_chart_with_crosshair). This has been removed.

Why it existed: Static SVGs (from vl-convert) have no interactivity. The crosshair was a workaround to bake hover behavior into the Vega-Lite spec itself. It produced a layered spec with selection params and rule marks.

Why it was removed: - It turned a simple type: line into a complex multi-layer spec (~140 lines of Python for what should be ~5 lines of Vega-Lite). - It added interactivity that only works in browser contexts — useless for PDF, PNG, terminal, or MCP output. - The Cloud app already has its own JS interactivity layer (chart-interactivity.js) that handles tooltips and context menus. - It violated the "thin wrapper" principle by silently adding layers the user didn't ask for.

Line and area charts now render as simple single-mark specs with tooltip: true, like every other chart type.

File Layout

File Responsibility
vega_lite.py Thin public entrypoints for chart rendering and Vega-Lite spec generation.
models.py Internal chart pipeline models (ChartIntent, EnrichmentPatch, ResolvedChart, RenderArtifact).
pipeline.py Normalize authored chart intent, run enrichment, and resolve render-ready chart semantics.
renderers.py Renderer registry and artifact selection for standard, geo, and SVG-family charts.
profile.py Chart profile mapping — single home for all Dataface/Vega-Lite divergence (type renames, channel encoding, orientation transforms, sort mapping, bar axis defaults).
standard_renderer.py Thin mechanical Vega-Lite assembly. Consumes profile-mapped state; no Dataface profile logic.
presentation.py Shared presentation helpers (axis config, tooltip assembly, config-driven presentation defaults).
serialization.py Chart-domain JSON serialization helpers.
spec_builders.py Shared Vega-Lite spec builder helpers (base spec construction, title setting).
type_inference.py Vega-Lite type inference from query result data (temporal, quantitative, nominal).
vega_lite_types.py Spec generators for chart types needing custom logic (arc, histogram, boxplot, layered multi-y).
geo.py Geographic chart specs — built-in geo sources, data joins, projections. Uses Vega-Lite geoshape but adds Dataface product value.
decisions.py Data-aware enrichment heuristics (auto format, scale, field detection) used by the pipeline when semantic fields are unresolved.
rendering.py Orchestration: execute query → resolve chart → render. Not Vega-Lite specific.
kpi.py, table.py, spark.py, spark_bar.py Non-Vega-Lite renderers (custom SVG).
../converters/chart.py Chart artifact conversion for SVG/PNG/PDF/JSON outputs.

Guiding Questions for New Code

Before adding chart logic, ask:

  1. Does Vega-Lite already support this? If yes, pass it through — don't wrap it.
  2. Am I inventing a name for something Vega-Lite already names? If yes, use Vega-Lite's name.
  3. Am I hardcoding a default? If yes, put it in default_config.yml instead and read it with get_config().
  4. Am I mutating the spec based on data? If yes, make it opt-in or limit it to type: auto.
  5. Would a user reading the YAML be surprised by the output? If yes, the translation is too thick.