tasks/workstreams/graph-library/initiatives/m3-chart-and-viz-extensibility/spec.md

Spec

Functional and technical specification derived from research. This is a recommended direction for implementation planning, not committed scope.


1. Goals


2. Non-goals (near term)


3. Capability tiers (lightweight, ordered)

Tier 0 — Built-ins, including type: inline

Tier 1 — Optional: named templates / registry (may be unnecessary for v1)

Tier 2 — Python extensions (build_vega_lite, build_vega, or render_svg) — niche

Recommendation: Ship type: inline (Tier 0) first; add Vega dispatch alongside VL for inline; treat Tier 1 registry and Tier 2 Python as follow-ons driven by real demand, not prerequisites.


4. “Plugin system” shape — optional after inline

Default story: no plugin install — authors use type: inline. That may be sufficient for a long time.

If we still need named templates and/or Python chart extensions, split the problem into two layers that are easy to confuse:

  1. Install plane (how code gets on disk) — must go through the Python environment (uv / pip / pyproject.toml) or explicit project-local modules on disk (§4.1b). YAML does not run pip for you. Hynek-style: invest in dft compile / dft doctor verification — actionable errors (“add this dep”, “sync”, “fix this import”) — not a parallel package manager.
  2. Policy plane (strict YAML lockdown)not part of the near-term plan. Default stays permissive forever unless we explicitly add an M5-era feature: YAML allow_types, expose_builtins, template allowlists for “only these extensions may load.” Treat that as a placeholder on the M5 milestone backlog; we may never implement it if demand never appears. Armin-style: avoid building it pre-emptively.

type: inline stays a built-in; it never lives in the extension registry. Later (M3+): reusable named template / registry work (templates + optional Python packs) if inline is not enough — separate from M5 policy placeholders.


4.1 Python chart extensions — install, verification, and M5 policy (may never)

How code is installed (recommended):

[project.entry-points."dataface.charts"]
acme_stream = "acme_charts.stream:register_chart"

Verification (Hynek-style — real, not “policy lockdown”):

Optional required_packages documents what the repo expects installed; dft compile / dft doctor fails with actionable messages if a listed distribution is missing. That is env truth, not “you may only use these chart types.”

charts:
  extensions:
    required_packages:
      - { name: "dataface-charts-acme", specifier: ">=1.0,<2" }

M5 placeholder — strict YAML policy (may never ship):

If one day we need enterprise-style lockdown (only these extension types, only these built-ins in Cloud editor / AI), we could add something like allow_types / expose_builtins / template allowlists. Park that on the M5 milestone as a maybe; assume we will not build it until a concrete customer forces the conversation. Until then: everything installed + everything in local_modules that resolves is usable.

# M5 backlog sketch only — do not implement by default
charts:
  extensions:
    allow_types: [acme_stream, acme_funnel]
    expose_builtins: [line, bar, area, table, inline]

If this ever ships, apply Bruno-style diagnostics: collect all violations in one compile pass where practical.

Why not chart_deps: [github urls] alone?

Ensuring deps are installed:

Approach Role
uv.lock / pip freeze Source of truth for CI and prod images
dft compile / dft doctor Verify required_packages and entry points resolve; fail fast with actionable message
Cloud deploy Same image that runs the app includes the extras; no runtime pip install from dashboard YAML

4.1b Project-local chart code (no separate PyPI package)

Teams often want one-off or repo-private chart logic without publishing my-corp-charts to PyPI. That is still valid; it does not require a second installable distribution if the dashboard repo is already a Python package or you accept a path-backed import.

Recommended (still “real Python”): editable monorepo package

Alternative: explicit local module(s) (no entry point — dumb and explicit)

charts:
  extensions:
    local_modules:
      - my_dashboards.charts_custom   # must be importable (project on PYTHONPATH or editable install)

Not recommended: auto-import every *.py under ./plugins/ without an explicit list — too magical (fails Armin’s “dumb loader” test). A single local_modules list or one register.py path is enough.

Relation to inline: locals are for reused Python logic across many charts; inline stays the default for one-off VL/Vega in YAML.


4.2 Named templates (no Python) — different mechanism

Reusable Vega-Lite / Vega JSON templates (files in repo, optional placeholders) do not need pip:

charts:
  templates:
    paths: ["./chart_templates/"]   # or explicit ids → file mapping

Compiler discovers templates from disk using a dumb rule: glob / explicit path list / id → file map — no dependency graph between templates. Version control is git. Template allowlists (only these template ids) belong in the same M5 “maybe never” bucket as allow_types. Pairs with type: inline when authors paste spec vs template_id: org_streamgraph.


4.3 What others do (short)

Ecosystem Pattern
Python plugins Entry points (pytest, dbt adapters, Jupyter) — install via pip/uv; app discovers at import time
Grafana / Superset Frontend bundles built into the host app or dropped in a plugins dir — not “YAML installs npm”
dbt Packages in packages.yml + dbt deps — explicit second step (closest analog if we ever add dft deps for template packs only)

Recommendation: For Python chart code, mirror entry points + pyproject/lockfile, plus optional required_packages verification. Do not plan YAML type allowlists for M2–M4 — M5 placeholder only, may never. For templates, git + paths (+ optional dft deps-style fetch later only for static assets, not arbitrary code execution). Do not promise “list GitHub URLs in default_config and charts auto-install” unless we build a deliberate, audited package story.


4.4 Relation to the hybrid task doc

The hybrid sketch in extensible-schema-with-custom-elements-and-chart-types.md maps cleanly:


4.5 Co-release with dbt — does it change deps / extensions?

Dataface may ship bundled with dbt (vendor CLI, Cloud/Fusion-style images, locked installer). That does change the default story for Python extensions, not the conceptual model.

Topic Standalone Dataface / open venv Locked dbt environment
type: inline Same More important — no extra packages; VL/Vega/svg/html in YAML is always available once Dataface runs.
Templates on disk (§4.2) Git repo paths Same; often delivered via dbt deps as static files inside a dbt package (e.g. chart_templates/*.vl.json) — no Python dependency edge, only dbt deps.
Entry-point chart packages pip install / uv add anything Often not allowed ad hoc. Extensions must be pre-installed in the bundle or exposed as blessed optional extras of the same distribution (pattern: dbt-<dist>[dataface-charts-acme]) — vendor-controlled, like adapters.
local_modules / editable monorepo (§4.1b) Fine for internal dev OSS / self-managed projects only; Cloud images typically ignore or disallow arbitrary repo Python unless documented.
required_packages verification “Run uv add Errors may need to read: “not included in this dbt release” / contact admin — same check, different remediation copy.

What we learn: Inline + templates are the portable extensibility layer across dbt-co-released and standalone installs. Python entry-point plugins remain valid but are second-class wherever the runtime is frozen; there, treat them like dbt adapters — versioned, tested, shipped with the product, not user-pip-installed at compile time.

dbt packages vs Python packages: Keep the distinction explicit: packages.yml is ideal for declarative template assets; Python chart code stays in the interpreter’s dependency graph (pyproject / bundle extras).


5. Minimal Python protocol (sketch)

# Conceptual — not an implementation commitment
class ChartExtension:
    id: str
    def validate(self, chart: DotDict) -> None: ...
    def build_vega_lite(self, chart: DotDict, data: list[dict]) -> dict: ...
    # OR — when VL is not enough:
    def build_vega(self, chart: DotDict, data: list[dict]) -> dict: ...
    # OR — bypass Vega entirely:
    def render_svg(self, chart: DotDict, data: list[dict]) -> str: ...

Implementations pick one build path per extension (VL vs Vega vs raw SVG) depending on what the chart needs. Distinct from type: inline, which is built-in dispatch, not an installed plugin.

Registration via entry points (see EXTENSIONS.md) or explicit path in project config for monorepos.


6. type: inline — one built-in chart, optional payloads

Default built-in — same class as bar / line, not an extension. Prefer one chart type: value instead of four separate “raw_” types. Authors set exactly one payload field; the compiler rejects zero or more than one* (clear errors, no silent precedence).

Shape (illustrative)

charts:
  inline_chart:
    type: inline
    vega_lite:        # mutually exclusive with vega / svg / html
      $schema: https://vega.github.io/schema/vega-lite/v5.json
      mark: bar
      encoding: { x: { field: a, type: nominal }, y: { field: b, type: quantitative } }
      data: { values: [] }   # or rely on explicit query binding per schema design

Same chart with another payload slot:

Why one inline type

Validation and trust

Optional later: file indirection

Payload values could alternatively be strings interpreted as paths (e.g. ./specs/foo.vl.json) — same exclusivity rule, one key set.

Templating (e.g. Jinja) — yes, if we define the contract

Authors can use Jinja (or the same template engine Dataface already uses elsewhere) to parameterize inline payloads, as long as we pin when templates run and what context is visible.

Payload Templating use Caveats
vega_lite / vega Usually avoid putting raw query rows through Jinja to build JSON — easy to produce invalid JSON or sneak in logic that belongs in SQL. Prefer normal query → structured data → inject data in code (today’s pipeline). Jinja is still reasonable for scalars: titles, colors, limits, width/height, variable-driven field names — or for a string payload that is rendered then parsed as JSON using a tojson-style filter so the result is valid.
svg / html Natural fit: treat payload as a template string; render with a documented context (variables, maybe bounded row snippets if we ever allow it). html: use autoescape defaults appropriate for the context.

Rule of thumb: Templating should not become a second data transformation layer that violates “data belongs to queries” (DESIGN.md). Use Jinja for authoring parameters and presentation strings; use queries for dataset shape.

Forms / write actions (still not our backend)

An html payload might contain <form action="…"> to a customer API — markup only; Dataface does not handle POST or auth. json-render / Lowdefy remain mental models, not something we replicate as a framework.


7. Impact on chart module organization


8. Milestone alignment

Milestone Work
M2 (or earlier if scoped small) Ship type: inline as a documented default: validation, query/data binding rules, vega_lite + vega vl-convert dispatch, tests. Fold inline into schema version / migrations when the YAML version field lands so payload keys evolve cleanly.
M2 Schema version field, migrations, namespace rules — applies to inline and any future custom: types.
M3+ Optional: named template registry, Python extensions — per extensible-schema-with-custom-elements-and-chart-types, reduced scope if inline covers most cases.
M5 Placeholder (may never): YAML policy lockdownallow_types, expose_builtins, template allowlists, etc. Do not schedule unless a concrete need appears; default assumption is permissive forever.

Research / design stays useful for external comparisons; implementation can lead with inline without waiting for the full catalog/registry design.


9. Acceptance criteria (initiative-level)