tasks/workstreams/graph-library/initiatives/m3-chart-and-viz-extensibility/research.md

Research

Evidence, references, and investigation notes for chart and visualization extensibility in Dataface: how comparable systems allow custom charts, how declarative JSON/YAML UI stacks extend themselves, and what fits a Python-centric, SVG-first server. No rich HTML component framework; an optional type: inline chart may carry an html payload (see spec §6) with explicit trust/export caveats — still not a form builder, just an escape hatch.


1. Dataface context (already in-repo)

Artifact Relevance
dataface/core/render/chart/DESIGN.md Extended Vega-Lite grammar; mechanical YAML → VL; no hidden data transforms in render. Extensions must respect “data belongs to queries.”
ai_notes/research/json-render-deep-dive.md Catalog + registry: Zod-typed components, defineCatalog / defineRegistry, actions, JSON Pointer state — closest analog to “schema is the extension point.” Recommends aligning Dataface with a typed catalog once declarative schema exists.
ai_notes/core/EXTENSIONS.md Prior design: ChartExtension + entry points, HTML/JS variable controls without Python, custom actions. Good straw-man for Python plugins vs template-only extensions.
tasks/workstreams/dft-core/tasks/extensible-schema-with-custom-elements-and-chart-types.md Task already outlines A project plugin registry, B Vega-Lite templates, C hybrid. Milestone M3.
tasks/workstreams/dft-core/tasks/task-m2-yaml-version-migrations.md M2: schema version + migrations. Any new extension types or registration keys should be versioned and migratable.
ai_notes/research/WRITE_BACK_LOWCODE_RESEARCH.md Lowdefy-style blocks + connections + actions; relevant for forms that POST to customer APIs (we only serve markup; client/browser or edge handles execution).
ai_notes/features/MARKDOWN_SVG_LIBRARY.md Planned markdown→SVG library mentions custom renderers/plugins as a future axis — same “escape hatch” idea as chart extensions.

2. Vega and Vega-Lite

2.1 Mental model

2.2 Vega extensibility (runtime, JavaScript-centric)

Vega documents an extensibility API for registering runtime pieces (e.g. transforms, projections, scales, color schemes) into the Vega runtime. This is not a Python or server-side plugin system; it assumes JS execution when parsing/rendering.

Implication for Dataface: We already treat VL as an intermediate representation (compile chart YAML → VL dict → renderer). Extension paths that stay declarative (full VL/Vega JSON + data injection) align with ecosystem skills without requiring a JS plugin host on the server.

2.3 “Custom chart” in practice for VL users

Common patterns in the wild:

  1. Author full Vega-Lite (or Vega) JSON — maximum flexibility; team maintains the spec; tool only binds data and theme.
  2. Parameterized templates — placeholders for encodings, filters, or mark subtrees; tool fills from YAML.
  3. Drop to Vega when VL is insufficient (e.g. some network layouts, custom interaction).
  4. Layer / facet / concat — composition instead of new mark types.

Streamgraph example: Often implemented as stacked areas with a wiggle (or similar) stack offset. Depending on VL version and transforms available, this may be expressible in VL, may require Vega-level spec, or may need custom SVG generation in Python. This is a good tiering test: template VL first; else Vega passthrough; else render_svg extension.


3. Other charting libraries (extension mechanics)

3.1 Apache ECharts

3.2 Observable Plot

3.3 D3.js


4. BI and analytics products (headless / embedding / plugins)

4.1 Apache Superset (superset-ui)

4.2 Metabase

4.3 Grafana

4.4 Lightdash (dbt-adjacent)


5. Declarative JSON / YAML UI systems

5.1 json-render (Vercel Labs)

5.2 Lowdefy (YAML internal tools)


6. Cross-cutting comparison (short)

System Extension unit Runtime Fits SVG-first Python server?
Vega-Lite Spec / template VL→Vega→SVG (in JS or via vl-convert) Strong if we keep extensions as spec fragments
Vega Spec + extensibility hooks JS typical Medium (passthrough spec); full hooks need JS
ECharts / Plotly (JS) JS modules / custom series Browser Weak without embedded JS runtime
Superset / Grafana Frontend plugin packages Host app Weak for our core; OK as analogy
json-render / Lowdefy Catalog / YAML blocks Client or SSR Strong for HTML-only extensions

7. Versioning and chart organization (research conclusions)


8. Synthetic perspectives: three OSS developers on this plan

The following are not real quotes. They are informed, stylized takes — what these people might emphasize if you walked them through the spec (inline-first, two-layer install vs policy, entry points, YAML allowlists). Use them as a design stress test.

Hynek Schlawick (Python packaging, attrs, structlog)

Likely emphasis: Treat the lockfile and pyproject as the only honest source of “what is installed.” A YAML chart_deps list that does not map to resolved distributions is fiction until uv sync runs. He would push for hard failures in compile/doctor when required_packages are missing, with messages that name the exact uv add line — no silent skip, no “best effort.” He would be skeptical of any future dft deps that executes unverified code; if you add it, it should look like pinned, hashed artifacts or stay template-only. type: inline as default: good — fewer moving parts, less supply-chain surface than a plugin zoo.

Takeaway: Align messaging with how Python actually works; invest in verification UX, not a parallel package manager. We follow this: spec §4 treats pyproject/lockfile as install truth and prioritizes dft compile / dft doctor messages that tell you exactly what to add or sync — no YAML-driven installer.

Armin Ronacher (Flask, Werkzeug, long history of pragmatic Python APIs)

Likely emphasis: Extension systems rot when the core tries to be too clever. Entry points are boring and that is a feature. He would ask whether allow_types is strictly necessary for v1 or if it is premature enterprise — maybe “everything installed is available” is enough until abuse appears. For inline, he would warn against ten optional sub-keys that interact oddly; one payload discriminator (exactly one of four keys) is the right kind of blunt rule. He might nudge: if templates live on disk, keep the loader dumb — glob + explicit id, not a plugin graph.

What “dumb loader vs plugin graph” means (plain English):

We follow this: Permissive by default (no required allow_types); strict YAML policy is an M5 placeholder, may never (spec §4.1). Template discovery stays path/glob + explicit ids; inline stays one blunt shape.

Takeaway: Minimize policy knobs until pain is real; keep discovery and loading boring.

Bruno Oliveira (pytest core maintainer; plugin ecosystem at scale)

Likely emphasis: pytest proved that entry_points["pytest11"] + pip install scales to thousands of plugins — but version skew (pytest>=8 vs plugin expecting 7) is where users bleed. He would ask: what is Dataface’s compatibility contract for dataface.chartssemver on the host, capability flags, or import-time version check? He would like required_packages in YAML as a second signal next to the lockfile for “this dashboard repo expects these dists.” He might suggest namespacing type ids (acme.stream vs stream) to avoid collisions when many entry points load. Collect errors (all missing plugins) not fail-fast one — pytest’s collection phase taught that.

Takeaway: Plan host–extension version story and collision-free ids early; aggregate diagnostic errors for missing/invalid extensions.

What we learn from the trio

Theme Action
Truth lives in the env pyproject + lockfile install; YAML allowlists and required_packages are policy + verification, not installers.
Boring discovery Entry points are enough; avoid smart auto-download from Git URLs in YAML.
Don’t over-build policy Strict YAML allowlists (allow_types, expose_builtins, …) are an M5 placeholder — may never ship (spec §4.1). Default: permissive — any installed entry point and local_modules (spec §4.1b) stay usable without opting in.
Ecosystem pain Document Dataface major version ↔ extension API expectations; consider namespaced chart type ids.
UX One round of collect-all-errors during compile beats mysterious first failure.

9. References (external)


10. Open questions (for spec / decisions)

  1. Do we standardize on one escape hatch first: raw VL only, or VL + Vega passthrough, or also Python SVG?
  2. How do extension IDs appear in YAML (type: custom.streamgraph vs type: streamgraph with registry)?
  3. (N/A for core — no first-class HTML/form extension path.) If embedders use foreignObject or surrounding HTML, CSP and sanitization are their responsibility.