M2 RunCache integration contract and adoption plan

ID	INTEGRATIONS_PLATFORM-M2_RUNCACHE_INTEGRATION_CONTRACT_AND_ADOPTION_PLAN
Status	completed
Priority	p1
Milestone	m2-internal-adoption-design-partners
Owner	head-of-engineering
Completed by	dave
Completed	2026-03-23

Problem

Run Cache is now a real internal/external surface (runcache.com, production/staging infra, and active dbt-run-cache support in engineering repos), but Dataface has no explicit M2 contract for how it should integrate with Run Cache execution, decision visibility, and failure handling. Without this contract, design-partner adoption is risky: behavior may be inconsistent across local and hosted environments, and usage in /Users/dave.fowler/Fivetran/analytics may diverge from general dbt-native repo behavior.

We need an M2 task that defines what Dataface supports, what remains pass-through to dbt/Run Cache, and how users can debug cache decisions end-to-end.

Context

Internal references show Run Cache is actively deployed and renamed onto runcache.com endpoints, with recent infra and argocd changes.
fivetran/engineering has recent dbt-run-cache support work in dbt agent images (POC closed after prod rollout).
Internal analytics guidance already assumes Run Cache usage in dbt workflows and documents key flags/config:
--favor-state
DBT_RUN_CACHE_DISABLED
DBT_RUN_CACHE_FRESHNESS_TOLERANCE
DBT_RUN_CACHE_TOLERATE_NONDETERMINISM
DBT_RUN_CACHE_CLONE_INCREMENTAL_IN_DEV
The internal analytics repo (/Users/dave.fowler/Fivetran/analytics) is a first-class Dataface proving ground for dbt-native workflows; Run Cache support must work there and in generic dbt repos bootstrapped by dft init.
Scope boundary: this task defines Dataface integration expectations and instrumentation; it does not replace Run Cache internals.
Codebase audit (2026-03-23): No Run Cache-specific code exists in Dataface today. The dbt execution path (dataface/core/execute/adapters/dbt_adapter.py) uses dbtRunner.invoke() which inherits the process environment — so Run Cache env vars already propagate if set, but there is zero observability, no structured logging of cache decisions, and no validation that the plugin is active.
The DbtAdapter resolves profiles via dbt_project.yml walk-up and supports configurable target_name, meaning analytics repo and generic repo paths both work through the same code path.
Existing QueryResult type could carry cache decision metadata without schema changes (via optional fields or a diagnostics sidecar).

Possible Solutions

Recommended: Dataface as a Run Cache-aware dbt orchestrator (contract + observability, minimal opinionated behavior) - Add a documented “Run Cache integration mode” that forwards Run Cache env/config into dbt execution, captures decision artifacts, and surfaces an explain view in logs/UI. - Pros: low-risk, fast to ship in M2, consistent with dbt-native workflows, portable to Fivetran/analytics. - Cons: still depends on external Run Cache service health and plugin behavior. - Selected. Aligns with Dataface's dbt-native philosophy: Dataface orchestrates, dbt owns execution, Run Cache owns caching. Codebase audit confirms env passthrough already works; the gap is observability and documentation.
Tight coupling: embed direct Run Cache API dependencies in Dataface runtime - Pros: richer native UX potential. - Cons: high coupling and larger blast radius; not M2-friendly. - Rejected. Adds a hard service dependency to Dataface's execute layer, breaking offline/local-first workflows.
No explicit integration: rely on user-managed dbt flags/env only - Pros: lowest implementation effort. - Cons: poor debuggability, inconsistent behavior, high support burden. - Rejected. Design partners need debuggability; “it works if you set the right env vars” is not an adoption-ready answer.

Plan

Define M2 support contract (this task's primary deliverable): - Supported execution modes (local CLI, hosted, analytics repo path). - Supported env vars/flags pass-through and defaults (no Dataface-side defaults; user-configured). - Explicit non-goals for M2 (no direct API calls, no cache invalidation UI, no auto --favor-state). - Deliverable: ai_notes/integrations/M2_RUNCACHE_CONTRACT.md — the canonical contract document.
Verify execution plumbing (validation, not new code): - Confirm DbtAdapter.invoke() inherits process env (already true via dbtRunner). - Confirm profile/target resolution works for analytics repo and dft init repos. - File follow-up tasks for structured logging and artifact capture (implementation work).
Spec decision introspection (design, not implementation): - Define the structured log format for cache decisions. - Define the optional --explain-cache CLI flag and /diagnostics/runcache endpoint shape. - These are stretch goals for M2; minimum viable surface is structured logging.
Validate in two real paths (manual verification): - /Users/dave.fowler/Fivetran/analytics/dbt_ft_prod — confirm dft render with Run Cache env vars active. - A generic dbt repo initialized via dft init — confirm plugin discovery and passthrough.
Document operator runbook: - Failure modes, fallback (DBT_RUN_CACHE_DISABLED=1), and triage checklist. - Included in contract document (section 7).
Rollout guardrails: - Design-partner checklist (plugin installed, env configured, logs verified, fallback tested). - Success metrics: setup time < 30 min, triage time < 5 min, no Dataface-side code changes needed for basic cache usage.

Implementation Progress

2026-03-23: Task created from Run Cache discovery pass (slab + github + jira context) with explicit M2 scope and analytics-repo compatibility requirement.
2026-03-23: Recommended approach selected: Run Cache-aware orchestration with strong observability, not deep runtime coupling.
2026-03-23: Codebase audit completed. Confirmed no Run Cache code exists in Dataface. DbtAdapter uses dbtRunner.invoke() which inherits process env — basic passthrough already works. Gap is observability and documentation.
2026-03-23: Contract document created at ai_notes/integrations/M2_RUNCACHE_CONTRACT.md. Covers: integration model, supported env vars/flags, execution modes (CLI/Cloud/analytics), decision tracing spec, non-goals, failure modes + operator runbook, rollout plan with design-partner checklist, and success metrics.
2026-03-23: Task sections (Context, Possible Solutions, Plan) enriched with codebase audit findings and explicit deliverable references.

QA Exploration

[x] QA exploration completed (or N/A for non-UI tasks)
N/A: non-UI planning/integration task.

Review Feedback

[ ] Review cleared