Dataface Tasks

Ensure dbt generate_schema_name-safe BigQuery ref source resolution

IDDFT_CORE-ENSURE_DBT_GENERATE_SCHEMA_NAME_SAFE_BIGQUERY_REF_SOURCE_RESOLUTION
Statuscompleted
Priorityp1
Milestonem1-ft-analytics-analyst-pilot
Ownersr-engineer-architect
Completed bydave
Completed2026-03-23

Problem

Dataface currently resolves dbt ref() and source() calls via a lightweight manifest parser that reconstructs identifiers as schema.alias (and for sources schema.table) instead of using dbt's full relation identity (relation_name or adapter relation object). This is risky for BigQuery because project/dataset/table identity and quoting semantics can be lost.

For ~/Fivetran/analytics/dbt_ft_prod, where generate_schema_name overrides are used, we need to guarantee Dataface query execution respects the same dbt relation naming outcome that dbt would use for BigQuery.

Context

  • dataface/core/execute/adapters/dbt_utils.py currently resolves ref/source using schema + alias/name, not relation_name.
  • dataface/core/serve/server.py registers adapters in order: DbtAdapter then SqlAdapter.
  • DbtAdapter._can_execute() declines SQL queries with inline source dicts or profile refs; those are routed to SqlAdapter.
  • In executor flow (dataface/core/execute/executor.py), named sources from dataface.yml are resolved to full source dicts before adapter execution, which means analytics-style type: bigquery source usage often bypasses DbtAdapter.
  • ~/Fivetran/analytics/dbt_ft_prod/macros/dbt_overrides/generate_schema_name.sql exists and is active in that repo.
  • Current consequence: Dataface may not always use the same adapter-level relation resolution semantics as dbt for BigQuery queries in analytics/dbt installs.

Possible Solutions

  1. Recommended: Make manifest resolution relation-first (relation_name) and BigQuery-safe - Update resolve_dbt_refs() to prefer relation_name from manifest nodes/sources, with robust fallback only when absent. - Ensure quoting/identifier shape is preserved for BigQuery (project.dataset.table). - Pros: minimal architectural change, aligns with dbt artifact truth, works for both adapters. - Cons: still artifact-driven (depends on manifest freshness).
  2. Force all dbt SQL through dbt adapter execution path - Pros: closest runtime behavior to dbt internals. - Cons: larger change, not compatible with all current source-config override flows.
  3. Keep current behavior and document limitations - Pros: no implementation effort. - Cons: known correctness risk for BigQuery naming semantics and analytics parity.

Plan

  1. Replace ref/source identifier construction in dataface/core/execute/adapters/dbt_utils.py: - Prefer manifest relation_name. - Use db/schema/alias fallback only when relation_name is unavailable.
  2. Add tests in adapter unit coverage: - BigQuery manifest fixture with project+dataset+table relation names. - Verify ref/source expansion preserves full relation identity. - Verify fallback path behavior remains stable.
  3. Add an analytics compatibility check: - Document and test expected behavior for dbt repos with generate_schema_name overrides.
  4. Validate both execution paths: - DbtAdapter path. - SqlAdapter path after source/profile normalization.
  5. Update docs/troubleshooting with manifest freshness and relation resolution notes.

Implementation Progress

  • 2026-03-23: Investigation complete.
  • Confirmed current resolution code uses schema.alias and can drop full relation identity details.
  • Confirmed analytics repo has generate_schema_name overrides in dbt_ft_prod.
  • Confirmed source-config queries can bypass DbtAdapter and run in SqlAdapter.
  • 2026-03-23: Implementation complete (Solution 1 — relation-first resolution).
  • Updated resolve_dbt_refs() in dbt_utils.py: both resolve_ref and resolve_source now prefer relation_name from the manifest when present, with schema.alias/schema.table fallback.
  • Added TestResolveDbtRefs class (6 tests) in tests/core/test_adapters.py covering: relation_name preference for ref and source, fallback paths, BigQuery three-part names, and generate_schema_name override scenarios.
  • All 54 adapter tests pass, zero regressions.

QA Exploration

  • [x] QA exploration completed (or N/A for non-UI tasks)
  • N/A: non-UI task (execution adapter correctness + tests).

Review Feedback

  • [ ] Review cleared