Vendor faketran as a monorepo lib and replace mockusign/gruber datasets
Problem
The repo still relies on the lightweight mockusign and gruber example datasets for fixtures, dbt examples, and local demo flows. Those examples are simpler than the faketran datasets and do not reflect the richer schemas and cross-table realism available in https://github.com/fivetran/faketran. That leaves demos, dashboard-pack examples, and seed-based onboarding flows anchored to lower-fidelity data than the team now wants to showcase.
We need to pull faketran into this repository as a monorepo-owned library/module, not leave it as a separate external repo dependency. That imported library should hold the canonical upstream-derived assets, while the runnable Dataface demo projects should live in this repo's examples/ tree as first-class repo-owned examples. On top of that, we need to swap the current examples to stronger datasets, starting with dundersign and optionally a second faketran example to replace the current mockusign/gruber pair. The change needs to preserve working local examples, keep the data-shape boundary intact, and update any docs, tests, seed commands, and references that assume the old dataset names.
Context
faketranis not currently present in this repo as a monorepo library, submodule, or checked-in example directory.- Current example assets live in:
examples/mockusign_dbt/examples/gruber_dbt/apps/cloud/fixtures/data/mockusign/apps/cloud/fixtures/data/gruber/- Seeded demo/org references also appear in
apps/cloud/apps/projects/management/commands/seed_dev_data.py. - Integration tests explicitly mention the current examples in
tests/integration/test_examples.py. - There is already related work in
tasks/workstreams/ft-dash-packs/tasks/issue-306-transform-mockusign-dbt-into-realistic-dbt-project-with-staging-.md, but that task upgradesmockusign_dbtin place rather than replacing the example source with vendoredfaketrandatasets. - Target structure:
libs/faketran/or equivalent monorepo lib path for imported canonicalfaketranassets.examples/for promoted runnable Dataface examples such asdundersign_dbt.- Avoid long-term duplication where the same example lives both inside the vendored lib and as a separate repo example.
- Constraint: the repo should keep ownership of query semantics and model shape while using richer example data; the visualization layer should not take on data-meaning responsibilities.
- Constraint: any direct edits to task files must preserve frontmatter and validate cleanly through the task CLI.
Possible Solutions
- Recommended: Pull
faketraninto this monorepo as a repo-owned library/module with a clear internal location and provenance notes, then promote the chosen runnable examples into this repo'sexamples/tree while replacingmockusignandgruberreferences withdundersignplus one other selectedfaketrandataset. - Pros: keeps upstream-derived assets separate from Dataface-owned example projects, preserves
examples/as the obvious place for demos, and makes schema/dbt/docs/test updates easier to coordinate in one change. - Cons: requires choosing a stable home for the imported library/module, defining how examples consume the canonical data, and doing a one-time migration sweep across example seeds, fixtures, tests, and docs.
- Add
faketranas a git submodule or external checkout and point the repo at it. - Pros: simpler upstream syncing with the source repo.
- Cons: makes local setup and CI more fragile, keeps examples split across repos, and complicates stable demo/test inputs.
- Keep
mockusign/gruberbut backfill them with richer schemas inspired byfaketran. - Pros: smaller naming migration.
- Cons: duplicates effort already embodied in
faketran, keeps the weaker branding/examples, and still requires reconstructing better datasets manually.
Plan
- Decide where
faketranshould live inside the monorepo and import it there as a repo-owned library/module with clear provenance and update guidance. - Inventory all references to
mockusignandgruberacross example projects, seed fixtures, seeded org/project metadata, tests, and docs. - Review
faketranand select the target replacement datasets, withdundersignas the default primary example and a second dataset chosen based on schema quality and demo usefulness. - Promote the selected runnable examples into
examples/as first-class Dataface examples instead of treating the vendored lib directory as the public demo surface. - Wire the vendored
faketranlibrary/module into the repo’s example/fixture workflows so dataset consumers use the in-repo source of truth without unnecessary duplication. - Update dbt example projects, seed files, and any fixture-loading flows to use the new datasets and names.
- Update seeded app/demo metadata and project references so local environments expose the new examples instead of
mockusign/gruber. - Refresh docs and tests that reference the old examples; add or update checks that prove the new examples still build and render.
- Document the vendoring/update policy so future
faketransyncs are intentional rather than ad hoc.
Implementation Progress
- Task created from user request to pull
faketraninto the monorepo and replace the current lower-fidelity example datasets. - Initial repo audit found
mockusignandgruberin checked-in fixtures, dbt example directories, seed-dev-data setup, docs, and integration test skips. - Initial repo audit found no checked-in
faketranlibrary/submodule/example directory in this worktree. - Vendored the upstream generator framework into
libs/faketran/, preserved upstream README provenance, and added a Dataface-specificfaketran-export-examplescommand for curated seed export. - Promoted new repo-owned example projects under
examples/dundersign_dbt/andexamples/pied_piper_dbt/with dashboard YAML tailored to the curated faketran exports. - Exported curated CSV seeds and matching cloud fixtures for both
dundersignandpied_piper(daily_metrics,users,documents,subscriptions,opportunities,tickets,workforce). - Updated
seed_dev_datato seeddundersign/signing-analyticsandpiedpiper/platform-analytics, and removed the oldgruber/mockusignexample directories and fixtures. - Fixed an upstream syntax defect in
libs/faketran/fake_companies/pied_piper/generate.pythat blocked the vendored Pied Piper generator import. - Updated the example integration test harness so
_dbtexamples with localseeds/resolve relative assets correctly. - Validated:
uv run --extra dev pytest tests/integration/test_examples.py -qPYTHONPATH=/Users/dave.fowler/.codex/worktrees/24ad/dataface uv run --extra cloud python -m apps.cloud.manage migratePYTHONPATH=/Users/dave.fowler/.codex/worktrees/24ad/dataface uv run --extra cloud python -m apps.cloud.manage seed_dev_data --resetuv run --extra dev ruff check libs/faketran/faketran/export_dataface_examples.py apps/cloud/apps/projects/management/commands/seed_dev_data.py tests/integration/test_examples.py
Review Feedback
- Fixed the vendored Pied Piper generator syntax error before exporting curated seeds so the second promoted example could be generated/exported cleanly from the monorepo copy.
-
Local validation passed for example compilation/rendering, targeted lint, Django migrations, and
seed_dev_dataagainst the renamed example projects. -
[x] Review cleared