Medallion Architecture: 5 Failure Modes
Most bronze/silver/gold lakehouse builds repeat the same five mistakes. A practitioner's breakdown of medallion architecture failure modes — and the fixes.
The medallion architecture — bronze for raw, silver for cleaned, gold for business-ready — is the most-copied diagram in enterprise data. It’s on every lakehouse vendor’s slide. It’s in every onboarding doc. And in most of the engagements we walk into, it’s implemented wrong in a way that quietly recreates the exact problems it was supposed to prevent.
The pattern itself isn’t the problem. The problem is that “bronze, silver, gold” is a naming convention being mistaken for an architecture. Three folders with the right names don’t give you a clean data platform any more than three folders called model, view, controller give you good software. What matters is what each layer is responsible for, who owns it, and what’s guaranteed at each boundary — and that’s exactly the part teams skip when they copy the diagram.
This post is the cut we use when we audit a data engineering stack and find a medallion build that looks right on the architecture page and behaves like a swamp in production. Five failure modes, why each one happens, and what “implemented right” actually looks like.
Failure mode 1: Bronze that isn’t actually raw
Bronze has one job: be a faithful, append-only landing zone for source data exactly as it arrived. No renames, no type coercion, no dedup, no “light cleaning.” The whole value of bronze is that when silver breaks — and it will — you can replay from a layer you know hasn’t been touched.
What we actually find:
- Column renames “to be consistent” applied on ingest. Now you can’t diff bronze against the source system.
- Type casting in the bronze write — strings parsed to timestamps, decimals coerced — which silently drops or mangles the rows that don’t parse.
- Dedup logic in bronze “to save space.” Congratulations, you’ve thrown away the duplicates that would have told you the source system double-fired.
- Bronze that’s actually a
MERGEtarget, so it’s mutable. Now it’s not a log, it’s just another table, and replay is impossible.
The fix is boring and strict: bronze is an immutable append-only log of what landed, with ingest metadata (source, load timestamp, batch id, file name) and nothing else. Schema-on-read, not schema-on-write. If you want a “cleaned but not yet conformed” zone, that’s a silver concern — give it its own table, don’t smuggle it into bronze.
A corollary most teams miss: bronze grows forever, so it needs a retention and VACUUM policy from day one. Bronze isn’t free storage; it’s cheap storage with a lifecycle. Decide whether old raw partitions go to cold storage or get dropped, and write that down before the table is 40 TB and nobody wants to touch it.
Failure mode 2: Silver that’s just “bronze with nicer column names”
This is the most common one, and the most damaging. Silver is supposed to be where raw data becomes trustworthy, conformed data — deduplicated, validated against expectations, type-enforced, with business keys resolved and entities modeled. It’s the layer the rest of the company is supposed to be able to build on without re-litigating data quality.
What we find instead is a silver layer that did SELECT with renamed columns, a CAST or two, and maybe a WHERE deleted = false. The hard parts — deduplication on a real business key, late-arriving data handling, slowly-changing-dimension logic, conforming customer_id across three source systems that all mint their own IDs, rejecting or quarantining rows that fail validation instead of silently letting them through — none of that happened. So every gold table and every analyst ends up redoing it, inconsistently, and you’re back to “the numbers don’t match” — the problem a layered architecture was supposed to kill.
Silver done right has opinions:
- One conformed entity per table, modeled deliberately — not one silver table per source table. If
orderslives in Shopify and NetSuite, silver has oneordersentity, with provenance columns, notshopify_orders_silverandnetsuite_orders_silver. - Validation is enforced, not aspirational. Rows that fail expectations go to a quarantine table with the failure reason. You find out about bad data because the quarantine table has rows in it, not because a dashboard looks weird three weeks later.
- Idempotent merges on real keys, with explicit handling for updates, deletes, and late arrivals. Re-running the silver job twice produces the same table.
- History is a decision, not an accident. Type-2 dimensions, append-only fact tables, or current-state-only — pick per entity, document it, don’t let it emerge from whatever the merge happened to do.
If silver is thin, the whole stack is thin. Everything downstream inherits whatever silver didn’t do.
Failure mode 3: Gold as a dumping ground
Gold is supposed to be the business-facing layer: aggregated, denormalized, shaped for how people actually consume data — a handful of well-modeled marts and metrics that everyone agrees on.
The failure pattern is one gold table per dashboard. Someone needs a new Power BI report, so they write a new gold table that joins the same silver tables in a slightly different way with a slightly different definition of “active customer.” Six months later you have 340 gold tables, no two of which agree on revenue, and gold has become a personal-workspace junk drawer with a fancy name. The medallion diagram promised “business-ready, single source of truth.” What got built is the opposite.
Gold done right is small and shared:
- Conformed dimensions and fact tables that multiple marts reuse — a real dimensional model, or a real semantic layer, not ad-hoc wide tables.
- Metric definitions live in one place — a metrics/semantic layer, dbt metrics, a Fabric semantic model, whatever — so “revenue” and “active user” are defined once and consumed everywhere. If two gold tables can disagree on a core metric, you don’t have a gold layer, you have a spreadsheet farm.
- Consumer-shaped, but not consumer-owned. Marts are designed with the BI team, governed by the data team. The moment analysts can write directly to gold under deadline pressure, the layer is gone.
A useful test: if you can’t list your gold tables on one page and say who owns each metric, gold has already failed and you just haven’t measured the damage yet. This is the difference between a working data architecture and three layers of cargo cult.
Failure mode 4: Layers as folders, not contracts
Here’s the structural one. In a healthy medallion build, each layer boundary is a contract: a guarantee about schema, quality, freshness, and ownership that the consuming layer can rely on. Bronze → silver guarantees completeness and replayability. Silver → gold guarantees conformed, validated entities. Each transition is a checkpoint where data quality is asserted, not assumed.
In most builds, the “layers” are just three schemas in the catalog, with nothing enforced at the boundaries. No data quality tests between bronze and silver. No schema contract that fails the build when a source adds a column or changes a type. No freshness SLA. No documented owner per dataset. Schema drift in a source ripples straight through to gold, and the first person to notice is an executive looking at a broken dashboard.
And because there’s no contract, there’s no enforcement of direction either. Under delivery pressure, someone builds a gold table directly off bronze “just this once” to hit a deadline. Now you have a path that skips silver entirely — skips all the validation and conformance — and it’s load-bearing. The architecture diagram still shows three clean layers; the lineage graph shows spaghetti.
The fix is to treat boundaries as the actual deliverable:
- Quality gates between every layer. Bronze → silver: completeness, uniqueness, row-count sanity. Silver → gold: referential integrity, aggregation correctness, metric reconciliation. A failed gate stops the pipeline; it doesn’t log a warning nobody reads.
- Schema contracts at the source boundary so an upstream change breaks the build loudly, in CI, not silently in production three layers later.
- One owner per dataset, written down. Not “the data team.” A name.
- No layer-skipping. Ever. If gold needs something silver doesn’t have, the answer is “add it to silver,” not “reach into bronze.” The diagram’s value is the constraint.
This is where the medallion pattern earns its keep — or doesn’t. Folders are free. Contracts are the work.
Failure mode 5: Three layers whether the data needs them or not
The flip side of cargo-culting the pattern: applying it religiously to data that doesn’t need three hops. A reference table with 200 rows that updates twice a year does not need bronze, silver, and gold. A single source feeding a single mart with no other consumers does not need three layers of indirection. But because “we do medallion,” it gets the full treatment — three tables, three jobs, three sets of tests — for data where the overhead exceeds the value by an order of magnitude.
The result is a platform where 80% of the engineering effort goes into ceremonially moving small, simple datasets through layers they don’t need, while the genuinely complex, multi-source entities — the ones that actually justify silver’s conformance work — get rushed because the team is out of time.
Medallion is a default, not a mandate. The judgment call is per-dataset:
- Lots of sources, real conformance work, many downstream consumers → full medallion, and invest in silver heavily.
- One source, light cleaning, one or two consumers → bronze + a single curated table is plenty. Call it silver if it makes the diagram tidy; don’t build a gold layer for one report.
- Tiny, slow-changing reference data → just maintain the table. Version it in git if you want history. Skip the pipeline.
Spend the layers where the complexity is. Don’t tax the simple stuff to make the architecture diagram symmetrical.
What “implemented right” actually looks like
Strip away the colors and the medallion pattern is just three honest commitments:
- Bronze is an immutable record of what arrived — replayable, metadata-stamped, lifecycle-managed, never mutated.
- Silver is where data becomes trustworthy — conformed entities, enforced validation, idempotent merges, deliberate history. This is where most of the engineering goes. A thin silver layer is a tell that the build is in trouble.
- Gold is a small, shared, governed business layer — conformed dimensions, a single metrics definition, marts designed with consumers but owned by the data team.
…held together by contracts at every boundary — quality gates, schema enforcement, freshness SLAs, named owners — and applied proportionally, not religiously.
When we run a data engineering assessment on a struggling lakehouse, the diagnosis is almost never “you picked the wrong pattern.” It’s “you implemented the layer names without implementing the layer responsibilities.” The fix is rarely a re-platform; it’s pushing the conformance work back into silver where it belongs, putting real gates between layers, consolidating the gold sprawl behind a metrics layer, and deleting the ceremonial pipelines that never earned their keep. That’s the same discipline baked into the S.C.A.L.E. data foundation we deploy on enterprise engagements — the layers are real because the contracts between them are real, not because the schemas have the right names.
Quick self-audit
Run these five questions against your own build:
- Can you replay silver from bronze for an arbitrary date, and is bronze byte-faithful to the source? (If no — failure mode 1.)
- Does silver do real conformance and validation, or is it bronze with renamed columns and a
CAST? (Failure mode 2.) - Can you list your gold tables on one page, with one owner and one definition per core metric? (Failure mode 3.)
- Is there an automated quality gate between every layer, and is layer-skipping actually impossible? (Failure mode 4.)
- Does every dataset get three layers regardless of whether it needs them? (Failure mode 5.)
A “wrong” answer on two or more of these means the medallion architecture on your slide and the one in production have diverged — and the gap is where your data quality problems are coming from. Worth fixing before the next platform RFP, not after.
Data Engineer
Mukesh is a Data Engineer at Algoscale building the deep-plumbing pieces of enterprise data platforms across AWS and Azure — MDM ledgers, CDC pipelines, Lake Formation access controls, Fabric semantic models. Writes from the production side of the stack.