Research · The economics of benefits

Nobody holds the whole truth.

Employee benefits looks like a product. It behaves like a relay: at every handoff a fee is taken and a piece of the truth is dropped. Follow the money through the chain and a single law explains the last fifteen years of it. The margin migrates to wherever measurement stops.
Keel LabsJune 9, 2026No. 01 · Revised16 min read
Figure 1 · The relaypause
A premium dollar passes through a half-dozen hands before it becomes care. Each handoff takes a fee and drops a shard of information. Fee sizes here are illustrative; the verified ranges are in the body. Hover any node for its take.

A dollar leaves an employer and passes through a half-dozen hands before it becomes care. At every handoff a fee is taken and a piece of information is dropped. This paper follows both losses through the chain, node by node, and finds that they obey one law.

The people in the middle are not villains. I want to say that clearly up front, because it is the part everyone gets wrong. Each of them is rational, each is paid for the slice they touch, and each is blind to the slices they do not. Assemble them and you get a system that profits from confusion in aggregate without any single party choosing to.

The fair objection is that every supply chain looks like this. Coffee passes through a dozen hands between the farm and the cup, and nobody writes papers about the tragedy of coffee. The difference is that when coffee changes hands, the buyer can taste it. When benefits data changes hands, nobody downstream can tell a clean record from a corrupted one until a person is standing at a pharmacy counter being told they do not exist.

We started Keel because we kept hitting the same wall from the inside, so this paper does the unglamorous thing and traces the money and the data through the network, one node at a time. It ends with two findings. First: no participant in this system holds end-to-end truth. Not one. Second, and this is the engine underneath everything else in our research: every time a regulator or an auditor has finally measured one segment of the chain, the margin in that segment has not shrunk. It has moved. The margin migrates to wherever measurement stops.

The pool

Start with the money everyone is paid out of

US health spending reached $5.3 trillion in 2024, growing 7.2 percent in a single year to 18 percent of GDP, $15,474 for every person in the country. Private health insurance paid $1.64 trillion of that, and employer coverage is its biggest block: 53.8 percent of Americans, roughly 180 million people, were on an employment-based plan at some point in 2024.

The price of a seat in that pool is the number this paper keeps returning to. The average family premium in 2025 was $26,993, up 6 percent in a year. In 1999, the same survey measured $5,791. The premium has multiplied 4.7 times since, and the worker's direct share now runs $6,850 a year, about $263 out of every biweekly paycheck before a single claim is filed.

$26,993.
The average family premium in 2025, against $5,791 in 1999. A once-a-year decision, made mostly blind, on one of the largest line items in most households.

In fairness, the most recent stretch is the calmest in the series: from 2020 to 2025, wages grew 28.6 percent while family premiums grew 26. The famous divergence is mostly a 1999-to-2019 story, with a fresh re-acceleration at the end, 6 percent premium growth against 2.7 percent inflation last year. The relay does not need runaway growth to matter. It needs only the size of the flow, and the flow is twenty-seven thousand dollars per family, every year. Hold that number. It is what the chain below divides.

The nodes

Everyone in the middle is paid for their slice

The employer funds the plan and carries the legal risk: under ERISA it is a fiduciary, personally on the hook for prudence. In 2025, 67 percent of covered workers were in self-funded plans, where the employer pays the claims directly and the carrier merely administers. The employee pays a premium share through payroll, then pays again at the point of care. It is the only node that bears cost on both sides and sees almost nothing of how the price was set.

The broker is the door the money walks in through, paid as a fraction of premium: roughly 3 to 6 percent on medical, and as much as 40 to 50 percent of first-year premium on voluntary products. The largest study of these payments, covering 11.7 million employees across 33,689 plans, found a median commission of $178 per enrollee per year, positively associated with the plan's premium. ProPublica documented carrier bonuses on top, $100,000 and more per employer group, plus trips. None of this requires bad faith. It requires only arithmetic: the structure rewards placing premium, not lowering it, and that tilt is the first place truth leaves the system.

The general agency sits above the broker and takes an override the employer never sees. The carrier underwrites the risk under the medical loss ratio cap: at least 80 to 85 cents of every premium dollar must go to care, leaving admin, marketing, and profit inside the visible remainder. The TPA runs the self-funded plan for a per-head fee, the lightly regulated mirror of the insured model. The pharmacy benefit manager runs the drug benefit, and three firms process about 80 percent of the 6.6 billion prescriptions Americans fill each year. Underneath all of them sit the benadmin platforms that hold the elections, the enrollment firms paid out of the voluntary commission, and the data-exchange vendors whose entire existence is the proof of the problem. If benefits data flowed cleanly, there would be no market for the plumbing.

Every intermediary is paid in proportion to its position in the relay, not to any measurable improvement in the employee's outcome.
Figure 2 · The slices never reassemblepause
Each node holds a real piece of the truth. Watch a record try to cross between them. The pieces are never in one place at one time, so the picture never assembles.

Here is the part that actually keeps me up. The carrier holds claims and utilization. The PBM holds drug spend and rebate terms, opaque even to the plan sponsor paying for them. The broker holds the relationship. The benadmin holds the elections. The employer holds the budget. The employee holds nothing, and usually finds out the record was wrong when a claim is denied. Better email does not fix this. Every handoff in a relay is a chance to drop the baton, and this relay has a dozen handoffs and no runner who carries it the whole way.

The law

The margin migrates to wherever measurement stops

Here is where complaint becomes thesis. Four times in the past fifteen years, a meter has been bolted onto one segment of this relay: an audit, a risk model, a loss-ratio cap, an underwriting ban. Four times, the metered segment came back clean within a few years, and the margin surfaced in the segment next door. Once is an anecdote. Four times is a law, and you can watch it operate in public data.

First instance, the cleanest. For years the accusation against pharmacy benefit managers was that they pocketed manufacturer rebates. So the government installed a meter. GAO audited Medicare Part D and found that in 2016, PBMs passed through 99.6 percent of roughly $18 billion in rebates to plan sponsors, keeping $74.3 million. The industry has quoted that audit ever since, and the audit is correct. It is also the wrong address. In January 2025 the FTC published what it found at the next one: from 2017 to 2022, the three largest PBMs collected $7.3 billion above estimated drug acquisition cost dispensing specialty generic drugs through their own affiliated pharmacies, with markups passing 1,000 percent on some cancer and HIV drugs, plus another $1.4 billion from spread pricing. The affiliated pharmacies' share of specialty dispensing revenue climbed from 54 percent in 2016 to 68 percent in 2023, and the excess was compounding at 42 percent a year. The newest fee entities, the group purchasing arms, are chartered in Switzerland and Ireland: one more border between the margin and the meter. This is why a rebate guarantee in a PBM contract is so cheap to give. The guarantee covers the metered address, and the margin has moved.

$7.3B.
Above-cost dispensing revenue the three largest PBMs collected on specialty generics from 2017 to 2022 through their own pharmacies, during the same era they passed 99.6 percent of rebates through. Both numbers are true. That is the point.

Second instance, the controlled experiment. In March 2025, MedPAC put Medicare Advantage payments about 20 percent above what traditional Medicare would have spent on the same people: $84 billion, roughly $44 billion of it coding intensity, plans documenting diagnoses aggressively because payment rises with recorded sickness, and $40 billion favorable selection. CMS then finished phasing in V28, a risk model built specifically to meter the coding. It worked where it pointed: by MedPAC's March 2026 report, the upcoding estimate had fallen to about $22 billion and the total overpayment to $76 billion, 14 percent. The metered channel shrank by half. The unmetered one, favorable selection, the quiet physics of which seniors choose which plans, is now the larger share, and no risk model audits it. The meter moved, and the margin's center of gravity moved with it, within a single reporting cycle, in public data.

Third instance, the original meter. The ACA's medical loss ratio was the first big clamp: at least 80 to 85 cents of the premium dollar must be spent on care, with the carrier's take confined to the visible remainder. The decade that followed is a story of carriers buying the far side of the meter. The largest insurers now own PBMs, specialty pharmacies, and physician groups, and when a carrier pays its own subsidiary for a drug or a visit, the payment lands on the care side of the ledger at a price the subsidiary helped set. No public dataset totals how much margin crossed the line this way, and the endnote flags this instance as structure rather than sum. But one wing of that structure has been measured, and it is the FTC finding above. An affiliated-pharmacy share does not climb from 54 to 68 percent by accident.

Fourth instance, the one that gets personal. The ACA banned medical underwriting in the small group market: a carrier may not price a 30-person company by the health of its 30 people. Underwriting did not die. It moved into stop-loss. A level-funded plan is technically self-funded, wrapped in stop-loss insurance that is individually underwritten, and 37 percent of covered workers at firms with 10 to 199 workers are now in one, up from roughly 7 percent in 2019. Inside those arrangements the measurement gets names attached: in the 2025 Aegis stop-loss survey, 49 percent of plans saw a claimant cross $1 million in the past two policy years, and 16 percent report at least one lasered claimant, a specific person the underwriter has excluded or re-priced individually at renewal. The law banned measuring the group, so the measurement moved one contract over, where the ban does not reach, and turned into a list of names.

Figure 3 · The margin migratespause
The PBM case, animated. A meter clamps the rebate segment and the rebate comes back 99.6% clean. The margin slides to the next unmetered segment: $7.3B in above-cost dispensing. GAO-19-498; FTC Second Interim Staff Report, Jan 2025.
Audit the rebate and the rebate comes back clean. The margin no longer lives at that address.

Notice what the law implies. Transparency rules aimed at one segment do not recover the margin; they forward it. Every reform that meters a single node finances the move to the next one, which is why each of the four stories above ends with a clean audit and an intact margin. The only countermeasure the law leaves open is the one nobody has built: measurement with no edge to migrate past.

The data layer

The pipe only runs one way

The mechanics underneath confirm the thesis. Enrollment moves on a HIPAA transaction called the 834, standardized in 1996, and the 834 is a one-way pipe. The sender transmits a file describing who is covered; the carrier may acknowledge that the file parsed; whether the carrier applied each record correctly is confirmed never. There is no universal schema, because every carrier built its own dialect. So a birth date miskeyed at enrollment becomes a denied claim three nodes downstream, in January, at a pharmacy counter, discovered by the member.

How often does the pipe fail? Here is the honest answer, and it took longer to accept than anything else in this paper: nobody knows. We went looking for an audited, public error rate for the enrollment pipeline and found vendor marketing, conference folklore, and nothing that survives a citation trace. The widely repeated claims that most carrier bills contain enrollment errors all trace back to firms selling the audit. The denial machine, which we covered in an earlier paper, at least has public rates in two markets. The pipe that decides whether you exist to your carrier has none anywhere.

0.
Audited, public error-rate statistics for the 834 enrollment pipeline. Not a low number. No number. By the first law, that makes the pipe exactly where the margin should be hiding, and the first institution to measure it will own the fact.
Figure 4 · The 834, send-onlypause
Records fly toward the carrier and nothing confirms they were applied correctly. The failure rate drawn here is invented, because no audited one exists. That absence is the finding.

What the market cannot count, it prices anyway. The plumbing vendors, Noyo, Ideon, and the rest, exist for exactly one reason: to sell back the read path the 834 never had. Their headline feature is reconciliation, a recurring check on whether the data you sent is the data the carrier holds. Sit with that for a second. An industry sprang up to answer a question the system should have been able to answer itself, and its revenue is the closest thing to a measurement of the error rate nobody publishes.

The toll

What it costs to run a relay nobody can see

None of this operates for free. A relay in which nobody trusts the previous runner re-verifies everything by hand, and the hand-verification shows up in the national accounts as administration. The United States spends $925 per person per year on health system administration, against $245 in comparable wealthy countries, and that figure counts only insurer and program overhead, not the armies on the provider side. Counted fully, administration runs to roughly a quarter of US health spending. On the billing-and-insurance slice alone, about $496 billion a year, David Cutler's estimate is that half to two-thirds is excess.

The texture of that excess is in one comparison. US physician practices spend $82,975 per physician per year interacting with payers. Ontario practices, dealing with one payer and one schema, spend $22,205 for the same function. The gap is not richer care. It is the price of reconciling, manually, what the relay corrupted in transit, and it is the one fee in this paper that every node pays.

$82,975.
What a US practice spends per physician, per year, interacting with payers, against $22,205 in Ontario. The difference is the relay's operating fee, paid in staff hours.
Regulation

Every major rule is a patch on opacity

Read the major rules in sequence and a pattern jumps out. The medical loss ratio caps the carrier's take and forces rebates on the overshoot. Section 202 of the CAA, in force since December 27, 2021, requires any broker or consultant expecting $1,000 or more in compensation to disclose it to the plan fiduciary before the contract is signed, and makes the arrangement a prohibited transaction if they do not, with the liability landing on the employer. The gag-clause provision bans contracts that hide provider price and quality data from the plans paying the bills, and requires employers to attest annually that they signed none.

Each rule is necessary. Each is also a patch, forcing one specific slice of withheld information into the light, and you do not legislate the revelation of something that was never hidden. Read together, the rulebook is a regulator's confession: the system's default state is opacity.

Whether anyone reads what the patches reveal is a separate question. Four and a half years after Section 202 took effect, we can find no public record of a Department of Labor enforcement action under it. The disclosures exist. The reader they assumed does not. That is the second law, and a later paper: disclosure regimes exist, but nobody is built to read them.

A note on what would change our mind, before we get to what we are building. If any node in this relay could produce, on demand, one employee's complete current picture, the elections made, the records each carrier actually holds, and the compensation paid at every hop, this paper collapses and we will publish the correction with the screenshots. We spent years inside this system and never saw it done. And the first law is falsifiable on its own terms: if V28's squeeze had simply returned the money instead of shifting the mix toward selection, or if rebate transparency had been followed by a flat affiliate dispensing share, the law would be dead and this paper with it. The record so far runs one way.

What Keel Labs is building

If the failure is the absence of end-to-end truth, then the thing worth building is the layer that finally holds it.

This is the whole thesis of the lab, so let me be precise about the machinery. The money problem and the data problem are the same problem wearing two coats: both come from the fact that no node can see the whole, and another slice would not help. So we build the visibility layer: one system that sits across every node, reconciles what each party believes against what is actually true, and is accountable for the outcome instead of the transaction. Here is how that gets assembled, in order.

01

A living ontology of the whole network

Every carrier, plan, tier, rider, employer, broker, member, and provider becomes a typed node. Every commission, eligibility rule, and relationship becomes a labeled edge. You cannot reason about a network you have not mapped, so first we map all of it.

02

Entity resolution, so there is one truth and not five

The same plan shows up under five names across five systems. We resolve every record to one canonical entity, so the graph carries a single source of truth instead of five plausible contradictions.

03

The read path the 834 never had

The legacy pipe is send-only. We add continuous reconciliation: comparing what every node believes against what is actually true, and surfacing the drift the moment it appears, before it becomes a denied claim. This is the runner that finally carries the baton the whole way.

04

Provenance on every edge

Every fact in the graph carries its source and its timestamp, so when an answer comes out, it traces back to the exact line, from the exact system, at the exact moment it came from. In a regulated environment, an answer you cannot trace is an answer you cannot use.

05

Fathom reasons over it

Our benefits-native model sits on top of the resolved, reconciled, provenanced graph and answers the questions people actually ask. Because it reasons over truth instead of one carrier's PDF, it can prove every answer back to the page.

06

Amanda makes it human

The employee was the one node holding nothing. Now there is a single voice that can see the whole relay and explain it in plain language, at 11pm, on a phone, without judgment. The truth finally reaches the person it was always for.

The first law has a contrapositive, and the contrapositive is the company. If the margin migrates to wherever measurement stops, then a layer that measures everywhere leaves it no edge to slip past. The fee that only made sense because nobody could see beyond it, the error that became a denial because nobody reconciled it, the better plan the employee never knew existed: none of it survives contact with a system that holds the whole truth and is paid to be right about it.

We are not trying to abolish the relay. We are trying to give it, for the first time, a runner who carries the baton the whole way. That runner is the rest of our research, and it is the reason Keel exists.
What this paper does not claimIt does not claim that any individual broker, carrier, or PBM is overpaid, only that the structure pays for position rather than outcome and that the measured record behaves exactly as that structure predicts. Specific caveats: the fee magnitudes in Figure 1 are illustrative, with the verified ranges in the body. The FTC's $7.3 billion covers specialty generics, a sliver of total drug spend, and Bai and colleagues show that commissions rise with premiums by association, not by proven cause. The third instance of the law, margin moving inside vertically integrated carriers after the MLR cap, is structural reasoning from public filings rather than a measured aggregate, and should be read that way. Figure 4 invents its failure rate, because no audited, public error rate for the 834 pipeline exists and vendor claims do not clear our sourcing bar; the Morra payer-interaction figures are 2011 dollars, unadjusted; KFF cautions that small employers sometimes misreport level-funded status; the Aegis survey reflects its respondent base of self-funded plans, not a census. No agency counts the aggregate cost of the dropped batons. The thesis does not rest on any one number. It rests on four independent instances of the same migration, and on the shape of a relay that cannot see its own end.
SourcesCMS, National Health Expenditures 2024 (released Dec 2025: $5.3T, +7.2%, 18.0% of GDP, $15,474 per capita; private health insurance $1,644.6B) · U.S. Census Bureau, Health Insurance Coverage in the United States: 2024, P60-288 (employment-based coverage 53.8% of the population) · KFF, 2025 Employer Health Benefits Survey (family premium $26,993, +6%, worker share $6,850; family premium $5,791 in 1999; 67% of covered workers self-funded; 37% of covered workers at firms of 10–199 workers level-funded; 2020–2025 wages +28.6% vs family premiums +26%, inflation +23.5%) · GAO-19-498, Medicare Part D: Use of Pharmacy Benefit Managers (2019: 99.6% of ~$18B in 2016 rebates passed through; $74.3M retained) · FTC, Interim Staff Report on PBMs (Jul 2024: top three PBMs process ~80% of ~6.6B annual prescriptions) · FTC, Second Interim Staff Report on PBMs (Jan 2025: $7.3B dispensing revenue above estimated acquisition cost on specialty generics 2017–2022; $1.4B spread income; affiliated share of specialty dispensing revenue 54%→68%, 2016–2023; excess compounding 42%/yr 2017–2021; GPOs chartered in Switzerland and Ireland) · ProPublica, The Health Insurance Hustle (2019: commissions 3–6% of premium, 40–50% on supplemental products, carrier bonuses of $100,000+ per group) · Bai et al., Medical Care Research and Review (2022: 11.7M employees, 33,689 plans, 2017 data; median commission $178/enrollee, positively associated with premium) · MedPAC, March 2025 Report to the Congress (MA ~20% above FFS, $84B: ~$44B coding intensity, ~$40B favorable selection) · MedPAC, March 2026 Report to the Congress (14%, $76B; upcoding ~4%, ~$22B, with V28 fully phased in) · Aegis Risk Medical Stop-Loss Premium Survey 2025 (1,268 policies: 49% with a $1M+ claimant in the last two policy years; 16% report at least one lasered claimant) · Peterson-KFF Health System Tracker (2021 OECD data: US health administration $925 per capita vs $245 in comparable countries) · Morra et al., Health Affairs (2011: US practices $82,975/physician/yr interacting with payers vs $22,205 in Ontario) · Health Affairs policy brief, The Role of Administrative Waste in Excess US Health Spending (2022: administration ~a quarter of US health spending; ~$496B billing- and insurance-related; Cutler: half to two-thirds excess) · CAA 2021 §202 / ERISA §408(b)(2)(B), DOL Field Assistance Bulletin 2021-03 (disclosure at $1,000+, effective Dec 27, 2021; non-disclosure is a prohibited transaction) · CAA gag-clause prohibition and CMS attestation · X12 834 implementation guides; Noyo and Ideon product documentation (reconciliation as a product) · Derived figures (4.7x premium multiple since 1999, ~$263 per biweekly paycheck, ~$54B selection remainder in 2026) are computed from the counts above. Figures are point-in-time and directional.

Keep reading: every dropped baton lands on a person, and it usually lands as a denial letter.

From the Keel Labs research feed. A paper every week. A field note every day.