When the importer breaks, Claude opens a pull request
Roaming runs on data exchange, and not all of it arrives over a tidy API. A steady share of it turns up as spreadsheets: partners send us CDRs and invoices as CSV and Excel files, and every partner has their own idea of what the columns should be called, what order they go in, and how a date or an amount should look. Multiply that by a long tail of partners and you have a genuinely hard ingestion problem.
This is the story of how we tamed it, and of a pattern we ended up rather liking: when the importer hits a file it cannot parse, it does not just page a human. It asks Claude to investigate and open a pull request.
The baseline: header mapping
The first layer of the answer is unglamorous and effective: header mapping. Rather than hard-code one partner’s layout, we map each partner’s arbitrary columns onto our own canonical set of fields. “Session ID”, “cdrId”, “Charge Ref”, and a dozen other spellings all land on the same internal concept. Get the mapping right and the rest of the pipeline does not care where the file came from.
The trouble is that maintaining those mappings by hand, across every partner and every time one of them quietly changes their export, is exactly the kind of toil that never ends.
Letting an LLM propose the mapping, but never trusting it
So we let an LLM do the first pass. It looks at a file and proposes how its columns map onto our fields. That saves the tedious part. What it does not get is our trust.
An LLM-generated plan is never applied as-is. It is a proposal that has to survive validation:
- Type checks. Do the values in a column actually look like what the plan claims? A column mapped to “energy” had better contain numbers, not timestamps.
- Probing the database. Several columns often look like they could be the CDR id. So we take the candidates and try them against the database to see which one actually resolves to real records. The column that sticks is the real id.
The LLM is good at the fuzzy first guess. Our code is good at deciding whether the guess is right. Keeping those two responsibilities separate is most of what makes this reliable.
The interesting bit: failure opens a pull request
Even with all that, formats keep surprising us. Eventually a file arrives that the importer genuinely cannot handle: a structure the code was never written for, an assumption that no longer holds. The old answer was a stack trace and an engineer starting from scratch.
The new answer is a loop. On a parsing failure, the platform triggers a GitHub Actions workflow. That workflow hands Claude the relevant code, the failure itself, and a sample of the offending data, and asks it to work out what went wrong. Claude analyses all three and opens a pull request with a proposed change.
And then it stops. The pull request is where a human takes over. An engineer reviews the diff, the normal CI runs against it, and it ships through exactly the same path as any other change. Nothing is applied automatically, nothing bypasses review, nothing deploys itself.
Why a pull request, and not a fix
The temptation with this kind of thing is to close the loop entirely and let the system patch itself. We deliberately did not. A pull request is the perfect seam:
- It puts the investigation where the LLM is genuinely strong, reading unfamiliar code, correlating it with a concrete failure and real data, and drafting a plausible change.
- It keeps the judgement where humans belong, deciding whether the fix is actually right, whether it has side effects, and whether it should merge at all.
- It rides the guardrails you already have. Code review, tests, and deployment gates all apply, because the output is just a PR like any other.
That is the whole point of the talk this post came out of. The LLM is not a magic autonomous fixer. It is one more participant in a controlled loop, doing the legwork of diagnosis and a first draft, with a human firmly holding the merge button.
The future
The pattern generalises well beyond spreadsheet imports. Any place where a failure is well-defined, the relevant context can be gathered automatically, and a fix is reviewable, is a candidate for the same loop: detect, gather, propose, review. We started with the messiest corner of roaming because that is where it hurt most. It is unlikely to be where it ends.