Case study 02 · Healthtech · regulated AI

Healthily — A symptom checker 5.5M people use

People described their symptoms in everyday words — "my chest feels tight," "I'm always tired" — that the AI couldn't recognise. I built the layer that let it understand them.

Read
+23%symptom-description accuracy
5.5Musers of the device it shipped on
Class I → IIadesign foundations retained through stricter recertification
Role
Design Lead, clinical & commercial product
Scope
Symptom checker, triage flow, clinician knowledge base
Regulatory
Class I medical device · EU MDR
Before
1
Enter symptoms
No guidance
2
System reads it
Misreads plain words
3
Consultation
Less accurate
4
Result
Lower confidence
Mild frustration
Frustration
Dissatisfaction
Uncertainty

Pain points — medical terminology, spelling errors misread, no guidance during entry.

After
1
Smart entry
Autocomplete after 3 characters
2
Confirm terms
“Also known as” terms
3
Consultation
+23% more accurate
4
Reliable result
Higher confidence
Satisfaction
Confidence
Reassurance
Trust

What changed — autocomplete, “also known as” terms, visual confirmation, editable symptoms.

ProblemPeople describe symptoms in plain words — "my chest feels tight." The AI's clinical knowledge base didn't recognise them, so searches failed and consultations were abandoned, on a Class I medical device used by 5.5 million people.
The callA curated lay-term → clinical mapping, validated by clinicians in MediBase, the internal tool I designed — recognition first, without loosening clinical precision. Two faster options were rejected; both broke precision.
ResultSymptom-description accuracy up 23%, the design foundations held through stricter Class IIa recertification, and the knowledge base I structured is still live in the product today.
Screenshot · .webp The consultation flow and the report screen — "based on what you've told us".
The consultation, then the report — triage framed as triage, never diagnosis.
Screenshot · .webp MediBase — clinicians model each condition: symptoms keyed to UMLS concept IDs (CUIs), weighted, with inclusion and exclusion rules.
MediBase, the symptom-recognition and ratification system. The second product, built in tandem.

The insight

Users weren't failing to describe their symptoms — the product was listening in the wrong language.

The single biggest source of failed consultations was the gap between how people talk about their bodies and how clinical terminology classifies them. And because this was a certified medical device under EU MDR, the fix couldn't be looser language — clinical precision was non-negotiable. The design problem was building a bridge between the two vocabularies without compromising either.

The knowledge base behind it

The search field was the visible tip. Under it sat a knowledge base I structured: every condition, its symptoms, its red flags, and the relationships that decided what the checker could safely ask and show. The medical calls were never mine. I built the system clinicians used to map conditions and set red flags, because the front end is only ever as clear as the structure feeding it.

The translation layer

I designed a smart symptom autocomplete backed by a curated mapping between lay terms and the ontology's clinical vocabulary — type "hay fever" and the system knows you mean allergic rhinitis, with "also known as" terms surfaced in the dropdown. The clinical team and data scientists validated every mapping before it shipped.

A/B tests against the existing flow showed completion held even as consultations lengthened — but drop-off spiked the moment a symptom went unrecognised. That set the priority: recognition first, speed second.

Recognising what patients type

  • The fast fixLoosen the matching thresholds. But on a certified medical device, a clinically wrong match is worse than no match.
  • The workaroundFree-text now, clinical review later. That broke the real-time promise and pushed the ambiguity onto the clinical team.
  • What I builtA curated lay-term mapping, every entry signed off by clinicians. Recognition improved with clinical precision intact.
View my comment

The clinical lead resisted this for two weeks — it meant sign-off on every single mapping. I held that precision was the product, not a feature of it. I'd make the same call again, but I'd have built the governance around it a quarter earlier.

The triage flow

I owned the journey end to end, not just the search field. The consultation was rebuilt around progressive disclosure, gathering symptom details at the right moments rather than front-loading them, with add and remove confirmation so people could see and correct what the system had understood. The report presented possible causes, a plain summary and clear next steps, framed as "based on what you've told us". The product triages. It does not diagnose.

View my comment

"Based on what you've told us" wasn't copy — it was the line between triage and diagnosis, and therefore the regulatory position. Legal and I wrote that phrasing together; it changed what the report screen was allowed to claim.

Screenshot · 16:10 · ~1840×1150 · .webp Two phone frames side by side — the consultation flow and the report screen ("based on what you've told us").
The consultation, then the report — triage framed as triage, never diagnosis.
Flow board · zoomed-out export · .webp The whole branching flow, zoomed out — not meant to be read. Every triage path, red-flag interrupt, follow-up question and error state, mapped end to end.
The complete branching flow — every triage path, red-flag interrupt, and error state. I walk it through live; it sits here as proof of the depth behind the happy path.

Designing under a medical-device constraint

This was a Class I medical device under EU MDR, so a design decision could carry regulatory weight a normal product never feels. What the report was allowed to claim, how a symptom was worded, what counted as a red flag. All of it sat inside the certification. When the product was later taken through the stricter Class IIa recertification, the design foundations held. Patterns built for one risk class stood up under a tougher one.

Two products, hand in hand

This was two products in tandem. The symptom checker was the one 5.5 million people saw. Behind it, I led MediBase, where clinicians model and validate each condition, keyed to clinical concept IDs and weighted, behind a publish and red-flag gate. When an outcome came back wrong, this was where it was traced and corrected at source. The design problems were opposites: settle a worried patient at home, or give an expert exact control over how a condition is modelled.

User-facing · symptom checker

Head feels tight

also known as: constricting headache

MediBase · internal clinician tool

C0857278Constricting headache95

lay terms: tight band, pressing, squeezing

✓ Ratified · Dr. P. Okafor · 12 May 2024 · v3

MediBase, the symptom-recognition and ratification system — a condition's symptoms keyed to UMLS concept IDs with weights, plus inclusion and exclusion rules, behind red-flag and publish gates
  1. 1Publish + red-flag gatesNothing reaches a patient un-reviewed.
  2. 2Symptoms keyed to UMLS CUIs, weightedA probabilistic model, not a keyword list.
  3. 3Inclusion / exclusion rule engineThe clinical logic that decides each outcome.

What it changed

Symptom-description accuracy rose +23% after launch, counting only inputs that mapped to a clinically actionable classification. The gain held. When the product was later recertified to the stricter Class IIa, the design foundations carried over, patterns built for one risk class holding under a tougher one.

MediBase, the symptom-recognition and ratification system I designed, where clinicians validate how everyday symptoms map to conditions, is still in active use in the Healthily product today.

Reflection

The hardest constraint was cross-disciplinary, not regulatory. One word could block certification and still had to make sense to a worried person at home. The skill this built was holding the clinical, legal and UX frames at once, and knowing which takes precedence when they conflict. I treated the clinical vocabulary as fixed for too long. The mapping table needed governance, who owns it and how it updates, as much as an interface, and I came to it a quarter later than I should have.

Next

03 — A payment system that moved $500M