I Built an AI Medical Records App and Shipped on Kubernetes

How I built MediRecord, an AI medical-records app, and then shipped it like it actually had users.

There's a folder.

If someone in your family has been in treatment for a while, you know the exact one I mean. Prescriptions, lab reports, scan printouts, hospital bills, all shoved together, getting thicker with every visit. And nobody can really read it. The patient can't tell what's improving and what's getting worse. The doctor spends the first ten minutes of an appointment rebuilding a timeline from paper instead of actually treating. And if the reports are in clinical English but the patient only reads Bengali or Tamil, then the single most important information in their life is basically noise.

That folder is the whole reason MediRecord exists.

The pitch was easy to say and genuinely annoying to build: take the pile, let an AI read it, and hand it back as something a human can use. You upload a report, it gets parsed, sorted, summarized, and translated into your own language. The folder turns into a dashboard. A small assistant explains your latest result in plain words. The doctor gets a clean view instead of a paper avalanche.

The part that feels like magic

You drop a PDF or a phone photo into Smart Upload, and the app sends it to Google Gemini with a very opinionated prompt. The trick is that Gemini doesn't just summarize. It returns typed

JSON:

{
  "category": "blood_test",
  "date": "2026-06-12",
  "source": "Apollo Diagnostics",
  "data_points": [
    { "test": "HbA1c", "value": "7.8", "unit": "%", "normal_range": "4.0-5.6", "flag": "high" }
  ],
  "summary_en": "Your HbA1c is 7.8%, above the normal range..."
}

That shape is the whole game. Once a blood test stops being a PDF and becomes data, everything downstream gets easy. I can chart an HbA1c trend across four visits, flag the one that's climbing, and put it in front of the patient before they go looking for it. A single derivation engine (buildHealth) reads the raw records and computes vitals, trends, and insights, so there's exactly one source of truth feeding both the patient dashboard and the doctor's view.

The cache I'm a little too proud of

Translation is where the free-tier Gemini key and I had words.

Every summary gets auto-translated into the patient's language (twelve Indian languages plus English). Cool, except the AI free tier caps you around 20 requests a day, and re-translating the same sentence on every page load is a fast way to burn through that and your patience.

So translations are cached in Postgres, keyed by an MD5 hash of the source text plus the target language. Translate a string once, serve it from the database forever. Same string, same language, never hits the API twice.

The other small decision I like: translation fails open. If Gemini is down or the quota's gone, the patient sees clean English instead of a broken page. A medical app showing a 500 error to a worried person is the worst possible failure mode, so it just degrades quietly.

Then I decided to ship it like it was real

I could have stopped at "it runs on Vercel." It does, and that's the public demo. But the part I actually wanted to learn was everything a hosted platform hides from you.

So the repo carries a full delivery pipeline. A pull request runs GitHub Actions: lint, a full TypeScript type-check, and a production build. Nothing reaches main unless all three pass, branch protection enforces it. On merge, the image gets built and pushed to GitHub Container Registry. From there it lands on a k3s Kubernetes cluster running on EC2, and Argo CD keeps the cluster in sync with the k8s/ folder in git, with auto-sync, self-heal, and prune turned on.

The nice consequence: I don't deploy by SSHing in and running commands. I change replicas in a YAML file, open a PR, merge it, and Argo CD reconciles the cluster to match git. Git is the source of truth. If I manually scale a pod by hand, Argo notices the drift and quietly puts it back. That felt like cheating the first time I watched it happen.

A few details I cared about along the way. The container image comes in around 150 MB instead of a gigabyte, because Next.js standalone output traces only the dependencies the app actually uses, and each pod idles at roughly 35 MiB of memory. The containers run as a non-root user, and secrets never get baked into the image: the public Supabase keys go in as build args, but the real ones are injected at runtime from a Kubernetes Secret that never touches git. And every Supabase table is locked down with Row-Level Security at the database layer, not just in app code, so a patient only ever sees their own records, and a doctor sees a patient only when there's a matching row in an access_grants table.

What broke, and what it taught me

A few things humbled me.

The first time I saw Argo CD report Synced while the cluster clearly wasn't what I expected, I assumed it was lying. It wasn't. Synced means the cluster matches git, not my mental history of kubectl commands. Everything I'd poked at by hand was drift. That reframed how I think about infrastructure: stop treating the live cluster as the thing you edit, start treating git as the thing you edit.

ImagePullBackOff taught me my GHCR package was private and the cluster had no credentials to pull it. The Gemini quota taught me to cache aggressively. And restarting the EC2 host taught me that a stopped instance comes back with a new public IP unless you pin an Elastic IP, while the k3s cluster, Argo, and pods all happily recover from disk.

None of that is in a tutorial in a way that sticks. You learn it by breaking your own thing at midnight.

What I'm building next

Right now the brain of the whole thing is Gemini, and that's fine for reading a document. But the direction I'm actually chasing is a model of my own. Not one that reads a single report in isolation, but one that reads as much of a person as it can and slowly learns the shape of them. Your baselines, your patterns, what "normal" looks like for your body specifically instead of the textbook average. The more of your history it reads, the more it understands you and not just your latest test. That's the part I'm quietly building toward.

The bigger goal is personalized diagnosis that works from both sides of the room. For the patient, an assistant that can look across years of records and say, in plain words, here's what's slowly trending and here's what's worth bringing up. For the doctor, a second pair of eyes that pulls the real signal out of a stack of PDFs before the appointment even starts. Same data, two lenses, both genuinely useful.

It's half-built right now, and honestly the harder half is still in front of me. But that's kind of the whole point of starting.

Where it lands

MediRecord ended up being two projects wearing one repo. A real product, an AI app that turns a messy folder into a readable, multilingual health record. And a platform-engineering showcase, a complete path from a pull request to a self-healing cluster.

It's a portfolio project, not a medical device, and I'm very clear about that in the README. But it answered the question I actually had when I started: can I take an idea this fuzzy all the way to something that builds itself, ships itself, and heals itself when I'm not looking?

Turns out, yes. One overstuffed folder at a time.

Live demo: medirecord.vercel.app · Code: github.com/wasimat404/medirecord

The Folder That Learned to Read

There's a folder.

That folder is the whole reason MediRecord exists.

The part that feels like magic

The cache I'm a little too proud of

Then I decided to ship it like it was real

What broke, and what it taught me

What I'm building next

Where it lands

Comments

Command Palette

There's a folder.

That folder is the whole reason MediRecord exists.

The part that feels like magic

The cache I'm a little too proud of

Then I decided to ship it like it was real

What broke, and what it taught me

What I'm building next

Where it lands

Comments