I Built a Free AI Resume Builder — Here's Every Architecture

This is part two of the Synapse build story. Part one covered the what and why. This one covers the how — every tool, every trade-off, and a few things I'd do differently.

If you haven't read part one: I built Synapse — a free, ATS-friendly resume builder — because my own resume was two years out of date and I couldn't find a tool I actually trusted. You upload your LinkedIn PDF, pick a target market (US, UAE, India, or Global), and get back a professionally typeset, ATS-optimized resume in seconds.

Since that post, two new features shipped:

Job description targeting. Paste a JD alongside your resume and Synapse tailors the output to match — reordering, reframing, and keyword-aligning your experience to what the role is actually looking for. Nothing fabricated. Your real experience, presented in the language that lands.

Resume match scoring. After generation, you get a score showing how well your resume aligns with the JD. Strengths listed. Gaps listed. So you can fix them before the recruiter sees it, not after.

Both features are covered in the architecture section below — they're a good example of how to add intelligence to a pipeline without slowing it down.

The first post was about the product. This one is about the engineering.

I'm going to walk through every layer of the architecture — what I chose, what I rejected, and why. Not because every decision was brilliant, but because the trade-offs are the interesting part.

The Full Stack at a Glance

Here's what's running:

DNS: GoDaddy
Frontend: Next.js (React)
Frontend hosting & CDN: Vercel
Bot protection: Cloudflare Turnstile
WAF & DDoS protection: Cloudflare WAF
Application server: AWS Lightsail
Backend framework: Directus (headless CMS + REST API)
Database: Neon (serverless Postgres)
File storage: Cloudflare R2
AI models: Claude Haiku + Claude Sonnet (Anthropic)
LaTeX compilation: Custom Node.js service + Linux TeX binary
Email delivery: Resend
Observability: PostHog + BetterStack
Error tracking & analytics: PostHog

That's a lot. Let's go layer by layer.

The Request Path: From User to Server

A request to Synapse travels through three gates before it touches my application.

Gate 1 — Next.js on Vercel (UI + CDN)

The frontend is a Next.js app. I chose Next.js for its file-based routing, server components, and the fact that Vercel — its native host — handles the rest: global CDN, SSL termination, instant cache invalidation, and zero-config deployments on every push.

The UI isn't a thin wrapper. It owns the entire user flow across three meaningful steps:

Upload — PDF drag-and-drop, target market selection, optional JD input, Turnstile verification
Review — After AI extraction, the user sees their structured data and can edit it before generation fires. This is intentional. The AI extracts; the human confirms. That step keeps the user in control and dramatically reduces bad outputs.
Result — The generated resume renders inline with a download link, match score (if JD was provided), and the option to save via email.

Each step has its own loading state, error state, and retry path. The UI has to handle a pipeline that touches four different services — it can't just be a form.

Vercel sits at the edge in front of all of this. DNS lives on GoDaddy. Nothing interesting there — I'd move it to Cloudflare for unified control, but the migration isn't worth the friction yet.

Gate 2 — Cloudflare Turnstile

Before a user can submit their resume, they have to pass Cloudflare Turnstile — an invisible CAPTCHA that runs in the background without making users solve puzzles.

Why does this matter? My resume generation endpoint calls the Anthropic API, which costs money per request. Without bot protection, any automated script could drain my API credits in minutes. Turnstile validates the token server-side on every request. Bots don't pass. Humans don't notice.

Cost: $0.

Gate 3 — Cloudflare WAF

The WAF sits in front of my Directus instance on Lightsail. It handles rate limiting, blocks known malicious IP ranges, and absorbs DDoS traffic before it reaches my server. This is the layer that lets me sleep at night.

I configured custom rate-limiting rules specifically for the API endpoints — stricter limits on the generation routes, looser on the stats endpoints.

The Application Layer: Why Directus on Lightsail

This is the decision I get asked about most.

Why not just write a custom Express/Fastify API?

I considered it. The problem is that a resume tool needs more than API endpoints. It needs:

A relational database with proper schema management
File storage and serving
An admin panel to inspect requests, debug issues, and manage data
Role-based access control

Building all of that from scratch on a deadline is a classic trap — you spend two weekends writing plumbing instead of building the product. Directus gives you all of it out of the box, and you extend it with custom endpoint extensions written in plain Node.js.

The extension pattern is clean:

module.exports = {
  id: 'generate-resume',
  handler: (router, { services, env }) => {
    router.post('/', async (req, res) => {
      // your logic here
      // services.ItemsService talks to Postgres
      // services.FilesService manages file uploads
      // env carries all your environment variables
    });
  }
};

You write the business logic. Directus handles everything else.

Why Lightsail instead of EC2 or ECS?

Predictability. Lightsail is $20/month, full stop. EC2 with all the surrounding AWS services (ALB, RDS, ECR, ECS Fargate) can run you $80-150/month before you've handled any real traffic — and the billing surprises are real.

For a free side project where I'm absorbing all infrastructure costs, a fixed monthly bill isn't a compromise. It's a requirement.

The trade-off: Lightsail doesn't scale horizontally the way ECS does. If Synapse somehow goes viral and I need to run ten instances behind a load balancer, I'll need to migrate. But I've stopped building for hypothetical traffic spikes. That's a future problem.

The AI Pipeline: Two Models, Two Jobs

This is the part I'm most proud of.

The naive approach to building an AI resume tool is: send the PDF to a powerful model, get back a resume. One model, one prompt, done.

The problem is that approach is slow and expensive. A powerful model like Sonnet costs roughly 6x more than Haiku per token, and it takes longer to respond. If you're using it for everything, you're over-paying for work that doesn't need that much intelligence.

I split the pipeline into two tiers.

Tier 1 — Claude Haiku (fast, cheap, parallel)

Haiku handles everything that's high-volume and doesn't require deep reasoning:

Resume validation — Is this actually a resume? Is it readable? Does it have enough content to work with?
Job description validation — Is this a real JD? Can we extract structured requirements from it? This runs in parallel with resume validation — the user doesn't wait for both sequentially.
Resume data extraction — Parse the PDF text into structured JSON: contact info, work history, education, skills, certifications.
Match scoring — After generation, how well does the resume match the JD? Returns a score, a list of strengths, and a list of gaps.

The JD features (customized generation + scoring) were the most-requested addition after launch. The interesting thing is they didn't require rearchitecting anything — they plugged into the existing two-tier model split cleanly. JD validation joins the parallel validation step. JD context passes into the Sonnet prompt as additional signal. Match scoring fires at the end, only when a JD is present. Three additions, zero new infrastructure.

Validation and extraction run in parallel. The user doesn't wait for them sequentially — both fire at the same time.

Tier 2 — Claude Sonnet (slower, higher quality)

Sonnet fires exactly once: when generating the LaTeX output.

This is the moment that matters to the user. They've reviewed and edited their extracted data. They've confirmed it's accurate. Now they need a document that will actually get them an interview. This is where you spend the extra tokens.

Sonnet receives the structured data and a market-specific prompt (US formatting conventions differ from UAE and India), and returns LaTeX source code ready for compilation.

The split means I'm spending money where it earns its cost, and not spending it where it doesn't.

The LaTeX Compiler: Unglamorous and Effective

Once Sonnet returns LaTeX source, I need to compile it into a PDF.

I run a Node.js microservice on the same Lightsail host. It receives LaTeX via HTTP POST, writes it to disk, calls pdflatex (a Linux binary), and returns the compiled PDF as a buffer.

Why not a managed service or a lambda? Because LaTeX compilation requires a full TeX distribution installed on the system — it's a 3GB install. That's not something you bundle into a Lambda function. Running it as a persistent sidecar on the same machine that runs Directus is the simplest solution that works.

Why LaTeX at all instead of HTML-to-PDF (Puppeteer, wkhtmltopdf, etc.)?

ATS (Applicant Tracking System) parsers are the reason. HTML-to-PDF conversion creates PDFs with broken or missing text layers — the PDF looks right to a human but the parser extracts garbage. LaTeX generates PDFs with proper document structure: correct reading order, clean text layers, accurate glyph-to-character mapping. ATS parsers handle them correctly because there's nothing to mishandle.

The extra 3-5 seconds of compile time is worth it.

Storage: Cloudflare R2

Generated PDFs are stored in Cloudflare R2, accessed through Directus's FilesService.

The reason is simple: R2 has zero egress fees. S3 charges per GB transferred out. For a free tool where users download their resumes, egress costs would accumulate directly against my bill. R2 eliminates that entirely.

Files auto-delete after 24 hours by default. If a user saves their resume (provides their email), the TTL extends to 30 days. This is the privacy model: zero default retention, explicit opt-in for longer storage.

Email: Resend

When a user saves their resume, I send them a download link via Resend. The email is sent from noreply@bilal.one.

Resend is the cleanest transactional email API I've used. One function call, reliable delivery, good deliverability out of the box. The free tier covers the volume I'm running.

Observability: PostHog + BetterStack

PostHog handles product analytics and error tracking. I track events through the resume generation pipeline — where users drop off, what markets they select, how often extraction succeeds vs. fails. Error tracking captures exceptions with full context.

BetterStack handles infrastructure logging and uptime monitoring. Every request logs to BetterStack. If the Lightsail instance goes down, I get paged.

Both are free at my current scale. Both would have been afterthoughts if I'd treated this like a hobby project — but I didn't. This is a real product. Observability is non-negotiable.

The Database: Neon (Serverless Postgres)

Neon is serverless Postgres with database branching — you can spin up isolated branch environments for testing without provisioning a separate database.

I use it through Directus's ItemsService abstraction. Schema changes are tracked in the repo. The branching feature has been useful for testing migrations without touching production data.

Privacy Architecture

This is built in from the start, not bolted on at the end.

Resumes auto-delete after 24 hours
No data is sold or shared
All traffic is proxied through Cloudflare (users never get the raw Lightsail IP)
The Anthropic API processes data per-request — no training on submitted resumes
Email is only collected if the user explicitly requests a download link

The 24-hour window exists because the resume is generated for a moment in time — a specific job application. Keeping it indefinitely provides no value to the user and creates unnecessary liability for me.

What I'd Change

1. Move DNS to Cloudflare.
Managing DNS on GoDaddy while running everything else through Cloudflare is unnecessary friction. One migration, one less dashboard.

2. Containerize the Lightsail deployment.
Right now, Directus and the LaTeX compiler run on bare metal (a Lightsail instance). If I ever need to scale, I'd want Docker Compose at minimum, and ideally ECS Fargate with proper task definitions. The current setup works but isn't portable.

3. Add a queue for the generation pipeline.
Right now, generation is synchronous — the HTTP request stays open while Haiku and Sonnet do their work. For a low-traffic tool, that's fine. Under load, it's a problem. A proper job queue (BullMQ or similar) would decouple submission from processing.

4. Better prompt versioning.
I store prompt templates in a Directus collection (ai_configurations) and can override them at runtime without a deploy. But I have no versioning, no rollback, and no A/B testing. For a product this dependent on prompt quality, that's a gap.

The Lesson

The difference between a hobby project and a production product is the 80% of work nobody sees: the security layers, the privacy model, the observability setup, the error handling, the deployment pipeline.

Anyone can get an AI API to return text. The engineering is everything that happens before and after that.

Try Synapse: https://resume.bilal.one

If you found this useful, share it with someone who's grinding through a job search right now.

Next up: the prompt engineering behind the LaTeX generation — why market context matters, what makes a resume ATS-friendly at the prompt level, and how I handle edge cases like non-standard career histories.

How I Engineered Synapse: The Stack, The Trade-Offs, The Lessons