Why I Replaced My Onboarding Form With a LangGraph AI Agent

Simple question, right? Except it's not. Not really.

Ask ten different creators that question and you'll get ten wildly different answers. A developer who live-codes on Twitch is nothing like a lifestyle creator on Instagram. A B2B thought leader on LinkedIn has completely different needs than a YouTube educator breaking down finance for Gen Z.

That's exactly the problem I ran into building my AI operator for content creators. I needed a way to actually understand each user before we could help them. A basic onboarding form wasn't going to cut it.

So I built an AI agent that holds a real conversation with users, asks the right follow-up questions, and produces a detailed niche profile — all during onboarding. Here's how it works, and more importantly, why I built it this way.

The Problem With Standard Onboarding

Most onboarding flows ask you to fill out a form. Pick your niche from a dropdown. Choose your goals. Click next.

The problem? Content creation doesn't fit in a dropdown.

The difference between "tech content" and "beginner-friendly Python tutorials for career switchers" is enormous. One is useless to an AI system trying to help you. The other tells you everything.

I needed onboarding to be a conversation — not a form. And I needed that conversation to be persistent, resumable, and capable of reasoning across multiple turns.

That's when I turned to LangGraph.

What Even Is LangGraph?

At a high level, LangGraph is a library for building stateful, multi-step AI workflows as a graph.

Think of it like a state machine for AI agents. You define nodes (things the agent can do) and edges (the rules for moving between them). What makes it powerful is that it handles state persistence natively — meaning your agent can pause, wait for external input (like a user answering a question), and pick back up exactly where it left off.

Here's the mental model: LangGraph is like a choose-your-own-adventure book where the AI decides which page to turn to — and can bookmark its place.

How I Structured the Agent

The agent has two nodes:

analyze — This is where the heavy lifting happens. It builds a prompt from the user's initial niche description (and any follow-up answers they've provided), calls the LLM with structured output via Zod schemas, and returns one of two things:

A set of follow-up questions (when it needs more info)
A completed finalNicheDescription (when it has enough to work with)

awaitUser — This calls LangGraph's interrupt() function, which pauses execution and saves all of the agent state to Postgres, indicating that we have gotten to a "checkpoint". The agent literally stops running and waits. When the user submits their answers, the frontend resumes the graph by sending back the answers with a Command({ resume: answers }).

The routing logic is simple: after analyze, if the agent is done (or has looped 3 times), go to END. Otherwise, go to awaitUser and loop back to analyze once answers come in.

// Simplified edge logic
const shouldContinue = (state: NicheAgentState) => {
  if (state.isComplete || state.iterationCount >= 3) return "END";
  return "awaitUser";
};

That 3-iteration cap is important. More on that in a second.

The Part Most Tutorials Skip: Checkpointing Across HTTP Requests

Here's where things get interesting.

Each time the user submits answers, they're making a new HTTP request. The agent has no memory between requests — unless you explicitly persist it somewhere. This is where LangGraph's Postgres checkpointer comes in.

Every time the graph hits an interrupt(), it saves the entire agent state to Postgres. When the next request comes in with the same threadId, the graph loads that checkpoint and resumes as if nothing happened.

// Route handler: start vs resume
if (action === "start") {
  input = { initialNiche: niche };
} else {
  input = new Command({ resume: answers });
}

await graph.stream(input, {
  configurable: { thread_id: threadId }
});

The threadId is what ties everything together. It gets sent to the client on the first questions SSE event, and the client sends it back with every subsequent request. Lose the threadId, and the graph can't find its checkpoint — the conversation is gone.

This is also why the agent survives server restarts. The state lives in Postgres, not in memory.

Where People Go Wrong: Over-Questioning

Here's a mistake I almost made.

My first instinct was to let the agent ask as many questions as it needed until it was fully confident. Makes sense in theory. In practice, it's a disaster.

Users don't want to answer six rounds of questions during onboarding. They'll drop off. They'll give lazy answers. They'll start saying "I don't know" just to get through it.

The 3-iteration cap forces the agent to be decisive. By the third loop, it has to commit to a finalNicheDescription regardless of how confident it feels. This actually made the output better, not worse — it stopped the agent from nitpicking and made it prioritise the signal it already had.

Constraints make AI systems more useful, not less.

The Structured Output Problem

One thing I got wrong early on: trusting the LLM to return consistent JSON.

Even with a well-crafted prompt, raw LLM output is unpredictable. The agent might return valid JSON one call and a slightly different shape the next. That breaks your downstream code.

The fix was using withStructuredOutput(analysisSchema) — where analysisSchema is a Zod schema that the LLM must conform to. If it doesn't, the call fails and you get a proper error instead of silently broken data.

const model = getModel(AI_MODELS.GROK_REASONING)
  .withStructuredOutput(analysisSchema);

The output schema captures everything the agent might return: reasoningSteps, isComplete, questions[], and finalDescription (which itself has a detailed shape: niche, subNiche, contentPillars, targetAudience, tone, and whether the creator covers current affairs).

That last field — doesCurrentAffairs — sounds minor but it genuinely changes how the AI operator writes content. Small structured fields like that pay off downstream.

Streaming the Reasoning to the UI

Users get nervous when nothing happens. A spinner for 10 seconds feels broken.

Since the agent does real multi-step reasoning before deciding what to ask, I wanted to stream those reasoning steps to the UI in real-time via Server-Sent Events (SSE). Each step shows up as the agent "thinks through" the creator's niche.

The route handler inspects each LangGraph update as it streams and emits typed SSE events:

reasoning_step — intermediate thinking, shown progressively
questions — the follow-up questions for the user (includes threadId)
final_result — the completed niche profile

On the client side, the component reads the SSE stream manually, buffers partial chunks, and updates state as events arrive. It's a bit fiddly, but the UX payoff is worth it — users can see the agent working, which builds trust.

What Does the Final Output Look Like?

After at most 3 loops, the agent returns a DetailedNiche that looks something like this:

{
  "niche": "Software development",
  "subNiche": "Backend engineering for self-taught developers",
  "contentPillars": ["System design", "API architecture", "Career growth"],
  "targetAudience": "Self-taught developers transitioning to mid-level roles",
  "tone": "Practical, no-fluff, peer-to-peer",
  "doesCurrentAffairs": "sometimes"
}

This gets persisted to Supabase via an upsert on user_id, and every subsequent AI feature in the platform uses it as context. The onboarding agent essentially writes the creative brief that the rest of the system follows.

When Should You Build Something Like This?

If your product serves a wide spectrum of users and the type of user fundamentally changes what your product should do — you probably need personalised onboarding.

A generic onboarding form gets you a generic product experience. An agent that asks the right questions gets you a product that feels like it was built specifically for that user.

The LangGraph + Postgres checkpointing pattern is the right call when:

Your onboarding requires multiple turns (back-and-forth, not just a form submit)
You need the conversation to survive page refreshes and server restarts
The output needs to be structured and validated (not just a free-text summary)

If your onboarding is a single form with five fields, you don't need any of this. But if you're building a product where "who is this user, exactly?" is a hard question — this pattern is worth every line of complexity.

I Built an AI Agent That Figures Out Your Content Niche — So You Don't Have To

The Problem With Standard Onboarding

What Even Is LangGraph?

How I Structured the Agent

The Part Most Tutorials Skip: Checkpointing Across HTTP Requests

Where People Go Wrong: Over-Questioning

The Structured Output Problem

Streaming the Reasoning to the UI

What Does the Final Output Look Like?

When Should You Build Something Like This?

Comments

More from this blog

Building a Production-Ready Notification System That Actually Scales

WTF are Sitemaps??

A Beginner's Guide to Terraform

Making Caching Efficient

Command Palette

The Problem With Standard Onboarding

What Even Is LangGraph?

How I Structured the Agent

The Part Most Tutorials Skip: Checkpointing Across HTTP Requests

Where People Go Wrong: Over-Questioning

The Structured Output Problem

Streaming the Reasoning to the UI

What Does the Final Output Look Like?

When Should You Build Something Like This?

Comments

More from this blog