Notion AI Customer Research: How a $10B Company Decides What to Build

TL;DR

Notion's customer research practice is the clearest case study in modern SaaS for what happens when a CEO refuses to outsource learning about users. Co-founder Ivan Zhao personally interviewed early users for years, and that habit cascaded into a product-development culture where talking to customers is treated as the primary research instrument — not a quarterly survey, not a NPS dashboard, not a synthetic persona. By the time Notion crossed a reported $10 billion valuation in 2021 and shipped Notion AI in 2023, the company had institutionalized a five-part research practice: founder-led 1:1 interviews, public roadmap signal-gathering, beta cohorts as live research panels, in-product feedback woven into shipping decisions, and qualitative validation before any GA launch. The lesson for product teams in 2026 is not "have your CEO do interviews forever" — it's that conversation, not the survey field, is the primitive that should sit at the bottom of your research stack. As Notion scaled past 100 million users, the bottleneck shifted from collecting conversations to running them at scale, which is exactly the problem Perspective AI was built to solve.

Why Notion Is the SaaS Industry's Most-Cited Research Culture

Notion is referenced more than almost any other SaaS company when product leaders talk about "research culture" because the founder built the company from inside the conversation, not from inside a Looker dashboard. Public talks by Ivan Zhao at First Round Capital's CEO Summit, his appearance on Lenny Rachitsky's podcast, and Notion's own engineering blog all describe the same loop: a small team, a strong opinion about what the product should be, and a relentless cadence of customer conversations that update that opinion week by week.

That practice produced three observable outcomes. First, Notion shipped a product that was famously polarizing in 2018 and famously sticky by 2022 — a sign that the team was iterating on real signal rather than aggregate sentiment. Second, the company has consistently launched features (Notion AI, Notion Calendar, Notion Mail) that arrive feeling like answers to questions users had already asked, rather than guesses leadership made in a roadmap meeting. Third, when Notion needed to validate the largest product bet in its history — wrapping a foundation model into a workspace assistant — it did so through customer conversations, beta cohorts, and qualitative iteration, not through a marketing survey.

For any product team trying to operationalize discovery, that combination of inputs and outcomes is what makes Notion's playbook worth studying. The rest of this post breaks the playbook down into the moves any team can borrow, regardless of whether the founder still does the interviewing.

Move 1: Founder-Led Customer Interviews as the Primary Research Instrument

Ivan Zhao's interview obsession is the foundational move and the one that gets most under-appreciated. In multiple public conversations — including his First Round Review interview — Zhao has described personally talking to users for years, often through video calls, often weekly, and often without a research team mediating the conversation. The pattern matters more than the volume.

The pattern is "founder hears the literal words customers use." Notion's product copy, onboarding sequences, and template library are downstream of those conversations. When users described using Notion as "Lego for software," that phrase made it into Zhao's own pitch language. When users described "wanting one tool instead of five," that sentence shaped Notion's all-in-one positioning. The interview is doing two jobs at once — it is generating product insight, and it is generating the marketing language that makes the product easier to sell.

The version other teams can copy is not "have your founder run all interviews forever." That doesn't scale past Series B. The lesson is that someone with decision authority should be in the conversation directly. When a PM, a designer, or a CEO with shipping power hears the literal words users use, the gap between research and roadmap shrinks. That's the same insight behind the continuous discovery habits framework Teresa Torres documented — discovery is a team sport, but it loses its power when the team running it is one degree removed from the decisions.

This is also where many SaaS teams hit a ceiling. By the time a company reaches 10 million users, the founder cannot personally interview a representative sample. The choice is then: stop interviewing (most companies), hire a research team that produces decks executives skim (many companies), or scale the conversation itself so leaders still hear customer language directly (rare, and what the customer research at scale playbook is built around).

Move 2: The Public Roadmap as Always-On Signal Layer

Notion's public-facing roadmap and feature-request board are the second move. Anyone with a Notion account can submit a feature request, vote on others, and read the team's responses. That's not unusual — Linear, Canny, Productboard all enable similar workflows. What's unusual is how Notion treats the data.

The roadmap is not the source of truth for what gets built. It is a signal layer that flags candidates for deeper investigation. A request with 5,000 votes triggers interviews, not a build ticket. The discipline is critical: vote counts measure intent in the loosest possible way, while interviews capture the why behind the intent. Without the qualitative follow-up, the team would build the most-voted features rather than the most-valuable ones, which are not the same thing.

The reason this matters for any modern product org is that votes-as-signal is the modern equivalent of the survey form — it flattens preference into a single dimension. A request like "build a mobile app" stacks 20,000 votes from users with totally different jobs to be done. Some want offline access for travel. Some want a quick-capture inbox. Some want the full desktop experience on a phone. Building the literal "mobile app" without disambiguating those jobs would have produced a product that satisfied none of them. Notion's discipline of routing high-vote items into structured interview cohorts is the move that turned the public roadmap from a noise generator into a research input.

This is also exactly the kind of work that jobs-to-be-done interviews at scale are designed for — moving from "what do users want" to "what job is this hiring our product to do."

Move 3: Beta Cohorts as Live Research Panels (the Notion AI Validation Path)

The validation path for Notion AI is the cleanest available example of how Notion uses cohorts as research panels rather than as launch hype machines. When Notion AI was first announced in late 2022, it shipped to a closed beta of users who had requested early access — and importantly, the team treated those users as a research instrument, not just as early adopters.

The beta cohort served three functions. First, it told the team which use cases users gravitated toward in practice — summarization vs. translation vs. Q&A vs. content generation. Public commentary from the Notion team in Lenny Rachitsky's interview described being surprised by how much usage clustered around in-context summarization and translation, which shaped the GA feature set. Second, the cohort surfaced the pricing-and-packaging questions early — would users pay an add-on, would they expect it free, what would tip them to churn? Third, it provided a stress test for the underlying model integration — failure modes, latency tolerance, and content-quality thresholds.

None of those questions could have been answered well by a survey. They required watching real users solve real jobs in their actual workspaces, then asking follow-up questions when something looked weird. That's a research instrument. And it's the kind of qualitative validation that gets cited in pieces like the continuous discovery stack for AI-first product teams — because the AI features that ship well are the ones validated through conversation, not through quant adoption metrics.

The lesson generalizes well past AI launches. Any product team with a beta or early-access cohort has a research panel sitting in front of them. Most companies use those cohorts as a NPS-collection device. Notion uses them as the qualitative wind tunnel before GA.

Move 4: In-Product Feedback Woven Into Shipping Decisions

The fourth move is the part most product teams miss. Notion treats in-product feedback — the comments, the bug reports, the small requests embedded in workspaces — as a high-signal stream that gets routed back to PMs and designers, not just to support tickets. This is documented in their engineering blog and discussed in multiple public talks.

The mechanism is mundane but the effect is large. A user types a feature request into a doc. That doc is captured (with permission) into the product team's research backlog. Patterns across many docs become candidates for investigation. The team then reaches back out to users who flagged the same theme and runs structured interviews — closing the loop the user started.

Two things make this work in practice. The first is that the team treats the request as a starting point, not a ticket. A user asking for "better tables" gets followed up on, because "better tables" can mean any of: keyboard navigation, formula support, large dataset performance, mobile editing, or import/export. Without the follow-up, the request is unactionable. With the follow-up, it becomes a job to be done.

The second is that the request volume becomes manageable only because the synthesis layer is fast. This is where the modern stack matters. Notion has a research team and engineering investment that lets them turn a noisy stream of in-product feedback into a structured signal — but most companies in the 50-500 person band don't. For those teams, the modern equivalent is to run conversational follow-ups at the moment of feedback, which is what an AI interviewer can do. (See the customer research tools stack modern teams use for what that looks like in practice.)

Move 5: Qualitative Validation Before Any GA Launch

The fifth and final move is a discipline more than a process. Notion does not GA-launch a feature that hasn't been qualitatively validated. That sounds obvious; it isn't. Most SaaS companies ship features when engineering is done, marketing is ready, and the launch calendar fits. Notion ships when the qualitative signal says the feature is solving a job customers actually have, in a way that fits how they already use the product.

You can see this in the staggered rollout of Notion AI features through 2023-2024. Q&A shipped after summarization because the qualitative validation cycle for Q&A — which depends on workspace context quality — took longer than for summarization. Notion Calendar shipped slowly because the qualitative work to figure out what "calendar inside Notion" should mean was harder than building the integration. The team explicitly delayed launches when interviews flagged that the feature wasn't yet fitting the job.

This is the move that separates research-driven product cultures from research-decorated ones. A research-decorated culture runs studies, produces decks, then ships on a marketing schedule. A research-driven culture lets the qualitative signal change the launch date. Notion is the latter, and it's the cultural choice that other 100-person SaaS teams find hardest to copy because it requires executive willingness to delay.

The Notion Playbook in a Comparison Table

Move	What Notion Does	What Most SaaS Teams Do Instead	Why The Difference Matters
Founder/leader interviews	CEO hears literal user words weekly	Quarterly research deck reaches exec via summary	Marketing language and roadmap language stay aligned
Public roadmap signal	Votes flag candidates, interviews validate them	Top votes become build tickets	Avoids building wrong feature for noisy intent
Beta cohorts	Used as research panels with structured follow-up	Used as launch-hype machines	Get GA-ready feature, not a popular launch
In-product feedback	Routed to PMs as research backlog, not just support	Triaged as support tickets and bugs	Captures jobs at the moment of friction
Qualitative GA gating	Launches delayed until qualitative signal validates fit	Launches ship on marketing calendar	Features arrive ready for the job, not for the demo

Where Conversation-At-Scale Would Change Notion's Process

Here's the editorial honesty: Notion's research practice scales beautifully up to a point, and then it stops. With 100 million users and growing, the team cannot personally interview a representative slice of the user base. They cannot follow up on every in-product feedback note. They cannot run structured cohorts for every beta. The bottleneck has shifted from "do we run customer research" (yes, obviously) to "can we run enough conversations to keep the qualitative signal high-resolution at scale."

This is where a tool like Perspective AI fits the modern playbook. The constraint is not that founder interviews are bad — they're the gold standard. The constraint is that the founder cannot do 50,000 of them. AI interviewers solve that math problem: they run the conversation, follow up on vague answers ("you said the new editor felt slow — slow on what?"), capture the why behind a request, and feed structured insight back to the team. They do not replace the founder interview. They give the team a way to keep that qualitative depth at the scale the user base actually demands.

The research-at-scale problem is what teams like the 2026 mid-year state of AI customer interviews and the JTBD interview playbook for product teams are designed around. It's also why named-company case studies on research culture matter — they show what the practice looks like when it's running well, so the rest of us can borrow the moves.

Lessons Across the Notion / Stripe / Klarna Trio

Notion is one of three named SaaS case studies worth reading together. Stripe's onboarding philosophy shows what conversion-obsession looks like at the activation layer — Stripe replaces forms with progressive disclosure because friction is existential for payments. Klarna's AI customer service deployment shows what happens when conversation replaces a 700-agent support function — controversial, instructive, and now widely studied. Notion shows what conversation looks like when it sits at the bottom of the research stack rather than the support stack.

Together, the three companies show that "AI conversations at scale" is not one feature category. It is a set of moves a modern company makes across product, support, and research — each of which trades a form, a script, or a ticket for an actual conversation. As Bloomberg and the Financial Times have both noted in coverage of the AI-native SaaS wave, the companies that win this cycle are the ones that put conversational interfaces at the center of how they learn from users, not just how they serve them.

What Product Teams in Any Growth Stage Should Borrow

For early-stage teams (pre-Series B): copy Move 1 directly. The founder runs interviews. There is no research team yet, and that's an advantage. The fastest way to lose this advantage is to hire a research function before the founder has internalized customer language.

For Series B-C teams: layer Moves 2-4 in. The public roadmap, the beta cohort discipline, the in-product feedback routing — these are all systems-level moves that survive past founder-led interviewing. The transition is hardest here because the founder can't do all the interviews anymore but the team hasn't built the muscle to take over.

For growth-stage and public companies: Move 5 is the one to enforce. Qualitative gating before GA. The infrastructure to do this at scale — including AI interviewers and conversational research instruments — exists in 2026 in a way it didn't in 2018. Use it. The constraint is no longer "we can't run enough interviews." The constraint is whether leadership is willing to delay a launch when the qualitative signal says the feature isn't ready.

Frequently Asked Questions

How does Notion actually do customer research?

Notion combines five practices: founder-led 1:1 interviews, a public roadmap that surfaces signal candidates, beta cohorts used as structured research panels, in-product feedback routed back to PMs as research backlog, and qualitative validation gating before any GA launch. The combination — not any single move — is what makes the practice work. Most teams adopt one or two of the moves; the multiplier comes from running all five as a coordinated discovery system.

Did Ivan Zhao really do customer interviews himself?

Yes, and he has discussed it publicly multiple times — including on Lenny Rachitsky's podcast and in First Round Review's coverage of Notion's early years. Zhao personally interviewed users for years, weekly, often through video calls. The pattern is well-documented in industry coverage and is widely cited as a foundational element of Notion's product culture. The lesson generalizes: someone with shipping authority should be hearing customer language directly, even if it's not the CEO forever.

How did Notion validate Notion AI before launching it?

Notion AI was released into a closed beta of users who had requested early access, and that cohort functioned as a structured research panel rather than as launch hype. The team used the beta to identify which AI use cases users actually gravitated toward (summarization and in-context translation clustered higher than expected), to stress-test pricing-and-packaging assumptions, and to surface failure modes in the underlying model integration before GA. The qualitative signal shaped the GA feature set.

Can a SaaS company without a famous founder copy this playbook?

Yes — the playbook is portable and most of it does not depend on having a celebrity CEO. The transferable moves are: route in-product feedback to PMs as research backlog (not just support tickets), use beta cohorts as research panels, qualitatively validate features before GA, and keep someone with shipping authority in the conversation directly. Founder-led interviewing is the easiest move when the team is small, but the practice scales through AI interviewers and structured cohorts as the user base grows.

What's the role of AI interviewers in a Notion-style research practice?

AI interviewers solve the scale problem that breaks founder-led research at the 10-million-user threshold. They run the conversation, follow up on vague answers, capture the why behind feature requests, and route structured insight back to the team. They do not replace the founder interview as the gold standard — they extend that depth of conversation to a sample size the founder can't personally cover. Tools like Perspective AI exist because every research-driven product culture eventually hits the bottleneck Notion is now navigating.

Where does Notion's research practice still have gaps?

The honest gap is scale. With 100 million users, Notion cannot personally interview a representative slice of the user base, and structured cohorts cover a small fraction of the feature surface area. The team has institutional discipline that compensates — but the bottleneck has shifted from "do we run research" to "can we run enough conversations to keep the qualitative signal high-resolution at scale." That's the problem AI conversational research tools are built to solve, and it's the next frontier for any SaaS company past Series D.

Conclusion: What Notion's Research Practice Teaches About Scaling Conversation

Notion's customer research practice is worth studying because it answers the central question of modern product development: how does a $10B company keep deciding what to build with the same fidelity it had at 10 users? The answer is not "more dashboards." It is to keep conversation at the bottom of the research stack, treat every cohort as a research panel, gate launches on qualitative signal, and invest in the infrastructure that lets the practice scale without losing depth.

For product, research, and CX teams running into the same bottleneck — too many users, too few real conversations, too much survey data, too little qualitative signal — the takeaway is direct. The form-and-dashboard layer is not where insight comes from. The conversation is. Notion's practice is the proof point. The infrastructure to run that conversation at scale is what Perspective AI was built to provide. If you are scaling past the point where founder-led interviewing covers the user base, the next move is to make conversation a first-class instrument in your research stack — and to run it at the volume your customer base actually demands. See how it works or start a study to bring the Notion research discipline to your own product.

TL;DR#

Why Notion Is the SaaS Industry's Most-Cited Research Culture#

Move 1: Founder-Led Customer Interviews as the Primary Research Instrument#

Move 2: The Public Roadmap as Always-On Signal Layer#

Move 3: Beta Cohorts as Live Research Panels (the Notion AI Validation Path)#

Move 4: In-Product Feedback Woven Into Shipping Decisions#

Move 5: Qualitative Validation Before Any GA Launch#

The Notion Playbook in a Comparison Table#

Where Conversation-At-Scale Would Change Notion's Process#

Lessons Across the Notion / Stripe / Klarna Trio#

What Product Teams in Any Growth Stage Should Borrow#

Frequently Asked Questions#

How does Notion actually do customer research?#

Did Ivan Zhao really do customer interviews himself?#

How did Notion validate Notion AI before launching it?#

Can a SaaS company without a famous founder copy this playbook?#

What's the role of AI interviewers in a Notion-style research practice?#

Where does Notion's research practice still have gaps?#

Conclusion: What Notion's Research Practice Teaches About Scaling Conversation#

More articles on AI Customer Interviews & Research