Scale AI's Forward Deployed Engineers: How the $14B Data Labeling Leader Embeds Engineers With Enterprise AI Buyers

15 min read

Scale AI's Forward Deployed Engineers: How the $14B Data Labeling Leader Embeds Engineers With Enterprise AI Buyers

TL;DR

Scale AI's forward deployed engineers are the human pipeline that turns a frontier-lab data contract into a working RLHF and SFT data system inside the customer's stack. CEO Alexandr Wang ran the original FDE play himself — sitting in customer offices, writing labeling guidelines by hand, and shipping the first versions of what became Scale's Data Engine. By 2026 that solo motion has scaled into a function of roughly 200+ forward deployed engineers, applied scientists, and data ops leads embedded with OpenAI, Microsoft, Meta, and the U.S. Department of Defense. The unit's job is unusual: discover what "good" looks like for a model the customer is still inventing, translate that into a labeling rubric, and rebuild the pipeline every time the model improves. Scale's $14B valuation, the $1B Meta deal that landed Wang as Meta's Chief AI Officer in mid-2025, and the Donovan / Thunderforge defense contracts all flow from this embedded model. This post breaks down how the FDE function works, what an engagement actually looks like week-to-week, and which parts of the Scale playbook other AI companies — from Anthropic to Cohere to Mistral — are already copying.

What is Scale AI's forward deployed engineering function?

Scale AI's forward deployed engineering function is the team of engineers, applied scientists, and data ops leads who embed inside customer environments — frontier labs, defense agencies, and Global 2000 enterprises — to design RLHF and supervised fine-tuning data pipelines that produce the labeled examples those customers need to train and align their models. Unlike a traditional solutions engineer who configures a SaaS product, a Scale FDE co-designs the customer's data spec, runs the discovery loop with the customer's research scientists, and operates the labeling workforce that feeds it. The function exists because frontier model training is a moving target: what counts as a "good" response from a reasoning model in 2026 is not what counted as good in 2024, and only an embedded team can keep the rubric and the pipeline in sync.

Scale FDEs sit at the intersection of three roles that used to live in different companies: a labeling vendor's project manager, a frontier lab's research engineer, and a management consultant's discovery lead. That hybrid is the part of the model competitors are now trying to copy — and the hottest AI role of 2026 overall.

How Scale AI built its FDE function — from Alexandr Wang's early-days operator role to a 200+ person org

Scale AI built its FDE function by industrializing the customer-embedded motion Alexandr Wang ran personally as a 19-year-old founder. Wang dropped out of MIT in 2016 and launched Scale (then "Scale API") through Y Combinator with co-founder Lucy Guo. The early customer set — Cruise, Waymo, Zoox, GM, and Toyota — all had the same problem: autonomous vehicle teams needed millions of labeled lidar, radar, and camera frames, and existing crowd-labeling vendors couldn't hold the quality bar. Wang's response was to fly to the customer, sit with their perception scientists, and write the labeling guidelines on a whiteboard until the spec was tight enough to hand to operators.

That solo operator-founder loop became the FDE template. By 2021 Scale had layered an applied-engineering team between Wang and the customer to scale the motion. By 2023, with the ChatGPT wave dragging every AI company into RLHF, the unit pivoted from autonomous-vehicle labeling to language-model alignment data, and the customer set rotated to OpenAI, Microsoft, Meta, Anthropic-adjacent partners, and the U.S. government. By 2026 the function is reportedly 200+ people — a mix of forward deployed engineers, applied scientists, data quality leads, and the program managers who run labeler workforces of 100,000+ contributors across 20+ languages.

The inflection point was the Meta deal announced in June 2025: Meta took a 49% non-voting stake in Scale at a $29B post-money valuation in a transaction Bloomberg reported as a $14.3B investment, and Wang himself moved to Meta as Chief AI Officer to run the new Superintelligence Lab. The deal materially repriced what an FDE-led data company is worth — and signaled to every AI startup that "embedded data engineering" is not a cost center but a strategic moat. Other AI labs and customers responded by accelerating their own forward deployed engineering builds, often using Scale's structure as a reference.

Inside a Scale AI FDE engagement — the customer-discovery → data-spec → pipeline-build loop

A Scale FDE engagement runs in a continuous three-phase loop: customer discovery, data spec authoring, and pipeline build-and-iterate. Each loop spans roughly two to six weeks before the team kicks off the next iteration, and a frontier-lab account typically has three to five loops running in parallel across model checkpoints.

Phase 1 — Customer discovery. The FDE pair (usually one engineer plus one applied scientist) sits with the customer's research team and runs structured interviews: what is the model failing at, what does "good" look like for the next checkpoint, where is the existing data thin, what new behaviors need to be elicited. This is not a kickoff call — it's the same kind of discovery loop forward deployed engineers run at OpenAI and Anthropic, just industrialized. Teams that get this phase wrong end up labeling the wrong thing for six weeks. Inside Scale the phase is treated as a research function, not a sales function — the FDE is expected to surface customer truths the customer's own PMs missed.

Phase 2 — Data spec authoring. The FDE turns the discovery output into a formal labeling rubric: a multi-page document defining what counts as a correct response, a preferred response, a rejected response, and an edge case. Modern RLHF specs run 40–120 pages and include positive examples, negative examples, decision trees for ambiguity, and language-specific guidance. The spec is co-signed by the customer's research lead before any labeler sees a task — because a wrong rubric, executed at scale across 10,000 labelers, produces a mountain of toxic training data.

Phase 3 — Pipeline build-and-iterate. The FDE stands up the actual labeling pipeline: workforce selection (PhDs for advanced math reasoning, MDs for medical, lawyers for legal RLHF), tooling configuration inside Scale's Data Engine, golden-set construction, calibration runs, and inter-annotator agreement gates. As real data starts flowing, the FDE monitors quality dashboards daily, runs spot audits, retrains labelers on edge cases the spec didn't cover, and feeds counter-examples back into the rubric. When the customer ships the next model checkpoint, the loop restarts — usually with a different failure mode.

The reason this is hard to copy is that it requires three muscles in one team: research-grade ML literacy (to converse with the customer's scientists), operations rigor (to run a 100,000-person labeling workforce without quality decay), and management-consulting-grade customer empathy. Most AI startups can hire one of the three. Scale industrialized all three, which is why the debate over solutions engineer vs forward deployed engineer titles is essentially a debate about which of those three muscles you're building.

Defense & government — the Scale Donovan and Scale FedRAMP angle

Scale's defense and government franchise is anchored on Donovan, the company's AI platform for U.S. national security customers, and a series of large prime contracts that depend entirely on the FDE function. Donovan is positioned as a secure, mission-specific AI workspace for the Department of Defense — letting analysts and operators query classified data, summarize intelligence products, and draft courses of action through a conversational interface. The platform was authorized for the DoD Impact Level 6 (IL6) environment in 2024, and by mid-2025 was being deployed inside U.S. Indo-Pacific Command and Special Operations Command workflows.

The headline contract is Thunderforge, awarded by the Defense Innovation Unit in March 2025 as a flagship effort to bring AI agents and decision-support tools into joint-force planning. Thunderforge teams Scale with Anduril and Microsoft, with Scale serving as the prime integrator — a role that exists only because Scale's FDEs can sit inside a combatant command, run discovery with operators in real workflows, and translate that into model fine-tuning specs the way they would for a commercial frontier lab. The Pentagon's broader AI spend supports this: the DoD's FY2025 unclassified AI budget request was approximately $1.8 billion, and the U.S. Government Accountability Office's 2024 review of DoD AI programs identified more than 685 active AI activities across the department, many of which require labeled, mission-specific training data.

On the procurement side, Scale has built up the compliance plumbing the federal market demands: FedRAMP authorization for Scale GenAI Platform, IL4 and IL6 authorizations for Donovan, and ISO/IEC 27001 certification. None of that wins a contract by itself — but combined with an embedded FDE who shows up cleared, on-base, and fluent in joint planning vocabulary, it produces a moat no labeling-only competitor can match. The State of Forward Deployed Engineering 2026 survey found that defense and regulated-industry engagements paid 1.6x more in total comp than horizontal-SaaS engagements, in part because of the clearance requirement.

This is also the part of the Scale playbook that Palantir watchers find familiar — and for good reason. Scale's defense FDE motion is openly modeled on Palantir's, and the broader pattern of Palantir's playbook being copied by Anthropic and OpenAI shows the gravitational pull of the embedded-engineer model across every serious AI buyer in regulated markets.

What other AI companies copy from Scale's FDE playbook

Other AI companies copy four specific moves from Scale's FDE playbook: the founder-as-first-FDE pattern, the discovery-led labeling spec, the customer-embedded engineer with revenue accountability, and the hybrid workforce model. Each shows up in the org charts of newer entrants in different combinations.

CompanyFunction nameCustomer setWhat they copied from Scale
OpenAIForward Deployed EngineeringFortune 500, enterprise GPT customersCustomer-embedded engineer pattern, applied-research liaison role
AnthropicApplied AI EngineeringRegulated enterprises, BigLaw, fintechDiscovery-first methodology, multi-week engagement loops
CohereForward DeployedEnterprise LLM buyers, financial services"Build with the customer" engagement model, named accounts
Mistral AIForward Deployed EngineeringEuropean enterprises, sovereign AI dealsEmbedded engineers with deal accountability
PalantirForward Deployed EngineeringDefense, intelligence, large industrialsThe original — Scale copied much of this for defense
Harvey AIDeployment / Forward DeployedBigLaw firmsVertical-specialist FDEs with domain training

The pattern is consistent across these companies: a small team of senior engineers with research literacy gets paired with the customer's most strategic deployments, and the team is measured on shipped model behavior rather than software delivery milestones. You can see the same template in OpenAI's forward deployed engineering team and embedded model, in Anthropic's applied AI engineers and Claude enterprise deployments, in Cohere's enterprise LLM build-with-customers strategy, in Databricks' FDE-flavored data lakehouse motion, and — at the lab specifically modeling itself on Scale's defense work — Mistral AI's forward deployed engineering build-out across European enterprise.

The fifth move competitors have not yet copied at scale is Scale's hybrid workforce: the ability to summon 1,000 PhDs in a week, 200 lawyers in two weeks, or 50 cleared analysts in a month. That capability is what lets a Scale FDE actually deliver on the spec they wrote. Competitors trying to skip this part of the model and rely purely on synthetic data generation are running into the wall the McKinsey State of AI 2024 survey flagged: data quality and lineage remain the top two risks named by enterprise AI adopters, and synthetic-only pipelines still fail those audits.

What the FDE motion teaches GTM and product teams outside AI

The Scale FDE motion is not just a labeling story — it's a customer research story. Every spec-authoring loop is a customer interview cycle: define the failure mode, probe the why, build a rubric, test it, refine it. That's exactly how high-functioning forward deployed engineers run customer discovery outside the labeling context.

For product and CX teams who can't hire 200 forward deployed engineers, the playbook still applies in miniature: capture the "why" behind every customer signal, treat each loop as a chance to refine the spec, and never rely on flat-form survey data to direct the next iteration. That's the bet behind Perspective AI's interviewer agent and intelligent intake platform — running spec-quality customer conversations at scale, without standing up a Donovan-grade labeling org. Teams built for product and built for CX use the same discovery loop pattern Scale runs internally, just applied to user research and post-sale feedback. The customer interview template and jobs-to-be-done interview template are designed for exactly this kind of structured discovery loop.

Frequently Asked Questions

What does a Scale AI forward deployed engineer actually do day-to-day?

A Scale AI forward deployed engineer spends most of their day in a tight loop with the customer's research team and the labeling pipeline they own. Mornings typically involve customer standups, spec edits based on the prior day's labeler edge cases, and reviewing dashboards for inter-annotator agreement drift. Afternoons are split between calibration sessions with labeler workforces, async work on the next version of the rubric, and writing tooling code in Scale's internal Data Engine to handle a new task type. Travel to customer sites and (for defense engagements) on-base work are part of the rhythm.

How big is Scale AI's FDE team in 2026?

Scale AI's FDE team in 2026 is reportedly 200+ people across forward deployed engineers, applied scientists, data quality leads, and embedded program managers — out of a total Scale headcount in the 1,200–1,500 range. The exact number is not publicly disclosed, but the function has grown roughly 5x since 2022 when the RLHF wave hit, and is concentrated in San Francisco, Washington DC, and a London hub serving European frontier labs.

Did Alexandr Wang leaving Scale change the FDE function?

Alexandr Wang's move to Meta as Chief AI Officer in mid-2025 changed Scale's executive leadership but did not dismantle the FDE function. Jason Droege, previously Scale's Chief Strategy Officer and a former Uber Eats executive, stepped in as interim CEO and has publicly kept the FDE-led embedded model as Scale's core go-to-market motion. The Meta investment in fact accelerated FDE hiring, because Meta's Superintelligence Lab is now itself a major FDE customer alongside OpenAI, Microsoft, and DoD.

How does Scale's FDE function differ from a traditional solutions engineer?

Scale's FDE function differs from a traditional solutions engineer in three ways: scope, ownership, and measurement. A solutions engineer typically owns a software configuration on top of a fixed product; a Scale FDE co-authors the data spec, owns the labeling pipeline, and ships training data the customer will use to retrain a model. SEs are measured on deal support and PoC conversion; FDEs are measured on model behavior outcomes and revenue retention. The full breakdown lives in the forward deployed engineer vs ML engineer vs solutions architect comparison.

Why is the FDE model worth so much to AI buyers?

The FDE model is worth so much because AI buyers are buying outcomes against a moving target, and only an embedded team can keep the spec current. Buying labeled data without an FDE is like buying compiled code without a build system — the moment the underlying requirements change, the artifact is stale. A Gartner 2024 survey of enterprise AI deployments found that data quality and integration were the top barriers cited by 39% of organizations scaling GenAI, ahead of model performance or cost — which is why the embedded data-engineering role keeps gaining budget.

Can a smaller AI company replicate the Scale FDE playbook?

A smaller AI company can replicate the Scale FDE playbook by starting with the founder-as-first-FDE pattern Wang used, and only scaling the org once two or three repeatable engagement motions are proven. The founder playbook for building a forward deployed engineering function breaks down the staffing ratios, the comp structure, and the customer-discovery rhythm to copy. The mistake most startups make is hiring senior FDEs before there is a repeatable spec-authoring loop to hand them.

Conclusion

Scale AI's forward deployed engineering function is the most industrialized example of the embedded-engineer model in AI today, and the reason the company commands a $14B+ valuation despite operating in the unglamorous middle layer between frontier labs and the data they train on. The playbook — founder-as-first-FDE, discovery-led spec authoring, customer-embedded engineers with model-behavior accountability, and a hybrid workforce of domain experts behind every pipeline — is now being copied by OpenAI, Anthropic, Cohere, Mistral, Harvey, and a growing roster of category-defining AI startups. Defense and government work, anchored on Donovan and Thunderforge, shows the model extending into the highest-trust segments of the market.

The deeper lesson for product, CX, and research leaders outside AI labs is that forward deployed engineering is really a customer-discovery discipline wearing engineering clothes. The teams that win are the ones that treat every customer interaction as a chance to refine the spec, capture the "why" behind every signal, and rebuild the pipeline the moment the requirements move. That's the conversational research loop Perspective AI is built for — at the scale Scale's FDEs operate, but without the 200-person org. If you're building a discovery loop your product team can run weekly, start a research project, explore the interviewer agent, or see pricing to put a Scale-style spec-iteration cadence on your own roadmap.

More articles on AI Conversations at Scale