Why FAANG Killed the Algo Round (And What Replaced It)
Two years ago, I prepped a friend for a Meta ML engineer loop. He was sharp, fast, would solve LeetCode mediums in under twelve minutes. We did about 200 problems together over six weeks. He walked in expecting the coding round to be his strongest signal.
He didn't get the offer.
The debrief was strange. The coding round was "fine." Not a red flag, not a strong signal either. What sank him was the ML system design round. The interviewer had described an ambiguous production scenario: a recommender suddenly returning biased outputs, a hundred million users affected, what's your move in the next thirty minutes? My friend defaulted to architecture. He drew a clean diagram. He talked about how he'd rebuild the pipeline. The interviewer kept pulling him back: "what would you do first?"
He didn't have an answer. Six weeks of LeetCode hadn't taught him how to reason about a real system under real pressure.
I've spent most of the time since on the other side of that table, running ML system design and behavioral rounds. What happened to my friend is now happening to a lot of strong engineers, and most of them don't know why. They're prepping for an interview that doesn't really exist anymore.
The contamination problem
The coding round isn't dead. You'll still get one. But its predictive weight has collapsed, and the reason isn't fashion or politics. It's contamination.
By late 2024, Claude Sonnet could solve roughly 80% of LeetCode mediums on the first try in under thirty seconds. GPT-4 wasn't far behind. By 2025, basically every engineer I work with was using these tools daily for actual work. The skill "produce a clean implementation of two-sum in twelve minutes" stopped predicting anything useful about on-the-job performance, because anyone with an API key could now do that without breaking a sweat.
The signal didn't degrade. It got contaminated. There was no longer any way for an interviewer to know whether the person on the other end of the Zoom had grinded NeetCode 150 last weekend or was genuinely fluent at production engineering. Same output. Same speed. Same clean code.
Companies have responded in two distinct directions, and which way they went tells you something about their culture.
The first direction, which Meta piloted in late 2025 using CoderPad with Claude and GPT-4o-mini available to the candidate, is the AI-aware coding round. You get the tools. The interviewer watches what you do with them. Can you tell when the model is wrong? Do you ask it for boilerplate or for architecture? Do you verify or trust? OpenAI runs a version of this too: AI is allowed, but you share your screen and narrate your reasoning the whole way. The signal stopped being "can you write code" and became "can you collaborate with AI like a competent engineer."
The second direction is the opposite: ban the tools, pull back to in-person. In-person interview rounds went from 24% in 2022 to 38% in 2025. Nobody talks about it but every hiring manager I know has noticed the shift. Microsoft now runs split rounds for some ML positions: one where you have AI, one where you don't, scored separately. Both data points are signal.
Either way, the coding-round-as-main-event is over. It's a "don't fail" round now.
What actually got harder
The weight moved to the rounds where AI can't fake it for you yet, and those rounds have gotten meaningfully harder in the last two years.
Reasoning under ambiguity. The biggest shift. ML system design used to be "design a recommendation system." Broad, predictable, a known surface. Today it sounds more like: "Our ranking model is suddenly returning stale results for three percent of users in Southeast Asia. What's your first hour?" There's no clean answer. The interviewer is watching how you frame the problem, what you choose to ignore, which hypotheses you prioritize, what you'd verify first. Candidates who reach for a template get filtered. Candidates who slow down and reason out loud get strong-hire notes.
Meta leans into this hard. Their loops are intentionally vague because what they're hiring for is autonomy under ambiguity, and the only way to test that is to give you ambiguity and watch what you do.
ML system design at real production scale. Not "design YouTube" anymore. Try "design YouTube with a 100ms latency budget, a downstream team that already depends on your output format, a vendor relationship you inherited and can't change, and an A/B test that's been running three weeks already, showing flat results." The constraints are now part of the question. The interviewer wants to see if you can hold the actual messiness of a production decision in your head, not draw a clean architecture diagram divorced from any real context.
AI-collaboration literacy. The newest scoring dimension, and the one nobody is preparing for. Can you describe how you'd build a system with AI in the loop without sounding like it's still 2022? Do you have an actual opinion about when to verify an LLM output and when to trust it? Anthropic and OpenAI score this explicitly. Google and Meta have started to. If you walk into a 2026 ML loop and never mention AI tooling unprompted, you're signaling something about whether you've kept up.
Thinking out loud. I underestimated this one until I started running loops. The candidates who get strong-hire reports almost all reason audibly the entire way through. Not narrating their actions ("now I'm going to write the for loop"). Exposing their thought process. Naming assumptions out loud. Flagging where they're uncertain. Telling you what they're considering before they discard it. The candidates who silently work to a correct answer often get weaker scores than candidates who got to a worse answer with visible reasoning. The interviewer is grading the process, not just the output.
A Meta E5 loop, then and now
Concrete comparison. What an L5/E5 Meta ML loop looked like in 2022 versus today:
2022: Two coding rounds, one ML system design, one ML breadth-and-depth round, one behavioral. Coding was about 35-40% of the weight in committee. Behavioral was often the lightest round.
2026: One coding round, often shorter, with pseudocode accepted for harder parts. One ML system design that's tightly constrained and scenario-based. One ML deep-dive that interrogates a system you've actually built, in detail, with the interviewer pushing on why you made specific decisions. One behavioral round that's gotten genuinely harder, probing for judgment, scope, conflict, and how you handle being wrong.
Coding is maybe 15-20% of the weight now. System design and the deep-dive are the rounds that decide hire vs no-hire. The behavioral round has grown teeth.
If I were prepping today
I'll be blunt: a lot of the prep advice on the internet is now actively misleading. "Do 500 LeetCode problems" was correct in 2018, mostly correct in 2021, getting wrong by 2023, and completely wrong today. The diminishing returns past about 80-100 problems are extreme. You're spending hours on a 5% signal that you could be spending on a 40% signal.
Rough allocation if I were prepping for a FAANG ML loop right now:
- 30% system design. Real-system reasoning. Read engineering postmortems. Pick a real outage Cloudflare or AWS or GitHub published and walk through it as if you're in the room. Practice naming constraints out loud, before solutions.
- 25% behavioral. This is where most ML engineers underprep. Practice telling stories about non-obvious judgment calls you've actually made. Practice being asked "what would you have done differently?" without getting defensive. Throw away the STAR template.
- 20% ML depth on systems you've shipped. The deep-dive is now an interrogation of your real experience. If you've built one production ranking system end to end, you're better prepared than someone who's read three textbooks.
- 15% coding. Sixty to eighty well-chosen mediums covering the major patterns. Past that, you're optimizing the wrong thing.
- 10% AI-collaboration practice. The new one. Run mock interviews where you reason out loud while using AI tools, and practice explicitly stating when you'd verify, when you'd trust, when you'd push back on a suggestion.
The ratios will feel wrong if you're used to the old advice. They are wrong, for the old interview. They're right for the one you're actually walking into.
The one thing
If there's one thing I want a candidate to internalize:
The interview isn't testing what you know. It's testing how you reason.
LeetCode prep was pattern recognition. See problem, recognize pattern, execute solution. That worked when interviews were about whether you could execute a known pattern fluently. Modern ML loops aren't doing that. They're testing whether you can reason through novel problems where there is no template, with incomplete information, in front of a stranger, under time pressure.
The candidates who get offers slow down when they don't know. They ask clarifying questions. They name their assumptions. They propose multiple approaches and pick one with explicit tradeoffs. Out loud. The whole way.
My friend lost his Meta loop in 2024 because he'd prepared for a 2018 interview. The good news for the rest of us: if you've been shipping production ML systems, you already have most of the judgment the loop is testing for. You just have to practice making it visible.
Grind less. Reason more. Talk out loud.
Prep for questions like these with GradientCast — see our plans. Staff-level ML system design walkthroughs and behavioral answers, built by engineers who run these loops every week.