Netflix Graph Search: AI‑Powered Natural Language Query Generation
Netflix enhanced its enterprise‑wide Graph Search by integrating large language models (LLMs) to translate everyday language into the proprietary Graph Search Filter DSL. This shift reduces user friction, standardizes query creation across diverse UIs, and lays the groundwork for a self‑managed AI search platform.
Context Engineering for LLM‑Driven Filter Generation
Effective translation requires supplying the LLM with concise, relevant metadata about index fields and controlled vocabularies. By extracting type information from the underlying GraphQL schema and curating a focused subset of fields, the system balances context richness against latency, ensuring syntactic and semantic accuracy.
Field Retrieval Augmented Generation (RAG)
To identify which fields and values are pertinent to a user’s question, the pipeline employs a RAG pattern that matches query intent against schema metadata. This selective inclusion prevents the “needle‑in‑the‑haystack” problem and curtails hallucinations.
Controlled Vocabulary RAG
When a query references entities such as countries or genres, the system detects the associated controlled vocabulary and injects only the relevant enumerated values, further narrowing the context for the LLM.
LLM Prompt Design and Execution
The prompt instructs the model to produce a syntactically, semantically, and pragmatically correct filter statement given the supplied field metadata. Careful wording, combined with few‑shot examples, guides the LLM to respect DSL grammar while honoring user intent.
Prompt Engineering Reference
For deeper insights into crafting effective prompts for small models, see the prompt engineering guide.
Post‑Processing: AST Validation and Hallucination Mitigation
Generated statements are parsed into an Abstract Syntax Tree (AST). If parsing fails, the query is rejected outright. Even when parsing succeeds, the system cross‑checks field names and enumerated values against the metadata to catch hallucinated elements before returning results.
User‑Facing Transparency and Disambiguation
To build trust, the UI visualises the parsed AST as familiar filter chips and facets, allowing users to see and edit the inferred constraints. Optional “@mention” syntax lets users select exact entities, bypassing ambiguity and improving pragmatic correctness.
End‑to‑End Architecture Overview
The solution comprises three stages: (1) pre‑processing builds a tailored context via RAG, (2) the LLM generates the DSL statement, and (3) post‑processing validates and visualises the result. This hybrid approach leverages generative AI while retaining deterministic safeguards.
LLM Technical Foundations
Netflix’s implementation builds on state‑of‑the‑art LLM research; details on a comparable model can be found in the DeepSeek‑V4 technical overview.