Implementing Natural Language Search for Netflix Graph Search with LLMs

3 March 2026 by

Suraj Barman

Natural language search lets users ask questions in everyday language and receive results from Netflix's Graph Search platform without writing a Graph Search Filter DSL statement.

Problem Statement & Motivation

Users interact with dozens of UI components across Content and Business Products, each requiring manual construction of DSL filters. This creates friction and inconsistent experiences.

Multiple bespoke query builders increase learning overhead.
Hundreds of index fields make UI forms cumbersome.
SMEs must translate domain knowledge into technical syntax.
Inconsistent DSL support leads to errors.

LLM‑Powered Text‑to‑DSL Engine

The core engine uses a large language model to convert free‑form questions into syntactically valid Graph Search Filter DSL statements.

Prompt design balances instruction clarity with token efficiency.
Model selection prioritizes low latency and high accuracy.
Output is constrained by a JSON schema to enforce grammar.
Supports ambiguous phrasing through iterative clarification.

Context Engineering & Schema Extraction

Accurate DSL generation requires the LLM to understand field names, types, and controlled vocabularies derived from the GraphQL schema.

Automated schema parser extracts field metadata.
Controlled vocabularies are injected as enumerated value lists.
Metadata includes description, type, and permitted values.
Context payload is cached for fast reuse.

Validation & Post‑Processing Pipeline

Generated statements undergo multiple checks to ensure they meet syntactic, semantic, and pragmatic criteria.

Syntax validator parses the DSL with the official grammar.
Semantic checker cross‑references field types and allowed values.
Pragmatic layer runs a mock query against a sandbox index to verify intent alignment.
Feedback loop surfaces confidence scores to the UI.

Deployment & Operational Considerations

The solution runs as a self‑managed microservice integrated with existing Netflix applications.

Containerized deployment using Kubernetes for autoscaling.
Observability via metrics, logs, and trace IDs.
Rate limiting protects LLM usage costs.
Rollout includes A/B testing against legacy DSL builders.

For reference on building resilient services, see the real‑time orchestration framework and the scalable data platform guides.