Catching Just-in-Time Tests (JiTTests): A Guide for Agentic Development

21 March 2026 by

Suraj Barman

Catching Just-in-Time Tests (JiTTests)

Just-in-Time Tests (JiTTests) are automatically created by large language models at the moment a code change is introduced, providing immediate regression detection without the need for static test suites. This approach aligns with the rapid pace of agentic software development, reducing maintenance overhead and focusing engineering effort on genuine defects rather than test upkeep.

What Are JiTTests?

Just-in-Time Tests are bespoke test cases generated in real time when a developer submits a change. Unlike traditional test suites that exist long before a change lands, JiTTests are produced by a large language model that analyses the diff, predicts possible failure points, and writes assertions that target the altered logic. The result is a set of focused checks that run immediately, catching regressions before they reach production.

Why Traditional Testing Struggles with Agentic Development

Conventional testing relies on manually authored test files that must be updated whenever the code evolves. In an environment where code is produced by autonomous agents, the frequency of changes outpaces the capacity to keep test suites current. This mismatch leads to stale tests, missed edge cases, and a growing number of false alerts that waste developer time.

Core Mechanics of Catching JiTTests

The process begins with the version‑control system emitting a change set. A large language model consumes the diff, identifies the public interface affected, and synthesizes assertions that verify input‑output relationships, state transitions, and error handling. These generated tests are then injected into the build pipeline, executed alongside existing checks, and any failure is reported as a potential regression.

Generating Tests with Large Language Models

LLMs are trained on vast code corpora, giving them an understanding of common patterns, libraries, and idioms. When presented with a specific change, the model predicts likely failure modes based on historical data and language semantics. The generated test code follows the project's language conventions, includes descriptive names, and avoids reliance on external mocks unless required.

Managing False Positives and Test Signal Value

To keep the signal‑to‑noise ratio high, JiTTests employ heuristics that prioritize high‑impact failures. Tests that target critical paths, security‑sensitive functions, or performance hotspots receive higher weight. Additionally, the system tracks the historical accuracy of generated tests, adjusting confidence thresholds to reduce spurious alerts.

Integration into CI/CD Pipelines

JiTTests are designed to fit naturally into continuous integration workflows. After a pull request is opened, the pipeline triggers the LLM‑driven test generation step, runs the new tests, and reports results in the same interface developers already use. Because the tests are transient, they do not persist in the repository, eliminating long‑term maintenance burdens.

Best Practices for Engineers

Engineers should treat JiTTests as an early warning system, not a replacement for well‑crafted unit or integration suites. Review any failing JiTTest to confirm the underlying issue, then address the bug in the code. Over time, patterns observed in generated tests can guide the creation of permanent tests for especially fragile components.