Ecdysis: Graceful Restarts for Rust Services
Ecdysis is a lightweight Rust library that provides zero‑downtime upgrades for high‑throughput network services. By preserving listening sockets across process generations, it eliminates connection gaps during restarts, allowing millions of requests per second to continue uninterrupted worldwide.
Technical Overview of Graceful Restarts
A graceful restart replaces a running process with a new binary while keeping existing connections alive. The technique relies on a graceful restart pattern pioneered by web servers such as NGINX, where the parent process hands off listening sockets to a child that starts accepting new traffic only after it signals readiness.
Fork‑Exec Model
The core of ecdysis follows the classic Unix fork() then execve() sequence. The parent forks a child, which inherits the open socket file descriptors via a named pipe. The child initializes the new code base; if it crashes, the parent continues serving without interruption.
Socket Inheritance via Named Pipe
Listening sockets are transferred through a named pipe that both processes share. This ensures the underlying kernel socket structure remains open, allowing the parent to keep handling existing connections while the child prepares to accept new ones.
Tokio Integration
Ecdysis offers native async wrappers for Tokio. The TokioEcdysisBuilder creates listeners that are automatically converted into async streams, removing the need for manual glue code and enabling high‑performance, non‑blocking handling of new connections.
Systemd Support
When compiled with the systemd_notify feature, ecdysis sends readiness notifications to systemd, allowing units declared with Type=notify-reload to track upgrades accurately. The systemd_sockets feature further enables socket‑activated services to participate in graceful restarts without extra configuration.
Security Considerations
The fork‑exec approach isolates the new code in a fresh address space, and only explicitly passed descriptors survive the transition. All other handles are marked CLOEXEC, preventing accidental leakage. Services employing seccomp must permit fork() and execve(), a trade‑off accepted by many high‑availability network daemons.
Practical Example
The snippet below demonstrates a minimal TCP echo server that uses ecdysis to achieve zero‑downtime upgrades. It creates a listener, spawns a task to process connections, and waits for a readiness signal before allowing the parent to shut down.
use ecdysis::tokio_ecdysis::{SignalKind, StopOnShutdown, TokioEcdysisBuilder};
use tokio::{net::TcpStream, task::JoinSet};
use futures::StreamExt;
use std::net::SocketAddr;
#[tokio::main]
async fn main() {
let mut ecdysis_builder = TokioEcdysisBuilder::new(SignalKind::hangup()).unwrap();
ecdysis_builder.stop_on_signal(SignalKind::user_defined1()).unwrap();
let addr: SocketAddr = "0.0.0.0:8080".parse().unwrap();
let stream = ecdysis_builder
.build_listen_tcp(StopOnShutdown::Yes, addr, |builder, addr| {
builder.set_reuse_address(true)?;
builder.bind(addr.into())?;
builder.listen(128)?;
Ok(builder.into())
})
.unwrap();
let server_handle = tokio::spawn(async move {
let mut stream = stream;
let mut set = JoinSet::new();
while let Some(Ok(socket)) = stream.next().await {
set.spawn(handle_connection(socket));
}
set.join_all().await;
});
let (_ecdysis, shutdown_fut) = ecdysis_builder.ready().unwrap();
let shutdown_reason = shutdown_fut.await;
log::info!("Shutting down: {:?}", shutdown_reason);
server_handle.await.unwrap();
}
async fn handle_connection(mut socket: TcpStream) {
// Echo logic here
}
Recent tooling updates, such as OpenAI’s Codex launch, illustrate the broader industry push toward zero‑downtime deployment models, making ecdysis a timely addition to the Rust ecosystem.