Skip to Content
  • Home
  • Blog
  • Privacy Policy
  • Terms And conditions
  • Disclaimer
  • About Us
      • Home
      • Blog
      • Privacy Policy
      • Terms And conditions
      • Disclaimer
      • About Us
  • Knowledge Base
  • 137 migrations and counting: building a personal OS schema from scratch
  • 137 migrations and counting: building a personal OS schema from scratch

    14 March 2026 by
    Suraj Barman

    This article chronicles the evolution of a personal operating system built atop PostgreSQL, growing from a modest ten‑table Fastify prototype to a 147‑table schema after 137 migrations. It explores design choices such as a unified public schema, soft‑delete columns, pgvector embeddings, and a sync engine that together enable a self‑hosted, AI‑aware digital life manager.

    Initial Schema Design

    The first release defined thirteen tables, each with a UUID primary key generated by gen_random_uuid(). Core entities included notes, contacts, events, and an embeddings table that stored vector representations for semantic search. By installing the pgcrypto and vector extensions, the system prepared for future cryptographic operations and high‑dimensional similarity queries, laying a flexible foundation for later growth.

    Embedding Architecture with pgvector

    Embedding storage leveraged the vector data type, allowing 768‑dimensional arrays to be indexed with ivfflat for efficient nearest‑neighbor lookup. The embeddings table featured a domain column to differentiate source types (notes, emails, etc.) and a model_used field added later to track versioning. This design enabled a single retrieval path for any content, simplifying LLM‑driven augmentation pipelines.

    Unified Public Schema Decision

    All domain tables were placed in the default public schema rather than segregating by business area. This eliminated cross‑schema joins for the knowledge graph, allowing the entity_links table to reference any record via a generic (source_type, source_id) pair. The approach also streamlined hook implementation: a single CRUD interceptor could apply embedding, heat tracking, and entity extraction uniformly across all entities.

    Soft Delete Strategy with deleted_at

    Every table received a deleted_at timestamp from day one, providing a non‑destructive deletion model. This facilitated bidirectional sync with external services such as Gmail and Google Calendar, as the system could differentiate between never‑seen records and those explicitly removed. Restoration became a trivial UPDATE operation, and audit trails could reconstruct deletion timelines without additional logging tables.

    Sync Engine and Change Log

    The sync subsystem introduced change_log, sync_connectors, and sync_history tables. Each CRUD action emitted a log entry containing the affected record_id, operation type, and timestamp. Connectors consumed these entries to propagate changes to third‑party APIs, while sync_history recorded batch outcomes, enabling retry logic and conflict resolution. This architecture turned the database into an event‑sourced source of truth.

    Heat Scoring and Ranking Mechanism

    To prioritize frequently accessed items, a record_heat table stored a decay‑based score between 0 and 1. Every read operation incremented the score, while a background job applied exponential decay to older interactions. Search queries multiplied relevance scores by the heat factor, ensuring that hot records surfaced prominently without sacrificing lexical matching quality.

    Multi‑User Migration Challenges

    Introducing multi‑user support required adding a user_id column to every domain table, a migration that generated thousands of ALTER TABLE statements and index rebuilds. Earlier inclusion of user_id would have avoided this churn. Additionally, embedding versioning was retrofitted by adding model_version to the embeddings table, allowing incremental re‑embedding when switching from nomic‑embed‑text to qwen3‑embedding:0.6b. The experience highlighted the importance of anticipating cross‑tenant identifiers and model provenance from the outset.


    Latest Stories

    Explore fresh ideas and updates from our editorial team.

    See All
    Your Dynamic Snippet will be displayed here... This message is displayed because you did not provide enough options to retrieve its content.

    Copyright © 2026 TechStora. All Rights Reserved.