Skip to content

Tech Stack

Core Stack

LayerTechnologyPurpose
LanguageTypeScript (ESM, strict, ES2022)Type-safe development
RuntimeNode.js 22+Server runtime
Package Managerpnpm 9+Fast, disk-efficient package management
MCPOfficial MCP TypeScript SDKLLM agent integration (HTTP)
HTTPFastify 5High-performance REST API
API ContractOpenAPI 3.1 (hand-written spec)API documentation + code generation
CodegenKubb + openapi-typescriptGenerate types and Zod schemas from OpenAPI
ORMDrizzle ORM + Drizzle KitType-safe database queries + migrations
DatabasePostgreSQL via ParadeDBPrimary data store
Full-Text SearchParadeDB BM25 (pg_search, source_code tokenizer)Lexical code search
Vector Storepgvector (HNSW, cosine distance)Semantic similarity search
Job Queuepg-boss (Postgres-backed)Background job processing (no Redis)
Parserstree-sitter (9 languages) + custom markdownAST-based code parsing
EmbeddingOllama (Metal)Vector embedding generation
FrontendAngular 21 + Angular Material + highlight.jsAdmin dashboard
TestingVitest 3 + TestcontainersUnit and integration tests
ContainerDocker + Docker ComposeDeployment and local development

Key Design Choices

PostgreSQL Does Everything

RepoRelay uses a single PostgreSQL instance (via ParadeDB) for:

  • Relational storage — repos, refs, files, symbols, chunks, imports
  • Full-text search — ParadeDB's BM25 index with source_code tokenizer
  • Vector search — pgvector HNSW index with cosine distance
  • Job queue — pg-boss stores jobs in Postgres tables (no Redis needed)
  • Real-time notificationsLISTEN/NOTIFY for indexing progress

This eliminates the need for Redis, Elasticsearch, or dedicated vector databases.

tree-sitter for Parsing

tree-sitter provides fast, incremental, language-agnostic parsing:

  • Produces concrete syntax trees (CSTs) for 9 languages
  • Custom extractors pull out functions, classes, interfaces, imports, and exports
  • Symbol-aware chunking respects function/class boundaries

Hybrid Search via RRF

Search combines two signals for better results:

  1. BM25 — Lexical matching via ParadeDB's pg_search extension
  2. Vector — Semantic similarity via pgvector cosine distance

Results are merged using Reciprocal Rank Fusion (RRF) for a single ranked list.

Full-Index + SHA-256 Dedup

Every ref is indexed by listing ALL files via git ls-tree. The pipeline uses SHA-256 hashing to skip re-parsing, re-chunking, and re-embedding unchanged files. This gives every ref a complete ref_files set while keeping indexing fast for incremental updates.

Released under the MIT License.