Mnemonic: Local Agent Memory

Learn about Mnemonic, a local daemon providing AI coding agents persistent memory by capturing git commits and conversations, ensuring agents stop forgetting.

Rust SQLite HNSW ONNX all-MiniLM-L6-v2 MCP

Overview

Mnemonic is an open-source background daemon that gives AI coding agents persistent memory. It passively captures my git commits and my conversations with Claude Code, stores them locally in SQLite with vector, full-text, and knowledge-graph indexes, and serves the relevant slice back to the agent over MCP, so the agent stops re-asking what we already decided.

In the demo I will show it live, end to end: the daemon capturing a real session, hybrid retrieval (BM25 + HNSW vectors + a graph hop, fused with Reciprocal Rank Fusion) answering a query, the recall@5 / recall@20 / MRR eval harness I use to check whether retrieval actually improved, and the macOS menu-bar widget reading the same local HTTP API. Everything runs on-device: no cloud, no API keys.

Links

https://github.com/kossvat/mnemonic
Local-first SQLite memory engine using HNSW vectors and MCP.

Tech stack

Rust

Rust is a high-performance systems programming language that guarantees memory and thread safety via its compile-time ownership model.

Rust is a statically-typed systems language engineered for performance and reliability, directly challenging C/C++ in speed. Its core innovation is the ownership model and 'borrow checker,' which enforces strict memory and thread safety at compile-time, eliminating data races and null pointer dereferences without a conventional garbage collector. Rust achieves near-native speed through 'zero-cost abstractions,' allowing high-level features to compile into highly optimized code. Major industry players, including Microsoft and Cloudflare, leverage Rust for critical infrastructure, and it is now officially supported for development in the Linux kernel.

https://www.rust-lang.org/

View projects
SQLite

SQLite is a C-language library: a self-contained, serverless, zero-configuration SQL database engine embedded directly into the application process.

SQLite is the world's most deployed database engine, functioning as a compact, C-language library (under 900KiB with all features) that eliminates the need for a separate server process. It operates as a serverless, zero-configuration system, storing the entire database (up to 281 terabytes) in a single, cross-platform file. This architecture makes it ideal for countless applications: it is built into all major mobile phones, web browsers, and desktop operating systems. The engine guarantees high reliability, supporting full ACID transactions, and its source code is freely available in the public domain for any use.

https://www.sqlite.org

View projects
HNSW

HNSW (Hierarchical Navigable Small World) is a state-of-the-art graph-based algorithm: it executes Approximate Nearest Neighbor (ANN) search on high-dimensional vectors with logarithmic complexity (O(log n)), ensuring lightning-fast similarity retrieval.

Hierarchical Navigable Small World (HNSW) is the dominant Approximate Nearest Neighbor (ANN) search algorithm, delivering superior speed and recall for vector databases. It constructs a multi-layer proximity graph: higher layers contain long-range connections for rapid traversal, while lower layers provide fine-grained accuracy for finding the true nearest neighbors. This hierarchical structure, detailed in the 2016 paper by Malkov and Yashunin, achieves logarithmic complexity scaling, making it highly efficient. Use it to power critical applications like large-scale image retrieval, real-time product recommendation engines, and modern Retrieval-Augmented Generation (RAG) systems.

https://arxiv.org/abs/1603.09320

View projects
ONNX all-MiniLM-L6-v2

An ultra-lean, 384-dimensional sentence embedding model optimized in ONNX format for lightning-fast CPU and edge-device semantic search.

This technology packages the highly popular sentence-transformers/all-MiniLM-L6-v2 model into the portable ONNX runtime format, shrinking the deployment footprint to a mere 80 megabytes. It maps sentences and paragraphs into a dense 384-dimensional vector space, making it a go-to choice for clustering, duplicate detection, and retrieval-augmented generation (RAG) pipelines. By bypassing heavy Python dependencies, developers can run local, high-throughput semantic similarity searches directly in C++, Java, or browser-based JavaScript environments with minimal CPU overhead.

https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

View projects
MCP

MCP is the open-source standard for securely connecting AI agents (like LLMs) to external tools, data, and enterprise workflows.

The Model Context Protocol (MCP) functions as a standardized integration layer: think of it as a USB-C port for AI applications. Developed and open-sourced by Anthropic, this protocol allows large language models (LLMs) to access real-time context and execute actions via external tools like GitHub, Jira, or proprietary databases . It uses a simple JSON-RPC interface to define tools, schemas, and endpoints, which enables AI agents to perform complex, state-changing tasks—such as creating a GitHub issue or running a test script—rather than just generating text . MCP is essential for building agentic AI systems that can autonomously pursue goals and operate within defined safety and permission boundaries .

https://modelcontextprotocol.io/

View projects