qmd + nano-graphrag: You Do Not Need Pinecone for This
Local semantic search and graph RAG that runs on your filesystem. No vector database. No cloud service. Just fast, file-based knowledge retrieval that syncs however you want.
The Expensive Way
You've got notes. Docs. Meeting transcripts. Blog posts. Research. You want your agent to search them semantically. The internet says: spin up Pinecone. Or Weaviate. Or Qdrant. Or Chroma with a Docker compose. Or Microsoft's GraphRAG with Neo4j and an Azure account.
For what? Searching your Obsidian vault and a folder of markdown files? Come off it.
qmd: Semantic Search That Just Works
qmd by Tobias Lutke (yes, the Shopify founder) is a CLI search engine for your markdown. Point it at a folder, it indexes everything locally, and you get hybrid search (BM25 + vector + LLM reranking) that's properly fast. 20,000 stars.
qmd collection add docs ./my-docs --pattern "**/*.md"
qmd embed --collection docs
qmd query --collection docs "how does the auth flow work"
That's it. File-based SQLite index. Local embeddings. No server. No Docker. No API keys for the search itself. I've got three collections running: my learnings knowledge base, my Obsidian vault (500+ notes), and all the blog posts on this site. Searches come back in under 5 seconds with reranking, under a second without.
The index is just files on disk. Sync it with git, Google Drive, Dropbox, rsync, whatever you like. I keep mine in a git repo. Push from one machine, pull on another, instant shared knowledge base.
nano-graphrag: GraphRAG Without the Infrastructure
nano-graphrag is what it says. A simple, hackable GraphRAG implementation in Python. No Neo4j. No external database. File-based storage by default. It builds a knowledge graph from your documents (entities, relationships, communities) and lets you query it with both local search (entity-focused) and global search (theme-focused).
pip install nano-graphrag
The whole thing runs in a single Python process. The graph, embeddings, and community reports all live on your filesystem. I use it in my /reflect skill to index learnings from coding sessions. When I solve a problem, the solution gets captured as a learning note with entity extraction, indexed into nano-graphrag, and now any future session can retrieve it by searching for the error message, the technology, or the pattern.
3,700 stars. MIT licensed. The codebase is small enough to read in an afternoon, which is the whole point. If something breaks or you want to customise how entities get extracted, you can actually understand and modify the code. Try doing that with Microsoft's GraphRAG.
Why This Matters
The RAG space has a massive over-engineering problem. For most developers, the knowledge base is a few hundred documents. Maybe a few thousand. At that scale, a SQLite index and a local graph run circles around a cloud vector database because there's zero network latency, zero cold start, and zero monthly bill.
The expensive infrastructure makes sense when you're indexing millions of documents for a production SaaS product. For personal knowledge management, agent memory, and team documentation? Local file-based tools are faster, cheaper, and simpler. And you own all of it.
My Setup
- qmd indexes my Obsidian vault, learnings KB, and blog posts. Three collections, different sync strategies. Hybrid search finds anything in seconds.
- nano-graphrag powers the entity graph in my reflect skill. When I fix a tricky bug, the entities (technology, error, pattern) get extracted and linked. Next time I hit a similar issue, the graph surfaces the prior solution.
- Both sync via git. Push, pull, done. No Pinecone dashboard. No usage-based billing. No vendor lock-in.
You do not need a vector database yet. You might never need one. Start here.