Index any repo in your browser. Chat with its code using your own API key. Embeddings, retrieval, and storage — all on-device.
Try an example
How it works
Any public repo. Private repos with a token.
AST chunking + WebGPU embeddings. No server. Binary quantization shrinks vectors 32x. For ex: mlc-ai/web-llm goes from 2.1 MB to 67 KB.
Chat with your LLM of choice. Results cite real code.
Route params resolved, indexing starts
Repo tree + blobs at browser
tree-sitter WASM semantic splits
transformers.js WebGPU vectors
Compressed for fast Hamming search
Vectors + metadata persisted locally
Natural language query to the repo
Multi-query generation (CodeRAG-style)
Hybrid search. Hamming + regex
Hybrid search. Hamming + regex
Reciprocal Rank Fusion over both paths
Matryoshka reranker on candidates
Best chunks assembled for context
Qwen2-0.5B in a dedicated web worker
Streamed answer + verification loop