Case study

Knowledge OS

A map of research fields as evolving problems, built on 14,400 real papers.

Python · scikit-learn · SQLite · OpenAlex · React 19 · Vite · TypeScript Live ↗ Code ↗

Knowledge OS landing page with the knowledge graph — Knowledge OS, live: browse a field as a problem, not a reading list.

The problem

Research tools hand you papers, a long list of PDFs to wade through. They quietly skip the thing you actually want to know: how a field got here. What was tried, what flopped, what won, and what's still wide open. Knowledge OS flips that around. You browse a field as a problem, on a real map, instead of as a folder of citations.

It runs on real data: 14,400 papers and ~379k citations from OpenAlex, sorted into 36 computer-science problems and the sub-problems hiding inside them.

Architecture

A build-time Python pipeline turns a live citation graph into static JSON. Everything at runtime happens in the browser.

The engine is layered, L0 through L6: ingest OpenAlex into SQLite, extract problems and sub-problems with TF-IDF and K-means, compute evolution timelines and a citation "universe" graph, then run a retrieval agent and an "AI Scientist" that points out bridge opportunities between fields. A static export (export_static.py) turns the corpus database into the JSON the React and Vite app reads. Nothing paid sits in the hot path.

Key decisions & trade-offs

Static, client-side, $0 to run. The whole corpus is precomputed to JSON and served from a CDN, so there's no backend, no database, and no API keys at runtime. The trade-off: refreshing the data means re-running the pipeline, not firing a live query.
In-browser retrieval agent instead of an LLM API. Answers are retrieved and synthesized over committed data, so there's no bill and nothing leaks to a third party. The trade-off: it's grounded retrieval, not open-ended generation. By design it would rather say less than hallucinate.
TF-IDF and K-means for sub-problem extraction. Cheap, deterministic, and explainable, with the per-paper LLM path built but switched off by default. The trade-off: clustering is coarser than LLM-grade extraction, which is on the roadmap.

The hardest part

OpenAlex is real-world data, which means the corpus is gloriously messy: odd titles, citation counts that split or double, sparse and duplicated edges. Feeding that straight into TF-IDF and K-means produced "problems" that were really just noise, clusters that didn't map to anything a person would recognise as a field.

Getting it to read cleanly took two passes: normalising the citation graph before clustering, deduping edges and reconciling the split counts so the signal wasn't drowned out, and then tuning the feature extraction until the clusters lined up with real, nameable problems. The guiding rule was to surface the mess honestly rather than paper over it, so the occasional odd title stays visible instead of hiding behind a confident lie.

Outcome

Live at evolve-snowy.vercel.app with 14,400 papers, ~379k citations, and 36 CS problems.
Runs at $0 on static hosting, scales through a CDN, and ships with a test suite (tests/test_knowledge_os.py).
Features: problem-first navigation, evolution timelines, a ⌘K research agent, a knowledge graph, and an "AI Scientist" that finds bridge opportunities across fields.

What I'd do next

From the roadmap: LLM-grade problem extraction across the full corpus, scaling toward 500k papers, and pushing past computer science. The engine is domain-agnostic, so widening it is basically one ingest filter away.