Engram logo Engram · Docs Home GitHub ↗

Engram documentation

Engram turns Microsoft OneNote courses (exported as PDF) into a local, Obsidian-native index of your knowledge — searchable summaries of each page's key ideas and formulas, organized as a hierarchy that is also a concept graph.

Overview

Engram is an index and connector, not a transcriber. For each note it writes a concise, searchable summary of the page's key ideas and all its formulas, and keeps the original page right below as the source of truth. It then links related notes and distils a multi-layer concept graph — so you can find an idea, see what it connects to, and jump back to the page.

🚀 Don't import everything at once — start with one section group (one class).

Bringing in your whole library on day one means committing hours of model time before you know whether the summaries, model, and filters are right for you. Try one course first, look at the results, then scale.

  1. Export one course to PDF (one section group — see Getting notes out) and put the .pdf on your Mac.
  2. Add it to a domain (the domain is auto-created):
    uv run python -m engram add "Data Science" "CSE 234.pdf"
  3. Open the Mac app and read a few notes — are the summaries capturing the right ideas and formulas? Is anything you care about getting skipped?
  4. Tune in Settings (gear): pick the transcription model (Claude Code = best; local 7B = fast/offline) and adjust the filters.
  5. Happy? Add the rest of your courses, then build the graph:
    uv run python -m engram link "Data Science"
    uv run python -m engram concepts "Data Science" --layers 3
Why one course first
It's the cheapest way to (a) judge summary quality, (b) decide between Claude Code and a local model, (c) confirm the filters skip what you expect, and (d) — if you use Claude Code — see how it behaves before a few-hundred-note batch risks hitting rate limits.

Install & requirements

# set up; --extra local adds the on-device vision model (mlx-vlm)
uv sync --extra dev --extra local

cd macapp && ./build_app.sh && open Engram.app   # the GUI

How notes are organized

Engram mirrors OneNote's structure with one extra grouping level:

Domain                 e.g. "Data Science"  (many courses, one knowledge base)
└─ Section group       e.g. "CSE 234"       (one exported PDF = one course)
   └─ Section          e.g. "MLSys"
      └─ Note           a OneNote page → <Title>.md  +  <Title>.pdf (the leaf)

On disk: ~/Engram/<Domain>/<SectionGroup>/<Section>/<Title>.{md,pdf}. The split per-note PDF leaf is the source of truth; the .md summary is a searchable projection above it.

Getting notes out of OneNote

OneNote can't reliably export a whole large notebook, so export per section group (one course). The repo's windows-export/ helper automates it:

The result is one <Course>.pdf per section group, ready for engram add.

An index, not a transcription

Each note becomes a concise, searchable digest: every key concept, definition, and term, plus all formulas verbatim in LaTeX — not a word-for-word copy. Routine examples and arithmetic are summarized; the full page is one scroll below. The summary never invents content, and illegible parts are flagged.

Transcription models

Choose the model that reads your pages in Settings → Transcription model (or via the CLI). All produce the same index-style summary.

OptionWhat it isBest for
claude-codeClaude vision via your Claude Code login (default)Best quality & math; no API key. Watch rate limits on big batches.
local-7bQwen2.5-VL 7B on-device (mlx-vlm)Fast, free, fully offline/private.
local-32bQwen2.5-VL 32B on-deviceBetter local quality; ~3-4× slower, needs ~64 GB.
uv run python -m engram config set transcribe local-7b   # or claude-code / local-32b

The concept-map & cross-link reasoning runs through Claude Code too (text only) — pick the model under Settings → Concept & cross-link model.

Filters — what gets indexed

Engram indexes knowledge, not everything. By default it skips (and reports — never silently drops) content that's noise for an index. Skipped notes still exist in the source PDF. Toggle these in Settings or the config.

FilterSkipsConfig
Papersattached papers & printouts/scansskip_papers
Homeworkshomework / hw / assignment / lab / projectskip_homework
Long codecode notebooks over N pagesskip_code_over_pages
Customany section/title substring you add (e.g. "lit")skip_section_patterns
Page count is the real span
Long-code is measured by the note's actual page span — not the footer "第 N 页" number, which is the section's running page index and can start mid-section.

A holistic, domain-wide pass adds a ## Related section to each note — links to related notes, including across section groups. Safe by construction (no dangling links, symmetric, capped).

uv run python -m engram link "Data Science"

Concept graph

Beyond folders, Engram distils the domain into a multi-layer graph of concepts (not files): Concepts → Themes → Areas. Each level is a tab in the app you switch between to change granularity. Nodes are colored by cluster (their parent); edges are solid (two concepts share a note) or dashed (related by the model's general knowledge). Click a concept to see — and open — the notes it covers.

uv run python -m engram concepts "Data Science" --layers 3   # 3 tabs

In the app: the brain icon builds it (pick scope + layers); the Graph icon opens it.

Cross-notebook graphs

A concept graph can span one notebook or several. Building across notebooks (e.g. Data Science + Mathematics) surfaces concepts that connect across domains. In the brain builder, just check 2+ notebooks; on the CLI, pass more than one domain:

uv run python -m engram concepts "Data Science" "Mathematics" --layers 3

Combined maps are stored under ~/Engram/_concept_maps/<A + B>/; clicking a concept opens its note even if it lives in the other notebook. A combined map is shared — it opens from either member notebook's entrance, and a switcher lets you flip between a notebook's own map and any shared one.

Chat — query & synthesis

Once a concept graph exists, you can chat with your knowledge base. Context is pulled deterministically from the graph and your notes (no embeddings) and the answer is grounded and cited — it links the notes it drew on as [[wikilinks]] (shown as clickable Sources), and anything the model adds beyond your notes is put under a clearly-marked "Beyond your notes" section. Two modes:

ModeForContext it sees
Querya specific question about one course or one conceptthat concept's notes + its graph neighbours (or all of a course's note summaries)
Synthesisa big-picture question about a whole fieldthe concept graph + overview MOCs + the field's note summaries (budgeted, most-central first)

In the app: the Chat toolbar button opens it with Query / Synthesis tabs; you can also click a node in the graph and pick "Ask about this" to chat about that concept. The chat model defaults to Claude and is switchable in Settings.

# ask about one concept, or one course
uv run python -m engram chat query "Data Science" "How does value iteration converge?" --concept "Markov Decision Processes"
uv run python -m engram chat query "Data Science" "Give me an exam cheat-sheet" --course "DSC 120"

# a big question about a whole field (one or more notebooks)
uv run python -m engram chat synthesis "Mathematics" -q "What's the unifying story, and what's missing?"
What chat can and can't reach
Chat answers using your notes plus the model's own knowledge (which, depending on the model, may include general/world knowledge). It does not browse the internet or use external (MCP) tools yet — that's planned for a later version.

The Mac app

cd macapp && ./build_app.sh && open Engram.app

CLI reference

# domains
engram domain create "Data Science"
engram domain list
engram domain remove "Data Science"

# add / update one section group (a PDF) — incremental
engram add "Data Science" "CSE 234.pdf" [--group "CSE 234"]

# domain-wide cross-links
engram link "Data Science"

# multi-layer concept map (1 domain, or several for cross-notebook)
engram concepts "Data Science" ["Mathematics"] [--layers 3]

# chat (needs a concept graph): query a course/concept, or synthesise a field
engram chat query "Data Science" "…question…" (--concept "…" | --course "…")
engram chat synthesis "Mathematics" ["Data Science"] -q "…big question…"

# inspect structure (no model) / status / config
engram inspect "CSE 234.pdf"
engram status
engram config set transcribe claude-code      # transcribe / reasoning_model / chat_model / skip_* / vault_path
engram config set-key sk-ant-…                 # optional Claude API key

Run as uv run python -m engram … from the repo root.

Honest limitations

Engram is an early v0.1 research prototype · source on GitHub · back to home