Published on May 26, 2025 6:26 PM GMT
In my experience, the most annoyingly unpleasant part of research[1] is reorganizing my notes during and (especially) after a productive research sprint. The "distillation" stage, in Neel Nanda's categorization. I end up with a large pile of variously important discoveries, promising threads, and connections, and the task is to then "refactor" that pile into something compact and well-organized, structured in the image of my newly improved model of the domain of study.
That task is of central importance:
- It's a vital part of the actual research process. If you're trying to discover the true simple laws/common principles underlying the domain, periodically refactoring your mental model of that domain in light of new information is precisely what you should be doing. Reorganizing your notes forces you to do just that: distilling a mess into elegant descriptions.It allows you to get a bird's-eye view on your results, what they imply and don't imply, what open questions are the most important to focus on next, what nagging doubts you have, what important research threads or contradictions might've ended up noted down but then forgotten, et cetera.It does most of the work of transforming your results into a format ready for consumption by other people.
A Toy Example
Suppose you're studying the properties of matter, and your initial ontology is that everything is some combination of Fire, Water, Air, and Earth. Your initial notes are structured accordingly: there are central notes for each element, branching off from them are notes about interactions between combinations of elements, case studies of specific experiments, attempts to synthesize and generalize experimental results, et cetera.
Suppose that you then discover that a "truer", simpler description of matter involves classifying it along two axes: "wet-dry" and "hot-cold". The Fire/Water/Air/Earth elements are still relevant, revealed to be extreme types of matter sitting in the corners of the wet-dry/hot-cold square. But they're no longer fundamental to how you model matter.
Now you need to refactor your entire mental ontology – and your entire notebase. You need to add new nodes for the wetness/temperature spectra, you need to wholly rewrite the notes about the elements to explicate their nature as extreme states of matter (rather than its basic building blocks), you need to do the same for all notes about elemental interactions, you need to re-interpret the experimental results, and you need to ensure you don't overlook any subtle evidence contradicting the new ontology, or stray thoughts that might lead to an insight regarding an even-more-correct ontology, or absent-minded ideas about high-impact applications and research avenues...
At a sufficiently abstract level, what you should do is: fetch all information relevant to the new ontology from your old-ontology notes, use that information to properly flesh the new ontology out, then redefine the old ontology in the new ontology's terms.
This isn't only a frontier-researcher problem: similar happens whenever I'm studying a domain that's already well-explored. I start with a flawed model centered around incorrect variables. Gradually, as I learn more, my thinking re-organizes around truer central elements. Once enough changes have accumulated, over the course of months or years, my mental representation ends up having little in common with my initial one.
But the state of the corresponding notebase usually drags behind.
Refactoring your mental ontology is relatively easy: it just requires thinking, and the interface for navigating and editing your world-model is very rich and flexible. Friction costs of mental actions are nonzero, but low.
The same is not true for note-taking. Tools for it do a fairly poor job of accommodating the above functionality; even e. g. Obsidian's canvas. They impose a lot of additional friction, their interfaces and editing features aren't optimized for such at-scale refactors.
My impression is that a lot of people run into similar issues when trying to use notebases as "second brains".[2]
Why Not Just Start From Scratch?
Arguably, the solution is to just periodically start from scratch. Instead of trying to edit, you spin up a new notebase, writing directly from your updated world-model; the old notebase you delete.
I think this is very suboptimal, in two ways.
First, those outdated notebases do still hold a lot of value:
- Human memory, especially working memory, is painfully limited. Having a notebase ensures that you don't forget any phenomena and connections you don't commonly encounter[3]; that you're reminded to properly propagate the updates downstream of the new ontology to all corners of your world-model.The excitement of discovering an apparently better ontology might blind you to its issues. When thinking in its terms, you would, by definition, only be able to think about the phenomena that could be easily described in its terms. If there are any broad swathes of phenomena it fails at, or any subtle-but-crucial issues with interpreting past data through the new lens, you may end up unable to perceive them. A stable, externalized record of all of this forces you to properly think through it all.
Basically, "rewrite the notebase from scratch" has a lot of the same issues as "rewrite the codebase from scratch".
Second, even if you're taking a sufficiently wise "rewrite it from scratch" approach, where you're constantly reviewing your previous notebase to ensure you're not missing anything... That is a lot, a lot of work.
Work that coincidentally forces you to do useful conceptual thinking, yes. But a significant fraction of it is just drudgery forced on you by UI shortcomings.
What would the ideal interface for this be? Something that slashes the above friction costs. Something that allows to flexibly vary the representation of your knowledge – in terms of concepts you describe it via – while ensuring that all information (including subtle, forgotten, yet crucially important doubts) is retained.
Generalized Representation-Flipping
In a way, what would be ideal here is a generalization of my idea about an "exploratory medium for mathematics":
A big part of highly theoretical research is flipping between different representations of the problem: viewing it in terms of information theory, in terms of Bayesian probability, in terms of linear algebra; jumping from algebraic expressions to the visualizations of functions or to the nodes-and-edges graphs of the interactions between variables; et cetera.
The key reason behind it is that research heuristics bind to representations. E. g., suppose you're staring at some graph-theory problem. Certain problems of this type are isomorphic to linear-algebra problems, and they may be trivial in linear-algebra terms. But unless you actually project the problem into the linear-algebra ontology, you're not necessarily going to see the trivial solution when staring at the graph-theory representation. (Perhaps the obvious solution is to find the eigenvectors of the adjacency matrix of the graph – but when you're staring at a bunch of nodes connected by edges, that idea isn't obvious in that representation at all.)
This is a bit of a simplified example – the graph theory/linear algebra connection is well-known, so experienced mathematicians may be able to translate between those representations instinctively – but I hope it's illustrative.
As a different concrete example, consider John Wentworth's Bayes Net Algebra. This is essentially an interface for working with factorizations of joint probability distributions. The nodes-and-edges representation is more intuitive and easy to tinker with than the "formulas" representation, which means that having concrete rules for tinkering with graph representations without committing errors would significantly speed up how quickly you can reason through related math problems. Imagine if the derivation of such frameworks was automated: if you could set up a joint PD in terms of formulas, automatically project the setup into graph terms, start tinkering with it by dragging nodes and edges around, and get errors if and only if back-projecting the changed "graph" representation into the "formulas" representations results in a setup that's non-isomorphic to the initial one.
(See also this video, and the article linked above.)
A related challenge are refactors. E. g., suppose you're staring at some complicated algebraic expression with an infinite sum. It may be the case that a certain no-loss-of-generality change of variables would easily collapse that expression into a Fourier series, or make some Obscure Theorem #418152/Weird Trick #3475 trivially applicable. But unless you happen to be looking at the problem through those lens, you're not going to be able to spot it. (Especially if you don't know the Obscure Theorem #418152/Weird Trick #3475.)
It's plausible that the above two tasks is what 90% of math research consists of (the "normal-science" part of it), in terms of time expenditure. Flipping between representations in search of a representation-chain where every step is trivial.
Basically: You have some abstract construct which is "anchored down" by your notes/math. For any abstract construct, there's an infinite number of valid ways to anchor it. Some of those ways are better from the practical point of view: shorter, simpler to work. What a good note-taking tool would allow is freely varying the form of your anchors under the constraint of fully preserving the abstract construct.
The Fundamental Problem
Mind, this isn't just a problem with note-taking. This sort of surface-level messiness convergently appears in any situation where we have a system gradually learning/adapting to an unfamiliar domain. Some examples:
- A codebase that's gradually added to over the years, which ends up as a tall spaghetti tower in dire need of a refactor.A population of organisms being mutated by evolution, resulting in spaghetti-code DNA structures and apparently messy biological dynamics... which nevertheless turn out to be elegant and simple, if you can find the right lens to look at them from.A neural network being incrementally updated by gradient descent. It transforms into a massive black box... which still can, in principle, be translated into a simple symbolic-program form.Law systems or bureaucratic regulations, which are gradually adjusted in response to social changes and legal loopholes, until they turn into Kafkaesque nightmares.
In all cases, we start with the description of a system in some initial representation/language/ontology, gradually refine the system, and end up with something that's effectively implemented on a different, "truer" ontology. But that high-level ontology isn't by-default visible, we don't get the "interpreter" for free, so what you end up seeing is an inefficient mess.
... Which, if we view it from that perspective, has depressing implications regarding any hope of building "good" note-taking software. None of the powerful processes above struggling with isomorphic problems (programmers, evolution, interpretability researchers, legislators, company managers) have managed to solve them. The only "solution" that ever works is to just have a competent human manually untangle the mess.
And indeed, if we think about what "lossless notebase refactors" would imply, it would imply fully intelligent edits. Not even something LLMs can really do: they would lose track of those subtle-but-crucial tidbits/doubts/thoughts I keep talking about.
So: it seems that a fully competent notes-editing software is AGI-complete.
Can the Problem Be Ameliorated?
Okay, so a full solution is beyond the scope of a notetaking app. Can the situation still be improved?
Intuitively, yes. Recall that we're not actually asking for fully automatic notebase refactors, we're looking to make manual human-guided refactors easier on the humans.
So: any ideas regarding how?
- What UI elements would be helpful and ultimately implementable with the current technology? Both regarding how the notes are displayed, and how they can be edited.Do you have any note-taking/research-logging strategies that lessen/fix this problem? (It would be very nice if all of the above turned out to just be a skill issue on my part.)Are there any lessons from the domains of programming/biology/interpretability/government reform/company management that could be directly imported to this domain?
I've separated out my own thoughts into this comment.
- ^
Especially pre-paradigmic research, such as in agent foundations.
- ^
Source: Vague recollection of various discussions I've read, plus this brief attempt at a public-opinion review I just ran via o3.
- ^
See the generalized correspondence principle: new ontology must explain every real phenomenon the previous ontology was able to explain.
(And as far as keeping notes goes, you should also ideally preserve the explanation regarding what features of the new ontology made it look like the old ontology. "How and why does quantum physics consistently create the impression of classicality?")
Discuss