少点错误 01月27日
My supervillain origin story
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文讲述了作者在数学研究生阶段,从追求宏大理论到注重实际验证的转变。作者曾热衷于提出大胆的设想,却因忽视具体案例的验证而导致研究失败。在导师和朋友的影响下,作者逐渐认识到检验最小非平凡案例(如GL2)的重要性。尽管作者仍偏爱宏大理论,但也开始重视批判性思维和对既有观点的质疑。文章也反思了过度怀疑的弊端,强调在研究中需要平衡信心和批判,并推荐读者阅读一篇关于“蝴蝶想法”的文章。

💡作者最初追求宏大的数学理论,渴望提出具有前瞻性的“完形”想法,但忽略了具体案例的验证,导致在研究中走了弯路。

🧐在导师和朋友的影响下,作者逐渐意识到检验最小非平凡案例(如GL2)的重要性,这能有效地发现理论中的漏洞,并促使研究更加深入。

🤔作者反思了过度怀疑的弊端,认为虽然批判性思维重要,但也要避免因过分挑剔而错失有价值的观点和研究方向。保持开放心态,允许想法在实践中自我修正。

⚖️ 作者强调需要在研究中平衡信心和批判,既要有提出大胆想法的勇气,也要有验证和修正想法的严谨态度,才能取得真正有意义的进展。

Published on January 27, 2025 12:20 PM GMT

When I started graduate school (for math), I was very interested in big ideas. I had had a couple experiences of having general research intuitions pan out really well and felt like the core of good research is having a brave idea, a gestalt. I went into grad school looking for the “gestalt people”. The people whose math had that mysterious, cutting edge flavor but was not too pop (at the time the sexiest thing around was higher category theory and I was drawn to it and tried to learn it, but I didn’t want to do the “common” thing of working in that field). I ended up choosing an advisor, and skipped over any computational or “applications-driven” (insofar as doing calculations in Galois theory or string theory counts as applications) stuff that he recommended working on. It really wasn’t my thing: you had to read long technical papers, use the right lemmas, apply them to get actual numbers (yuck) that people would then build upon in future research. I wanted to have the big ideas. 

I ended up coming up with my own research project to check that an object my advisor had discovered – a certain new category associated to a Lie group – is equivalent to another category that I defined by applying higher category theory to some combinatorial data. This was exactly the kind of general thing I wanted to have: no numbers, purely theoretical, applies to a fully general class of groups: “Gestalt”. My advisor was skeptical but let me work on it. 

I ended up trying to prove a false result for a year and a half.

It wasn’t one of these “cute” false results, where all the cases you check work but there’s an unexpected edge case or counterexample. It was a straight-up false result where, if you really carefully worked out the smallest nontrivial example, you would see it’s false. The problem is that the falseness was still a little sneaky. The thing is that in this flavor of representation theory the smallest interesting case, the group GL2 (i.e. the group of matrices) is already somewhat hard (there is an infamous 300-page book about it), and while you wouldn’t have to read 300 pages to find the contradiction, you would have to intentionally work on it: clearly think through the implications of the result being correct with a view towards looking for a concrete computation or check that would lead to a contradiction. Instead of this, I alternated between trying to find the “correct” proof and thinking what nice related consequences I would get from it once it’s proven. 

While this was going on, I was going to a representation theory seminar organized by Pavel Etingof (among others), a professor who is a friend and mentor of mine from high school. (Outside math he is a minor celebrity in mushroom picking circles:) 

it's a fun guy

Pavel is a wonderful, warm person who has the best sense of humor of anyone I know. But he is a nightmare lecture attendee (I am known to be bad, but he is much worse). He asks a lot of questions. In my experience he once interrupted a graduate student’s talk after the “background context” stage to excitedly point out “oh, and notice that there is a nice obvious consequence of this” and proceeding to accidentally explain a stronger version of the person’s thesis in 5 minutes. 

But one question he almost always asks, at any representation theory talk, is “can we check this for GL2?”. He will then derail the talk to go through, on the blackboard, a computation or derivation of what the big formal construction does in this minimal interesting case. (Often once this is done the rest of the talk is moot, since in many contexts GL2 is a “representative” case: once you understand all the nontrivialities at this level, they transpose directly to all other groups). 

While at the time I was inspired by Pavel and on some level noticed the usefulness of working out concrete cases, I never thought of myself as a concrete person: I was a big idea guy. And I payed for this with my phd research.

At the end I got lucky. The “half” of the equivalence that I knew for sure (a “functor” which just happens to not in fact be an equivalence) was enough to prove a new result which I wrote up in my thesis. But the realization of months of research down the toilet led me on a villain’s journey of noticing the flaws in Gestalt reasoning. Is this very optimistic idea that you think will prove all of mirror symmetry really reasonable? How hard have you checked it? Did you look at GL2?

I am still at heart a big idea person. I love overconfident statements, I love thinking that “this one idea is all you need (plus a bunch of context around this idea)”. But I also love to nitpick and be skeptical. I love to notice when someone hasn’t actually gone through the work of really (i.e., with a view to disprove, not just perfunctorily) checking whether their idea applies in the minimal interesting case.

I think this has both good and bad consequences.

The good consequence is that I think I have finally (after over a decade) made progress in internalizing the idea of “checking this for GL2”. In my own research, I try to find a minimal operationalization that’s interesting (i.e., doesn’t follow from other simpler contexts) and nontrivial, where an idea I have might break.

The bad consequence is that I sometimes overdo this when thinking about other people’s research. I do think there is such a thing as “wrongly shaped” research. You won’t get very far if a core untested assumption you made is false, or if you’re trying to make some uber-formal philosophical picture of “what transformers really do” without actually ever looking at a single paper or experiment with real transformers. But there is also research that is “usefully wrong”. I notice cases where someone with a strong intuition of something real and interesting will try to explain something, and others will object that it makes a suspicious assumption that doesn’t stand up to scrutiny, or captures an interesting but oversimplified picture that doesn’t correspond to my understanding of “what is realistically useful”, or fails in this specific case. And I sometimes (in an intuitive sense that’s hard to exactly give examples of) notice the skeptics (usually, me) being wrong here, or even being right but interrupting an interesting chain of reasoning that could be self-correcting. 

A recent and very obvious example of the latter was when Kaarel Hänni and I were discussing the results of the “leap complexity” paper. Kaarel was excited about this paper but I was skeptical: I felt like its assumptions are a bit too limiting. And then after a bit more discussion I jumped at a clear nitpick that I could use to clearly falsify the paper: some obvious equivariance properties imply that the general result they claim has counterexamples and can’t be correct. Since the authors of this paper are very legit, I thought at first I might be misunderstanding it; we worked through some examples with Kaarel and at the end he agreed with me that the counterexamples are real. We were halfway through writing the authors a politely phrased “your paper is wrong and garbage” email when, just in case, we tried looking through the paper one more time to see if maybe we’re misunderstanding something. We noticed that at the end of the paper the authors themselves go through the exact same counterexample argument, and explain that one of the assumptions they make early in the paper (which to be fair, is a bit sneakily hidden) exactly gets rid of counterexamples like this. The paper was (unsurprisingly) correct. Eventually I came to appreciate this paper’s depth and joined Kaarel as an enthusiastic believer in its “conceptual” message. Thinking through related ideas together led both of us to refine our research ideas in useful ways (in particular, my current ideas about “analogy circuits” owe a lot to these discussions). And there's a lesson in noticing that if I had followed my nose, I’d have dismissed this angle of inquiry and never engaged in these ideas (or in the shoes of its authors, I would not have even embarked on this research after noticing the counterexamples to the general case). 

I don’t know what the moral to this story is. I do still hold strong to my supervillain inception moment; I believe in carefully working out simple cases and open-mindedly looking for counterexamples to the ideas you're excited about. I wouldn’t mind having the phrase “Check it for GL2” being written on my tombstone, and frankly, I’m still learning that lesson. But also I guess maybe the supervillain in me has embarked on a bit of a redemption arc. I’m starting to actively see the issues with excess skepticism, and I’m trying to learn to push through skepticism a bit more in thinking about my own research and others’. I’m still trying to balance these ideas out, correctly calibrate, in a research context, the struggle between faith and cynicism. If I were to try to extract a meaningful “take” here I would fail or say something banal. Thankfully I don’t have to, since it has been done before, on this site, really clearly and cogently. This is Elizabeth’s post on butterfly ideas. Go read it now, while I go back to brood in my lair. 



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

数学研究 批判性思维 GL2 理论验证 研究方法
相关文章