[1] Yuhao Li, Ruoran Bai, and Haiping Huang. Spin glass model of in-context learning. arXiv:2408.02288, 2024[2] Mezard M, Parisi G, Virasoro M A. Spin Glass Theory and Beyond[M]. World Scientific, 1986.[3] Huang H. Statistical mechanics of neural networks[M]. Springer, 2021.[4] Brown T, Mann B, Ryder N, et al. Language Models are Few-Shot Learners[C]. Advances in Neural Information Processing Systems: Vol. 33. 2020: 1877-1901.[5] Von Oswald J, Niklasson E, Randazzo E, et al. Transformers Learn In-Context by Gradient Descent[C]. International Conference on Machine Learning, ICML, 2023.[6] Lu Y M, Letey M I, Zavatone-Veth J A, et al. Asymptotic theory of in-context learning by linear attention[A]. 2024. arXiv: 2405.11751.[7] Chen S, Sheen H, Wang T, et al. Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality[A]. 2024. arXiv: 2402.19442.