MarkTechPost@AI 01月17日
Google AI Research Introduces Titans: A New Machine Learning Architecture with Attention and a Meta in-Context Memory that Learns How to Memorize at Test Time
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

谷歌研究人员推出了一种名为泰坦(Titans)的新型机器学习架构,旨在解决Transformer模型在处理长序列时面临的计算挑战。该架构引入了神经长期记忆模块,与注意力机制协同工作,实现高效的训练和推理。泰坦架构包含核心模块、长期记忆分支和持久记忆组件,通过复杂的集成设计,有效处理超过200万个tokens的长序列。实验结果表明,泰坦在多个配置下均优于现有模型,尤其在处理长依赖关系和“大海捞针”任务中表现出色,其关键优势在于高效的内存管理、深度非线性内存能力以及有效的内存擦除功能。

🧠 泰坦架构的核心创新在于引入了神经长期记忆模块,它与注意力机制协同工作,将注意力作为短时记忆,而神经记忆组件则作为长期存储,从而实现对历史信息的有效访问。

⚙️ 泰坦架构由三个关键部分组成:核心模块利用有限窗口大小的注意力进行短期记忆和数据处理;长期记忆分支实现神经记忆模块存储历史信息;持久记忆组件包含可学习的、数据无关的参数。

🚀 泰坦架构通过多种技术优化实现高效性能,包括残差连接、SiLU激活函数、ℓ2-范数归一化以及1D深度可分离卷积层,这些优化共同提升了模型的处理能力和效率。

🏆 实验结果显示,泰坦架构的三个变体MAC、MAG和MAL在多个配置下均优于混合模型,尤其在处理长依赖关系和“大海捞针”任务时,展现了其在长序列处理上的卓越性能。

✨ 泰坦架构的关键优势在于其高效的内存管理、深度非线性记忆能力以及有效的记忆擦除功能,这些特性使得模型能够更好地处理长序列,并适应复杂的数据模式。

Large Language Models (LLMs) based on Transformer architectures have revolutionized sequence modeling through their remarkable in-context learning capabilities and ability to scale effectively. These models depend on attention modules that function as associative memory blocks, storing and retrieving key-value associations. However, this mechanism has a significant limitation: the computational requirements grow quadratically with the input length. This quadratic complexity in both time and memory poses substantial challenges when dealing with real-world applications such as language modeling, video understanding, and long-term time series forecasting, where the context windows can become extremely large, limiting the practical applicability of Transformers in these crucial domains.

Researchers have explored multiple approaches to address the computational challenges of Transformers, with three main categories emerging. First, Linear Recurrent Models have gained attention for efficient training and inference, evolving from first-generation models like RetNet and RWKV with data-independent transition matrices to second-generation architectures incorporating gating mechanisms like Griffin and RWKV6. Next, Transformer-based architectures have attempted to optimize the attention mechanism through I/O-aware implementations, sparse attention matrices, and kernel-based approaches. Lastly, Memory-augmented models focus on persistent and contextual memory designs. However, these solutions often face limitations such as memory overflow, fixed-size constraints, etc.

Google Researchers has proposed a novel neural long-term memory module designed to enhance attention mechanisms by enabling access to historical context while maintaining efficient training and inference. The innovation lies in creating a complementary system where attention serves as short-term memory for precise dependency modeling within limited contexts even though the neural memory component functions as long-term storage for persistent information. This dual-memory approach forms the foundation of a new architectural family called Titans, which comes in three variants, each offering different strategies for memory integration. The system shows particular promise in handling extremely long contexts, successfully processing sequences beyond 2 million tokens.

The Titans architecture introduces a complex three-part design to integrate memory capabilities effectively. The system consists of three distinct hyper-heads: a Core module utilizing attention with limited window size for short-term memory and primary data processing, a Long-term Memory branch implementing the neural memory module for storing historical information, and a Persistent Memory component containing learnable, data-independent parameters. The architecture is implemented with several technical optimizations, including residual connections, SiLU activation functions, and ℓ2-norm normalization for queries and keys. Moreover, it uses 1D depthwise-separable convolution layers after query, key, and value projections, along with normalization and gating mechanisms.

The experimental results demonstrate Titans’ superior performance across multiple configurations. All three variants – MAC, MAG, and MAL – outperform hybrid models like Samba and Gated DeltaNet-H2, with the neural memory module proving to be the key differentiator. Among the variants, MAC and MAG show strong performance, especially in handling longer dependencies, surpassing the MAL-style combinations commonly used in existing hybrid models. In needle-in-a-haystack (NIAH) tasks, Titans outperforms baselines across sequences ranging from 2K to 16K tokens. This superior performance stems from three key advantages: efficient memory management, deep non-linear memory capabilities, and effective memory erasure functionality.

In conclusion, researchers from Google Research introduced a groundbreaking neural long-term memory system that functions as a meta-in-context learner, capable of adaptive memorization during test time. This recurrent model is more effective in identifying and storing surprising patterns in the data stream, offering more complex memory management than traditional methods. The system has proven its superiority in handling extensive contexts through the implementation of three distinct variants in the Titans architecture family. The ability to effectively process sequences exceeding 2 million tokens while maintaining superior accuracy marks a significant advancement in the sequence modeling field and opens new possibilities for handling increasingly complex tasks.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 65k+ ML SubReddit.

Recommend Open-Source Platform: Parlant is a framework that transforms how AI agents make decisions in customer-facing scenarios. (Promoted)

The post Google AI Research Introduces Titans: A New Machine Learning Architecture with Attention and a Meta in-Context Memory that Learns How to Memorize at Test Time appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Transformer 长时记忆 神经网络 机器学习 泰坦架构
相关文章