Training-Free ANN-to-SNN Conversion for High-Performance Spiking Transformer

cs.AI updates on arXiv.org 前天 12:39

Training-Free ANN-to-SNN Conversion for High-Performance Spiking Transformer

提出一种适用于Transformer架构的高性能、无训练ANN至SNN转换框架，通过引入MBE神经元，实现非线性操作的近似，降低转换误差和延迟，为Spiking Transformers在现实应用中的高效部署提供途径。

arXiv:2508.07710v1 Announce Type: cross Abstract: Leveraging the event-driven paradigm, Spiking Neural Networks (SNNs) offer a promising approach for constructing energy-efficient Transformer architectures. Compared to directly trained Spiking Transformers, ANN-to-SNN conversion methods bypass the high training costs. However, existing methods still suffer from notable limitations, failing to effectively handle nonlinear operations in Transformer architectures and requiring additional fine-tuning processes for pre-trained ANNs. To address these issues, we propose a high-performance and training-free ANN-to-SNN conversion framework tailored for Transformer architectures. Specifically, we introduce a Multi-basis Exponential Decay (MBE) neuron, which employs an exponential decay strategy and multi-basis encoding method to efficiently approximate various nonlinear operations. It removes the requirement for weight modifications in pre-trained ANNs. Extensive experiments across diverse tasks (CV, NLU, NLG) and mainstream Transformer architectures (ViT, RoBERTa, GPT-2) demonstrate that our method achieves near-lossless conversion accuracy with significantly lower latency. This provides a promising pathway for the efficient and scalable deployment of Spiking Transformers in real-world applications.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

ANN-to-SNN转换 Transformer架构 Spiking Neural Networks

相关文章

MIT Researchers Propose Cross-Layer Attention (CLA): A Modification to the Transformer Architecture that Reduces the Size of the Key-Value KV Cache by Sharing KV Activations Across Layers

A Decade of Transformation: How Deep Learning Redefined Stereo Matching in the Twenties

LLM for Biology: This Paper Discusses How Language Models can be Applied to Biological Research

Could Brain-Inspired Patterns Be the Future of AI? Microsoft Investigates Central Pattern Generators in Neural Networks

Microsoft Research Suggests Energy-Efficient Time-Series Forecasting with Spiking Neural Networks

Nat. Commun. | 利用transformer模型将质谱数据序列翻译成肽段序列

KnowFormer: A Transformer-Based Breakthrough Model for Efficient Knowledge Graph Reasoning, Tackling Incompleteness and Enhancing Predictive Accuracy Across Large-Scale Datasets

Researchers from MIT and Peking University Introduce a Self-Correction Mechanism for Improving the Safety and Reliability of Large Language Models

AI图像革命才刚刚开始

前沿人工智能的发展现状｜算法与模型