Dynamic Long Short-Term Memory Based Memory Storage For Long Horizon LLM Interaction

cs.AI updates on arXiv.org 07月08日 13:54

Dynamic Long Short-Term Memory Based Memory Storage For Long Horizon LLM Interaction

本文提出一种名为Pref-LSTM的动态轻量级框架，结合BERT分类器和LSTM记忆模块，用于LLM个性化存储，以实现长对话的个性化。研究显示，BERT分类器在识别用户偏好方面表现可靠，表明偏好过滤与LSTM门控原理结合的可行性。

arXiv:2507.03042v1 Announce Type: cross Abstract: Memory storage for Large Language models (LLMs) is becoming an increasingly active area of research, particularly for enabling personalization across long conversations. We propose Pref-LSTM, a dynamic and lightweight framework that combines a BERT-based classifier with a LSTM memory module that generates memory embedding which then is soft-prompt injected into a frozen LLM. We synthetically curate a dataset of preference and non-preference conversation turns to train our BERT-based classifier. Although our LSTM-based memory encoder did not yield strong results, we find that the BERT-based classifier performs reliably in identifying explicit and implicit user preferences. Our research demonstrates the viability of using preference filtering with LSTM gating principals as an efficient path towards scalable user preference modeling, without extensive overhead and fine-tuning.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

LLM 个性化存储 BERT LSTM 偏好识别

相关文章

Import AI 368: 500% faster local LLMs; 38X more efficient red teaming; AI21’s Frankenmodel

xLSTM: Enhancing Long Short-Term Memory LSTM Capabilities for Advanced Language Modeling and Beyond

Learn AI Together — Towards AI Community Newsletter #23

This AI newsletter is all you need #98

Patterns and Middleware for LLM Applications with Kyle Roche - #659

Building LLM-Based Applications with Azure OpenAI with Jay Emery - #657

Mental Models for Advanced ChatGPT Prompting with Riley Goodside - #652

Multimodal, Multi-Lingual NLP at Hugging Face with John Bohannon and Douwe Kiela - #589

Deep Learning for NLP: From the Trenches with Charlene Chambliss - #433

The Unreasonable Effectiveness of the Forget Gate with Jos Van Der Westhuizen - TWiML Talk #240