SYMBOLIC-MOE: Mixture-of-Experts MoE Framework for Adaptive Instance-Level Mixing of Pre-Trained LLM Experts

Like humans, large language models (LLMs) often have differing skills and strengths derived from differences in their architectures and training regimens. However, they struggle to combine specialized expertise across different domains, limiting their problem-solving capabilities compared to humans. Specialized models like MetaMath, WizardMath, and QwenMath excel at mathematical reasoning but often underperform on tasks requiring common sense or medical knowledge. Even within specific domains such as mathematics, models show nuanced variations in capability, e.g., one might excel at algebra while another masters geometry. creates a need for frameworks that can identify and select the most appropriate expert models for specific problems.

Existing approaches like Mixture-of-Experts (MoE) models distribute computation across multiple specialized components, with recent emphasis on sparse approaches that activate only the most relevant experts per input. The Sparse MoE (SMoE) method has improved efficiency across vision, language, and multimodal tasks but requires combining models in the parameter space through joint training. More recent frameworks like MoA (Mixture-of-Agents) attempt to address this by combining LLM outputs symbolically. Further, Multi-agent reasoning approaches have emerged as alternatives, such as the Student-teacher technique that distills reasoning capabilities from stronger to weaker agents, while debate frameworks allow multiple agents to refine arguments collectively.

Researchers from UNC Chapel Hill have proposed SYMBOLIC-MOE, a symbolic, text-based, and gradient-free Mixture-of-Experts framework to enable adaptive instance-level mixing of pre-trained LLM experts. It takes a fine-grained perspective by emphasizing specialized skills within broader domains like algebra within mathematics or molecular biology within biomedical reasoning. They also introduced a skill-based recruiting strategy that dynamically selects the most relevant expert LLMs for each specific reasoning task based on their demonstrated strengths. Moreover, SYMBOLIC-MOE outperforms strong LLMs like GPT4o-mini, as well as multiagent approaches, with an absolute average improvement of 8.15% over the best multi-agent baseline.

SYMBOLIC-MOE consists of three stages: model profile creation and aggregator selection followed by expert recruitment and final answer generation, both of which take place during inference. To maximize throughput and efficiency, SYMBOLIC-MOE introduces an innovative batching strategy where all instances are first analyzed to determine which LLMs will be needed. The system then intelligently groups problem instances based on their required experts, allowing each active expert model to receive all relevant instances in a single batch and ensuring each expert is loaded only once. This solution enables efficient batched inference on a single GPU while supporting a diverse pool of 16 LLMs, with the flexibility to add more GPUs for further parallelization.

SYMBOLIC-MOE shows exceptional performance across diverse benchmarks. It consistently outperforms all baseline approaches, surpassing single-model strategies, multi-agent debates with a single model, and multi-model multi-agent frameworks like MoA and ReConcile. It exceeds the strongest multi-agent baseline (Self-MoA) by an impressive 8.15% absolute average improvement, 8.28% on MMLU-Pro, 13.45% on AIME, 4.92% on GPQA, and 6.08% on MedMCQA. SYMBOLIC-MOE achieves comparable or superior performance to larger models with 70B parameters by using four 7-8B parameter models. It outperforms Llama3.3 70B on AIME and GPQA while matching its performance on MedMCQA. Efficiency testing reveals that it operates 44% faster on a single GPU than MoA while achieving better accuracy.

In conclusion, researchers introduced SYMBOLIC-MOE, a scalable MoE framework that combines models through their symbolic output. This method identifies the skills needed for a given problem and recruits agents based on those skills to engage in a discussion about a given input. SYMBOLIC-MOE outperforms standard inference-time scaling methods as well as other debate frameworks and other mixture-of-agents methods, leading to strong performance across domains without human intervention. It’s average performance across heterogeneous tasks is in fact stronger than that of advanced proprietary models such as GPT4o-mini. However, this method has limitations: (a) It involves running multiple models, which increases inference cost, and (b) it relies on skills inferred from a small validation set to set the agent profiles.

Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 80k+ ML SubReddit.

The post SYMBOLIC-MOE: Mixture-of-Experts MoE Framework for Adaptive Instance-Level Mixing of Pre-Trained LLM Experts appeared first on MarkTechPost.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签