热点
"层并行" 相关文章
Layer Parallelism: Enhancing LLM Inference Efficiency Through Parallel Execution of Transformer Layers
MarkTechPost@AI 2025-02-14T18:12:51.000000Z