热点
关于我们
xx
xx
"
LLM部署
" 相关文章
How to run Qwen 2.5 on AWS AI chips using Hugging Face libraries
AWS Machine Learning Blog
2025-03-13T14:09:46.000000Z
SGLang: An Open-Source Inference Engine Transforming LLM Deployment through CPU Scheduling, Cache-Aware Load Balancing, and Rapid Structured Output Generation
MarkTechPost@AI
2025-02-21T23:30:47.000000Z
Slim-Llama: An Energy-Efficient LLM ASIC Processor Supporting 3-Billion Parameters at Just 4.69mW
MarkTechPost@AI
2024-12-21T00:43:42.000000Z
Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – part 1
AWS Machine Learning Blog
2024-12-03T00:52:16.000000Z