MIT Technology Review » Artificial Intelligence 07月18日 01:42
How to run an LLM on your laptop
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了本地运行大型语言模型(LLMs)的兴起及其吸引力,尤其是在隐私担忧、数据控制和技术自主性日益凸显的当下。从早期需要昂贵硬件到如今普通笔记本甚至智能手机皆可运行,技术进步极大地降低了门槛。本地模型为用户提供了摆脱大型AI公司控制、保护个人数据隐私的途径,并允许用户更深入地理解和控制AI行为。尽管本地模型在性能上可能不及顶级在线服务,但它们能帮助用户培养对AI局限性的直觉,并提供了极大的灵活性和乐趣。

💻 **技术门槛降低,普惠性增强**:过去运行大型语言模型需要昂贵的GPU服务器,但随着模型优化和压缩技术的进步,现在普通笔记本电脑甚至智能手机也能运行部分模型。Ollama和LM Studio等工具的出现,进一步简化了本地模型的下载和使用流程,使其不再是技术专家的专属领域,普通用户也能轻松体验。

🔒 **隐私保护与数据主权**:与在线LLM服务(如ChatGPT、Gemini)可能使用用户数据进行模型训练不同,本地模型允许用户在本地环境中运行,无需将敏感对话上传至云端,从而有效保护个人隐私。这对于关注数据安全和不愿被大型科技公司追踪的用户来说,是一个重要的优势。

💡 **AI权力分散与自主控制**:本地运行LLMs是打破少数科技巨头对AI技术垄断的一种方式,有助于实现AI权力的分散。用户可以独立选择、配置和调整模型,不受第三方平台频繁更新和潜在行为变更的影响,拥有更高的自主权和更可预测的使用体验。

🧠 **理解AI局限性的实践**:虽然本地模型在性能上可能不如大型在线模型,但其局限性(如“幻觉”现象)却能帮助用户更好地理解AI的工作原理和潜在风险。通过与这些模型互动,用户可以培养对AI能力的直观认知,从而更审慎地使用更强大的在线AI工具。

🚀 **本地LLM的未来与乐趣**:本地LLM的兴起不仅是技术发展的体现,也为用户带来了探索和实践的乐趣。即使是手机上运行的小型模型,虽然性能有限,但其可玩性也为用户提供了一种独特的互动体验,满足了部分用户的好奇心和技术探索欲。

MIT Technology Review’s How To series helps you get things done. 

Simon Willison has a plan for the end of the world. It’s a USB stick, onto which he has loaded a couple of his favorite open-weight LLMs—models that have been shared publicly by their creators and that can, in principle, be downloaded and run with local hardware. If human civilization should ever collapse, Willison plans to use all the knowledge encoded in their billions of parameters for help. “It’s like having a weird, condensed, faulty version of Wikipedia, so I can help reboot society with the help of my little USB stick,” he says.

But you don’t need to be planning for the end of the world to want to run an LLM on your own device. Willison, who writes a popular blog about local LLMs and software development, has plenty of compatriots: r/LocalLLaMA, a subreddit devoted to running LLMs on your own hardware, has half a million members.

For people who are concerned about privacy, want to break free from the control of the big LLM companies, or just enjoy tinkering, local models offer a compelling alternative to ChatGPT and its web-based peers.

The local LLM world used to have a high barrier to entry: In the early days, it was impossible to run anything useful without investing in pricey GPUs. But researchers have had so much success in shrinking down and speeding up models that anyone with a laptop, or even a smartphone, can now get in on the action. “A couple of years ago, I’d have said personal computers are not powerful enough to run the good models. You need a $50,000 server rack to run them,” Willison says. “And I kept on being proved wrong time and time again.”

Why you might want to download your own LLM

Getting into local models takes a bit more effort than, say, navigating to ChatGPT’s online interface. But the very accessibility of a tool like ChatGPT comes with a cost. “It’s the classic adage: If something’s free, you’re the product,” says Elizabeth Seger, the director of digital policy at Demos, a London-based think tank. 

OpenAI, which offers both paid and free tiers, trains its models on users’ chats by default. It’s not too difficult to opt out of this training, and it also used to be possible to remove your chat data from OpenAI’s systems entirely, until a recent legal decision in the New York Times’ ongoing lawsuit against OpenAI required the company to maintain all user conversations with ChatGPT.

Google, which has access to a wealth of data about its users, also trains its models on both free and paid users’ interactions with Gemini, and the only way to opt out of that training is to set your chat history to delete automatically—which means that you also lose access to your previous conversations. In general, Anthropic does not train its models using user conversations, but it will train on conversations that have been “flagged for Trust & Safety review.” 

Training may present particular privacy risks because of the ways that models internalize, and often recapitulate, their training data. Many people trust LLMs with deeply personal conversations—but if models are trained on that data, those conversations might not be nearly as private as users think, according to some experts.

“Some of your personal stories may be cooked into some of the models, and eventually be spit out in bits and bytes somewhere to other people,” says Giada Pistilli, principal ethicist at the company Hugging Face, which runs a huge library of freely downloadable LLMs and other AI resources.

For Pistilli, opting for local models as opposed to online chatbots has implications beyond privacy. “Technology means power,” she says. “And so who[ever] owns the technology also owns the power.” States, organizations, and even individuals might be motivated to disrupt the concentration of AI power in the hands of just a few companies by running their own local models.

Breaking away from the big AI companies also means having more control over your LLM experience. Online LLMs are constantly shifting under users’ feet: Back in April, ChatGPT suddenly started sucking up to users far more than it had previously, and just last week Grok started calling itself MechaHitler on X.

Providers tweak their models with little warning, and while those tweaks might sometimes improve model performance, they can also cause undesirable behaviors. Local LLMs may have their quirks, but at least they are consistent. The only person who can change your local model is you.

Of course, any model that can fit on a personal computer is going to be less powerful than the premier online offerings from the major AI companies. But there’s a benefit to working with weaker models—they can inoculate you against the more pernicious limitations of their larger peers. Small models may, for example, hallucinate more frequently and more obviously than Claude, GPT, and Gemini, and seeing those hallucinations can help you build up an awareness of how and when the larger models might also lie.

“Running local models is actually a really good exercise for developing that broader intuition for what these things can do,” Willison says.

How to get started

Local LLMs aren’t just for proficient coders. If you’re comfortable using your computer’s command-line interface, which allows you to browse files and run apps using text prompts, Ollama is a great option. Once you’ve installed the software, you can download and run any of the hundreds of models they offer with a single command

If you don’t want to touch anything that even looks like code, you might opt for LM Studio, a user-friendly app that takes a lot of the guesswork out of running local LLMs. You can browse models from Hugging Face from right within the app, which provides plenty of information to help you make the right choice. Some popular and widely used models are tagged as “Staff Picks,” and every model is labeled according to whether it can be run entirely on your machine’s speedy GPU, needs to be shared between your GPU and slower CPU, or is too big to fit onto your device at all. Once you’ve chosen a model, you can download it, load it up, and start interacting with it using the app’s chat interface.

As you experiment with different models, you’ll start to get a feel for what your machine can handle. According to Willison, every billion model parameters require about one GB of RAM to run, and I found that approximation to be accurate: My own 16 GB laptop managed to run Alibaba’s Qwen3 14B as long as I quit almost every other app. If you run into issues with speed or usability, you can always go smaller—I got reasonable responses from Qwen3 8B as well.

And if you go really small, you can even run models on your cell phone. My beat-up iPhone 12 was able to run Meta’s Llama 3.2 1B using an app called LLM Farm. It’s not a particularly good model—it very quickly goes off into bizarre tangents and hallucinates constantly—but trying to coax something so chaotic toward usability can be entertaining. If I’m ever on a plane sans Wi-Fi and desperate for a probably false answer to a trivia question, I now know where to look.

Some of the models that I was able to run on my laptop were effective enough that I can imagine using them in my journalistic work. And while I don’t think I’ll depend on phone-based models for anything anytime soon, I really did enjoy playing around with them. “I think most people probably don’t need to do this, and that’s fine,” Willison says. “But for the people who want to do this, it’s so much fun.”

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

本地LLM AI隐私 技术自主 大语言模型 AI发展
相关文章