Scaling Laws for LLMs, the Actual Cost of Frontier Models, 3 Key Principles for AI at Scale, and More

Happy Weekend! Here's a comprehensive AI reading list from this past week. Thanks to all the incredible authors for creating these helpful articles and learning resources.

I put one of these together each week. If reading about AI updates and topics is something you enjoy, make sure to subscribe.

Society's Backend is reader supported. You can support my work (these reading lists and standalone articles) for 80% off for the first year (just $1/mo). You'll also get the extended reading list each week.

A huge thanks to all supporters. ?

Get 80% off for 1 year

Next Week on Society’s Backend

This next week, I’ll be sharing two articles:

I’ll be resharing my most popular article about machine learning infrastructure. I sent it out back when I only had 300 subscribers so most of you have missed it. It’s still applicable today and contains valuable info.

I’ll be sharing an article about Call of Duty’s data science problem for paid subscribers first. This will be all about the current issues with Call of Duty’s most recent game and why players seem to have a different opinion on the games success than the studio working on it. It’s an excellent reminder of the complexities of data and how issues can stem from bad data or improper interpretations of data (or both!).

I also hope to write guest posts for others in the coming months. I’ll make sure to keep you all updated when I do. ?

What Happened Last Week

OLMo 2, Phi-4, and CES (a lot of Nvidia announcements!) all happened this past week. If you want the full update and why its important, here are some resources I think are worth checking out:

The Weekly Kaitchup by

The Artificial Ignorance Weekly Roundup by

AI Made Simple Content Recommendations by for more in-depth reading on AI topics.

You can also catch last week’s reading list here:

Reading List

Scaling Laws for LLMs: From GPT-3 to o3

Recent advancements in large language models (LLMs) are largely driven by scaling, with larger models and more data leading to better performance. However, researchers are now questioning the effectiveness of continued scaling due to reports of a plateau in progress. Despite these concerns, scaling laws remain a crucial framework for predicting and improving LLM performance.

Source

Prophecies of the Flood

AI researchers are increasingly predicting the imminent arrival of supersmart AI systems, known as Artificial General Intelligence (AGI), which could dramatically change society. Recent benchmarks show that new AI models, like OpenAI's o3, are outperforming human experts in challenging tasks, indicating rapid advancements. However, there are concerns about our preparedness for these technologies and the need for broader discussions on their ethical use and societal impact.

Source

DeepSeek V3 and the actual cost of training frontier AI models

DeepSeek AI released its DeepSeek-V3 model on December 26, 2024, showcasing impressive training efficiency compared to peers. The model's development involved significant technical innovations, allowing it to perform well on limited GPU resources. Despite skepticism about its novelty, DeepSeek's success demonstrates that effective engineering and strategy can lead to significant advancements in AI at lower costs than previously expected.

Source

The Engine

Gulliver encounters a machine called the "Engine" in Jonathan Swift's *Gulliver's Travels*, which generates sentences using pieces of wood covered with words. This machine serves as a satirical commentary on the potential of mechanized writing, likening it to modern language models that generate text through probabilistic techniques. Swift critiques the idea of reducing creativity to mere mechanics while hinting at the absurdity of technology replacing human effort in writing.

Source

3 Key Principles for AI at Scale [Part 2]

The three key principles for effective machine learning at scale are scalability, efficiency, and velocity. Scalability focuses on having the necessary infrastructure to train multiple models, while efficiency emphasizes maximizing resource use and tracking metrics for improvement. Velocity aims to optimize the experimentation process, allowing companies to conduct experiments faster and stay competitive in AI development.

Source

AI x Computing Chips: How to Use Artificial Intelligence to Design Better Chips [Breakdowns]

AlphaChip, an AI developed by Google, uses reinforcement learning to significantly speed up the computer chip design process by optimizing component placement. This method learns from previous designs and sequentially makes decisions, improving efficiency and accuracy without heavy reliance on human experts. By leveraging a graph neural network, AlphaChip captures complex relationships between chip components, leading to better layouts and faster convergence in design iterations.

Source

Texas Plows Ahead

The Texas Responsible AI Governance Act (TRAIGA) aims to regulate AI by combating algorithmic discrimination and ensuring ethical deployment through strict compliance requirements for developers and users. It creates a powerful regulator, the Texas Artificial Intelligence Council, with broad authority to enforce rules and impose significant penalties for non-compliance. Despite some improvements, TRAIGA is still seen as a complex and burdensome regulatory framework that may lead to excessive censorship in AI development.

Source

13 Free AI Courses on AI Agents in 2025

In 2025, numerous free courses are available to learn about AI agents, which can perform complex tasks autonomously. These courses cover various topics, from multi-agent systems and prompt engineering to building AI agents and managing workflows. They are designed for both beginners and experienced professionals, providing essential knowledge to excel in the evolving AI landscape.

Source

5 Highlights From Society's Backend in 2024

Society's Backend grew from 345 to 3,445 members in 2024, thanks to the support of paid subscribers. Key highlights included creating a free machine learning roadmap, discussing the role of AI in software engineering, and collaborating with AI professionals to share insights. The author plans to enhance content quality and explore new formats in 2025 while continuing to support understanding of AI.

Source

Noteworthy AI Research Papers of 2024 (Part One)

This article highlights key advancements in large language model (LLM) research from early 2024, focusing on methods like low-rank adaptation (LoRA) and continued pretraining. Notable papers discuss the effectiveness of LoRA in retaining knowledge while learning new tasks and the introduction of a vast 15 trillion token dataset to aid LLM training. The author plans to continue the review with more exciting developments in the second part.

Source

Next Week on Society’s Backend

What Happened Last Week

Reading List

Scaling Laws for LLMs: From GPT-3 to o3

Prophecies of the Flood

DeepSeek V3 and the actual cost of training frontier AI models

The Engine

3 Key Principles for AI at Scale [Part 2]

AI x Computing Chips: How to Use Artificial Intelligence to Design Better Chips [Breakdowns]

Texas Plows Ahead

13 Free AI Courses on AI Agents in 2025

5 Highlights From Society's Backend in 2024

Noteworthy AI Research Papers of 2024 (Part One)

We ask again: Is AI a science?

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签