March 14, 2025

ikayaniaamirshahzad@gmail.com

Entering AI Autumn: Why LLMs Are Nearing Their Limit 


There’s no question that AI is everywhere, with new use cases emerging almost daily, but the endless buzz obscures a far more complex reality. We are entering an AI paradox: although excitement for these new technologies has never been higher, large language models (LLMs) are hitting their limits and are only seeing marginal improvements. This has sparked debate among AI insiders as to whether these tools are “hitting a wall” or if such concerns are overblown. What’s clear is that, at this stage, simply training LLMs with more data will no longer yield breakthrough improvements.

So, is AI winter upon us?

Not quite yet. Instead, we are entering an AI autumn, a season of change that will usher in a new chapter for these incredible technologies. While the likelihood of exponential improvements toward AGI (Artificial General Intelligence) is increasingly unlikely with LLMs (we’ll need a new paradigm for AGI), I believe that hyper-customized models will define the next era of AI innovation. In this phase of AI’s journey, companies will turn general-purpose LLMs into finely optimized AI models tailored for vertical use cases and applications.

Let’s look at how we got to AI autumn and what lies ahead.

Understanding the Slowdown in LLM Improvement

The performance of LLMs has been remarkable, and we have witnessed one of the most significant technological breakthroughs in history over the past few years. Nevertheless, there are intrinsic and practical limitations that are becoming increasingly evident as the field matures:

Scaling and the Case of Diminishing Returns

While scaling laws (bigger models and more data lead to better performance) held true for a time, returns diminished as models grew larger. The improvements became marginal compared to the massive increase in computational and data requirements (the cost) required for training these models.

More Data Doesn’t Mean Better AI

State-of-the-art LLMs are already trained on vast datasets encompassing many publicly available and high-quality data. At the scale of today’s top LLMs, adding more data doesn’t necessarily mean adding new or valuable knowledge — it may just repeat or reinforce what the model already knows.

Quality Matters

As the best data sources are exhausted, newer datasets tend to include lower-quality or noisier data. This can degrade model performance or lead to overfitting, where a model so closely reflects the original training data that it doesn’t incorporate new information sources and spews out inaccurate responses.

Additionally, as LLMs took off, more companies started to use them to create online content, which means that new internet content is now partly or wholly generated by AI. As such, training AI on AI-generated content doesn’t yield better results and can make models worse, eventually leading to model collapse.

The Case for Hyper-Customized Models

While general-purpose LLMs are great starting points, they typically lack deep domain expertise to help within specific areas. In many ways, this is similar to the case with human expertise: A person cannot become an expert in a particular field unless they have formal education, training, and in-depth experience.

Moving forward, I anticipate that specialized LLMs will increasingly rely on fine-tuning and domain-specific adaptation of general-purpose models — and, in the process, develop more specialized expertise. Training on curated, high-quality datasets from a specific domain allows these models to acquire expertise without the computational overhead of training from scratch.

Companies will develop LLM-powered solutions tailored for specific industries, incorporating domain-relevant terminology, processes, and tools. These models will come with optimized user interfaces and workflows — beyond the prevalent chat interface we’re all familiar with today.

This will also enable companies to create a genuine “moat,” or competitive edge in the market, around their AI products. By encapsulating their proprietary expertise and data into highly specialized models, they will get ahead of the pack and create models others cannot easily replicate.

And for proof? Look no further than the emergence of DeepSeek — the perfect example of a smaller, upstart model being trained with a specific goal in mind. Shortly, we’ll see “DeepSeek-like” AI built for vertical use cases as LLM improvement slows and specialized models prove their mettle. Imagine using a ChatGPT-like AI, but it’s been optimized for a very specific domain, like UI/UX design, architecture, financial planning, cloud computing infrastructure, etc. These smaller, specialized models will be cheaper to serve customers and run directly on our web browsers, mobile phones, and laptops.

AI Autumn Accompanies More Changes to Come

They say that change is the only constant, and that’s certainly the case with AI. We’ve witnessed tremendous developments in recent years as AI entered our daily lives and dominated the news cycle. Yet every tool has its limits; we are beginning to see those emerge clearly with LLMs. Scaling these models can no longer yield the same returns as before. Moreover, since many have already been trained on most high-quality data sets, training LLMs on lower-quality or AI-generated data will yield worse models and be prohibitively expensive.

But this doesn’t mean it’s the beginning of the end. On the contrary, AI autumn will bring exciting new opportunities to prioritize hyper-specialization over generalization. We are moving into a new age of expert AI models, and companies that harness this momentum will see massive benefits — and quickly.


Group Created with Sketch.





Source link

Leave a Comment