While large language AI models continue to make headlines, small language models are where the action is. At least, that’s what Meta appears to be betting on, according to a paper recently released by a team of its research scientists.
Large language models, like ChatGPT, Gemini, and Llama, can use billions, even trillions, of parameters to obtain their results. The size of those models makes them too big to run on mobile devices. So, the Meta scientists noted in their research, there is a growing need for efficient large language models on mobile devices — a need driven by increasing cloud costs and latency concerns.
In their research, the scientists explained how they created high-quality large language models with fewer than a billion parameters, which they maintained is a good size for mobile deployment.
Contrary to prevailing belief emphasizing the pivotal role of data and parameter quantity in determining model quality, the scientists achieved results with their small language model comparable in some areas to Meta’s Llama LLM.
“There’s a prevailing paradigm that ‘bigger is better,’ but this is showing it’s really about how parameters are used,” said Nick DeGiacomo, CEO of Bucephalus, an AI-powered e-commerce supply chain platform based in New York City.
“This paves the way for more widespread adoption of on-device AI,” he told TechNewsWorld.
A Crucial Step
Meta’s research is significant because it challenges the current norm of cloud-reliant AI, which often sees data being crunched in far-off data centers, explained Darian Shimy, CEO and founder of FutureFund, a venture capital firm in San Francisco.
“By bringing AI processing into the device itself, Meta is flipping the script — potentially reducing the carbon footprint associated with data transmission and processing in massive, energy-consuming data centers and making device-based AI a key player in the tech ecosystem,” he told TechNewsWorld.
“This research is the first comprehensive and publicly shared effort of this magnitude,” added Yashin Manraj, CEO of Pvotal Technologies, an end-to-end security software developer, in Eagle Point, Ore.
“It is a crucial first step in achieving an SLM-LLM harmonized approach where developers can find the right balance between cloud and on-device data processing,” he told TechNewsWorld. “It lays the groundwork where the promises of AI-powered applications can reach the level of support, automation, and assistance that have been marketed in recent years but lacked the engineering capacity to support those visions.”
Meta scientists have also taken a significant step in downsizing a language model. “They are proposing a model shrink by order of magnitude, making it more accessible for wearables, hearables, and mobile phones,” said Nishant Neekhra, global director of business development at Skyworks Solutions, a semiconductor company in Westlake Village, Calif.
“They’re presenting a whole new set of applications for AI while providing new ways for AI to interact in the real world,” he told TechNewsWorld. “By shrinking, they are also solving a major growth challenge plaguing LLMs, which is their ability to be deployed on edge devices.”
High Impact on Health Care
One area where small language models could have a meaningful impact is in medicine.
“The research promises to unlock the potential of generative AI for applications involving mobile devices, which are ubiquitous in today’s health care landscape for remote monitoring and biometric assessments,” Danielle Kelvas, a physician advisor with IT Medical, a global medical software development company, told TechNewsWorld.
By demonstrating that effective SLMs can have fewer than a billion parameters and still perform comparably to larger models in certain tasks, she continued, the researchers are opening the door for widespread adoption of AI in everyday health monitoring and personalized patient care.
Kelvas explained that using SLMs can also ensure that sensitive health data can be processed securely on a device, enhancing patient privacy. They can also facilitate real-time health monitoring and intervention, which is critical for patients with chronic conditions or those requiring continuous care.
She added that the models could also reduce the technological and financial barriers to deploying AI in healthcare settings, potentially democratizing advanced health monitoring technologies for broader populations.
Reflecting Industry Trends
Meta’s focus on small AI models for mobile devices reflects a broader industry trend towards optimizing AI for efficiency and accessibility, explained Caridad Muñoz, a professor of new media technology at CUNY LaGuardia Community College. “This shift not only addresses practical challenges but also aligns with growing concerns about the environmental impact of large-scale AI operations,” she told TechNewsWorld.
“By championing smaller, more efficient models, Meta is setting a precedent for sustainable and inclusive AI development,” Muñoz added.
Small language models also fit into the edge computing trend, which is focusing on bringing AI capabilities closer to users. “The large language models from OpenAI, Anthropic, and others are often overkill — ‘when all you have is a hammer, everything looks like a nail,’” DeGiacomo said.
“Specialized, tuned models can be more efficient and cost-effective for specific tasks,” he noted. “Many mobile applications don’t require cutting-edge AI. You don’t need a supercomputer to send a text message.”
“This approach allows the device to focus on handling the routing between what can be answered using the SLM and specialized use cases, similar to the relationship between generalist and specialist doctors,” he added.
Profound Effect on Global Connectivity
Shimy maintained the implications SLMs could have on global connectivity are profound.
“As on-device AI becomes more capable, the necessity for continuous internet connectivity diminishes, which could dramatically shift the tech landscape in regions where internet access is inconsistent or costly,” he observed. “This could democratize access to advanced technologies, making cutting-edge AI tools available across diverse global markets.”
While Meta is leading the development of SLMs, Manraj noted that developing countries are aggressively monitoring the situation to keep their AI development costs in check. “China, Russia, and Iran seem to have developed a high interest in the ability to defer compute calculations on local devices, especially when cutting-edge AI hardware chips are embargoed or not easily accessible,” he said.
“We do not expect this to be an overnight or drastic change though,” he predicted, “because complex, multi-language queries will still require cloud-based LLMs to provide cutting-edge value to end users. However, this shift towards allowing an on-device ‘last mile’ model can help reduce the burden of the LLMs to handle smaller tasks, reduce feedback loops, and provide local data enrichment.”
“Ultimately,” he continued, “the end user will be clearly the winner, as this would allow a new generation of capabilities on their devices and a more promising overhaul of front-end applications and how people interact with the world.”
“While the usual suspects are driving innovation in this sector with a promising potential impact on everyone’s daily lives,” he added, “SLMs could also be a Trojan Horse that provides a new level of sophistication in the intrusion of our daily lives by having models capable of harvesting data and metadata at an unprecedented level. We hope that with the proper safeguards, we are able to channel these efforts to a productive outcome.”