Shift to Smaller Language Models (SLMs)

  • 0
  • 3008
Font size:
Print

Shift to Smaller Language Models (SLMs)

Context:

The trend of scaling LLMs began with GPT-3 (175 billion parameters) by OpenAI in 2020, followed by GPT-4 (1.7 trillion parameters). However, by 2024, the focus shifted towards smaller language models due to diminishing returns from scaling LLMs further.

What is SLM?

  • Small Language Models (SLMs) are distinguished by their compact generative AI structures.
  • They feature fewer parameters and utilise a reduced volume of training data.
  • The reduced footprint results in lower memory and processing demands.
  • SLMs are well-suited for on-device deployments and applications prioritising resource efficiency.
  • SLMs are compact versions of larger language models (LLMs), broadening high-quality model options for customers and offering practical choices. 

Development of Smaller Language Models (SLMs):

  • In 2024, Big Tech firms began exploring smaller models:
    • Google DeepMind released Gemini Ultra, Nano, and Flash models.
    • OpenAI and Meta launched smaller versions like GPT-4o mini and Llama 3.
    • Anthropic AI launched Claude 3 and Haiku alongside Opus.

Advantages of Small Language Models:

  • Cost and Efficiency: SLMs are cheaper, require less time and computational resources, and are ideal for specialized tasks.
  • Specific Use Cases: They excel at focused applications rather than general AI tasks, making them suitable for edge devices like smartphones.
  • Examples:
    • Mistral AI, a startup, offers small models that are as efficient as LLMs for specific use cases.
    • Microsoft’s Phi-3-mini, with 3.8 billion parameters.
    • Apple Intelligence, running on iPhones and iPads, uses on-device SLMs for specific applications.

Drawbacks of Small Language Models:

  • Limited Complexity: While efficient for basic tasks like language translation, SLMs struggle with complex benchmarks such as coding or solving logical problems.
  • Performance Ceiling: Smaller parameter counts inherently limit their problem-solving capacity compared to Large Language Models (LLMs).

Use Cases for Large vs. Small Models:

  • Small Language Models (SLMs): Great for focused, simpler tasks such as translation, basic customer service, and other specific applications. Example: WhatsApp using Llama 8B for language learning.
  • Large Language Models (LLMs): Excel at more complex tasks like coding, logical reasoning, and solving intricate problems.
  • The analogy to human brains: Just like humans have large brains for complex tasks, LLMs have a larger parameter size for more advanced capabilities, while SMs are designed for narrower, simpler tasks.

Relevance for India:

  • In India, where AI adoption is growing but resources may be limited, SMs are ideal due to their affordability and ability to meet specific needs.
  • Projects like Visvam from IIIT Hyderabad are building small language models tailored for healthcare, agriculture, education, and promoting language and cultural diversity.
  • Sarvam AI is also working to create AI solutions that cater to the needs of a billion Indians.

Share:
Print
Apply What You've Learned.
Previous Post Using Bacterial Enzymes to Degrade Plasticizers
Next Post HAL’s Combat Air Teaming System
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x