Meta just launched the largest ‘open’ AI model in history—here’s why it matters

Credit: Julio Lopez from Pexels

In the world of artificial intelligence (AI), a battle is underway. On one side are companies that believe in keeping the datasets and algorithms behind their advanced software private and confidential. On the other are companies that believe in allowing the public to see what’s under the hood of their sophisticated AI models.

Think of this as the battle between open- and closed-source AI.

In recent weeks, Meta, the parent company of Facebook, took up the fight for open-source AI in a big way by releasing a new collection of large AI models. These include a model named Llama 3.1 405B, which Meta’s founder and chief executive, Mark Zuckerberg, says is “the first frontier-level open source AI model.”

For anyone who cares about a future in which everybody can access the benefits of AI, this is good news.

The danger of closed-source AI—and the promise of open-source AI

Closed-source AI refers to models, datasets and algorithms that are proprietary and kept confidential. Examples include ChatGPT, Google’s Gemini and Anthropic’s Claude.

Though anyone can use these products, there is no way to find out what dataset and source codes have been used to build the AI model or tool.

While this is a great way for companies to protect their intellectual property and their profits, it risks undermining public trust and accountability. Making AI technology closed-source also slows down innovation and makes a company or other users dependent on a single platform for their AI needs. This is because the platform that owns the model controls changes, licensing and updates.

There are a range of ethical frameworks that seek to improve the fairness, accountability, transparency, privacy and human oversight of AI. However, these principles are often not fully achieved with closed-source AI due to the inherent lack of transparency and external accountability associated with proprietary systems.

In the case of ChatGPT, its parent company, OpenAI, releases neither the dataset nor code of its latest AI tools to the public. This makes it impossible for regulators to audit it. And while access to the service is free, concerns remain about how users’ data are stored and used for retraining models.

By contrast, the code and dataset behind open-source AI models is available for everyone to see.

This fosters rapid development through community collaboration and enables the involvement of smaller organizations and even individuals in AI development. It also makes a huge difference for small and medium size enterprises as the cost of training large AI models is colossal.

Perhaps most importantly, open source AI allows for scrutiny and identification of potential biases and vulnerability.

However, open-source AI does create new risks and ethical concerns.

For example, quality control in open source products is usually low. As hackers can also access the code and data, the models are also more prone to cyberattacks and can be tailored and customized for malicious purposes, such as retraining the model with data from the dark web.

An open-source AI pioneer

Among all leading AI companies, Meta has emerged as a pioneer of open-source AI. With its new suite of AI models, it is doing what OpenAI promised to do when it launched in December 2015—namely, advancing digital intelligence “in the way that is most likely to benefit humanity as a whole,” as OpenAI said back then.

Llama 3.1 405B is the largest open-source AI model in history. It is what’s known as a large language model, capable of generating human language text in multiple languages. It can be downloaded online but because of its huge size, users will need powerful hardware to run it.

While it does not outperform other models across all metrics, Llama 3.1 405B is considered highly competitive and does perform better than existing closed-source and commercial large language models in certain tasks, such as reasoning and coding tasks.

But the new model is not fully open, because Meta hasn’t released the huge data set used to train it. This is a significant “open” element that is currently missing.

Nonetheless, Meta’s Llama levels the playing field for researchers, small organizations and startups because it can be leveraged without the immense resources required to train large language models from scratch.

Shaping the future of AI

To ensure AI is democratized, we need three key pillars:

  • Governance: regulatory and ethical frameworks to ensure AI technology is being developed and used responsibly and ethically.
  • Accessibility: affordable computing resources and user-friendly tools to ensure a fair landscape for developers and users.
  • Openness: datasets and algorithms to train and build AI tools should be open source to ensure transparency.

Achieving these three pillars is a shared responsibility for government, industry, academia and the public. The public can play a vital role by advocating for ethical policies in AI, staying informed about AI developments, using AI responsibly and supporting open-source AI initiatives.

But several questions remain about open-source AI. How can we balance protecting intellectual property and fostering innovation through open-source AI? How can we minimize ethical concerns around open-source AI? How can we safeguard open-source AI against potential misuse?

Properly addressing these questions will help us create a future where AI is an inclusive tool for all. Will we rise to the challenge and ensure AI serves the greater good? Or will we let it become another nasty tool for exclusion and control? The future is in our hands.

Provided by
The Conversation

This article is republished from The Conversation under a Creative Commons license. Read the original article.The Conversation

Citation:
Meta just launched the largest ‘open’ AI model in history—here’s why it matters (2024, August 3)
retrieved 3 August 2024
from https://techxplore.com/news/2024-08-meta-largest-ai-history.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.