The Open Source Initiative recently dropped the first version of its Open Source AI Definition (OSAID). While not legally enforceable, the idea is to provide the industry a concrete standard for what it really means to be โopen sourceโ in artificial intelligence.
The term โopen sourceโ is bandied around a lot and some tech companies that claim to be open source actually arenโt. As more companies utilise AI, standards and governance will be increasingly imperative to business practice.
Meta is just one of the companies that has branded its Llama large language models as open source, despite not meeting the new criteria.
And this is important.
For businesses looking to adopt AI, OSAID isnโt just about semantics โ itโs being positioned as a safeguard for transparency, security and keeping AI accessible.
What is open source AI and why does this definition matter?
Open source AI refers to AI systems that are fully accessible to the public in their code, data and design, allowing anyone to use, modify and share the model without restrictions. This openness ensures AI systems can be studied, improved and applied transparently, benefiting a wide community of developers and users.
In an industry where terms like โAIโ and โopen sourceโ are thrown around loosely, the release of OSAID brings some long-overdue clarity.
With policymakers worldwide drafting AI regulations โ including here in Australia โ a definition like OSAID gives everyone a common standard and helps regulators identify which models should receive open source benefits, such as favourable compliance treatment.
OSIโs executive vice president Stefano Maffulli notes the European Commission, for example, has been closely watching the open source AI space. This definition, Maffulli says, could be a guidepost for what โopenโ means as laws start to catch up with AI.
Open source AI standards
So what does it actually take for a model to qualify as open source AI under OSAID? The definition has some very specific requirements, or โfour freedomsโ, that go beyond just free access.
To meet OSAIDโs standard, an AI model must:
- Use the system for any purpose and without having to ask for permission;
- Study how the system works and inspect its components;
- Modify the system for any purpose, including to change its output; and
- Shareย the system for others to use with or without modifications, for any purpose.
These criteria create a framework for what transparency and accountability should look like in open source AI.
For businesses looking to bring AI into their operations, this definition aims to ensure theyโre working with tools that can be understood, audited and trusted.
The OSAID standard can also help businesses spot models that are โopen sourceโ in name only. Truly open source models are transparent, letting companies see what data the model was trained on and how it processes information. This visibility is essential for ensuring data compliance, detecting bias and strengthening security.
In an era where the legal and ethical implications of AI are under scrutiny, using AI tools that donโt align with OSAID could result in compliance issues.
Of course, some argue the OSAID doesnโt go far enough in specifying licensing and data transparency requirements, which are essential for businesses that depend on stable, well-documented AI tools.
OSI has acknowledged that future updates to its OSAID will likely address these nuances, especially as questions around intellectual property and licensing for training data evolve.
As a result, OSI has set up a committee to monitor the standardโs real-world application and recommend subsequent updates.
Open source AI is complicated
The new standard is certainly going to shine a light on companies claiming the open source label without delivering on the details.
In the case of Meta, its Llama models are labelled as open source but fail the OSAID standard regarding restrictive licensing.
Llamaโs license limits some commercial uses and prohibits activities that could be harmful or illegal. Meta has also been reluctant to disclose details about Llamaโs training data.
Meta argues these restrictions serve as โguardrailsโ against potential misuse, which remains a significant concern in AI development. Without controls, powerful AI models could be repurposed for harmful applications like deepfakes, misinformation or other unethical uses.
These are real risks, and Metaโs position reflects a broader ethical debate within the open source community: how to balance transparency and unrestricted use with responsibility and safety. For Meta, keeping certain restrictions in place allows it to reduce misuse risks without sacrificing model accessibility.
The OSI also acknowledges the need for privacy in some situations. Its transparent about the exclusion of some types of training data, like in the medical field, due to privacy laws.
“We want Open Source AI to exist also in fields where data cannot be legally shared,” the FAQs read.
Interestingly, Meta, along with other big tech companies such as Amazon, Google, Microsoft, Cisco, Intel and Salesforce, actually helps support the OSIโs work.
On the one hand, this does show an investment in open source software. But it also raises questions around potential conflicts of interest when a company may support the OSI financially but doesnโt adhere to its standards.
“While the OSI is very grateful for their support, the OSI does not endorse these or any other companies,” the website reads.
This dynamic reveals the complexities at play when the very organisations that fund open source initiatives also have a vested interest in shaping what โopen sourceโ means in AI.
This is all the more reason for international and independent standards to exist โ in order to ensure labels such as โopen sourceโ donโt just become a marketing tool to peddle AI models.
Never miss a story: sign up toย SmartCompanyโsย free daily newsletterย and find our best stories onย LinkedIn.
Comments