12 minute read

Why open-source models are disrupting the AI race

Emmanuel Ohiri

Emmanuel Ohiri

There has been a surge of open-source generative AI projects recently. In 2024 alone, developers launched nearly 137,000 new public generative AI projects on GitHub, with a nearly 60% increase in contributions to these projects.

The release of DeepSeek-R1 and DeepSeek-R1-Zero models by the Chinese company DeepSeek made a huge splash in the industry after CEO Liang Wenfeng announced it took just 2,048 Nvidia H800 GPUs at a cost of $5.6 million to create R1.

For context, according to Jensen Huang, founder of NVIDIA, it took 25,000 Ampere-based GPUs to train the GPT-MoE-1.8T model, and it would take about 8,000 H100 GPUs (an improved version of the H800) to train the same model.

deepseek_image_1 DeepSeek models compared to OpenAI’s o1 models. Source: DeepSeek

The biggest disruptive factor of the DeepSeek-R1 models is not even its performance - which we’ll discuss later - it’s the fact that it was made open-source upon its release, giving everyone the ability (and license) to use the models for free.

In this article, we’ll discuss what the release of open-source models means for the industry and the ongoing AI race. Let’s begin by discussing the difference between license types of AI models.

Types of AI licenses

There are different types of licenses for AI models, and even when a model is open-source, it isn't necessarily a free-for-all. While open-source suggests accessibility, the specific license attached to a model dictates precisely how it can be used, modified, and distributed.

Broadly, open-source AI licenses fall into two main categories: permissive and copyleft. Permissive licenses, like the Apache 2.0 or MIT License (which DeepSeek-R1 is licensed as), are very liberal. They generally allow for almost any use of the model, including commercial applications, modifications, and redistribution, often with minimal restrictions. It's like being given a set of LEGOs and being told you can build whatever you want with them, even sell your creations. You just need to give credit where it's due.

Copyleft licenses, on the other hand, are a bit more nuanced. They operate on the principle of "share alike." A prominent example is the GNU General Public License (GPL). If you modify a model released under a copyleft license and then distribute your modified version, you're generally required to release your own modifications under the same copyleft license. It's like being given those LEGOs, building something cool, and then being asked to share your instructions so others can build similar things. This ensures that the benefits of the original open-source work continue to be shared and contribute to a collaborative ecosystem.

License TypeExample ModelsLicense CharacteristicsCommercial UseModification AllowedRedistributionRestrictions
Open-Source (Permissive)BLOOM (Apache 2.0), GPT-Neo (MIT), DeepSeek-R1 (MIT)Broad usage rights, modification, and redistribution with minimal restrictionsYesYesYesProper attribution is required, and no liability on the authors
Open-Source (Copyleft)Fairseq (GPL), Marian (GNU GPL)Requires derivatives to be released under the same license (strong copyleft)YesYesYesMust release modifications under the same license
Permissive-but-RestrictedStable Diffusion (CreativeML), GPT-J (RAIL-M),
Llama (Meta)
Allows modification and redistribution but with specific use-case limitationsLimitedYesLimitedRestrictions on commercial use, harmful content, or certain industries
Proprietary (API Access)GPT-4 (OpenAI), Claude (Anthropic)Access through API; no access to the underlying modelPaid OnlyNoNoNo reverse engineering or competing product creation
Proprietary (EULA)Jasper AI, Bard (Google)Full control by the provider; license agreement governs usePaid OnlyNoNoUsage is restricted to the terms of the agreement
Dual LicensingMySQL for ML-based appsOpen-source for non-commercial; proprietary license for commercial deploymentConditionalYes (under open license)Yes (under open license)Commercial use requires purchasing a proprietary license
Custom Research LicenseCodestral 22B (Mistral)Research-only license; prohibits commercial and unauthorized useNoYesNoFor research purposes only, strict control on redistribution

Beyond these open-source categories, we also see other licensing types in the AI space:

  • Custom research licenses: Some organizations might release models under custom licenses specifically tailored for research purposes. These licenses might place restrictions on commercial use or require specific attribution for research publications. They aim to keep academic exploration while protecting potential commercial interests.
  • Dual licensing: This approach offers the model under two different licenses. One might be a permissive license for non-commercial use, while the other is a commercial license that requires a fee or specific agreement for commercial applications. This allows developers to choose the licensing option that best suits their needs.
  • API access (Proprietary): While not strictly open-source, many companies offer access to their AI models through APIs. This gives developers the ability to use the model's capabilities without having access to the underlying code or weights. OpenAI’s ChatGPT is an example of a proprietary model, and they are typically granted access to use it through a paid subscription or usage-based pricing. It's more like renting the LEGO creation rather than owning the LEGOs themselves.
  • Permissive-but-restricted: Some licenses might appear permissive at first glance but contain specific restrictions. For example, a license might allow commercial use but prohibit certain types of applications, like those related to defense or surveillance. These restrictions can be subtle, so careful review is essential.

These licenses define the boundaries of permissible activity when using an AI model. These are broad categories, and with each category, the exact license agreement for the model might be different. For example, Meta specified in their Llama 3 license agreement that you cannot use it to train other larger models, and if you have to use it for a product or service that exceeds 700 million monthly active users, you’ll need a separate license from Meta.

Why DeepSeek is making a difference

To see how DeepSeek performs against other models, we conducted our own benchmark tests. We focused on four key categories—reasoning, summarization, factual recall, and creative writing—measuring both latency (response time) and token (inference) efficiency (average token count).

Disclaimer: OpenAI has claimed that the model was partially distilled from OpenAI’s ChatGPT.

There’s been a lot of comparisons done on GPT-4o with DeepSeek-R, and you can see some of those results in the DeepSeek paper. Our benchmarks compared some of their smaller models with GPT-4, averaging results over multiple trials for accuracy.

deepseek_image_2 Benchmark results comparing DeepSeek (8B parameters) and GPT-4 across various tasks.

As shown in the graph, DeepSeek performs comparably to GPT-4 in certain categories and even outperforms it in others. DeepSeek's performance一especially, given how little training costs一is a testament to its brilliant engineering, using techniques like cold-start data pre-training and reinforcement learning to fine-tune reasoning capabilities.

DeepSeek’s success promises more highly capable free-to-use AIs in the hands of developers, which portends well for the future evolution and development of AI.

Considering its limited access to advanced chips due to sanctions, DeepSeek also highlights the ongoing challenges in compute and data limitations in AI. While its benchmark performance is noteworthy, its model isn’t quite as good at certain tasks. For example, it didn’t do as well in reasoning tasks as ChatGPT, and you can see that in its answers to certain prompts.

We asked both models to explain why the sky appears blue to the human eye. They were prompted multiple times, so we only took one answer from the same round of questioning.

Prompt: Explain why the sky appears blue to the human eye.

Here is how DeepSeek answered the question:

Response: The sky appears blue to the human eye due to the scattering of sunlight by molecules and small particles in the Earth's atmosphere. This phenomenon is known as Rayleigh scattering. Here's a step-by-step explanation:`

  • 1. Sunlight Composition: The sun emits light with various wavelengths, but most of it is in the form of visible light, with a peak in the blue and green regions of the spectrum.
  • 2. Rayleigh Scattering: When sunlight enters......

Here is how ChatGPT answered the same question:

Response: The sky appears blue to the human eye due to a phenomenon called Rayleigh scattering. When sunlight reaches the Earth's atmosphere, it is made up of a spectrum of different colors, with blue light having a shorter wavelength than other colors in the spectrum. When sunlight encounters molecules and particles in the Earth's atmosphere, such as oxygen and nitrogen molecules, they scatter the light in different directions. The blue color of the sky is more noticeable when looking directly overhead because there is a larger volume of atmosphere that the sunlight has to pass through, leading to more scattering of the blue light. At sunrise and sunset, when the sun is lower on the horizon, the light has to pass through even more atmosphere, causing the shorter wavelength blue light to be scattered out even more, giving rise to the beautiful hues of orange and red that we see during those times of day. ...`

ChatGPT's response demonstrates a deeper understanding of the underlying physics, explaining the role of wavelength and atmospheric conditions more comprehensively than DeepSeek. While DeepSeek provides a basic explanation of Rayleigh scattering, it lacks the nuance and detail present in ChatGPT's response. While it is normal for specific models to outperform others for certain tasks, you can see that both for token creation and latency, the ChatGPT model also performed better.

If you want to run your own benchmark tests, you can use our pre-built setup. You can find it on GitHub.

All of these lead us to consider what the advantages and disadvantages of using open-source models are.

Pros and cons of using open-source AI models

With the release and open-sourcing of DeepSeek-R1and DeepSeek-R1-Zero models, there is a lot of consideration for developers and organizations looking to integrate an AI model into their products and/or services. Let's weigh the advantages and disadvantages of using open-source models:

Advantages of using open-source AI models

  • Accessibility and cost-effectiveness: The easiest advantage of using an open-source model is that you can access AI models for free, allowing smaller teams, individual researchers, and even hobbyists to experiment with cutting-edge technology without the hefty price tag associated with proprietary models.
  • Customization and flexibility: Open-source models can be modified and fine-tuned to suit specific needs. You have access to the underlying code and weights, allowing you to adapt the model for unique applications or datasets. This level of customization is simply not possible with closed-source, proprietary models.
  • Community and collaboration: Open-source projects often have vibrant communities of developers and researchers who contribute to the model's improvement. Such a collaborative environment can lead to faster bug fixes, new features, and a wealth of shared knowledge. It's like having a team of experts working together to make the model better.
  • Transparency and auditability: With access to the code, developers can inspect the inner workings of the model, understand its limitations, and identify potential biases. This transparency is important for building trust and ensuring responsible AI development.
  • Rapid innovation: The open-source model allows for rapid iteration and experimentation. Researchers can quickly test new ideas and contribute their improvements back to the community, accelerating the pace of innovation in the field.

Disadvantages of using open-source models

  • Maintenance and support: While community support can be a huge asset, it's not always guaranteed. Open-source projects may lack dedicated support teams, and users might need to rely on community forums or their own expertise to troubleshoot issues.
  • Security risks: Open-source models can be more vulnerable to security risks if not properly maintained. Malicious actors could potentially introduce vulnerabilities into the code, which could be exploited by others. Careful code review and security best practices are essential.
  • Licensing complexities: As discussed earlier, navigating the various open-source licenses can be challenging. You need to carefully understand the terms of each license to ensure compliance and avoid legal issues.
  • Quality and reliability: The quality and reliability of open-source models can vary significantly. Some models might be well-documented and thoroughly tested, while others might be experimental or still in development. It's important to carefully evaluate the model's maturity and suitability for the intended application.
  • Potential for misuse: The open availability of powerful AI models also raises concerns about potential misuse. These models could be used for malicious purposes, such as generating misinformation or creating deepfakes. Responsible development and deployment practices are crucial to mitigate these risks.

Oopen-source AI models offer a powerful and accessible way to use and deploy cutting-edge generative AIs without the resources to train from scratch.

Concluding thoughts

While proprietary companies have held the keys to the most powerful models, DeepSeek's release figuratively throws open the doors, democratizing AI access in an unprecedented way, and this is a good thing. Imagine a world where startups, researchers, and even individual developers can access, use, and collaboratively build cutting-edge AI for free.

DeepSeek's approach could be the catalyst that finally breaks down the barriers to entry, putting AI development within reach of anyone with the drive and the skills, which might start a wave of innovation, leading to the development of AI-powered tools and applications that we can't even conceive of today.

The pressure is now on the proprietary AI companies. They can no longer afford to operate in a vacuum, charging for API access while equally capable open-source alternatives are growing. DeepSeek's move could force a long-overdue reckoning, compelling these companies to rethink their pricing strategies and offer more competitive options. Consumers, ultimately, stand to benefit.

However, open-source AI also comes with challenges, from security concerns to the potential for misuse. These are real issues that need to be addressed head-on. We need robust security protocols, clear ethical guidelines, and ongoing community engagement to ensure that these powerful tools are used responsibly. The open-source community needs to step up and ensure that these models are not just accessible but also safe and reliable.

If you need simple compute resources for testing and prototyping your AI models, you can access the latest NVIDIA GPUs on demand at a cost-effective rate. For companies looking to train generative AIs at a cost-effective rate while prioritizing performance and scalability, use our enterprise solution. We also offer bare metal servers and scalable GPU clusters optimized for AI, HPC, and deep learning. Get started today, or contact us to learn more.

Starting from $2.15/hr

NVIDIA H100's are now available on-demand

A cost-effective option for AI, VFX and HPC workloads. Prices starting from $2.15/hr

Subscribe to our Newsletter

Subscribe to the CUDO Compute Newsletter to get the latest product news, updates and insights.