DeepSeek-V3: An Economical and Public Alternative to Major AI Players
DeepSeek-V3 model: A budget-friendly and open challenge to the major players in AI.
According to the company’s internal testing, the model excels in English, Chinese, coding, and mathematics, competing with top commercial models, including OpenAI's GPT-4o. With a substantial size of 671 billion parameters, DeepSeek-V3 significantly exceeds the capabilities of both Meta's Llama 2 and Llama 3.1, likely enhancing its performance.
Users can easily test the model's capabilities through a free chatbot available on DeepSeek's website. This interactive platform helps users search the web and also elucidates the model's reasoning process by visualizing its thought steps.
While matching the performance of what the community designates as "frontier models," DeepSeek-V3 stands out with its remarkably lower development and operational costs. DeepSeek claims a training expenditure of only $5.5 million, a small fraction compared to the estimated over $100 million spent by OpenAI on GPT-4.
The pricing for DeepSeek-V3's online services also reflects its cost-effective strategy, with 1 million tokens offered at just $1.1, and a promotional rate of $0.28 currently available, contrasting sharply with GPT-4o's $10 pricing.
DeepSeek-V3's broad availability for free download and local execution presents significant benefits for users who prioritize data privacy, operate in areas with limited internet connectivity, or desire enhanced control over their AI solutions. This access is notably different from models like Microsoft’s Copilot, Google’s Gemini, and OpenAI’s GPT series, which necessitate a stable internet connection.
For companies focused on data security, utilizing a local version of DeepSeek-V3 provides an effective way to leverage advanced AI while safeguarding sensitive information.
Nonetheless, the large size of DeepSeek-V3 poses a substantial challenge for home users: executing the model requires powerful hardware that exceeds the capabilities of typical PCs and smartphones. As a result, many individual users may prefer the convenience of the free chatbot.
Currently, there are few verified real-world instances of successful local execution, and independent validation of the company’s performance claims is still required. One blogger noted that the model could operate on a cluster of eight Apple Mac Mini Pros, each equipped with an M4 chip and 64 gigabytes of RAM, with the entire setup costing over $10,000.
DeepSeek acknowledges the model's bulk and less-than-optimal speed, attributing these challenges to existing hardware limitations. However, they remain optimistic that future hardware advancements will naturally address these constraints.
As outlined in a research paper available on the company’s website, DeepSeek strives for the development of artificial general intelligence while adhering to principles of open access and long-term growth.
Emily Johnson for TROIB News