Just recently, xAI, the AI company founded by Elon Musk, unveiled its latest AI model, Grok 3. Musk claims that Grok 3’s capabilities have been enhanced by "an order of magnitude" compared to its predecessor, Grok 2. He also describes Grok 3 as an AI "pursuing ultimate realism," even if that realism sometimes contradicts "political correctness."
In areas like mathematical reasoning, scientific logic, and code writing, Grok 3 has outperformed other models like DeepSeek-v3, GPT-4o, and Gemini-2 Pro in several benchmark tests. Musk was quick to praise Grok 3 as "the smartest AI on Earth," highlighting its exceptional abilities in these critical fields.
According to xAI, Grok 3 surpassed GPT-4o in several key benchmarks, including AIME (evaluating the model’s performance in mathematical problems) and GPQA (testing the model's ability to handle PhD-level problems in physics, biology, and chemistry). Early versions of Grok 3 also performed excellently in Chatbot Arena, a crowdsourced platform where different AI models compete, and users vote for the most accurate responses.
A major contributor to Grok 3’s speed and performance was its powerful Colossus supercomputer. The Colossus system, built in just eight months, boasts 100,000 NVIDIA H100 GPUs and a total training time of 200 million GPU hours—ten times the capacity of its predecessor, Grok 2. This massive computational power allows Grok 3 to process vast datasets more quickly, significantly improving model accuracy.
Not only was the hardware upgraded, but xAI also optimized the software. Grok 3’s performance was further enhanced through improvements in the training process, the use of synthetic datasets, self-correction techniques, and reinforcement learning. These combined innovations have made Grok 3 a powerhouse in tackling complex tasks.
The development of Grok 3 was significantly accelerated by the Colossus supercomputer. Built in a mere eight months, Colossus uses 100,000 NVIDIA H100 GPUs and has logged 200 million GPU hours of training time—ten times the capacity of the previous Grok 2. This scale of computational power has allowed Grok 3 to handle enormous datasets with greater speed, leading to a noticeable boost in model accuracy.
Beyond the hardware upgrades, xAI also implemented software optimizations. By refining the training process, incorporating synthetic datasets, self-correction, and reinforcement learning, Grok 3’s performance has been significantly enhanced. These technological improvements ensure Grok 3 excels in handling complex tasks with remarkable precision.
Grok 3 comes in two variants—Grok 3 Reasoning and Grok 3 Mini Reasoning—that are designed to think through problems more deeply, similar to OpenAI's o3-mini and DeepSeek's R1. These reasoning models conduct thorough fact-checking before providing answers, reducing errors that often trouble standard models.
xAI claims that Grok 3 Reasoning has surpassed the best version of o3-mini (o3-mini high) in several popular benchmark tests, including a new mathematical test called AIME 2025. Users can access the reasoning models via the Grok app, and for more challenging problems, they can enable the “Big Brain” mode for deeper, more cautious reasoning. xAI emphasizes that these modes are ideal for tasks involving math, science, and programming.
Premium+ users on the X platform will be the first to experience Grok 3, with other features being integrated into a new subscription service called SuperGrok. SuperGrok is priced at \$30 per month or \$300 per year, providing additional access to reasoning models, DeepSearch queries, and unlimited image generation.
Musk also revealed that a "Voice Mode" will be launched within the next week, and Grok 3, along with the DeepSearch feature, will be integrated into xAI’s enterprise API in the coming weeks.
It’s incredible how fast the AI world is evolving! From DeepSeek’s R1 model at the start of the year to the release of Grok 3 now, and even OpenAI teasing the upcoming GPT 4.5 and GPT 5, we are seeing rapid progress. While DeepSeek focuses on delivering high value at a lower cost, Grok 3’s “big investment for big results” approach proves that the Scaling Law is still in effect, with the 200,000 GPUs behind Grok 3 pushing it to a level that was previously unimaginable.
For users, this is all good news. We can look forward to more breakthroughs and innovations from AI models. After all, who wouldn’t want to witness firsthand how the world is changing? I can only say that I’ve opened my arms wide, ready to embrace the exciting future AI has in store!