I Tested ChatGPT o3-mini and DeepSeek R1 with 6 Prompts, Here Are the Results

Mina

2025-03-11

OpenAI's o3-mini model is now available in the free tier of ChatGPT. It is a compact yet powerful AI model designed to excel in advanced reasoning, coding capabilities, and mathematical problem-solving, achieving a score of 96.7% in the American Information Mathematics Examination (AIME), surpassing its predecessor o1. The popular Chinese chatbot DeepSeek has proven to be particularly strong in mathematical reasoning and coding tasks, effectively solving complex problems and generating code snippets. With its excellent multilingual capabilities and high reasoning efficiency, this model demonstrates versatility across a broad range of applications. The answers provided by the two models, R1 and V3, are similar, but R1 is able to "think" through answers, providing stronger reasoning capabilities for more detailed responses.

Comparison of the Tests

So how do these two chatbots compare? I prompted them with a series of the same questions to test their capabilities in various aspects ranging. Here’s what happened during the match-up of these free-tier models, including the champion.

1. Lateral Thinking Puzzle

Prompt: “You are in a completely dark room with three light switches on a wall. Each switch controls one of three light bulbs in another room, but you cannot see the bulbs from where you are. You can flip the switches as many times as you want, but you can only enter the bulb room once to check the bulbs. How do you determine which switch controls which bulb?” Both o3-mini and DeepSeek R1 added a layer of clarity by clearly labeling the switches and numbering the steps, making the explanation easier to understand, exhibiting strong logical reasoning skills.

Winner: o3-mini and DeepSeek R1 are evenly matched, both demonstrating strong logical reasoning abilities.

View the detailed answer

2. Deductive Reasoning

Prompt: "A detective is investigating a murder case. He interviews three suspects: Alice, Bob, and Charlie. One of them is guilty, and the other two are telling the truth. Here’s what they say: Alice: "Bob is innocent." Bob: "Charlie is guilty." Charlie: "I am innocent." Who is the murderer?" o3-mini offered a methodical elimination approach: the model systematically assumes each person is guilty and checks for contradictions. The explanation was clear, logical, and not overly complicated.

DeepSeek R1 provided a very structured and logical explanation, with clear steps ensuring there were no contradictions in the final conclusion.

Winner: DeepSeek R1 won for its strongest structure and clarity, making it easier for the reader to understand.

View the detailed answer

3. Mathematical Proof

Prompt: "Prove the Pythagorean theorem using a geometric approach."

o3-mini's explanation followed a well-structured, step-by-step method that was easy to understand. The explanation was neither excessively lengthy nor lacking in necessary details.

DeepSeek R1 produced a correct proof following a logical structure but lacked the conversational response style of o3-mini, which made it less easy to comprehend.

Winner: o3-mini won for its best combination of clarity, detail, and logical flow.

View the detailed answer

4. Scientific Explanation

Prompt: "Explain the process of photosynthesis in detail."

o3-mini provided a detailed description of light-dependent and light-independent reactions, clearly breaking down each step. The progression from capturing light to converting energy into glucose was easy to understand. It broke down the complex process into digestible parts.

DeepSeek R1 covered the two main stages of photosynthesis well; however, compared to the detailed explanation from o3-mini, it did not emphasize the real-world significance of climate change, food security, etc., making the response feel overly concise.

Winner: o3-mini achieved the best balance in depth, clarity, organization, and accuracy.

View the detailed answer

5. Historical Analysis

Prompt: "Analyze the causes and effects of the French Revolution."

o3-mini conducted a comprehensive and well-structured analysis, clearly dividing the causes and effects into different sections, providing in-depth explanations for each factor.

DeepSeek covered key causes well, including social inequalities, economic hardships, and Enlightenment ideas, citing sources but without offering in-depth explanations.

Winner: o3-mini won for its best balance in depth, clarity, organization, and historical analysis.

View the detailed answer

6. Philosophical Discussion

Prompt: "Discuss the concept of utilitarianism and its implications in modern ethics."

o3-mini clearly outlined the key aspects of utilitarianism and act versus rule utilitarianism, covering business ethics, technology, artificial intelligence, and medical ethics well.

DeepSeek R1 covered the core principles effectively and included historical context, but it failed to delve deeply into critiques as o3-mini did. Additionally, the response lacked a strong thematic connection between theory and real-world issues.

Winner: o3-mini provided the most in-depth response, with high clarity and relevance to modern ethical issues.

View the detailed answer

Champion: o3-mini

ChatGPT's o3-mini emerged as the most comprehensive and consistent chatbot in this showdown. In challenges ranging from reasoning, mathematics, scientific explanations, historical analysis, to philosophical discussions, o3-mini repeatedly demonstrated exceptional depth, clarity, organization, and applicability to the real world. o3-mini strikes a balance between detail and readability, providing well-structured and insightful answers that integrate theoretical understanding with practical significance. In four out of the six challenges, o3-mini consistently ranked first, proving to be the most balanced AI model for users seeking thoughtful, clearly expressed, and logically sound answers. While DeepSeek R1 provided valuable assistance in various tasks, o3-mini currently offers the most refined and reliable experience among these free chatbot options.

Of course! If you want to experience more models, feel free to use XXAI! XXAI integrates 15 popular models such as ChatGPT, Claude, Gemini, Perplexity, and DALLE-3, providing users with more intelligent and convenient services.

My Girlfriend Said, "Spring Is Here, But We’re Stuck Working!" So I Used Claude + Dreamina to Bring Spring to Her

Creating a City Promo Video with Claude + Runway