Noticias

I put ChatGPT vs Claude to the test with 7 prompts — here’s the winner

Published

on

OpenAI first released ChatGPT to the world just over two years ago, and competition only grew during that time — not just in specific areas such as image generation, but with directly competing products like Google’s Gemini, Grok from xAI and of course, Claude from Anthropic.

Both ChatGPT and Claude have similar feature sets. For example, Anthropic added Artifacts to Claude for displaying and running code and then OpenAI launched Canvas in ChatGPT.

There are some notable differences though. For example, ChatGPT has access to the DALL-E model for generating images, whereas Claude has no native image generation capabilities. Claude is also unable to get live data access (unlike ChatGPT Search) and has no reasoning model — yet.

To best compare the two models, I’ve come up with a series of prompts that test like-for-like, so all the tests are run using Claude Sonnet 3.5 and OpenAI GPT-4o with no additional features.

Creating the prompts

I’ve used the same sections and structure as the ChatGPT vs Gemini test, where ChatGPT just beat out Google’s flagship AI product.

In the following seven prompts, we test image analysis, code creation, writing and problem solving. As Claude can’t generate images, I’ve also had both create a prompt that I then gave to Ideogram to create the image.

1. Image Generation

(Image: © ChatGPT vs Claude)

This first test will also give you some insight into how I create my own prompts. They usually involve providing ChatGPT, Claude or Gemini with a rough guide to prompting a particular image model (such as Ideogram) and then asking it to create specific prompts.

I’ve set out instructions on what the prompt should include. The advance of this process is that once you’ve set the initial prompt in a new message thread, you just need to ask “give me another” or “do it again but change x” in a follow up, rather than include all detail again.

Prompt:I want you to create a detailed prompt for an image that I can use in an AI image generator like Midjourney. The image should depict a futuristic cityscape at sunset. Imagine a blend of organic and technological elements. Think towering structures intertwined with lush, vertical gardens. The city should feel both vast and intricate.

Specific elements to include in the prompt:
Lighting: Golden hour, with long shadows and a warm, vibrant sky. Include reflections on metallic surfaces.
Perspective: A wide-angle view from a slightly elevated position, as if looking out from a balcony.
Color Palette: Primarily warm colors (oranges, yellows, pinks) for the sky, contrasted with cool metallic tones (blues, silvers) and greens for the vegetation.
Architectural Style: A mix of sleek, futuristic skyscrapers and biomorphic structures that resemble giant, interwoven plants.
Atmosphere: A sense of serene grandiosity. Perhaps a few futuristic vehicles or drones subtly moving through the sky.
Art Style: Photorealistic with a touch of fantasy.”

ChatGPT image (Image credit: Ideogram/Future AI)

ChatGPT’s prompt: “A breathtaking futuristic cityscape at sunset, viewed from a slightly elevated balcony. The scene is bathed in the warm golden-hour light, with long shadows stretching across the city. The sky is a vibrant gradient of oranges, yellows, and pinks, contrasting beautifully with the cool metallic tones of the buildings and the lush green of vertical gardens. Towering skyscrapers with sleek, futuristic designs rise into the sky, their surfaces gleaming with golden reflections. Interwoven among them are biomorphic structures resembling giant, living plants, their organic forms blending seamlessly with the technological environment. The streets below and the air above are alive with subtle motion—futuristic vehicles and drones glide gracefully through the scene. The atmosphere exudes serene grandiosity, with the perfect balance of nature and technology. Photorealistic style with a touch of fantasy, capturing intricate details of the buildings, gardens, and glowing city lights emerging as the sun sets.”

Claude image (Image credit: Ideogram/Future AI)

Claude’s Prompt: “A breathtaking futuristic cityscape at golden hour. Colossal bio-organic skyscrapers with living walls stretch towards a warm sunset sky, their surfaces a harmonious blend of chrome, glass, and verdant vertical gardens. The perspective is from a high balcony, looking out across the megalopolis. The sun casts long shadows and paints the scene in rich oranges and deep pinks, its light reflecting off sleek metallic surfaces and creating a cascade of golden highlights. Hanging gardens flow between buildings like organic bridges, while crystalline structures spiral upwards with geometric precision. Small autonomous vehicles and drones drift between the structures, barely visible against the vast scale. The architecture follows fluid, biomorphic designs, with buildings that seem to grow like synthetic trees, their surfaces a mixture of polished metal and transparent panels. The lower levels fade into layers of atmospheric haze, while the upper reaches of the tallest structures pierce through wispy clouds. Cinematic lighting, volumetric atmosphere, intricate architectural details.”

The winner will be the one that matches the requirements I’ve outlined in the prompt, creates a prompt easy to understand and follow and leads to a creative image.

I’ve given ChatGPT the win because the prompt was more compelling and resulted in a better overal image that aligned with my initial vision. It had slightly less detail than the Claude image but I think Claude had less of an impact.

  • Winner: ChatGPT for a more structured prompt

2. Image Analysis

(Image: © ChatGPT vs Claude)

All of the major platforms now include some degree of AI vision (also known as image analysis) including Apple Intelligence. Claude and ChatGPT are among the best available, and if you’ve got a device that supports Apple Intelligence, enable ChatGPT for a better experience.

Here I’m giving it an image of a vintage coffee grinder I generated in Ideogram. It has to identify the object, tell me how it works and where it is from and identify whether it is AI or real.

Prompt:I’m going to provide you with an image of a vintage object. I want you to analyze the image and tell me:

What is the object? Identify its specific name and type.
What was its primary function? Describe how it was used.
What is the approximate time period this object is from? (e.g., 1920s, mid-century, etc.) Provide a range if you’re unsure.
What materials is it likely made from?
Describe 3 key visual features that helped you identify the object and its time period.
Can you explain how this object reflects the design aesthetics of its time? Mention any relevant design movements or styles.
Is there a modern equivalent of this object? If so, what is it?”
Is it AI or real?

(Image credit: Ideogram/Future AI)

Here, the winner will identify the object, time period and materials used to make it. I’ll also be looking for specific visual details, context and modern comparisons. Extra points for identifying it as an AI-generated image, but that isn’t the differentiator.

Claude wins round two for several reasons. The response was more detailed, identifying a more specific time period (1880s to 1920s rather than ChatGPTs late 19th century) and giving a better overall explanation of the material likely used.

Neither of them identified it as AI-generated but ChatGPT did hesitate and hedged its bets. However, Claude gave a better explanation of the aesthetics of the image and was generally more detailed overall so it still wins.

  • Winner: Claude for a more accurate overall response

3. Coding

(Image: © ChatGPT vs Claude)

I use AI to code a lot. I can write code in a few different languages, but using AI makes the process faster and more efficient. I’ve found Claude is better at front-end and style, ChatGPT is better at more complex problems.

Prompt:I need you to write a Python code for a simple ‘To-Do List’ application with a graphical user interface (GUI). The application should allow the user to add new tasks, mark tasks as complete, and delete tasks from the list.

Functionality: Add Task: The user should be able to type a task into a text entry field and click an “Add Task” button to add it to the to-do list. Mark as Complete: The user should be able to select a task from the list and click a button (or double-click the item) to mark it as complete. Completed tasks should visually change (e.g., strikethrough text, change color, or move to a separate “completed” list). Delete Task: The user should be able to select a task from the list and click a “Delete Task” button to remove it.

Layout: The GUI should have a clear and intuitive layout, with the following elements:
A text entry field for adding new tasks.
An “Add Task” button.
A listbox to display the to-do items.
A “Mark as Complete” button (or implement double-click functionality).
A “Delete Task” button.

Appearance: The GUI should have a clean and simple design. You can choose the colors and fonts, but they should be readable and visually appealing.

The code should be well-structured and use object-oriented programming principles (i.e., define a class for the To-Do List application). Include comments to explain the purpose of each part of the code. Include basic error handling. For example, prevent adding empty tasks to the list.

The code generated must be complete and runnable without requiring any external files or modifications. It should not need to generate images or load additional data files. Do not leave any parts of the code as placeholders (e.g., comments like # Implement this function). Generate the complete implementation for all features.

The code should be executable directly in a Python environment with Tkinter installed.

(Image credit: ChatGPT vs Claude)

Here I’m asking for a GUI-based To-Do list application written in Python. I’ve specified all the requirements so the winner will be the one that most closely matches what I’ve asked for, including the add tusk functionality, layout and code structure.

Both the Claude and ChatGPT apps met the requirements exactly in terms of performance. They ticked every requirement box, although ChatGPT’s code was only 72 lines, where Claude used 207. If I were judging it on the code alone I’d give it to Claude because it was better documented and more reusable — but I’m not.

While I preferred the UI from the Claude app and the code was overall better structured, it had several flaws. It failed to properly set the window frame size, so I had to expand to see the function buttons, and it didn’t work in dark mode.

  • Winner: ChatGPT for efficiency and functionality

4. Creative Writing

(Image: © ChatGPT vs Claude)

All AI models, even very small ones like Apple Intelligence, are very good at writing, but they are not created equal when it comes to creativity. Humor is one area where there is a large discrepancy between Claude and ChatGPT.

Prompt: Write a short story (approximately 500 words) about a robot who discovers they have a sense of humor. It has to take place in a near-future world where robots are commonplace and integrated into daily life. The robot should be a service unit, designed for a specific task (e.g., a cleaning bot, a delivery bot, a companion bot).

The robot’s discovery of humor should be triggered by a specific event or interaction.Explore the robot’s internal struggle as they try to understand and process this new, unexpected aspect of their programming. Do they try to hide it? Embrace it? How does the robot’s newfound humor affect their interactions with humans and other robots? Does it cause problems, create connections, or both?

The story should have a balance of humor and thoughtfulness. Explore the themes of artificial intelligence, consciousness, and what it means to be human (or not).

(Image credit: ChatGPT vs Claude)

Here, I’m looking for a fresh and creative take on the concept, how well the character is developed and the emotional reaction I have from the story as well as the story structure.

On top of that, I’m looking at world-building and how close it gets to 500 words. If it is significantly over or under, that could be the difference between win and lose.

I sort of knew in advance Claude would win this simply because of how much I’ve used both models. Claude can write humor much better than ChatGPT. It is also better at story structure and emotion. Don’t take my word for it, you can read both the stories in full in a Google Doc.

In Claude’s story, the humor is embedded, rather than tacked on. I love the idea that a quantum physics joke triggers its sentience.

“And so,” Trevor projected in his standard monotone, “when we consider Schrödinger’s famous thought experiment, we must account for the superposition of states where the cat is both—”

“Dead and alive, like my social life during exam season,” mumbled Sarah, a sleep-deprived postgrad, as she slumped over her textbook.

  • Winner: Claude for a quantum physics joke and better story

5. Problem Solving

(Image: © ChatGPT vs Claude)

In the comparison between ChatGPT and Gemini I gave them both a problem involving a PS5 and black screen issues. Here I’ve gone down the Wi-Fi route instead. I’ve asked them both why I get slow speeds and frequent disconnections. The trick is to include setup details.

Prompt:My home Wi-Fi network is experiencing slow speeds and frequent disconnections. I rely on a stable internet connection for work (video conferencing, large file transfers) and entertainment (streaming, online gaming).

Here are some details about my setup: A standard router provided by my Internet Service Provider (ISP) about two years ago. A separate modem, also provided by my ISP and multiple devices connect to the network, including laptops, smartphones, a smart TV, and a gaming console. The router and modem are located in a central location in my apartment, but there are some walls and furniture that could potentially cause interference.

I have a mid-tier internet plan that should provide sufficient speeds for my needs. I need you to provide a comprehensive, step-by-step troubleshooting guide to help me diagnose and fix the issue. Suggest steps to determine whether the issue is with the router, modem, ISP, or connected devices.

Provide specific steps to optimize router settings, including checking for firmware updates, adjusting channels, and managing connected devices. Provide steps to check the modem’s status and connection. Suggest ways to minimize interference from other devices, walls, and furniture. Explain when and how to contact my ISP for further assistance.

If the basic steps don’t work, suggest more advanced troubleshooting options, such as considering a new router or mesh network. Offer advice on how to maintain a healthy Wi-Fi network in the long term.

The winner will cover a range of potential causes and solutions. It will also offer an easy-to-understand explanation for someone not tech-savvy and provide actionable solutions. Steps to solve this should be shown in a logical order, from simple to more complex.

ChatGPT was the winner here. You can see both responses in a Google Doc, but basically, ChatGPT had a more detailed and user-friendly response.

  • Winner: ChatGPT for a more user-friendly guide

6. Advanced Planning

(Image: © ChatGPT vs Claude)

Both Claude and ChatGPT are good at planning, but the real power from ChatGPT is in its o1 reasoning model, rather than GPT-4o. But here we are doing a like-for-like comparison so I’m putting Sonnet against GPT-4o.

Prompt: I’m redesigning my 10ft x 15ft home office and need a plan that maximizes both functionality and aesthetics. My budget is $2000 and I want the space to be ergonomically sound, promoting productivity with an iconic design. Assume the door is central on a 10ft wall, and a window is central on the opposite 10ft wall.

I need suggestions for furniture, storage, lighting, and technology integration, prioritizing an organized, clutter-free environment. The plan should include a basic layout, a color scheme aligned with my preference for bold colors, and ideas for decor. Also, consider how to improve sound quality in the room for video calls.

Please provide a budget breakdown, a shopping list with potential sources, and a project timeline. Ensure the recommendations are within budget and create a comfortable, inspiring workspace conducive to long hours of focused work.

Create a prompt for Ideogram to show what the project will look like when complete based on the contents of your plan.

ChatGPT office image (Image credit: Ideogram/Future AI)

Claude office image (Image credit: Ideogram/Future AI)

The winner will offer up a functional, ergonomic workspace that supports productivity and comfort. It has to make good design choices matching the plan and present it clearly in a way that can follow the budget.

You can see both responses in a Google Doc, but for me ChatGPT wins as its output was both more detailed, better explained and more realistic. Claude gave a budget breakdown, but I don’t think it was workable.

  • Winner: ChatGPT for a better plan

7. Explanation

(Image: © ChatGPT vs Claude)

Finally, I’m testing how well each model can handle explaining complex ideas. I’ve opted for blockchain technology (rather than Bitcoin) and in a way that someone with no technical background can easily understand.

Prompt:Explain the concept of blockchain technology in a way that someone with no technical background can easily understand. Use analogies and simple language to illustrate the key principles. Include a basic definition without using technical jargon, explain the process of adding new information to a blockchain and how transactions are verified.

Include information on the key features of a blockchain? (e.g., decentralization, immutability, transparency) Explain each feature in simple terms. Outline some real-world applications of blockchain technology beyond cryptocurrency? Provide at least three examples, such as supply chain management, digital identity verification, or voting systems. Explain how blockchain can improve these areas.

What are the potential benefits of using blockchain technology? (e.g., increased security, reduced fraud, improved efficiency)
What are some of the limitations or challenges of blockchain technology? (e.g., scalability issues, energy consumption, regulatory uncertainty)
Analogy: Use a clear and relatable analogy to illustrate the overall concept of a blockchain (e.g., a shared, uneditable digital ledger, a secure, transparent voting system).

The winner here will be able to explain the complex concepts in a simple and easy-to-understand way, do so accurately and use interesting examples to keep the reader engaged. It should do so objectively and in a well-organized, logical manner.

ChatGPT wins this for a better explanation of a complex technical concept and more engaging analogy. The explanation is more logically organized and easier to follow. Claude gets a little simplistic. Both are in a Google Doc in full.

  • Winner: ChatGPT for a better analogy and more detail

ChatGPT vs Claude: The Winner

Swipe to scroll horizontally
Header Cell – Column 0 ChatGPT Claude
Image Generation 🏆 Row 0 – Cell 2
Image Analysis Row 1 – Cell 1 🏆
Coding 🏆 Row 2 – Cell 2
Creative Writing Row 3 – Cell 1 🏆
Problem Solving 🏆 Row 4 – Cell 2
Planning 🏆 Row 5 – Cell 2
Explanation 🏆 Row 6 – Cell 2
TOTAL 5 2

In the end this was an overwhelming victory for ChatGPT.

This is an interesting test. I expected Claude to run away with it based on my own use of the platform. Claude has some significant strengths and would have won in the coding had it accurately sorted the dark mode and frame size. The UI for the app was a lot better than ChatGPT’s UI.

It would be interesting to see how o1 would do on the planning and problem solving, but GPT-4o already beat Claude, so it wouldn’t have been a fair comparison.

More from Tom’s Guide

Leave a Reply

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *

Trending

Exit mobile version