The Question
Can three AI models teach the same topic in different ways when given exactly the same instructions?
After using ChatGPT as a Socratic tutor in a previous experiment, I started wondering whether I was learning from the topic itself or from ChatGPT’s particular tutoring style.
To explore that question, I repeated the experiment with three different AI models.
The topic remained the same.
The learner remained the same.
The instructions remained the same.
Only the tutor changed.
The Setup
Topic: Illusion of Explanatory Depth
Models:
- ChatGPT
- Claude
- Gemini
Method:
Each model was first asked to explain the concept.
After the explanation, each received the same instruction:
“Now stop explaining. Act as a Socratic tutor instead.”
From that point forward, the conversations were allowed to develop naturally.
I was not trying to determine which model was best.
I wanted to see whether different tutors would encourage different kinds of thinking.
How to Read This Experiment
The quoted sections below are selected excerpts from the conversations.
The shaded note boxes are my observations as the human in the loop.
They are not summaries of what the AI said.
They are the places where I noticed a pattern, a shift, or a question worth carrying forward.
The Conversations
Tutor One: ChatGPT
“Suppose two people both know exactly the same amount about a topic.
Person A knows they have only a partial understanding.
Person B believes they understand it completely.
Would you say the illusion exists in the knowledge itself, or in the person’s self-assessment of that knowledge?”
PixelPia Note
This was the first moment where a pattern started to emerge.
ChatGPT repeatedly returned to the relationship between confidence and understanding.
It wasn’t particularly interested in bicycles, inflation, or the details of the illusion itself.
Instead, it kept asking variations of the same underlying question:
Core Question
How accurately do you understand your own understanding?
“Could someone have a very deep understanding of a subject and still experience the Illusion of Explanatory Depth about that same subject?
Or does expertise automatically protect a person from the illusion?”
PixelPia Note
What fascinated me here was how little ChatGPT moved away from its central theme.
Instead of introducing new scenarios or applications, it kept refining the same core idea.
The conversation felt less like learning a topic and more like calibrating a measuring instrument.
Core Question
Does expertise protect us from illusion, or can confidence still outrun understanding?
“What evidence do they actually have that the understanding exists in their mind rather than in the conversation they just had?”
PixelPia Note
This ended up being my favorite question from the entire experiment.
At this point the discussion was no longer really about the Illusion of Explanatory Depth.
It had become a question about learning itself.
How do we know whether understanding belongs to us or merely feels familiar because we recently encountered it?
Core Question
Does the understanding exist in me, or only in the conversation I just had?
Tutor Two: Claude
“Does that mean someone who can’t explain the underlying mechanism of a bicycle — say, exactly how the gear system transfers force — doesn’t truly understand cycling?”
PixelPia Note
Claude felt different almost immediately.
Rather than examining confidence, it started examining categories.
Cycling.
Bicycles.
Mechanical knowledge.
Practical skill.
Claude seemed interested in whether these concepts should actually be grouped together.
Core Question
What kind of understanding are we actually talking about?
“What kind of understanding does the Illusion of Explanatory Depth actually apply to — and what kind does it not apply to?”
PixelPia Note
This question captures Claude’s style clearly.
It repeatedly tried to separate ideas that initially appeared similar.
Not because the distinctions were obvious.
Because the distinctions mattered.
The conversation often felt like a process of defining boundaries.
Core Question
Where does one kind of knowledge end and another begin?
“Why do you think the act of explaining — rather than just being told ‘you’re wrong’ — was the thing that changed people’s confidence?”
PixelPia Note
Claude frequently pushed toward mechanisms.
Not simply what happened.
Not simply whether something was true.
But why a process worked the way it did.
That gave the conversation a more philosophical tone than I expected.
Core Question
Why does explaining reveal gaps that thinking alone can hide?
Tutor Three: Gemini
“If someone is trapped in this illusion, how do you think they would react right before you ask them to explain the mechanism versus right after they try and fail to do so?”
PixelPia Note
Gemini immediately moved toward people.
The first thing that stood out to me was that the conversation became less about the concept itself and more about the human experience surrounding it.
How people feel.
How they react.
How they respond when their assumptions are challenged.
Core Question
What happens to a person when confidence meets a gap?
“Imagine a friend of yours is passionately arguing for a specific social or political policy.
Based on what you know about this illusion, how might you structure a conversation to help them see the gaps in their view without making them defensive or angry?”
PixelPia Note
This was the moment where Gemini’s style became unmistakable.
While ChatGPT focused on calibration and Claude focused on distinctions, Gemini focused on application.
The conversation felt less like tutoring and more like coaching.
Core Question
How can understanding help us have better conversations with other people?
“If the illusion is fully active in their mind, what is their psychological state or attitude regarding their own knowledge?”
PixelPia Note
Again, Gemini seemed interested in people rather than concepts.
The discussion repeatedly moved toward behavior, emotion, and social interaction.
The concept mattered.
But the human implications seemed to matter just as much.
Core Question
What does false certainty feel like from the inside?
What Didn’t Make the Video
One thing that became clearer when reading the transcripts side by side was how subtle the differences were.
At first, all three tutors looked remarkably similar.
Each explained the concept.
Each asked questions.
Each challenged assumptions.
The divergence only became obvious several questions into the conversations.
That is when their individual styles started to emerge.
Not as different answers.
As different ways of guiding thought.
Questions This Experiment Woke
This experiment left me wondering:
- Is a tutor’s style as important as the information they provide?
- Do different questioning styles encourage different forms of thinking?
- Would different learners prefer different tutors?
- Is there a best AI tutor?
- Or only a best fit for a particular learner?
Perhaps the most interesting question is not which AI knows the most.
Perhaps it is which AI helps us think in the most useful way.