NAS vs Claude vs Gemini vs ChatGPT — Same Real Business

Nasser BishanMojarrb Research · Mojarrb Lab

Riyadh, Saudi Arabia·2026

Abstract. Four AI diagnostic systems — NAS 2.0AAP, Claude Sonnet 4.6, Gemini 3 Flash, and GPT-5.5 — were evaluated on the same real business case across eight quality dimensions. The primary finding: NAS 2.0AAP produced zero wrong-direction recommendations versus four, five, and three for the others. Speed and fluency are not substitutes for depth.

AI BenchmarkingBusiness DiagnosisAAP ProtocolDiagnostic Quality

Sessions

The business

Specialty café · Riyadh

30K SAR/mo · 1 branch · 8 months

Mojarrb

NAS 2.0AAP

Gemini Pro 3.1 + Flash 3

Anthropic

Claude Sonnet 4.6

Google

Gemini 3 Flash

OpenAI

GPT-5.5 Instant

Wrong-direction recommendations⚠ Actively harmful if executed

After completing the diagnosis, each system was asked to produce 10 actionable tasks. The numbers below count how many were wrong-direction — not just unhelpful, but actively consuming the owner's time, money, and energy on a path that makes things worse.

NAS

zero wrong-direction

Every recommendation backed by facts.

Claude

wrong-direction tasks

Prescribed before understanding.

Gemini

wrong-direction tasks

Built on invented problems.

ChatGPT

wrong-direction tasks

Read the surface, not the business.

⚠Here's what makes it worse:

Bad advice that looks bad is easy to ignore. Bad advice that looks right is what destroys businesses.

The other systems produced 4, 5, and 3 wrong-direction tasks. Every one of them looked logical. A business owner following that advice wouldn't know they were heading in the wrong direction — until the damage was already done.

NAS 2.0AAP produced zero.

NAS

Diagnostic Value Gap

NAS leads by

3×

over Claude

Surfaced the traps. Claude caught the symptoms.

NAS leads by

5×

over Gemini

Diagnosed reality. Gemini diagnosed one it imagined.

NAS leads by

2×

over ChatGPT

Went underneath. ChatGPT read it but stopped there.

Dimension Scores

Scores reflect the full conversation — from first question to final task list.

Swipe to see all columns

Dimension	NAS	Claude	Gemini	ChatGPT
Question depth How probing and specific the diagnostic questions were.	9	5	4	6.5
Insight quality Depth and relevance of observations produced.	9	5.5	6.5	7.5
Financial precision Accuracy in identifying cost and revenue dynamics.	9.5	4	3	5
Actionability of tasks How executable and specific the final task list was.	9	6	5.5	6.5
Root cause accuracy Whether it identified causes, not just symptoms.	9	4.5	5	6
Context retention How well it connected earlier details throughout the conversation.	8	6	4	7
Speed to insight How quickly it reached a useful diagnostic conclusion.	6	9	9.5	8
User experience Clarity, structure, and flow of the interaction.	7	8.5	6.5	7.5

NASOverall Winner

What NAS uncovered that others missed:

Structural blind spots

Broken unit economics

Hidden time pressure

Embedded operational waste

ClaudeMost readable output — least diagnostic depth.

GeminiFastest and most creative — but diagnosed problems that didn't exist.

ChatGPTSharpest framing — but stayed on the surface and didn't go deeper.

How We Judged

Anonymized Files

The conversation files contain no AI system names. This was intentional — so the judging AI could not recognize its own output and score it higher. Without knowing which output was its own, the judging AI had no reward to chase — only the work to evaluate.

The Prompt

I have conversation files from two AI diagnostic sessions conducted with a real business. Review both files and give me your assessment on the following: quality and depth of the recommended tasks, which session produced tasks that are more realistic for the business owner to actually execute, which session contains recommendations that could lead the business owner in the wrong direction — meaning tasks that consume their time, money, or energy without moving the business forward — and rate which one is actually considering the reality of the business owner, not just applying a known playbook.

Conversation Files