AI Accuracy Concerns: Top Chatbots Produce False Information 15% of the Time
According to a December 2025 report on AI chatbots’ role in everyday jobs, ChatGPT ranks as the least reliable tool for work tasks. The study by casino games aggregator Relum examined 10 major chatbots to reveal how they perform in real workplace situations.
- ChatGPT scores worst for workplace reliability, making up incorrect information 35% of the time despite having the highest market share at 81%.
- Google Gemini has the highest hallucination rate at 38%, giving false information more often than any chatbot tested.
- Grok and DeepSeek are the most reliable options for work, recording almost no downtime.
The research evaluated each chatbot based on four main factors: how often they make up false information (hallucination rate), what customers think of them (product ratings), how well they maintain quality across different tasks (response consistency), and how often the service goes down (downtime rate). Each chatbot received a reliability risk score from 0 to 99, with higher scores showing bigger problems. The hallucination rate mattered most in the scoring, counting for 50% of the total.
Here’s a look at the 10 chatbots ranked by their reliability risk scores:
| Chatbot | Hallucination Rate (%) | Product Rating According to Customers (0-5) | Quality and Response Consistency (0-5) | Downtime Rate (%) | Reliability Risk Score |
| ChatGPT | 35 | 4.7 | 4.0 | 0.81 | 99 |
| Claude | 17 | 4.4 | 3.5 | 0.77 | 75 |
| Meta AI | 15 | 3.4 | 3.4 | 0.10 | 70 |
| Botpress | 15 | 4.5 | 4.5 | 0.37 | 41 |
| Perplexity AI | 13 | 4.6 | 3.5 | 0.18 | 31 |
| Google Gemini | 38 | 4.4 | 4.0 | 0.05 | 13 |
| Microsoft Copilot | 27 | 4.4 | 4.0 | 0.10 | 11 |
| Grok | 8 | 4.5 | 3.5 | 0.07 | 6 |
| DeepSeek | 14 | 4.7 | 3.5 | 0.00 | 4 |
| Kimi | 13 | 4.5 | 4.3 | 0.06 | 3 |
You can access the complete research findings here.
1. ChatGPT
- Hallucination rate: 35%
- Product rating: 4.7 out of 5
- Quality and response consistency: 4 out of 5
- Downtime rate: 0.81%
ChatGPT is the least reliable chatbot for work tasks. Despite having the highest customer rating at 4.7 and controlling 81% of the market, ChatGPT makes up false information in more than one-third of its responses. The chatbot also experiences more downtime than most competitors, going offline 0.81% of the time over the past two months.
2. Claude
Claude is the second most unreliable chatbot. It hallucinates 17% of the time, less than half ChatGPT’s rate, but still gets things wrong often enough to potentially cause problems at work. Claude holds only 0.99% of the market and scores 4.4 for customer satisfaction. The service goes down 0.77% of the time, nearly matching ChatGPT’s downtime issues.
LambdaTest Rebrands to TestMu AI, the Agentic AI platform for Quality Engineering
3. Meta AI
While Meta AI’s hallucination rate sits at 15%, the chatbot struggles with overall performance quality. Customers give it just 3.4 out of 5 stars, the lowest rating among major chatbots tested. Response consistency is also low (3.4 score), meaning users get answers of varying quality depending on what they ask. Meta AI still holds a 20% market share and rarely goes down.
4. Botpress
Botpress is next on the list. The chatbot hallucinates 15% of the time and maintains a 4.5 rating for both customer satisfaction and response consistency. Botpress also charges $89 per month, making it the priciest option among business chatbots. The service experiences 0.37% downtime, which adds up to several hours of unavailability each month.
5. Perplexity AI
Perplexity AI ranks as a relatively more reliable option among the five. The chatbot has a 13% hallucination rate and earns a 4.6 customer rating. Response quality sits at 3.5 out of 5, showing room for consistency improvements. Perplexity AI takes up more than 10% of market share despite charging $40 monthly, double what ChatGPT costs. The service rarely experiences problems, showing only 0.18% downtime during the two-month testing period.
Razvan-Lucian Haiduc, Chief Product Officer at Relum, commented on the study.
“About 65% of US companies now use AI chatbots in their daily work, and nearly 45% of employees admit they’ve shared sensitive company information with these tools. These numbers show well how important chatbots have become in everyday work. Dependence on AI tools will likely increase even more, so companies should choose their chatbots based on how reliable and fit they are for their specific business needs. A chatbot that everyone uses isn’t necessarily the one that works best for your industry or gives accurate answers for your tasks.”
Latest Posts
- India Space Push: ISRO Speeds Up Space Station Plans, 80 Satellites Lined Up
February 10, 2026 | Breaking News, Science & Technology, Technology - India Set to Clear Record ₹3.25 Lakh Crore Deal for 114 Rafale Jets Ahead of Macron Visit
February 10, 2026 | Breaking News, India - Tina Ambani Skips Probe Agency ED’s Summons In Money Laundering Case
February 10, 2026 | Breaking News, India - Sensex Rises by 247 Points In Early Trade, Nifty Up 80 Points
February 10, 2026 | Breaking News, Business - Muhammad Yunus Makes Clarion Call For ‘Yes’ Vote For Bangladesh Referendum
February 10, 2026 | Breaking News, Politics, World - Donald Trump Threatens To Stop Opening Of Canada-US Bridge Amid Trade Row
February 10, 2026 | Breaking News, Politics, World - Doda Police Register FIR, Detain Man Over Alleged Disrespect to National Flag
February 10, 2026 | Breaking News, Doda, Jammu Kashmir - UK PM Keir Starmer Refuses To Quit As Pressure Builds Over Epstein Files
February 10, 2026 | Breaking News, Politics, World - US Bolsters Military Presence in Middle East as Trump Warns Iran, Tehran Says It Is Not Intimidated
February 10, 2026 | Breaking News, Politics, World - US Military Bases In The Gulf Under Iran’s Ballistic Missile Shadow
February 10, 2026 | Breaking News, Politics, World
