AI Accuracy Concerns: Top Chatbots Produce False Information 15% of the Time
According to a December 2025 report on AI chatbots’ role in everyday jobs, ChatGPT ranks as the least reliable tool for work tasks. The study by casino games aggregator Relum examined 10 major chatbots to reveal how they perform in real workplace situations.
- ChatGPT scores worst for workplace reliability, making up incorrect information 35% of the time despite having the highest market share at 81%.
- Google Gemini has the highest hallucination rate at 38%, giving false information more often than any chatbot tested.
- Grok and DeepSeek are the most reliable options for work, recording almost no downtime.
The research evaluated each chatbot based on four main factors: how often they make up false information (hallucination rate), what customers think of them (product ratings), how well they maintain quality across different tasks (response consistency), and how often the service goes down (downtime rate). Each chatbot received a reliability risk score from 0 to 99, with higher scores showing bigger problems. The hallucination rate mattered most in the scoring, counting for 50% of the total.
Here’s a look at the 10 chatbots ranked by their reliability risk scores:
| Chatbot | Hallucination Rate (%) | Product Rating According to Customers (0-5) | Quality and Response Consistency (0-5) | Downtime Rate (%) | Reliability Risk Score |
| ChatGPT | 35 | 4.7 | 4.0 | 0.81 | 99 |
| Claude | 17 | 4.4 | 3.5 | 0.77 | 75 |
| Meta AI | 15 | 3.4 | 3.4 | 0.10 | 70 |
| Botpress | 15 | 4.5 | 4.5 | 0.37 | 41 |
| Perplexity AI | 13 | 4.6 | 3.5 | 0.18 | 31 |
| Google Gemini | 38 | 4.4 | 4.0 | 0.05 | 13 |
| Microsoft Copilot | 27 | 4.4 | 4.0 | 0.10 | 11 |
| Grok | 8 | 4.5 | 3.5 | 0.07 | 6 |
| DeepSeek | 14 | 4.7 | 3.5 | 0.00 | 4 |
| Kimi | 13 | 4.5 | 4.3 | 0.06 | 3 |
You can access the complete research findings here.
1. ChatGPT
- Hallucination rate: 35%
- Product rating: 4.7 out of 5
- Quality and response consistency: 4 out of 5
- Downtime rate: 0.81%
ChatGPT is the least reliable chatbot for work tasks. Despite having the highest customer rating at 4.7 and controlling 81% of the market, ChatGPT makes up false information in more than one-third of its responses. The chatbot also experiences more downtime than most competitors, going offline 0.81% of the time over the past two months.
2. Claude
Claude is the second most unreliable chatbot. It hallucinates 17% of the time, less than half ChatGPT’s rate, but still gets things wrong often enough to potentially cause problems at work. Claude holds only 0.99% of the market and scores 4.4 for customer satisfaction. The service goes down 0.77% of the time, nearly matching ChatGPT’s downtime issues.
LambdaTest Rebrands to TestMu AI, the Agentic AI platform for Quality Engineering
3. Meta AI
While Meta AI’s hallucination rate sits at 15%, the chatbot struggles with overall performance quality. Customers give it just 3.4 out of 5 stars, the lowest rating among major chatbots tested. Response consistency is also low (3.4 score), meaning users get answers of varying quality depending on what they ask. Meta AI still holds a 20% market share and rarely goes down.
4. Botpress
Botpress is next on the list. The chatbot hallucinates 15% of the time and maintains a 4.5 rating for both customer satisfaction and response consistency. Botpress also charges $89 per month, making it the priciest option among business chatbots. The service experiences 0.37% downtime, which adds up to several hours of unavailability each month.
5. Perplexity AI
Perplexity AI ranks as a relatively more reliable option among the five. The chatbot has a 13% hallucination rate and earns a 4.6 customer rating. Response quality sits at 3.5 out of 5, showing room for consistency improvements. Perplexity AI takes up more than 10% of market share despite charging $40 monthly, double what ChatGPT costs. The service rarely experiences problems, showing only 0.18% downtime during the two-month testing period.
Razvan-Lucian Haiduc, Chief Product Officer at Relum, commented on the study.
“About 65% of US companies now use AI chatbots in their daily work, and nearly 45% of employees admit they’ve shared sensitive company information with these tools. These numbers show well how important chatbots have become in everyday work. Dependence on AI tools will likely increase even more, so companies should choose their chatbots based on how reliable and fit they are for their specific business needs. A chatbot that everyone uses isn’t necessarily the one that works best for your industry or gives accurate answers for your tasks.”
Latest Posts
- Oppo A6c With 6,500mAh Battery, Snapdragon 685 SoC Launched: Price, Features
January 15, 2026 | Tech, Technology - Realme P4 Power 5G India Launch Teased; Could Pack a 10,000mAh Battery: Availability, Expected Specifications
January 15, 2026 | Tech, Technology - Redmi Note 15 Pro 5G India Launch Seems Imminent After Smartphone Appears on Geekbench
January 15, 2026 | Tech, Technology - Kannada Actor Karunya Ram Files Harassment Complaint Against Sister
January 15, 2026 | Breaking News, Entertainment - Indian Coast Guard Seizes Pak Boat With 9 Crew Members Inside Indian Waters
January 15, 2026 | Breaking News, India, World - Indian Citizens Stranded In Iran; First Flight Tomorrow Amid Evacuation Plans
January 15, 2026 | Breaking News, Politics, World - Three Gulf Nations Persuaded Donald Trump to Hold Off Attack on Iran, Saudi Official Says
January 15, 2026 | Breaking News, Politics, World - Jammu Kashmir Lamberdars and Chowkidars Union Demands Regular Policy for Lamberdars and Chowkidars
January 15, 2026 | Breaking News, Doda, Jammu Kashmir - Iran Says Protester Erfan Soltani Not On Death Row Amid Trump’s Threats
January 15, 2026 | Breaking News, Politics, World - Iran Brings In Iraqi Militia To Crack Down On Protesters Amid US Threats
January 15, 2026 | Breaking News, World
