AI-generated multimodal model struggling to interpret a clock and calendar

AI can write novels and solve complex problems but flunks kindergarten-level time-telling. Researchers found major models like GPT-4o and Gemini 2.0 struggle with clocks and calendars, with accuracy below 25% in some cases. πŸ˜‰

Why it matters: Time reasoning is crucial for real-world AI applications, from scheduling to autonomous systems. Yet, this area remains largely unexplored.

The kicker: Even the best model, GPT-o1, only nailed 80% of calendar questions. So, maybe don’t let AI plan your New Year’s Eve party just yet.

Related news