Researchers used questions from the NPR Sunday Puzzle challenge to build a benchmark to test AI 'reasoning' models.
Amazon is using math to help solve one of artificial intelligence’s most intractable problems: its tendency to make up answers, and to repeat them back to us with confidence.