로딩 중...

Claude Opus 4.6 accuracy on BridgeBench hallucination test drops from 83% to 68% | AI Paper Digest