March 22, 2025

ikayaniaamirshahzad@gmail.com

o1-pro sets a new record on the Extended NYT Connections benchmark with a score of 81.7, easily outperforming the previous champion, o1 (69.7)!

o1-pro sets a new record on the Extended NYT Connections benchmark with a score of 81.7, easily outperforming the previous champion, o1 (69.7)!

This benchmark is a more challenging version of the original NYT Connections benchmark (which was approaching saturation and required identifying only three categories, allowing the fourth to fall into place), with additional words added to each puzzle. To safeguard against training data contamination, I also evaluate performance exclusively on the most recent 100 puzzles. In this scenario, o1-pro remains in first place.

More info: GitHub: NYT Connections Benchmark

NYT Connections

submitted by /u/zero0_one1
[comments]

Latest articles

Need some assistance making this prompt better.

ikayaniaamirshahzad@gmail.com

how to piss off any bot (101% effective)

ikayaniaamirshahzad@gmail.com

Scallop

ikayaniaamirshahzad@gmail.com

Leave a Comment Cancel reply