LM Curve - Search News

New study accuses LM Arena of gaming its popular AI benchmark

The rapid proliferation of AI chatbots has made it difficult to know which models are actually improving and which are falling behind. Traditional academic benchmarks only tell you so much, which has ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

New study accuses LM Arena of gaming its popular AI benchmark

Trending now