I know, I know. The Artificial Analysis Intelligence Index v4.0 that incorporates 10 evaluations: GDPval-AA, πΒ²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt and evaluates all the AI models based on that, is trash because you have better models that can analyze and compare the different AI systems.
Do post the results of your testing.