Lila: A Unified Benchmark for Mathematical Reasoning

It’s pretty remarkable how good the cutting edge language models have gotten at answering difficult technical questions. We’re very close to AIs being able to beat the average Human student on many undergrad or PhD STEM exams.

This project is impressive; they use LLMs to provide explanations of AI papers on request as you read them.