| Jan 27, 2026 | Our Adaptive ELO and Test-time Correction of Reasoning are accepted by ICLR 2026 |
| Nov 07, 2025 | Our Cognitive Adaptive Selection Strategy is accepted by AAAI 2026 |
| Sep 18, 2025 | Our Vittle, is accepted by NeurIPS 2025 |
| May 01, 2025 | Our recent research about trustworthy evaluation, am-ELO, is accepted by ICML 2025 (Spotlight) |
| Oct 15, 2024 | Check out our Awesome LM Evaluation Methodologies, a collection of frontier papers in LM Eval Methodologies |
| Sep 27, 2024 | PertEval and CCAT are accepted by NeurIPS 2024. PertEval is selected as a Spotlight |
| May 30, 2024 | Complete my research intern at Alibaba Cloud. Many thanks to my mentor and all collaborators! |
| May 30, 2024 | PertEval is available at Arxiv. [Paper] |
| Jan 23, 2024 | ID-CDF is accepted by TheWebConf 2024. [Paper] |
| Jan 06, 2024 | Start my research intern in LLM evaluation at Alibaba Cloud |
| Sep 18, 2023 | DeepEval is available at Arxiv. [Paper] |
| Mar 18, 2023 | One research about Bayesian Item Response Theory is invited to make an oral presentation at IMPS 2023, University of Maryland, College Park, USA. [Web] |
| May 20, 2022 | HierCDF is accepted by SIGKDD 2022. [Paper] |