Oct 15, 2024 | Check out our Awesome LM Evaluation Methodologies, a collection of frontier papers in LM Eval Methodologies! |
Sep 27, 2024 | PertEval and CCAT are accepted by NeurIPS 2024. PertEval is selected as a Spotlight paper. |
May 30, 2024 | Finish my research intern at Alibaba Cloud. Many thanks to my mentor and all collaborators! |
May 30, 2024 | PertEval is available at Arxiv. [Paper] |
Jan 23, 2024 | ID-CDF is accepted by TheWebConf 2024. [Paper] |
Jan 06, 2024 | Start my research intern in LLM evaluation at Alibaba Cloud! |
Sep 18, 2023 | DeepEval is available at Arxiv. [Paper] |
Mar 18, 2023 | One research about Bayesian Item Response Theory is invited to make an oral presentation at IMPS 2023, University of Maryland, College Park, USA. [Web] |
May 20, 2022 | HierCDF is accepted by SIGKDD 2022. [Paper] |