ReviewScore: Misinformed Peer Review Detection with Large Language Models Paper • 2509.21679 • Published Sep 25, 2025 • 64
Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games Paper • 2506.03610 • Published Jun 4, 2025 • 9
Efficient Generative Modeling with Residual Vector Quantization-Based Tokens Paper • 2412.10208 • Published Dec 13, 2024 • 19