Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published 26 days ago • 74
LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions Paper • 2510.08211 • Published Oct 9, 2025 • 22
view post Post 3535 BAAI has released ROME🔥 evaluating 30+ large reasoning models on text & visual reasoning FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions (2509.17177)✨Tests visual reasoning, not just recognition ✨Covers capability × alignment × safety × efficiency ✨More transparent & reliable (less data contamination) ✨Helps make real-world deployment choices See translation 🔥 5 5 + Reply
FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions Paper • 2509.17177 • Published Sep 21, 2025 • 13
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models Paper • 2506.05176 • Published Jun 5, 2025 • 77
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics Paper • 2506.04308 • Published Jun 4, 2025 • 43
Personalize Anything for Free with Diffusion Transformer Paper • 2503.12590 • Published Mar 16, 2025 • 44
AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement Paper • 2502.16776 • Published Feb 24, 2025 • 6
AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement Paper • 2502.16776 • Published Feb 24, 2025 • 6
VLSBench: Unveiling Visual Leakage in Multimodal Safety Paper • 2411.19939 • Published Nov 29, 2024 • 10
VLSBench: Unveiling Visual Leakage in Multimodal Safety Paper • 2411.19939 • Published Nov 29, 2024 • 10