Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias Paper • 2407.15366 • Published Jul 22, 2024
Preemptive Answer "Attacks" on Chain-of-Thought Reasoning Paper • 2405.20902 • Published May 31, 2024
Course-Correction: Safety Alignment Using Synthetic Preferences Paper • 2407.16637 • Published Jul 23, 2024 • 26
MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models Paper • 2406.13975 • Published Jun 20, 2024
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation Paper • 2312.09085 • Published Dec 14, 2023