Rongwu Xu's picture

1 2

Rongwu Xu

pillowsofwind

https://rongwuxu.site

AI & ML interests

NLP

Organizations

None yet

authored 6 papers over 1 year ago

Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias

Paper • 2407.15366 • Published Jul 22, 2024

Preemptive Answer "Attacks" on Chain-of-Thought Reasoning

Paper • 2405.20902 • Published May 31, 2024

Course-Correction: Safety Alignment Using Synthetic Preferences

Paper • 2407.16637 • Published Jul 23, 2024 • 26

MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models

Paper • 2406.13975 • Published Jun 20, 2024

Knowledge Conflicts for LLMs: A Survey

Paper • 2403.08319 • Published Mar 13, 2024

The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation

Paper • 2312.09085 • Published Dec 14, 2023