Xu Rongwu 许融武

IIIS
Tsinghua University
Beijing, 100084, P.R. China

Contact me:
[Email1] [Email2] [Github] [Twitter]

Don't Crack Under Pressure

I am a master student in computer science at IIIS@Tsinghua University. Prior to that, I obtained my bachelor degree in computer science from CST@Tsinghua University.

My research interests are: natural language processing (NLP) and computational social science (CSS).

In terms of NLP, I focus on: LLM evaluation, robustness and interpretability.
In terms of CSS, I mainly focus on AI safety and public discourse understanding.

Feel free to check out my [CV] for details.

Updates

        If you are interested in my work or see potential for collaboration, please do not hesitate to contact
        me!
      

[Jul 2024] Check out our survey on knowledge conflicts for (RAG) LLMs! [Paper][Resource][机器之心][Talk (Chinese)][Slide]
[May 2024] Two papers accepted to ACL! Thanks to my collaborators! See you in Bangkok🇹🇭!
[May 2024] Check out LLMs' safety vulnerabilities discovered by tricking them to believe in misinformation! [Paper][Resource][Video]
[Dec 2023] I recieve the overall execellence scholarship at Tsinghua!
[Apr 2023] One paper accepted to EuroS&P! Thanks to my collaborators!
[Dec 2022] Debut of my homepage.

Recent Work

Course-Correction: Safety Alignment Using Synthetic Preferences
Rongwu Xu*, Yishuo Cai*, Zhenhong Zhou, Renjie Gu, Haiqin Wang, Yan Liu, Tianwei Zhang, Wei Xu, Han Qiu
arXiv Preprints
[Paper][Code]

MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models
Zhongshen Zeng, Yinhong Liu, Yingjia Wan, Jingyao Li, Pengguang Chen, Jianbo Dai, Yuxuan Yao, Rongwu Xu, Zehan Qi, Wanru Zhao, Linling Shen, Jianqiao Lu, Haochen Tan, Yukang Chen, Hao Zhang, Zhan Shi, Bailin Wang, Zhijiang Guo, Jiaya Jia
arXiv Preprints
[Paper][Code][Project Page]

How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
Zhenhong Zhou, Haiyang Yu, Xinghua Zhang, Rongwu Xu, Fei Huang, Yongbin Li
arXiv Preprints
[Paper][Code]

Knowledge Conflicts for LLMs: A Survey
Rongwu Xu*, Zehan Qi*, Zhijiang Guo, Cunxiang Wang, Hongru Wang, Yue Zhang, Wei Xu
arXiv Preprints
[Paper][Code][机器之心][Talk (Chinese)][Slide]

Preemptive Answer ``Attacks'' on Chain-of-Thought Reasoning
Rongwu Xu*, Zehan Qi*, Wei Xu
ACL 2024 (Findings) Bangkok, Thailand
[Paper][Code]

The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation
Rongwu Xu, Brian S. Lin, Shujian Yang, Tianqi Zhang, Weiyan Shi, Tianwei Zhang, Zhixuan Fang, Wei Xu, Han Qiu
ACL 2024 (Oral, Main) Bangkok, Thailand
[Paper][Code][Project Page][Video] ARR Meta: 5

Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias
Rongwu Xu, Zi'an Zhou, Tianwei Zhang, Zehan Qi, Su Yao, Ke Xu, Wei Xu, Han Qiu
arXiv Preprints
[Paper]

Show earlier work▼

Exploring Chinese Humor Generation: A Study on Two-Part Allegorical Sayings
Rongwu Xu
IJCNN 2024 Yokohama, Japan
[Paper]

Tempo: Confidentiality Preservation in Cloud-Based Neural Network Training
Rongwu Xu and Zhixuan Fang
IJCNN 2024 Yokohama, Japan
[Paper]

LSync: A Universal Timeline-synchronizing Solution for Live Streaming
Fan Dang*, Yifan Xu*, Rongwu Xu, Xinlei Chen, Yunhao Liu
IEEE/ACM ToN
[Paper]

MISO: Legacy-compatible Privacy-preserving Single Sign-on using Trusted Execution Environments
Rongwu Xu, Sen Yang, Fan Zhang, Zhixuan Fang
IEEE EuroS&P 2023 Delft, The Netherlands
[Paper][Project Page] Bachelor Thesis

LSync: A Universal Event-synchronizing Solution for Live Streaming
Yifan Xu, Fan Dang, Rongwu Xu, Xinlei Chen, Yunhao Liu
IEEE INFOCOM 2022 Virtual
[Paper]

LifeRec: A Mobile App for Lifelog Recording and Ubiquitous Recommendation
Jiayu Li, Hantian Zhang*, Zhiyu He*, Rongwu Xu*, Pingfei Wu*, Min Zhang, Yiqun Liu, Shaoping Ma
ACM CHIIR 2022 Regensburg, Germany
[Paper][Code]

* denotes equally contribution.

Talk and Presentation (Online):

Knowledge Conflicts for LLMs [2-Hour Talk (Chinese)][Talk Slide]
July 2024
LLMs Safety Vulnerabilities via Contextual Misinformation [3-Min Clip (English)]
May 2024

Teaching Activities

Leading Teaching Assistant: Introduction to Large Language Model Applications (Lectured by Prof. Xu Wei, Spring 2024)
Check out the [Code] we developed for the course. (10+contributers, involing LLM-based App design, inference, efficient fine-tuning, etc.)
Teaching Assistant: Operating Systems and Distributed Systems (Lectured by Prof. Xu Wei, Fall 2023)
Teaching Assistant: Distributed Systems (Lectured by Prof. Xu Wei and Prof. Fang Zhixuan, Spring 2022)

Social Activities

President@IIIS Graduate Student Union (June 2024 --- Current)
Graduate Freshman Counselor@IIIS (May 2024 --- Current)
Social Practice Captain@IIIS (Apr 2024 --- Jul 2024)
Member@IIIS Graduate Student Union (Sept 2023 --- Jun 2024)
Outstanding individual 2023-2024

Visiting

Duke University, Durham, USA (Jun 2021 --- Aug 2021)

Experience

Intern at TongYi Vision Intelligence Lab, Alibaba Inc., Beijing, China (Apr 2024 --- Current)
Intern at Shanghai Qi Zhi Institute, Shanghai, China (Dec 2022 --- Jan 2023)

Misc

I used to play electric guitar, see this. My favorite band is Pink Floyd.
I explored various research topics as an undergraduate, now focused on NLP since 2023.