Abstract: Recent progress in large language models (LLMs) calls for a thorough safety inspection of these models. In this talk, I will discuss three of our recent works on adversarial attacks related to natural languages. We first review common concepts of jailbreaking LLMs and discuss the trade-offs between their usefulness and safety. Then, we move...