Latest Colloquia for February 12th, 2024

Usefulness and safety Trade-offs in Language Models

Abstract: Recent progress in large language models (LLMs) calls for a thorough safety inspection of these models. In this talk, I will discuss three of our recent works on adversarial attacks related to natural languages. We first review common concepts of jailbreaking LLMs and discuss the trade-offs between their usefulness and safety. Then, we move...

By Yue Dong | February 12, 2024

Center for Robotics and Intelligent Systems

Latest Colloquia for February 12th, 2024

Usefulness and safety Trade-offs in Language Models

CENTER FOR RESEARCH IN INTELLIGENT SYSTEMS