In its commitment to drive responsible innovation, Penn Engineering convenes an open dialogue in D.C. on the government-funded research and solutions for a safer AI-enabled world
By Holly Wojcik
Rapid advancements across industries from healthcare, technology, finance and beyond present novel opportunities as well as challenges. As part of the University of Pennsylvania’s School of Engineering and Applied Science’s (Penn Engineering) commitment to develop leading-edge solutions that provide a better future for all, the School is bringing together today renowned leaders in engineering, academia, industry and policy for a dialogue on responsibly shaping the future of innovation at the Penn Washington Center in Washington, D.C.
Within its new Responsible Innovation initiative, researchers at Penn Engineering discovered that certain features of AI-governed robots carry security vulnerabilities and weaknesses that were previously unidentified and unknown. Funded by the National Science Foundation and the Army Research Laboratory, the research aims to address the emerging vulnerability for ensuring the safe deployment of large language models (LLMs) in robotics.
“Our work shows that, at this moment, large language models are just not safe enough when integrated with the physical world,” says George Pappas, UPS Foundation Professor of Transportation in Electrical and Systems Engineering (ESE), in Computer and Information Science (CIS), and in Mechanical Engineering and Applied Mechanics (MEAM).
In the new paper, Pappas, who also serves as the Associate Dean for Research at Penn Engineering, and his coauthors caution that a wide variety of AI-controlled robots can be manipulated or hacked.
RoboPAIR, the algorithm the researchers developed, needed just days to achieve 100% “jailbreak” rate, bypassing safety guardrails in the AI governing three different robotic systems: the Unitree Go2, a quadruped robot used in a variety of applications; the Clearpath Robotics Jackal, a wheeled vehicle often used for academic research; and the Dolphin LLM, a self-driving simulator designed by NVIDIA. In the case of the former two, the AI governor is OpenAI’s ChatGPT, which proved vulnerable to jailbreaking attacks, with serious potential consequences. For example, by bypassing safety guardrails, the self-driving system could be manipulated to speed through crosswalks.
Prior to publicly releasing the study, Penn Engineering informed the companies about their system vulnerabilities and is working with them to use the research as a framework to advance the testing and validation of these manufacturers’ AI safety protocols. Short of removing LLMs from these systems, there is little that companies can do to fix these vulnerabilities. “We cannot even solve these jailbreaking attacks in chatbots,” points out Hamed Hassani, Associate Professor in ESE and in CIS within Penn Engineering and in Statistics and Data Science at Wharton and one of the paper’s senior authors.
“What is important to underscore here is that systems become safer when you find their weaknesses. This is true for cybersecurity. This is also true for AI safety,” says Alexander Robey, a recent Penn Engineering Ph.D. graduate in ESE, current postdoctoral scholar at Carnegie Mellon University and the paper’s first author. “In fact, AI red teaming, a safety practice that entails testing AI systems for potential threats and vulnerabilities, is essential for safeguarding generative AI systems — because once you identify the weaknesses, then you can test and even train these systems to avoid them.”
“The findings of this paper make abundantly clear that having a safety-first approach is critical to unlocking responsible innovation,” says Vijay Kumar, Nemirovsky Family Dean of Penn Engineering and another coauthor. “We must address intrinsic vulnerabilities before deploying AI-enabled robots in the real world. Indeed our research is developing a framework for verification and validation that ensures only actions that conform to social norms can — and should — be taken by robotic systems.”
Additional co-authors of the paper include Hamed Hassani, Associate Professor in ESE and in CIS within Penn Engineering and in Statistics and Data Science within Wharton; and Zachary Ravichandran, a doctoral student in the General Robotics, Automation, Sensing and Perception (GRASP) Laboratory.
This study was conducted at the University of Pennsylvania School of Engineering and Applied Science and supported by the U.S. National Science Foundation (NSF) Institute for CORE Emerging Methods in Data Science (NSF-CCF-2217058), the AI Institute for Learning-Enabled Optimization at Scale (NSF-CCF-2112665), Distributed and Collaborative Intelligent Systems and Technology (DCIST) Collaborative Research Alliance (ARL DCIST CRA W911NF-17-2-0181), the ASSET Center for AI-Enabled Systems, and the NSF Graduate Research Fellowship Program.