Nvidia Debuts Open-Source Tool That Trains AI To Stop Spouting Malicious Remarks
By Nicole Rodrigues, 26 Apr 2023
Artificial intelligence is a rapidly expanding field with potential to improve lives. However, with this possibility comes a significant risk of errors and malicious attacks, especially regarding chatbots. Fortunately, Nvidia has devised a solution to this problem: ‘NeMo Guardrails’.
NeMo Guardrails is a set of guidelines to prevent chatbots from spewing misinformation, toxic comments, and racist remarks. In addition, developers from various organizations can train their language learning models (LLMs) to follow specific rules and safety measures. This ensures that the machines stick to their intended purpose, remain secure, and do not produce inappropriate content.
One of the critical features of NeMo Guardrails is that it enables companies to set security limits, ensuring their LLMs use accurate information when responding to users. Moreover, the safety rules prevent LLMs from making up answers to questions, avoiding toxic responses and malicious codes. For example, customer service chatbots will decline to answer questions about the weather to avoid going off-topic. These guidelines will also prevent humans from hacking into the machine and trying to steer it into making racist or hostile comments.
Gizmodo points out that with Snapchat’s new AI chatbot, many users have called for it to be removed as the bot was pushed onto them without warning. On top of that, there were reports that some users were training it to say the N-word to them. This shows that as powerful of an assistant as these generators can be, there is still a lot that companies cannot control, and human curiosity often gets the best of us sometimes. Proper safety measures such as this might work towards ensuring that these systems do not contribute to more hate.
The NeMo Guardrails are open-source and can be easily integrated into all the tools enterprise developers already use. This makes it user-friendly and accessible to everyone, regardless of their knowledge of the field. It is also designed to work with ChatGPT, the large language model trained by OpenAI.
“Nvidia made NeMo Guardrails—the product of several years’ research—open source to contribute to the developer community’s tremendous energy and work AI safety,” states the company. “Together, our efforts on guardrails will help companies keep their smart services aligned with safety, privacy and security requirements so these engines of innovation stay on track.”
[via Gizmodo and Engadget, cover photo 153385536 © Prostockstudio | Dreamstime.com]