Dioptra, a newly released open-source software package from the National Institute of Standards and Technology (NIST), helps developers assess how various attacks could compromise the effectiveness of AI models.
The National Institute of Standards and Technology (NIST) under the US Department of Commerce has introduced Dioptra, an open-source tool designed to help developers evaluate the vulnerability of AI models to different types of attacks.
According to a NIST statement, Dioptra aims to test the impact of adversarial attacks on machine learning models. This new software package assists AI developers and users in understanding the robustness of their AI systems against a variety of adversarial threats.
Freely downloadable, the tool enables AI system developers to quantify how attacks might degrade model performance, providing insights into the frequency and conditions under which their systems might fail.
The release of Dioptra follows President Biden’s 2023 executive order, which tasked NIST with aiding in model testing.
In addition to Dioptra, NIST has issued several documents advocating for AI safety and standards, aligning with the executive order.
One such document is the initial public draft of guidelines for developing foundation models, titled “Managing Misuse Risk for Dual-Use Foundation Models.” These guidelines recommend voluntary practices for developers to design models that are resistant to misuse, thereby preventing intentional harm to individuals, public safety, and national security.
The draft outlines seven approaches to mitigate the risks of model misuse, along with recommendations for implementation and transparency.
“These practices aim to prevent models from being used to create biological weapons, conduct offensive cyber operations, or generate harmful content such as child sexual abuse material and nonconsensual intimate imagery,” NIST noted, inviting comments on the draft until September 9.
Companion Documents for AI Safety Guidelines Released
NIST also released two guidance documents as companion resources to its AI Risk Management Framework (AI RMF) and Secure Software Development Framework (SSDF). These documents are intended to help developers manage the risks associated with generative AI.
The AI RMF Generative AI Profile identifies 12 potential risks of generative AI and suggests nearly 200 actions developers can take to mitigate these risks. These risks include the facilitation of cybersecurity attacks, the spread of misinformation and hate speech, and the generation of confabulated or “hallucinated” outputs by AI systems.
The second guidance document, “Secure Software Development Practices for Generative AI and Dual-Use Foundation Models,” complements the SSDF. While the SSDF focuses broadly on software coding practices, this companion resource extends the SSDF to address issues like the introduction of malicious training data that could degrade AI system performance.
As part of its efforts to ensure AI safety, NIST has proposed a plan for US stakeholders to collaborate with international partners on developing AI standards.
In November of last year, the US and China, along with at least 25 other countries, agreed to work together to address the risks associated with AI advancements. This agreement, known as the Bletchley Declaration, was signed at the UK AI Safety Summit and aims to establish a common framework for overseeing AI development and ensuring the technology’s safe progression.