Exploring the Hidden Dangers of Multimodal AI as Enkrypt AI Uncovers Critical Safety Gaps
Exploring the Hidden Dangers of Multimodal AI as Enkrypt AI Uncovers Critical Safety Gaps
  • Dan Yoo
  • 승인 2025.05.08 23:46
  • 댓글 0
이 기사를 공유합니다

Multimodal AI at a Crossroads: Report Reveals CSEM Risks

 


The report shows two Mistral models are 60 times more likely to produce CSEM and 40 times more likely to generate CBRN content than GPT-4 and Claude-3.7 Sonnet./ Source: Multimodal Safety Report.

Boston, MA -- As the world of artificial intelligence(AI) advances at breakneck speed, the integration of text and imagery into generative models, known as multimodal AI, is hailed as a groundbreaking frontier. But amid this excitement, a warning echoes from within the industry. On May 8,  Enkrypt AI, a leader in AI safety and compliance solutions, unveiled a revealing report highlighting overlooked risks and vulnerabilities lurking in these powerful systems.

The report stems from rigorous red teaming exercises conducted on several prevailing multimodal models, focusing on safety and harm prevention aligned with standards like the NIST AI Risk Management Framework. What they uncovered is alarming: new jailbreak techniques—ingenious yet malicious manipulations—are exploiting the way these models process combined media. These methods allow harmful instructions to slip past content filters, often embedded within seemingly innocuous images. The result? Potentially dangerous outputs that carry no obvious red flags.

Sahil Agarwal, CEO of Enkrypt AI, emphasizes the gravity of these findings: “Multimodal AI offers incredible benefits, but it also broadens the attack surface in unpredictable ways. This research should serve as a wake-up call: embedding harmful instructions within images has serious implications for enterprise liability, public safety, and vulnerable populations like children.”

The report zeroes in on two prominent models by Mistral—Pixtral-Large (25.02) and Pixtral-12b—and reveals these are alarmingly more susceptible to producing harmful responses: up to 60 times more likely to generate content related to child sexual exploitation material compared to other popular models like GPT-4o and Claude 3.7 Sonnet. Beyond this, the models showed an 18- to 40-fold increase in the risk of the dissemination of dangerous chemical, biological, radiological, and nuclear information when exposed to adversarial prompts, highlighting a serious safety gap.

Most troubling of all, these vulnerabilities aren’t tied to overt malicious text entries. Instead, they arise from cleverly embedded prompt injections within images—techniques that can easily bypass traditional safety defenses.

In response, Enkrypt AI urges urgent action from the AI community and enterprise developers alike. The report recommends a series of best practices: integrating red teaming datasets into safety protocols, establishing continuous automated testing, deploying context-aware safety guardrails, maintaining real-time monitoring, and creating transparency through model risk communication.

Agarwal concludes with a stark warning: “These are not just theoretical risks. If we fail to prioritize safety in multimodal AI, we jeopardize the well-being of users—and especially vulnerable groups—on a scale we can no longer ignore.”

As the industry explores the immense potential of multimodal models, this report underscores a vital truth: safeguarding these technologies is essential to ensure they serve all of society safely.
 


댓글삭제
삭제한 댓글은 다시 복구할 수 없습니다.
그래도 삭제하시겠습니까?
댓글 0
댓글쓰기
계정을 선택하시면 로그인·계정인증을 통해
댓글을 남기실 수 있습니다.

  • ABOUT
  • CONTACT US
  • SIGN UP MEMBERSHIP
  • RSS
  • URL : www.koreaittimes.com | Tel : +82-2-578- 0434 / + 82-10-2442-9446 | North America Dept: 070-7008-0005
  • Email : info@koreaittimes.com | Publisher. Editor :: Chung Younsoo
  • Masthead: Korea IT Times. Copyright(C) Korea IT Times, All rights reserved.
ND소프트