DeepSeek warns its open-source AI models are vulnerable to 'jailbreaking'

**DeepSeek Warns Its Open-Source AI Models Are Vulnerable to ‘Jailbreaking’**

*By Dwaipayan Roy | Sep 21, 2025, 02:58 PM*

DeepSeek, a Hangzhou-based start-up, has issued a warning that its artificial intelligence (AI) models are susceptible to being “jailbroken” by malicious actors. The company recently published its findings in a peer-reviewed paper featured in the prestigious academic journal *Nature*.

### Understanding the Vulnerabilities

The study outlines the various vulnerabilities inherent in open-sourced AI models and explains how these weaknesses can be exploited by bad actors to manipulate the models’ outputs for harmful purposes. This raises concerns about the security and ethical implications of deploying open-source AI technologies without adequate safeguards.

### DeepSeek’s Evaluation Process

DeepSeek conducted a thorough evaluation of its AI models using industry-standard benchmarks alongside internal testing procedures. According to Fang Liang, an expert member of China’s AI Industry Alliance (AIIA), the detailed testing protocols were shared in the company’s paper published in *Nature*.

Among the approaches employed was a series of “red-team” assessments based on a framework introduced by AI safety company Anthropic. These assessments involve testers deliberately attempting to coax AI models into generating harmful or dangerous speech, thereby identifying weaknesses and potential exploit paths.

### Addressing Risk: DeepSeek’s Proactive Stance

While many US-based AI companies have been vocal about the risks associated with rapidly advancing AI models, Chinese firms have generally remained more reserved on this topic. However, DeepSeek stands out for its proactive approach.

The company has previously conducted assessments of various AI risks, focusing especially on so-called “frontier risks” — the most severe and difficult-to-mitigate dangers. This vigilance reflects a risk-aware strategy similar to that of leading players like Anthropic and OpenAI, both of which have implemented robust risk mitigation policies to counter threats arising from their AI technologies.

### Conclusion

DeepSeek’s findings and transparent communication underscore the importance of continuous monitoring and securing of AI models, particularly open-source variants. As AI capabilities continue to grow, ensuring these systems cannot be manipulated into harmful outputs remains a critical priority for developers and policymakers alike.
https://www.newsbytesapp.com/news/science/chinese-ai-firm-warns-of-jailbreak-risks-in-its-models/story

DeepSeek warns its open-source AI models are vulnerable to ‘jailbreaking’

Leave a Reply Cancel reply

归档

分类