Cyber Risks

Exploring PLeak: An Algorithmic Method for System Prompt Leakage

May 3, 2025

Key Takeaways

We took a deep dive into the concept of Prompt Leakage (PLeak) by developing strings for jailbreaking system prompts, exploring its transferability, and evaluating it through a guardrail system. PLeak could allow attackers to exploit system weaknesses, which could lead to the exposure of sensitive data such as trade secrets.
Organizations that are currently incorporating or are considering the use of large language models (LLMs) in their workflows must heighten their vigilance against prompt leaking attacks.
Adversarial training and prompt classifier creation are some steps companies can take to proactively secure their systems. Companies can also consider taking advantage of solutions like Trend Vision One™ – Zero Trust Secure Access (ZTSA) to avoid potential sensitive data leakage or unsecure outputs in cloud services. The solution can also deal with GenAI system risks and attacks against AI models.

In the second article of our series on attacking artificial intelligence (AI), let us explore an algorithmic technique designed to induce system prompt leakage in LLMs, which is called PLeak.

System Prompt Leakage pertains to the risk that preset system…

Актуальные книги на английском

Decision Science

Читать на английском

The Decision Intelligence Handbook: Practical Steps for Evidence-Based Decisions in a Complex World

Читать на английском

Algorithms for Decision Making

Читать на английском

An Introduction to Decision Theory (Cambridge Introductions to Philosophy)

Читать на английском

An Introduction to Management Science: Quantitative Approaches to Decision Making

Читать на английском

The Art and Science of Decision-Making: Mastering Critical Thinking and Problem-Solving for Enhanced Productivity and Success

Читать на английском

Актуальные книги на английском

LEAVE A REPLY Cancel reply

МЕРОПРИЯТИЯ

ПОПУЛЯРНЫЕ СТАТЬИ

ПОПУЛЯРНЫЕ ШАБЛОНЫ

Подписывайтесь на новый канал