FixVibe

// プローブ / スポットライト

LLMプロンプトインジェクション

AI機能がユーザー入力を指示として信頼すれば、ユーザーはシステムプロンプトを書き換えられます。

概要

Prompt injection is the new SQL injection, with one large complication: the parser is a probabilistic neural network whose decision boundaries are not specified anywhere. There is no equivalent of parameterized queries because the LLM does not have a structural separation between code and data — both arrive as text in a single context window. Every developer who builds a chatbot, summarization tool, or RAG-backed search starts with the same naive pattern (concatenate the system prompt with the user's input, send it to the LLM), and every one of them is vulnerable until they layer on defenses. Worse: the attacker doesn't need to be the user. Indirect prompt injection — instructions hidden inside documents, web pages, or emails the LLM consumes — turns content fetching into command execution.

仕組み

Prompt injection appears when LLM-facing inputs can override instructions, leak context, or trigger unsafe tool behavior. The risk depends on what the AI feature can read or do inside the product.

被害範囲

Data exfiltration: system prompts (often containing internal context, business logic, or credentials), conversation history from other users in shared deployments, document contents from RAG systems. Reputation damage when chatbots produce offensive output that screenshots cleanly. Phishing assistance via injected URLs that the LLM presents as legitimate. Financial loss when LLMs with tool access execute unintended operations — sending email, posting messages, hitting paid APIs, calling code-execution sandboxes. In agentic systems with broader tool access, the impact grades up to remote code execution.

// what fixvibe checks

What FixVibe checks

FixVibe checks this class with verified-domain active testing that is bounded, non-destructive, and evidence-driven. Public reports describe the affected surface and remediation. For check-specific questions about exact detection heuristics, active payload details, or source-code rule patterns, contact support@fixvibe.app.

鉄壁の防御

Treat user input as untrusted by the LLM. The most reliable structural defense is to constrain output to structured formats (JSON schema with strict validation, function-calling with tool-side permission gates) so the LLM's output is parsed by your code rather than executed. Layer the system prompt to restate boundaries after user content — instructions placed at the end of the context have measurably better adherence than those at the beginning. For tool-using agents, gate every dangerous operation behind a human approval step (don't let an LLM send email, transfer money, or execute code without a confirmation prompt). For indirect injection, sanitize fetched content before adding it to the context — strip HTML attributes, comment markers, and instruction-shaped text patterns. Run a second model as a 'jailbreak detector' over user inputs and outputs in high-stakes deployments; OpenAI's moderation endpoint and Anthropic's classifier APIs cover the common cases. Most importantly, design with assumed compromise: what's the worst the LLM can do if it follows attacker instructions? If the answer is 'leak the system prompt,' you might accept that risk. If the answer is 'send mass email or call paid APIs,' you need the gates.

要点

There is no parameterized-query equivalent for LLMs yet. Defense is currently architectural — constrain what the model can do with its output, not what it can be told to do.

// あなたのアプリで実行してみてください

FixVibe が見守る間も、安心して出荷を続けられます。

FixVibe は攻撃者と同じ視点で、あなたのアプリの公開面を徹底的にテストします —— エージェント不要、インストール不要、クレジットカード不要。新しい脆弱性パターンを継続的に研究し、実用的なチェックと Cursor、Claude、Copilot 向けの貼り付け可能な修正に変換します。

アクティブプローブ
103
このカテゴリで実行されるテスト
モジュール
27
専用の アクティブプローブ チェック
1スキャンごと
384+
全カテゴリ合計のテスト
  • 無料 —— カード不要、インストール不要、Slack 通知不要
  • URL を貼り付けるだけ —— クロール、検査、レポートはお任せ
  • 重大度別に分類、シグナルだけに重複排除
  • 最新の AI 修正プロンプトを Cursor、Claude、Copilot にそのまま貼り付け
無料スキャンを実行

// 最新チェック · 実用的な修正 · 安心してリリース

LLMプロンプトインジェクション — 脆弱性スポットライト | FixVibe · FixVibe