FEAT: set the objective target's system prompt from the CoPyRIT GUI#2056
FEAT: set the objective target's system prompt from the CoPyRIT GUI#2056adrian-gavrila wants to merge 2 commits into
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR brings the framework’s existing prepended_conversation / supports_system_prompt functionality into the CoPyRIT frontend, letting operators set the objective target’s system prompt from the chat composer before the first message is sent.
Changes:
- Adds a collapsible System Prompt UI block above the composer (with disabled state when unsupported and a soft character counter).
- Threads
systemPromptstate throughChatWindow→ChatInputArea, and injectsprepended_conversationinto the lazycreateAttackcall only when the active target supports it. - Adds a small mapper helper (
buildSystemPrompt) plus request typing (prepended_conversation/PrependedMessageRequest) and accompanying unit/integration tests.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| frontend/src/utils/messageMapper.ts | Adds buildSystemPrompt() to translate the UI string into a prepended_conversation system-role message payload. |
| frontend/src/utils/messageMapper.test.ts | Adds unit tests for buildSystemPrompt() trimming/blank behavior and payload shape. |
| frontend/src/types/index.ts | Extends CreateAttackRequest with prepended_conversation? and introduces PrependedMessageRequest. |
| frontend/src/components/Chat/SystemPromptSetup.tsx | New UI component for configuring a system prompt (collapsible, disabled reason, counter). |
| frontend/src/components/Chat/SystemPromptSetup.test.tsx | New component tests covering toggle behavior, typing, disabled state, and counter behavior. |
| frontend/src/components/Chat/SystemPromptSetup.styles.ts | New Fluent UI v9 styles for the system prompt component. |
| frontend/src/components/Chat/ChatWindow.tsx | Owns systemPrompt state; injects prepended_conversation into createAttack for supported targets; resets on attack clear; passes props down to input area. |
| frontend/src/components/Chat/ChatWindow.test.tsx | Adds integration tests validating show/hide, support gating, and correct forwarding of prepended_conversation. |
| frontend/src/components/Chat/ChatInputArea.tsx | Hosts the new SystemPromptSetup above the composer for brand-new conversations and computes disabled state from capabilities. |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| <SystemPromptSetup | ||
| value={systemPrompt} | ||
| onChange={onSystemPromptChange} | ||
| disabled={!!activeTarget && activeTarget.capabilities?.supports_system_prompt !== true} |
There was a problem hiding this comment.
Rich agrees with this comment but it is copilot generated:
The supports_system_prompt rule is encoded twice in two different shapes. Here the editor is gated with capabilities?.supports_system_prompt !== true, while ChatWindow gates the send with the inverse (capabilities?.supports_system_prompt ? buildSystemPrompt(...) : undefined). They agree today, but the source of truth is split between the input and the window. If they ever drift you get a silent failure: an enabled editor whose typed value is dropped on send, or vice-versa. Consider deriving one supportsSystemPrompt boolean in ChatWindow and passing it down so the editor's enabled state and the send-time gate can't disagree.
| value={systemPrompt} | ||
| onChange={onSystemPromptChange} | ||
| disabled={!!activeTarget && activeTarget.capabilities?.supports_system_prompt !== true} | ||
| disabledReason="This target does not support system prompts." |
There was a problem hiding this comment.
Rich agrees with this comment but it is copilot generated:
The editor is disabled whenever supports_system_prompt !== true, which also matches the "capabilities not loaded / unknown" case (capabilities == null). But the reason is always "This target does not support system prompts." If capabilities are merely unresolved, that message asserts something untrue. Consider gating this message on an explicit === false.
| if (!attackResultId) { | ||
| setMessages([]) | ||
| setLoadedConversationId(null) | ||
| setSystemPrompt('') |
There was a problem hiding this comment.
Rich agrees with this comment but it is copilot generated:
systemPrompt is only reset when attackResultId flips to null, not on target change. If a user types a prompt under a supporting target, switches to a non-supporting one, then sends, the text is retained in state but silently discarded (and the editor is collapsed/disabled so they can't see why). Worth considering resetting or surfacing the retained value on target switch.
Description
Brings the framework's
system_prompt=capability to the CoPyRIT GUI. A CoPyRIT operator can now type a system prompt into the empty-chat composer before their first message It is delivered to the target as a single system-role message at the front of the conversation matching the framework end state.The backend was already wired for prepended system messages (
CreateAttackRequest.prepended_conversation, thePOST /attacksroute,_store_prepended_messages_async, and thesupports_system_promptcapability flag), so the gap was frontend-only.Independent of the framework half which is in review as open PR #2040
Changes
SystemPromptSetup.tsx/.styles.ts(new): collapsible inline field at the top of the composer. Disabled (greyed, non-expanding, with a reason) when the active target does not support system prompts. Soft character counter.ChatInputArea.tsx: hosts the field; computesdisabledfrom the target's capability flag.ChatWindow.tsx: ownssystemPromptstate; injectsprepended_conversationinto the lazycreateAttackonly when the target supports it; resets the value when the attack state clears.messageMapper.ts:buildSystemPrompt(value)translator (trims;undefinedwhen blank).types/index.ts:prepended_conversation?andPrependedMessageRequest.The set-once lifecycle is guarded: the prompt is applied only to the lazy create call; the branch/template create paths clone server-side via
source_conversation_idand never readsystemPrompt, so there is no double-injection.Tests and Documentation
npx tsc --noEmit;;eslint --max-warnings 0doc/samples touched.