Skip to content

Fix arena moderation skipping right model conversation history#3889

Open
Chessing234 wants to merge 1 commit into
lm-sys:mainfrom
Chessing234:fix/arena-moderation-missing-model-b-history
Open

Fix arena moderation skipping right model conversation history#3889
Chessing234 wants to merge 1 commit into
lm-sys:mainfrom
Chessing234:fix/arena-moderation-missing-model-b-history

Conversation

@Chessing234

Copy link
Copy Markdown

In side-by-side arena mode, the moderation filter builds all_conv_text from both models' histories before each turn. Two files had a copy-paste bug where all_conv_text_right read states[0] (left model) instead of states[1] (right model), so Model B's responses were never checked. The vision anonymous arena was worse — it passed the current user message as all_conv_text, so no prior history from either model was moderated.

This matches the fix already in gradio_block_arena_named.py (34eca62).

Fixes #3834, #3794

Made with Cursor

In side-by-side arena modes, moderation was reading states[0] twice
instead of states[1] for the right model, and vision anony passed only
the current user message instead of full conversation history.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Security] Content Moderation Bypass in Arena Side-by-Side Views (Incomplete Fix for 34eca62)

1 participant