feat(proxy): add video/speech/music/sound-effects media endpoints#14
Merged
Merged
Conversation
The sidecar only exposed /v1/images/generations, so OpenAI-compatible clients (LiteLLM) could not reach the gateway's video/audio models — e.g. a call to xai/grok-imagine-video 404'd at the proxy even though the gateway supports it. - Add POST /v1/videos/generations, /v1/audio/speech, /v1/audio/generations, and /v1/audio/sound-effects, mirroring the existing image route. - Adapter: video/music/speech/sound_effect async wrappers with Base (dedicated VideoClient/MusicClient/SpeechClient) vs Solana (unified SolanaLLMClient) dispatch; sync SDK clients run in the shared thread pool. Requires blockrun-llm with Solana media support (SolanaLLMClient.video etc.). Validated live end-to-end on Solana mainnet via the adapter path.
113f973 to
cc046f8
Compare
VickyXAI
added a commit
that referenced
this pull request
Jul 4, 2026
Review follow-up fixing everything flagged on the media-endpoints PR: - ValueError -> 400 on ALL media routes via a shared _media_endpoint helper (was video-only; music with lyrics + default instrumental=True 500'd instead of returning the SDK's clear message) - Solana media on a pre-#16 blockrun-llm now degrades to a clear 501 (upgrade hint) instead of an AttributeError 500; pyproject notes the floor bump owed when the SDK release ships - Long media (video 60-900s, music 60-210s) moved to a dedicated 8-thread pool (BLOCKRUN_LONG_MEDIA_THREADS) and all media routes to their own semaphore (BLOCKRUN_MEDIA_MAX_CONCURRENT, default 20) so a video burst can no longer starve images or brick chat/messages - Client-supplied budget_seconds/timeout clamped to the 900s server cap (was forwarded verbatim: one request body could pin a worker thread for a day); _run_media wraps every call in asyncio.wait_for so the coroutine + permit always release even if the SDK thread wedges (504) - Media calls now audit-log via log_proxy_call and surface the in-body settlement txHash as x-blockrun-settlement (paid media traffic was invisible to spend reconciliation) - asyncio.get_running_loop() in _run_media; VIDEO_PARAM_KEYS single- sourced in the adapter; media client getters collapsed to one cached factory; /v1/responses added to the module endpoint catalog - tests: full negative-path suite for all 5 media routes (incl. the pre-existing image route) + adapter dispatch/clamp/guard/ceiling tests Co-authored-by: 1bcMax <viewitter@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
The sidecar only exposed
/v1/images/generations, so OpenAI-compatible clients (LiteLLM) could not reach the gateway's video/audio models — a call toxai/grok-imagine-video404'd at the proxy even though the gateway supports it.Adds four OpenAI-shaped media routes, mirroring the existing image route:
POST /v1/videos/generations— video (defaultxai/grok-imagine-video; async submit+poll, settles only on completion)POST /v1/audio/speech— OpenAI-compatible TTSPOST /v1/audio/generations— musicPOST /v1/audio/sound-effects— sound effectsAdapter:
video/music/speech/sound_effectasync wrappers that dispatch Base (dedicatedVideoClient/MusicClient/SpeechClient) vs Solana (unifiedSolanaLLMClient); the sync SDK clients run in the shared thread pool, same pattern as images.Requires
blockrun-llmwith Solana media support — BlockRunAI/blockrun-llm#16 (SolanaLLMClient.videoetc.). Merge/release that first.Validation
Tested live on Solana mainnet through the adapter path: grok-imagine-video, speech (flash + turbo), sound-effects, and music all delivered real media URLs; failed video settlements (transient
transaction_simulation_failed) correctly took no payment.Purely additive (307 lines, no deletions); branched off
main.