Conversation
…rigger on version tags (#18)
* Added GitHub Actions release workflow and updated build workflow to trigger on version tags * Triggered evaluation-function-base release from release workflow
…g only for tag refs (#22)
* Added OpenAPI request/response validation middleware and integrated OpenAPI specification * Add embedded µEd OpenAPI specification Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Move µEd OpenAPI spec into runtime/schema Relocates the spec from api/ into runtime/schema/ alongside the existing JSON schema files, and renames it to mued_v0.1.0.yml to make the version explicit. Removes the api/ package; embed is now owned by runtime/schema. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Ignore .idea/ directory Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Make OpenAPI response validation strict for µEd routes Previously, responses that failed spec validation were only logged as warnings and forwarded anyway. Now a failed µEd response validation returns 500 to the caller. The legacy / route is unaffected — it has no matching path in the spec so the middleware passes it through unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Update Go version to 1.25 in Dockerfile for builder stage * Support OpenAPI 3.1.0 spec in router validation Pass IsOpenAPI31OrLater and AllowExtraSiblingFields options to the legacy router so description/summary siblings on $ref objects (valid in 3.1.0) don't fail validation. Also propagate errors from OpenAPIMiddleware and NewHttpServer instead of ignoring them. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Refactor error responses and improve OpenAPI middleware robustness Use `writeJSONError` helper for consistent JSON error responses in µEd handler. Enhance OpenAPI response validation to prevent buffer drainage during snapshot handling. * Add health status response to µEd handler based on test results * Update µEd test assertion to verify "status" field instead of "tests_passed" field --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Added Linux-only nsjail-based sandboxing for worker processes, including CLI support, configuration, and testing. * Added validation for `Content-Length` in `headerPrefixPipe` and tests for oversized and negative values * Enhanced `build.yml` to compile and install nsjail from source instead of using system package. * Switched nsjail mode from "once" to "exec" for direct command execution with inherited stdio. * Replaced `--time_limit` with `--rlimit_cpu` in nsjail arguments to ensure compatibility in containers without cgroupv2. * Updated sandbox test to replace `--time_limit` with `--rlimit_cpu` and adjusted workflow to run integration tests with elevated permissions.
* Added `MuEdHandler` to handle `/evaluate` and `/evaluate/health` endpoints with authentication and runtime integration, along with associated tests * Added `workflow_dispatch` trigger to GitHub Actions build workflow * Removed `NewCommandRoute` and corrected route definitions for `/evaluate` and `/evaluate/health` * Added `NormalizePath` middleware to canonicalize `/evaluate` and `/evaluate/health` paths across server and lambda integrations * Added API versioning support for `/evaluate` and `/evaluate/health` endpoints with header validation, default version handling, and capability reporting * Added OpenAPI request/response validation middleware and integrated OpenAPI specification * Add embedded µEd OpenAPI specification Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Move µEd OpenAPI spec into runtime/schema Relocates the spec from api/ into runtime/schema/ alongside the existing JSON schema files, and renames it to mued_v0.1.0.yml to make the version explicit. Removes the api/ package; embed is now owned by runtime/schema. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Ignore .idea/ directory Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Make OpenAPI response validation strict for µEd routes Previously, responses that failed spec validation were only logged as warnings and forwarded anyway. Now a failed µEd response validation returns 500 to the caller. The legacy / route is unaffected — it has no matching path in the spec so the middleware passes it through unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Simplify µEd response encoding by removing unnecessary "status" field logic --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Added `MuEdHandler` to handle `/evaluate` and `/evaluate/health` endpoints with authentication and runtime integration, along with associated tests * Added `workflow_dispatch` trigger to GitHub Actions build workflow * Removed `NewCommandRoute` and corrected route definitions for `/evaluate` and `/evaluate/health` * Added `NormalizePath` middleware to canonicalize `/evaluate` and `/evaluate/health` paths across server and lambda integrations * Added API versioning support for `/evaluate` and `/evaluate/health` endpoints with header validation, default version handling, and capability reporting * Refactored `/evaluate` and `/evaluate/health` error handling to standardize JSON responses with `writeMuEdError` and included `X-Api-Version` header validation and degraded health status support. * Added OpenAPI request/response validation middleware and integrated OpenAPI specification * Add embedded µEd OpenAPI specification Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Move µEd OpenAPI spec into runtime/schema Relocates the spec from api/ into runtime/schema/ alongside the existing JSON schema files, and renames it to mued_v0.1.0.yml to make the version explicit. Removes the api/ package; embed is now owned by runtime/schema. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Ignore .idea/ directory Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Make OpenAPI response validation strict for µEd routes Previously, responses that failed spec validation were only logged as warnings and forwarded anyway. Now a failed µEd response validation returns 500 to the caller. The legacy / route is unaffected — it has no matching path in the spec so the middleware passes it through unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Update µEd handler to use dynamic status codes for responses --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The spec defines task.referenceSolution as a plain object with additionalProperties, not a typed Submission wrapper. Change MuEdTask.ReferenceSolution from *MuEdSubmission to map[string]any and extract its content directly using the submission's type to determine the expected key. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ation Replace the 1ms timing-based wait with pool.Close() via m.Shutdown, consistent with all other tests in the file that rely on the same background goroutine pattern. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Keep main's version of release.yml which includes the Trigger evaluation-function-base release step. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use SHIMMY_DEPLOY_TOKEN (PAT) for checkout so the tag push is treated as a user action and triggers the build.yml Docker image workflow. GITHUB_TOKEN-initiated pushes are blocked from triggering other workflows. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SHIMMY_DEPLOY_TOKEN(org-level PAT) for checkout in release workflow so the tag push triggersbuild.ymlGITHUB_TOKENwas used which GitHub blocks from triggering other workflows, causing the Docker image build andtrigger-builddispatch to evaluation-function-base to never fireTest plan
build.yml→ builds and pushes shimmy Docker image withlatesttagbuild_base_imagesjob dispatchestrigger-buildto evaluation-function-baselatest🤖 Generated with Claude Code