Codex GitHub Action Tutorial

The Codex GitHub Action lets you run Codex from a GitHub Actions workflow. That means Codex can review pull requests, generate CI feedback, prepare release notes, or create a patch artifact without you manually opening the Codex CLI.

This tutorial shows the practical workflow: when to use the action, how to structure prompts, how to keep permissions narrow, how to handle outputs, and how to avoid the most common CI security mistakes.

Hand-drawn Codex GitHub Action workflow — A safe Codex GitHub Action workflow checks out code, runs Codex with a prompt file, captures output, and posts or stores reviewable feedback.

Quick Answer

Use openai/codex-action@v1 when you want Codex to run in CI/CD without installing and authenticating the CLI yourself. Good first workflows are pull request review, release note drafting, migration checks, and CI failure triage.

Start with read-only feedback before giving Codex write permissions. Keep prompt files in the repository, restrict who can trigger the workflow, use the narrowest GitHub token permissions, and avoid exposing API keys to steps that run untrusted repository code.

What The Action Does

The official Codex GitHub Action installs the Codex CLI, starts a Responses API proxy when you provide an API key, and runs codex exec with the permissions and arguments you configure.

Use it when you want:

Codex feedback on pull requests.
Repeatable AI review as part of CI.
Release prep or changelog generation.
Migration checks.
Patch artifacts for failed CI jobs.
A workflow that does not depend on a developer's local machine.

The action is not magic approval. It is automation. You still decide whether the feedback is valid and whether a patch should be merged.

A Good First Workflow: Pull Request Review

Start with a read-only PR review workflow. The job checks out the pull request, runs Codex with a review prompt, and posts the final message back to the PR.

Use a prompt file such as:

# .github/codex/prompts/review.md

Review this pull request for serious issues only.

Focus on:
- correctness bugs
- security or privacy regressions
- missing tests
- behavior outside the stated change

Ignore formatting and naming comments unless they hide a real bug.
Return findings first, ordered by severity, with file and line references when possible.

Prompt files are better than long inline workflow strings because they are easier to review, version, and improve.

Minimal Workflow Shape

A simplified workflow looks like this:

name: Codex pull request review

on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  codex:
    runs-on: ubuntu-latest
    permissions:
      contents: read
    steps:
      - uses: actions/checkout@v5
        with:
          persist-credentials: false

      - name: Run Codex
        uses: openai/codex-action@v1
        with:
          openai-api-key: ${{ secrets.OPENAI_API_KEY }}
          prompt-file: .github/codex/prompts/review.md
          output-file: codex-output.md
          sandbox: read-only

This first version should not modify code. It should produce review feedback that humans can judge.

Configure Codex Exec

The action maps several inputs to codex exec behavior.

Input	Use It For
`prompt`	Inline task instructions
`prompt-file`	A repository file with the task prompt
`codex-args`	Extra CLI flags such as profiles or output options
`model` and `effort`	Agent configuration
`sandbox`	Permission boundary such as `read-only` or `workspace-write`
`output-file`	Save final Codex message for later steps
`codex-version`	Pin a CLI release
`codex-home`	Reuse Codex configuration or MCP setup

For most teams, prompt-file, sandbox, and output-file are the highest-leverage inputs to get right first.

Manage Privileges Carefully

GitHub-hosted runners are powerful. Treat the Codex job like any other automation that receives credentials.

Use these rules:

Prefer contents: read for feedback-only jobs.
Use persist-credentials: false during checkout unless a later step truly needs push access.
Keep safety-strategy at its safer default unless you understand the tradeoff.
Do not expose OPENAI_API_KEY to arbitrary shell steps that run repository-controlled code.
Restrict who can trigger Codex workflows.
Store prompts in reviewed files instead of building them from untrusted PR text.

If a workflow should create a patch, split responsibilities. Let the Codex job produce a patch artifact with read permissions, then use a separate job with write permissions to apply the patch and open a PR.

Use AGENTS.md For Review Standards

Codex works better when repository expectations are durable. Add review rules to AGENTS.md:

## Review guidelines

- Prioritize correctness, security, privacy, auth, billing, data loss, and broken tests.
- Ignore pure style comments unless they hide a bug.
- Check that API routes preserve authentication and rate limits.
- Flag any logging of tokens, PII, payment data, or private user content.

This keeps CI feedback aligned with your team's review culture.

Capture Outputs

The action emits a final message output and can write the final Codex message to a file. Use that for comments, artifacts, or downstream summaries.

Common patterns:

Post a PR comment.
Upload codex-output.md as an artifact.
Save a patch artifact.
Extract structured JSON if your prompt and codex-args enforce a schema.

Do not hide the output. A useful AI workflow should leave an audit trail.

Good First Use Cases

Start with workflows that are helpful but not dangerous:

PR risk summary.
Missing test review.
Release note draft.
Changelog consistency check.
Documentation drift check.
CI failure explanation.

Move to autofix workflows only after the read-only feedback is consistently useful.

Common Mistakes

The first mistake is giving Codex write permissions before the team trusts its feedback.

The second mistake is using untrusted PR text directly as a prompt. Pull request text can contain prompt injection or hidden instructions.

The third mistake is running setup scripts with API keys available to the environment. Keep secret exposure narrow.

The fourth mistake is posting noisy review comments. A useful Codex review should focus on serious, actionable findings.

The fifth mistake is treating CI feedback as a merge decision. Codex is one reviewer, not the release manager.

Bottom Line

The Codex GitHub Action is best for repeatable, reviewable automation. Start with read-only PR feedback, keep permissions narrow, store prompts in the repo, capture outputs, and review every result.

Once the feedback earns trust, you can expand into patch artifacts, release workflows, and more advanced CI assistance.

Official References

Decision Checklist For Codex GitHub Action Tutorial

Use this guide as a decision filter before a sales call, trial, or migration plan. For Codex GitHub Action Tutorial, the practical question is whether the topic connects Codex GitHub Action, Codex CI, AI code review to a measurable workflow outcome. A good decision should improve delivery speed, quality, cost control, or operational confidence without creating hidden review, security, or migration work.

Generated changes survive code review with fewer rewrites, fewer broad diffs, and fewer style corrections.
The assistant understands multi-file context, tests, build failures, private repository rules, and local conventions.
Administrators can manage seats, data controls, policy settings, and usage visibility without blocking developers.

Pilot Plan

A useful pilot is small enough to finish quickly but realistic enough to expose integration, data, workflow, and pricing issues. Avoid demo-only tests. The trial should use real tasks, real constraints, and a baseline from the current process so the team can decide with evidence instead of impressions.

Give each candidate the same bug fix, failing-test repair, refactor, and explanation task.
Track accepted diffs, reviewer comments, rework time, test pass rate, and developer satisfaction.
Run the trial with senior maintainers and newer engineers because the value pattern is different for each group.

Metrics To Track

Track metrics that connect Codex GitHub Action Tutorial to outcomes a budget owner and an engineering owner can both understand. A tool can look impressive in a demo and still fail if usage is low, quality is uneven, or the cost model changes under real workload volume.

Accepted AI-assisted diffs, rejected suggestions, reviewer comments, and post-merge fixes.
Time to repair failing tests, explain unfamiliar modules, and complete safe refactors.
Seat utilization, premium request exhaustion, and policy exceptions for sensitive repositories.

Budget And Risk Review

Commercially useful AI tooling decisions should include the subscription or API price, but they should also include support load, review time, observability, privacy controls, switching cost, and the cost of wrong or low-quality output. Treat the first estimate as a working model and update it with production evidence.

Confirm private code handling, training opt-out, data retention, and enterprise policy controls.
Watch for over-generation: large patches that look productive but increase review cost.
Compare cost per accepted change rather than cost per seat alone.

Revisit the assistant after 30 days of real pull requests. A useful coding tool should reduce review latency and onboarding friction without increasing risky generated code.