Files
pr-reviewer/README.md
T
Daan Schouteden 84db121a4c Accept review_requested on pull_request webhook events.
Gitea sends review requests as pull_request/review_requested, not only pull_request_review_request.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-03 10:59:22 +02:00

206 lines
6.2 KiB
Markdown

# gitea-pr-review-bot
Central webhook service that reviews Gitea pull requests using the Cursor SDK and posts a full Gitea review (summary + inline comments).
This bot is designed to run once for many repositories in `Bram/*` instead of duplicating workflows in every repo.
## What This Bot Does
- Listens to Gitea webhook events.
- Triggers on:
- `pull_request` with action `opened`
- `pull_request` or `pull_request_review_request` with action `review_requested` when requested reviewer is the bot user
- Loads PR metadata and changed files from Gitea.
- Builds a review prompt (including optional repo-specific rule files).
- Calls Cursor Cloud Agent (`Agent.prompt`) for structured review output.
- Validates and posts a single consolidated Gitea review.
- Removes itself from requested reviewers after successful review.
- Prevents duplicate processing by dedupe key:
- `{owner}/{repo}#{pr_number}#{head_sha}`
## High-Level Flow
1. `src/server.ts` receives webhook payload.
2. `src/webhook/verify-signature.ts` validates HMAC (`WEBHOOK_SECRET`).
3. `src/webhook/event-router.ts` accepts or ignores event by type/action.
4. `src/run/review-runner.ts` orchestrates full review run.
5. `src/config/load-repo-config.ts` loads optional per-repo overrides.
6. `src/gitea/client.ts` fetches PR + files and writes reviews.
7. `src/prompt/build-review-prompt.ts` assembles review instructions/context.
8. `src/cursor/review-agent.ts` invokes Cursor SDK in cloud mode.
9. `src/cursor/review-schema.ts` validates structured JSON response.
10. `src/gitea/review-api.ts` posts review and handles inline comment fallback.
11. `src/gitea/reviewer-api.ts` removes bot reviewer from PR.
## Architecture Diagram
```mermaid
flowchart TD
A[Gitea Webhook] --> B[src/server.ts]
B --> C[verify-signature.ts]
C --> D[event-router.ts]
D --> E[review-runner.ts]
E --> F[dedupe-store.ts]
E --> G[should-process-event.ts]
E --> H[load-repo-config.ts]
E --> I[gitea/client.ts]
E --> J[build-review-prompt.ts]
E --> K[review-agent.ts]
K --> L[Cursor Cloud Agent.prompt]
K --> M[review-schema.ts]
E --> N[review-api.ts]
N --> O[POST /pulls/{index}/reviews]
N --> P[DELETE prior bot reviews]
E --> Q[reviewer-api.ts]
Q --> R[PATCH reviewers remove bot]
```
## Project Structure
### Server and Webhook Layer
- `src/server.ts`
- HTTP server, `/healthz`, `/webhooks/gitea`
- Signature check + event routing + response codes
- `src/webhook/verify-signature.ts`
- SHA256 HMAC validation with timing-safe compare
- `src/webhook/event-router.ts`
- Converts raw webhook headers/payload into supported routed events
### Domain Logic
- `src/domain/should-process-event.ts`
- Loop guard (skip bot-originated events)
- Repo enable/disable check
- Label skip logic
- Base branch filtering
- `src/domain/dedupe-store.ts`
- In-memory TTL dedupe store for idempotency
### Gitea Integration
- `src/gitea/client.ts`
- Typed wrapper around Gitea REST endpoints
- Pull details/files/reviews, create/delete review, patch pull
- Optional content-file loading from repository
- `src/gitea/review-api.ts`
- Delete prior bot reviews
- Post one consolidated review
- Validate inline comments against changed-file paths and positions
- Summary-only fallback if inline set is invalid
- `src/gitea/reviewer-api.ts`
- Remove bot from requested reviewers list
### Cursor Integration
- `src/cursor/review-agent.ts`
- Calls Cursor SDK `Agent.prompt` in cloud runtime
- Enforces review timeout
- Extracts text from SDK response formats
- `src/cursor/review-schema.ts`
- Zod schema for strict review JSON contract:
- `verdict`, `event`, `body`, `comments[]`
### Prompting and Config
- `src/prompt/build-review-prompt.ts`
- Builds complete PR review prompt from title/body/files/patches/rules
- `src/config/load-repo-config.ts`
- Reads `.gitea/pr-review-bot.yml` when present
- Loads optional repo-local rule files:
- `AGENTS.md`
- `docs/pr-review.md`
- `.cursor/skills/pr-review/SKILL.md`
- `.cursor/skills/pr-review/gitea.md`
### Orchestration and Utilities
- `src/run/review-runner.ts`
- Main end-to-end review execution
- Dedupe, retry wrappers, posting, reviewer self-removal
- `src/run/retry.ts`
- Exponential backoff retry helper for transient failures
- `src/types/events.ts`
- Gitea webhook payload typings
### Scripts and Deployment
- `scripts/dry-run.ts`
- Manual run by `owner/repo/pr`
- `Dockerfile`
- Production container image
- `docker-compose.yml`
- Local/hosted compose deployment
- `docs/setup.md`
- Installation and webhook setup
- `docs/operations.md`
- Runtime behavior, failure handling, rollback
## Environment Variables
Required:
- `CURSOR_API_KEY`
- `GITEA_TOKEN`
- `GITEA_BASE_URL`
- `GITEA_BOT_LOGIN`
- `WEBHOOK_SECRET`
- `PORT`
Optional:
- `DEFAULT_BASE_BRANCH` (default: `main`)
- `MAX_INLINE_COMMENTS` (default: `5`)
- `REVIEW_TIMEOUT_MS` (default: `120000`)
- `DEDUPE_TTL_SECONDS` (default: `1800`)
- `LOG_LEVEL` (default: `info`; options: `debug`, `info`, `warn`, `error`)
See `.env.example`.
## Review Output Contract
The Cursor response must be strict JSON:
- `verdict`: `approve | request_changes | comment`
- `event`: `APPROVE | REQUEST_CHANGES | COMMENT`
- `body`: string
- `comments`: array of:
- `path` (changed file path)
- `new_position` (integer >= 1)
- `body` (comment text)
Post-validation rules:
- Invalid path/position in inline comments => post summary-only review.
- Inline comments are clamped to configured maximum.
## Getting Started
1. Copy env template:
- `cp .env.example .env`
2. Fill secrets/tokens.
3. Install:
- `npm install`
4. Run locally:
- `npm run dev`
5. Verify health:
- `curl http://localhost:8787/healthz`
## Useful Commands
- Type check: `npm run check`
- Build: `npm run build`
- Start built app: `npm run start`
- Dry run: `npm run dry-run -- <owner> <repo> <pr_number> [head_sha]`
## Operational Notes
- Keep bot credentials scoped to least privilege.
- Do not log tokens or raw auth headers.
- Dedupe is in-memory (resets on restart).
- Best deployment model is a central service with org-level webhook and per-repo opt-out config.
- Logs are structured JSON to simplify filtering in Docker/log collectors.