pr-reviewer/README.md

# gitea-pr-review-bot

Central webhook service that reviews Gitea pull requests using the Cursor SDK and posts a full Gitea review (summary + inline comments).

This bot is designed to run once for many repositories in `Bram/*` instead of duplicating workflows in every repo.

## What This Bot Does

- Listens to Gitea webhook events.
- Triggers on:
  - `pull_request` with action `opened`
  - `pull_request` or `pull_request_review_request` with action `review_requested` when requested reviewer is the bot user
- Loads PR metadata and changed files from Gitea.
- Builds a review prompt (including optional repo-specific rule files).
- Calls Cursor Cloud Agent (`Agent.prompt`) for structured review output.
- Validates and posts a single consolidated Gitea review.
- Removes itself from requested reviewers after successful review.
- Prevents duplicate processing by dedupe key:
  - `{owner}/{repo}#{pr_number}#{head_sha}`

## High-Level Flow

1. `src/server.ts` receives webhook payload.
2. `src/webhook/verify-signature.ts` validates HMAC (`WEBHOOK_SECRET`).
3. `src/webhook/event-router.ts` accepts or ignores event by type/action.
4. `src/run/review-runner.ts` orchestrates full review run.
5. `src/config/load-repo-config.ts` loads optional per-repo overrides.
6. `src/gitea/client.ts` fetches PR + files and writes reviews.
7. `src/prompt/build-review-prompt.ts` assembles review instructions/context.
8. `src/cursor/review-agent.ts` invokes Cursor SDK in cloud mode.
9. `src/cursor/review-schema.ts` validates structured JSON response.
10. `src/gitea/review-api.ts` posts review and handles inline comment fallback.
11. `src/gitea/reviewer-api.ts` removes bot reviewer from PR.

## Architecture Diagram

```mermaid
flowchart TD
  A[Gitea Webhook] --> B[src/server.ts]
  B --> C[verify-signature.ts]
  C --> D[event-router.ts]
  D --> E[review-runner.ts]

  E --> F[dedupe-store.ts]
  E --> G[should-process-event.ts]
  E --> H[load-repo-config.ts]
  E --> I[gitea/client.ts]
  E --> J[build-review-prompt.ts]
  E --> K[review-agent.ts]
  K --> L[Cursor Cloud Agent.prompt]
  K --> M[review-schema.ts]

  E --> N[review-api.ts]
  N --> O[POST /pulls/{index}/reviews]
  N --> P[DELETE prior bot reviews]

  E --> Q[reviewer-api.ts]
  Q --> R[PATCH reviewers remove bot]
```

## Project Structure

### Server and Webhook Layer

- `src/server.ts`
  - HTTP server, `/healthz`, `/webhooks/gitea`
  - Signature check + event routing + response codes
- `src/webhook/verify-signature.ts`
  - SHA256 HMAC validation with timing-safe compare
- `src/webhook/event-router.ts`
  - Converts raw webhook headers/payload into supported routed events

### Domain Logic

- `src/domain/should-process-event.ts`
  - Loop guard (skip bot-originated events)
  - Repo enable/disable check
  - Label skip logic
  - Base branch filtering
- `src/domain/dedupe-store.ts`
  - In-memory TTL dedupe store for idempotency

### Gitea Integration

- `src/gitea/client.ts`
  - Typed wrapper around Gitea REST endpoints
  - Pull details/files/reviews, create/delete review, patch pull
  - Optional content-file loading from repository
- `src/gitea/review-api.ts`
  - Delete prior bot reviews
  - Post one consolidated review
  - Validate inline comments against changed-file paths and positions
  - Summary-only fallback if inline set is invalid
- `src/gitea/reviewer-api.ts`
  - Remove bot from requested reviewers list

### Cursor Integration

- `src/cursor/review-agent.ts`
  - Calls Cursor SDK `Agent.prompt` in cloud runtime
  - Enforces review timeout
  - Extracts text from SDK response formats
- `src/cursor/review-schema.ts`
  - Zod schema for strict review JSON contract:
  - `verdict`, `event`, `body`, `comments[]`

### Prompting and Config

- `src/prompt/build-review-prompt.ts`
  - Builds complete PR review prompt from title/body/files/patches/rules
- `src/config/load-repo-config.ts`
  - Reads `.gitea/pr-review-bot.yml` when present
  - Loads optional repo-local rule files:
    - `AGENTS.md`
    - `docs/pr-review.md`
    - `.cursor/skills/pr-review/SKILL.md`
    - `.cursor/skills/pr-review/gitea.md`

### Orchestration and Utilities

- `src/run/review-runner.ts`
  - Main end-to-end review execution
  - Dedupe, retry wrappers, posting, reviewer self-removal
- `src/run/retry.ts`
  - Exponential backoff retry helper for transient failures
- `src/types/events.ts`
  - Gitea webhook payload typings

### Scripts and Deployment

- `scripts/dry-run.ts`
  - Manual run by `owner/repo/pr`
- `Dockerfile`
  - Production container image
- `docker-compose.yml`
  - Local/hosted compose deployment
- `docs/setup.md`
  - Installation and webhook setup
- `docs/operations.md`
  - Runtime behavior, failure handling, rollback

## Environment Variables

Required:

- `CURSOR_API_KEY`
- `GITEA_TOKEN`
- `GITEA_BASE_URL`
- `GITEA_BOT_LOGIN`
- `WEBHOOK_SECRET`
- `PORT`

Optional:

- `DEFAULT_BASE_BRANCH` (default: `main`)
- `MAX_INLINE_COMMENTS` (default: `5`)
- `REVIEW_TIMEOUT_MS` (default: `120000`)
- `DEDUPE_TTL_SECONDS` (default: `1800`)
- `LOG_LEVEL` (default: `info`; options: `debug`, `info`, `warn`, `error`)

See `.env.example`.

## Review Output Contract

The Cursor response must be strict JSON:

- `verdict`: `approve | request_changes | comment`
- `event`: `APPROVE | REQUEST_CHANGES | COMMENT`
- `body`: string
- `comments`: array of:
  - `path` (changed file path)
  - `new_position` (integer >= 1)
  - `body` (comment text)

Post-validation rules:

- Invalid path/position in inline comments => post summary-only review.
- Inline comments are clamped to configured maximum.

## Getting Started

1. Copy env template:
   - `cp .env.example .env`
2. Fill secrets/tokens.
3. Install:
   - `npm install`
4. Run locally:
   - `npm run dev`
5. Verify health:
   - `curl http://localhost:8787/healthz`

## Useful Commands

- Type check: `npm run check`
- Build: `npm run build`
- Start built app: `npm run start`
- Dry run: `npm run dry-run -- <owner> <repo> <pr_number> [head_sha]`

## Operational Notes

- Keep bot credentials scoped to least privilege.
- Do not log tokens or raw auth headers.
- Dedupe is in-memory (resets on restart).
- Best deployment model is a central service with org-level webhook and per-repo opt-out config.
- Logs are structured JSON to simplify filtering in Docker/log collectors.