AI-powered code review

Ship better code, faster

Automated code reviews for every pull request. Catch bugs, enforce standards, and accelerate your development workflow.

Trusted by engineering teams worldwide

Everything you need for better code reviews

Streamline your workflow with intelligent automation

Instant Reviews

Reviews run automatically on every PR. Get detailed feedback in seconds, not hours.

Custom Skills

Define custom review rules for your team's patterns, conventions, and best practices.

Quick Fixes

Apply suggested fixes with one click. Resolve issues without leaving GitHub.

CLI

Review before you push

Catch issues locally before they reach your PR. Use it from your terminal, CI pipelines, or let your coding agents run it automatically.

~/project
$grepper review
Reviewing staged changes...
Mode: full
- Analyzing diff...
📋Reviewed 4 files
🔴Critical: Unsanitized user input in SQL query
src/db/queries.ts:42
User-provided values are interpolated directly into the query string, enabling SQL injection.
🟡Warning: Missing null check on optional chain
src/api/handler.ts:18-23
user.profile may be undefined but is accessed without a guard.
🟢Praise: Clean error boundary implementation
src/components/ErrorBoundary.tsx:1-45
Summary: 1 critical, 1 warning, 1 praise

Get started in seconds

npm i -g grepper-dev

Review any diff

Review staged, unstaged, or branch diffs before they become PRs.

Multiple modes

Full, quick, or security-focused modes. JSON output for CI pipelines.

Project-aware

Reads your CLAUDE.md and repo skills for context-aware reviews.

Agent-friendly

Coding agents like Claude Code and Cursor can run reviews as a tool to validate their own changes.

Works in CI too

# GitHub Actions
- run: npx grepper-dev review
--base main --fail-on-critical
Benchmark Results

Performance on the AI Code Review Benchmark

5 repositories, 16 expert-identified issues, head-to-head with industry leaders.

F-Score

50%

Beats Cursor & Greptile

Precision

41.7%

14 false positives

Recall

62.5%

Found 10/16 issues

True Positives

10

6 missed

Leaderboard

F-Score ranking on identical benchmark

1Augment
59%
2Grepper
50%9% behind #1
3Cursor
49%
4Greptile
45%
Performance Profile
Grepper vs Augment (top competitor)
Repository Breakdown
Performance across different codebases

Detailed Results

Per-repository metrics breakdown

RepositoryLanguageTPFPFNPrecisionRecallF-Score
cal.comTypeScript101100%50%66.7%
sentryPython37130%75%42.9%
grafanaGo22350%40%44.4%
discourseRuby22150%67%57.1%
keycloakJava23040%100%57.1%
Total—1014641.7%62.5%50%

Methodology

Benchmark: AI Code Review Evaluations dataset — 50 PRs across 5 major open-source repositories (cal.com, sentry, grafana, discourse, keycloak).

Golden Comments: Expert-identified issues that a senior reviewer should catch. Each PR contains 2-5 golden comments.

Metrics: Precision measures accuracy, Recall measures coverage, F-Score is the harmonic mean of both.

Model: Claude Opus 4.5 with specialized review skills tuned for each language and framework.

Benchmark conducted February 2026 • Data from ai-code-review-evaluations

Ready to improve your code quality?

Set up in under 2 minutes. Works with your existing workflow.

Get Startednpm i -g grepper-dev