AI-powered code review

Ship better code, faster

Automated code reviews for every pull request. Catch bugs, enforce standards, and accelerate your development workflow.

Get Started Go to Dashboard

Trusted by engineering teams worldwide

Everything you need for better code reviews

Streamline your workflow with intelligent automation

Instant Reviews

Reviews run automatically on every PR. Get detailed feedback in seconds, not hours.

Custom Skills

Define custom review rules for your team's patterns, conventions, and best practices.

Quick Fixes

Apply suggested fixes with one click. Resolve issues without leaving GitHub.

CLI

Review before you push

Catch issues locally before they reach your PR. Use it from your terminal, CI pipelines, or let your coding agents run it automatically.

~/project

$grepper review

Reviewing staged changes...

Mode: full

- Analyzing diff...

📋Reviewed 4 files

🔴Critical: Unsanitized user input in SQL query

src/db/queries.ts:42

User-provided values are interpolated directly into the query string, enabling SQL injection.

🟡Warning: Missing null check on optional chain

src/api/handler.ts:18-23

user.profile may be undefined but is accessed without a guard.

🟢Praise: Clean error boundary implementation

src/components/ErrorBoundary.tsx:1-45

Summary: 1 critical, 1 warning, 1 praise

Get started in seconds

npm i -g grepper-dev

Review any diff

Review staged, unstaged, or branch diffs before they become PRs.

Multiple modes

Full, quick, or security-focused modes. JSON output for CI pipelines.

Project-aware

Reads your CLAUDE.md and repo skills for context-aware reviews.

Agent-friendly

Coding agents like Claude Code and Cursor can run reviews as a tool to validate their own changes.

Works in CI too

# GitHub Actions

- run: npx grepper-dev review

--base main --fail-on-critical

Benchmark Results

Performance on the AI Code Review Benchmark

5 repositories, 16 expert-identified issues, head-to-head with industry leaders.

F-Score

50%

Beats Cursor & Greptile

Precision

41.7%

14 false positives

Recall

62.5%

Found 10/16 issues

True Positives

6 missed

Leaderboard

F-Score ranking on identical benchmark

1Augment

59%

2Grepper

50%9% behind #1

3Cursor

49%

4Greptile

45%

Performance Profile

Grepper vs Augment (top competitor)

Repository Breakdown

Performance across different codebases

Detailed Results

Per-repository metrics breakdown

Repository	Language	TP	FP	FN	Precision	Recall	F-Score
cal.com	TypeScript	1	0	1	100%	50%	66.7%
sentry	Python	3	7	1	30%	75%	42.9%
grafana	Go	2	2	3	50%	40%	44.4%
discourse	Ruby	2	2	1	50%	67%	57.1%
keycloak	Java	2	3	0	40%	100%	57.1%
Total	—	10	14	6	41.7%	62.5%	50%

Methodology

Benchmark: AI Code Review Evaluations dataset — 50 PRs across 5 major open-source repositories (cal.com, sentry, grafana, discourse, keycloak).

Golden Comments: Expert-identified issues that a senior reviewer should catch. Each PR contains 2-5 golden comments.

Metrics: Precision measures accuracy, Recall measures coverage, F-Score is the harmonic mean of both.

Model: Claude Opus 4.5 with specialized review skills tuned for each language and framework.

Ready to improve your code quality?

Set up in under 2 minutes. Works with your existing workflow.

Get Startednpm i -g grepper-dev