ShipSleuth Whitepaper

1. Problem

Investors, acquirers, analysts, and hiring managers routinely check GitHub profiles to gauge engineering activity. But raw commit counts, star counts, and contribution graphs are misleading without context. Monorepos inflate commit counts. Bots pollute activity graphs. Squash merges compress real work into single commits. Mirror repos create phantom activity. The result: public GitHub data is simultaneously the most accessible and most misread signal in software diligence.

2. Approach

ShipSleuth treats public GitHub data as a structured signal rather than a raw count. For any org, user, or repo, it decomposes visible activity into direct metrics that each answer a specific question:

Author commitsHow many visible commits are from real accounts, excluding bot accounts?

Merged PRsHow much work was integrated through the pull request workflow?

Active contributorsHow many distinct accounts touched the public codebase?

Active reposHow many repositories show meaningful author activity?

ReleasesHow many tagged milestones shipped during the window?

Merge velocityHow fast are pull requests integrated after opening?

Bus factorHow concentrated is the commit activity among contributors?

Active daysHow many distinct days had at least one push event?

Lines changedWeekly code adds + deletes via GitHub code frequency. Noisy but directionally useful.

Commit weightWhat is the typical code volume per author commit?

3. Scoring dimensions

ShipSleuth computes a 0-100 composite score from seven weighted dimensions. Each dimension uses log-scale normalization to compress extreme values and avoid rewarding pure volume over sustained, broad activity.

Volume30%

Log-scaled commits per 30-day period, pivot at 250. Rewards consistent throughput without overweighting outlier bursts.

Breadth19%

Blend of active repos (45%) and active contributors (55%), both log-scaled. Rewards distributed activity over concentration.

Consistency18%

Active days as a fraction of the analysis window. Rewards sustained daily shipping over concentrated bursts.

Releases10%

Log-scaled release count with pivot at 8. A direct signal that something public shipped.

Recency10%

Days since last visible push, decayed at 3 points per day. Rewards fresh activity.

Collaboration8%

Blend of merged PR velocity, PR author diversity, and discussion activity. Rewards team-shaped shipping.

Concentration adjustment5%

Penalty for top-repo dominance, top-contributor dominance, and high bot share. Rewards organic distribution.

4. Percentile calibration

Absolute metric values are hard to interpret without context. ShipSleuth assigns percentile badges (Top ~1%, Top ~5%, Top ~10%, Top ~25%, Top ~50%) using anchor thresholds derived from real GH Archive data queried via ClickHouse Playground.

The baseline population is ~8.19M human accounts with at least one public PushEvent in the preceding 90 days. Bot/CI accounts (dependabot, renovate, github-actions, etc.) are filtered out — they account for only ~0.3% of push-active accounts. Each metric has its own anchor table mapping value thresholds to percentile tiers, derived from actual GH Archive percentile distributions (P50, P75, P90, P95, P99, P99.9, P99.99).

Key caveat: GH Archive counts PushEvents, not individual commits. ShipSleuth applies a ×2 multiplier to approximate commit counts from push counts. For PRs, GH Archive captures “closed” events which include both merges and rejections. Lines changed are not available in GH Archive and retain estimated anchors.

Per-metric percentiles are interpolated in log-space between anchor points and scaled proportionally to the analysis window (e.g., a 30-day window scales the 90-day thresholds to 1/3). Only positive badges are shown: below Top ~50%, no badge appears. Every percentile display includes “~” to signal these are estimates, and population context (e.g., “estimated top ~400 of 8.2M+”) is shown alongside badges.

Users can click “View the Math” on any analysis result to inspect the exact anchors, window scaling, interpolation, and population estimates for every metric. See the full anchor tables on the methodology page.

5. Caveats as first-class output

Every ShipSleuth analysis includes structured caveats alongside its metrics. These are not footnotes. They are part of the analysis. Key caveats include:

Private repositories are invisible by default. Users can optionally supplement their analysis with self-reported private activity, but these values are unverified.
Monorepos and squash merges distort commit counts in both directions.
Bot activity is detected heuristically and may include false positives or negatives.
Pagination caps mean very busy repos may be undercounted.
Public membership lists only show people who have opted in.
GH Archive counts PushEvents, not individual commits — the ×2 multiplier is an approximation that varies by workflow.
PR “closed” in GH Archive includes both merges and rejections — the real merged-PR distribution may differ.
Percentile anchors are refreshed periodically but represent a snapshot, not a live census.
This is a diligence signal, not a verdict.

6. Design principles

Honest by default. Surface caveats alongside every metric, not behind a footnote.
Direct metrics first. Avoid composite scores when the underlying numbers are more informative.
Non-toxic framing. Percentile badges celebrate high performers without shaming low activity.
Public signal first. Default to public data. Allow optional self-reported private supplements, clearly labeled and unverified.
Open source. The entire methodology is inspectable and reproducible.