Comprehensive technical documentation for GitCheck's GitHub analytics platform, scoring algorithms, and implementation details.
GitCheck uses a sophisticated multi-stage pipeline to analyze GitHub profiles. The process combines GraphQL and REST API calls, caching strategies, and statistical analysis to deliver comprehensive developer insights.
User submits a GitHub username through the homepage input. The system validates the input and checks rate limits:
Before making expensive API calls, the system checks if the profile was recently analyzed:
💡 Performance Benefit:
Cached responses are served in ~50ms vs. ~30-45 seconds for full analysis. This reduces GitHub API usage by 90%+ and provides instant results for repeated queries.
The system uses GitHub's GraphQL API to efficiently fetch repository data in a single request:
query($username: String!, $repoCount: Int!) {
user(login: $username) {
login
name
bio
location
company
websiteUrl
avatarUrl(size: 400)
followers { totalCount }
following { totalCount }
organizations(first: 100) {
nodes { login }
}
gists(first: 1) { totalCount }
createdAt
repositories(first: $repoCount,
orderBy: {field: STARGAZERS, direction: DESC},
ownerAffiliations: OWNER,
isFork: false,
privacy: PUBLIC) {
totalCount
nodes {
name
description
url
stargazerCount
forkCount
watchers { totalCount }
primaryLanguage { name }
languages(first: 10) {
edges { size node { name } }
}
updatedAt
createdAt
licenseInfo { spdxId }
openIssuesCount: issues(states: OPEN) { totalCount }
closedIssuesCount: issues(states: CLOSED) { totalCount }
openPRsCount: pullRequests(states: OPEN) { totalCount }
mergedPRsCount: pullRequests(states: MERGED) { totalCount }
}
}
}
}📊 Data Retrieved:
⚡ Optimization:
Fetches detailed contribution history using REST API endpoints:
Commit Activity (REST API):
Parses last 365 days of push events to calculate total commits, streaks, and activity patterns
Pull Requests & Reviews:
Aggregates contribution statistics across all public repositories
Calculated Metrics:
Applies statistical algorithms to compute a normalized 0-100 developer score across four weighted components:
📐 Scoring Methodology:
z = (value - μ) / σ🎯 Design Philosophy:
The scoring system uses population-based statistics rather than absolute thresholds. This ensures scores remain meaningful as GitHub evolves and prevents inflation over time. A score of 70 always means "better than 70% of developers," regardless of when it was calculated.
All computed metrics are stored in PostgreSQL with automatic cache invalidation:
💾 Database:
PostgreSQL on Neon (serverless, auto-scaling, SSL/TLS encrypted)
⏱️ Cache Policy:
24-hour TTL, instant invalidation available via manual refresh
GitCheck uses a statistical scoring model based on z-score normalization and percentile ranking. The system compares each developer against a baseline population of 100,000+ GitHub users to provide meaningful, percentile-based scores.
Measures the reach and influence of a developer's work through community engagement metrics.
Formula:
rawImpact = totalStars + (totalForks × 2) + (totalWatchers × 0.5) + (followersCount × 0.1)Metrics Used:
Population Stats:
💡 Why this matters:
Forks are weighted 2x because they indicate not just interest but actual usage and derivative work. Watchers show ongoing engagement. This component heavily favors maintainers of popular open-source projects.
Evaluates repository health, maintenance activity, and development best practices.
Formula:
repoActivityRate = totalRepos / accountAgeYearsmaintenanceScore = avgRepoUpdateFrequency × issueResolutionRaterawQuality = repoActivityRate × maintenanceScore × (1 + gistsCount/100)Metrics Used:
Population Stats:
💡 Why this matters:
This component rewards consistent repository creation and maintenance. A developer with 5 well-maintained repos scores higher than one with 50 abandoned projects. Issue resolution and gist sharing indicate engagement with best practices.
Tracks coding frequency, commit patterns, and sustainable development habits.
Formula:
commitsPerYear = totalCommits / accountAgeYearsstreakBonus = Math.log10(currentStreak + 1) × 10rawConsistency = commitsPerYear × (1 + streakBonus/100)Metrics Used:
Population Stats:
💡 Why this matters:
Consistency indicates sustainable coding habits. The logarithmic streak bonus prevents over-optimization for daily commits while still rewarding regularity. This component favors developers who code steadily over time rather than in intense bursts.
Measures teamwork, code review participation, and open-source contributions.
Formula:
prQuality = totalPRs × (mergedPRs / totalPRs)orgBonus = Math.log10(organizationsCount + 1) × 15rawCollaboration = prQuality + totalReviews + orgBonusMetrics Used:
Population Stats:
💡 Why this matters:
Collaboration skills are essential for professional development. High merge rates indicate quality contributions. Code reviews demonstrate mentorship and code quality awareness. Organization membership shows team participation.
Each raw component score is normalized using z-scores to compare against the population distribution:
z = (rawValue - populationMean) / populationStdDevWhere populationMean and populationStdDev are derived from analyzing 100,000+ GitHub profiles. Z-scores typically range from -3 to +5, with 0 representing average.
Z-scores are converted to percentiles using a 48-point interpolation table based on the standard normal distribution:
The system uses cubic interpolation between lookup points for precision at high percentiles (95-100), where small changes in z-score result in significant percentile differences.
Component percentiles are combined using predefined weights to produce the final 0-100 score:
finalScore = (impact × 0.35) + (codeQuality × 0.30) + (consistency × 0.20) + (collaboration × 0.15)Weights were determined through empirical analysis of what metrics best correlate with developer effectiveness and community recognition.
Final scores are mapped to letter grades for intuitive interpretation:
Developer Profile:
Component Calculations:
Final Score:
score = (99.99 × 0.35) + (50.80 × 0.30) + (74.86 × 0.20) + (84.85 × 0.15)score = 35.00 + 15.24 + 14.97 + 12.73This developer excels at impact (popular projects) and collaboration, but has average code quality metrics and good consistency.
React framework with App Router, Server Components, and Turbopack for blazing-fast builds
Latest React with Server Components, Suspense, and new React Compiler for automatic optimization
Strict mode enabled for type safety and better developer experience
Utility-first CSS with custom design system and responsive breakpoints
Production-ready animation library for smooth transitions and interactive UI elements
Serverless API endpoints with automatic code splitting and edge runtime support
Type-safe database client with migrations, schema management, and query builder
Serverless Postgres with auto-scaling, branching, and sub-second cold starts
GraphQL API for efficient data fetching + REST API fallback for contribution data
Edge network deployment with automatic HTTPS, previews, and performance analytics
model Profile {
id String @id @default(cuid())
userId String? @unique
username String @unique
avatarUrl String?
bio String?
location String?
company String?
blog String?
hireable Boolean @default(false)
// Core Metrics
score Float?
percentile Int?
totalCommits Int @default(0)
totalRepos Int @default(0)
totalStars Int @default(0)
totalForks Int @default(0)
totalPRs Int @default(0)
mergedPRs Int @default(0)
openPRs Int @default(0)
// Activity Metrics
currentStreak Int @default(0)
longestStreak Int @default(0)
averageCommitsPerDay Float @default(0)
mostActiveDay String?
weekendActivity Float @default(0)
// Social Metrics
followersCount Int @default(0)
followingCount Int @default(0)
organizationsCount Int @default(0)
gistsCount Int @default(0)
// Collaboration Metrics
totalIssuesOpened Int @default(0)
totalReviews Int @default(0)
totalContributions Int @default(0)
totalWatchers Int @default(0)
totalOpenIssues Int @default(0)
// Repository Health
averageRepoSize Float @default(0)
accountAge Float @default(0)
accountCreatedAt DateTime?
// Language Data (JSON)
languages Json @default("{}")
frameworks Json @default("{}")
// Repository Data (JSON array)
topRepos Json @default("[]")
// Contribution Data (JSON array)
contributions Json @default("[]")
// Scoring Components (JSON)
scoreComponents Json?
scoringMethod String? // "fallback" or "pro"
scoreStrengths String[]
scoreImprovements String[]
// Cache Management
scannedAt DateTime @default(now())
lastLanguageScan DateTime?
lastFrameworkScan DateTime?
lastOrgScan DateTime?
// Indexes for performance
@@index([username])
@@index([score])
@@index([scannedAt])
}🔑 Key Design Decisions:
📈 Scalability Features:
GitCheck provides REST API endpoints for analyzing GitHub profiles and retrieving cached data. All endpoints are serverless and deployed on Vercel's edge network.
Analyzes a GitHub username and returns comprehensive developer metrics. Implements 24-hour caching and rate limiting.
{
"username": "torvalds",
"_honeypot": "", // Must be empty (bot detection)
"_timestamp": 1704067200000 // Page load time (timing validation)
}{
"success": true,
"cached": false,
"profile": {
"username": "torvalds",
"score": 96.93,
"percentile": 97,
"totalStars": 223690,
"totalForks": 60864,
// ... additional metrics
},
"nextScanAvailable": "2026-01-15T20:11:50.546Z",
"hoursRemaining": 23
}{ "error": "Username is required" }{ "error": "Bot detected - honeypot field filled" }{ "error": "Rate limit exceeded", "retryAfter": 300 }{ "error": "GitHub user not found" }const response = await fetch('/api/analyze-username', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
username: 'torvalds',
_honeypot: '',
_timestamp: Date.now() - 2000 // 2 seconds ago
})
});
const data = await response.json();
if (data.success) {
console.log(`Score: ${data.profile.score}/100`);
if (data.cached) {
console.log(`Cached data, next scan in ${data.hoursRemaining}h`);
}
}Retrieves cached profile data for a given username. Fast endpoint (~50ms) for displaying dashboard data.
GET /api/profile?username=torvalds{
"user": { "plan": "FREE" },
"profile": {
"username": "torvalds",
"score": 96.93,
"percentile": 97,
"scoreComponents": {
"impact": { "score": 99.99, "weight": 35, "source": "statistical" },
"codeQuality": { "score": 38.37, "weight": 30, "source": "statistical" },
"consistency": { "score": 72.96, "weight": 20, "source": "statistical" },
"collaboration": { "score": 69.50, "weight": 15, "source": "statistical" }
},
"scoringMethod": "fallback",
"totalStars": 223690,
"totalRepos": 10,
"languages": { "C": 98, "Rust": 0.3, "Shell": 0.4 },
"topRepos": [ /* array of repository objects */ ],
"contributions": [ /* array of contribution data */ ],
// ... all profile fields
}
}const response = await fetch('/api/profile?username=torvalds');
const data = await response.json();
console.log(`Score: ${data.profile.score}/100`);
console.log(`Impact: ${data.profile.scoreComponents.impact.score}%`);
console.log(`Total Stars: ${data.profile.totalStars.toLocaleString()}`);Calculates a user's global ranking position among all analyzed profiles. Real-time calculation using Prisma aggregation.
GET /api/global-rank?username=torvalds{
"rank": 3,
"totalProfiles": 1247,
"percentile": 99.76,
"score": 96.93
}// Count profiles with higher scores
const higherScores = await prisma.profile.count({
where: { score: { gt: userScore } }
});
// Rank is 1-based
const rank = higherScores + 1;
// Calculate percentile
const percentile = ((totalProfiles - rank + 1) / totalProfiles) * 100;const response = await fetch('/api/global-rank?username=torvalds');
const { rank, totalProfiles, percentile } = await response.json();
console.log(`Rank #${rank} of ${totalProfiles}`);
console.log(`Top ${percentile.toFixed(2)}% globally`);429 when exceeded💡 Best Practices:
If implementing a client, respect the cache TTL and avoid repeated analysis requests. Use the /api/profile endpoint for displaying data, which has no rate limits and ~50ms response time.
Scores are statistically accurate relative to our baseline population of 100,000+ developers. The system uses z-score normalization, which means a score of 70 always represents "better than 70% of developers" regardless of when it was calculated. However, scores reflect GitHub activity patterns, not developer skill, work ethic, or professional competence.
The scoring system weighs impact (35%) most heavily. If you maintain popular open-source projects with many stars and forks, you'll score higher. Conversely, having many private repositories or working on closed-source projects won't increase your score since GitCheck only analyzes public data.
Common reasons for lower scores: few public repositories, low star count, infrequent commits, or a new GitHub account (account age affects several metrics).
Profiles are cached for 24 hours to reduce GitHub API usage and prevent abuse. After 24 hours, you can request a fresh analysis. The cache warning on your dashboard shows the next available scan time.
No. GitCheck only analyzes publicly available GitHub data. We never access private repositories, require OAuth authentication, or store sensitive information. All data comes from GitHub's public API endpoints.
Focus on the four component areas: Impact (create valuable open-source projects that earn stars), Code Quality (maintain repositories consistently, close issues), Consistency (commit regularly, build streaks), and Collaboration (contribute PRs, do code reviews, join organizations).
Yes. Contact us via GitHub issues on our repository or through the homepage contact information. We'll honor deletion requests within 7 days. Note that all data stored is already publicly available on GitHub.
Full analysis requires fetching data from multiple GitHub API endpoints (repos, commits, PRs, contributions), calculating statistical metrics, and writing to the database. We use GraphQL to optimize this, but GitHub's API has inherent latency. Cached responses are served in ~50ms.
Global ranking counts how many profiles in our database have a higher score than yours. If your score is 85.5 and 42 profiles score higher, your rank is #43. Percentile is calculated as: ((totalProfiles - rank + 1) / totalProfiles) × 100
No. GitCheck is an independent analytics platform that uses GitHub's public API. We are not affiliated with, endorsed by, or sponsored by GitHub, Inc.