Data Usage Policy

Last updated: December 4, 2025

Overview

This Data Usage Policy explains how GitCheck collects, processes, analyzes, and utilizes your GitHub data to provide our analytics service. This policy supplements our Privacy Policy and Terms of Service.

GitHub Data Collection

Public Repository Data

We collect and analyze the following from your public GitHub repositories:

  • Repository metadata: Names, descriptions, creation dates, stars, forks, watchers
  • Commit history: Commit messages, timestamps, frequency patterns, authorship
  • Code statistics: Lines of code, file counts, language distribution
  • Branch information: Active branches, default branch, branch protection
  • Pull requests: Created PRs, merge status, review participation
  • Issues: Opened issues, closed issues, response times
  • Documentation: README files, Wiki pages, docs folders
  • CI/CD: GitHub Actions workflows, test configurations

Profile Information

  • Username and display name
  • Avatar URL
  • Bio and location
  • Account creation date
  • Follower and following counts
  • Organization memberships
  • Public email (if provided)

Activity Metrics

  • Contribution graph data
  • Commit frequency and patterns
  • Active hours and days
  • Streak information
  • Language usage over time

Data Processing and Analysis

FREE Tier Analysis

For FREE users, we calculate:

  • Basic metrics: Total repos, stars, forks, commits, PRs
  • Activity stats: Current streak, longest streak, contributions
  • Language breakdown: Primary languages with percentages
  • Top repositories: Most starred/forked projects
  • Contribution patterns: Activity heatmap, most active days

PRO Tier Advanced Analysis

PRO subscribers receive comprehensive analysis across four domains:

1. README Quality Analysis (20% of score)

  • Documentation length and completeness
  • Presence of key sections (Installation, Usage, Contributing)
  • Badge usage (build status, coverage, version)
  • Structure and formatting quality
  • Example code snippets
  • License information

2. Repository Health (25% of score)

  • Maintenance frequency (recent commits)
  • Issue response time and resolution rate
  • PR merge rate and review quality
  • Security: Dependabot alerts, vulnerability scanning
  • Community engagement (stars, forks, contributors)
  • Branch protection and code review policies

3. Developer Patterns (30% of score)

  • Commit patterns by hour (0-23 heatmap)
  • Language evolution tracking
  • Productivity peak identification
  • Collaboration style (solo vs team projects)
  • Consistency of contributions
  • Weekend vs weekday activity ratio

4. Career Insights (25% of score)

  • Experience level (based on account age and depth)
  • Specialization score (technology focus areas)
  • Consistency rating (commitment patterns)
  • Learning curve (skill development trajectory)
  • Portfolio quality assessment
  • Professional presentation indicators

Scoring Algorithm

Overall Developer Score (0-100)

The comprehensive score is calculated as:

Score = (README × 0.20) + (Repo Health × 0.25) + (Dev Patterns × 0.30) + (Career × 0.25)

Grade Assignment

  • S Grade: 95-100 (Elite)
  • A Grade: 85-94 (Excellent)
  • B Grade: 70-84 (Very Good)
  • C Grade: 55-69 (Good)
  • D Grade: 40-54 (Fair)
  • F Grade: 0-39 (Needs Improvement)

Data Storage and Caching

Database Storage (PostgreSQL via Neon)

We permanently store:

  • User profile: GitHub username, email, avatar, bio
  • Analysis snapshots: Historical scores and metrics
  • Subscription status: FREE/PRO tier and purchase date
  • Session data: Login timestamps, last analysis date

Server-Side Caching (1 hour TTL)

Analysis results are cached to improve performance:

  • PRO analysis results cached for 60 minutes
  • Automatic cache invalidation after TTL expires
  • Manual refresh available via "Recalculate" button
  • Cache keys based on username and analysis type

Client-Side Session Storage

Temporary storage during your session:

  • Recent analysis results for instant loading
  • Cleared when you close the browser tab
  • Not shared across devices
  • Used only for UI performance optimization

GitHub API Usage

API Endpoints We Access

  • /user - Basic profile information
  • /users/:username/repos - Repository list
  • /repos/:owner/:repo - Repository details
  • /repos/:owner/:repo/commits - Commit history
  • /repos/:owner/:repo/pulls - Pull request data
  • /repos/:owner/:repo/issues - Issue tracking
  • /repos/:owner/:repo/languages - Language statistics
  • /repos/:owner/:repo/readme - README content
  • /users/:username/events - Contribution activity

Rate Limits and Optimization

  • We respect GitHub's rate limits (5,000 requests/hour for authenticated users)
  • Intelligent caching reduces redundant API calls
  • Parallel processing for faster analysis
  • Exponential backoff for rate limit handling

Permissions Required

GitCheck requests OAuth scopes:

  • read:user - Read profile information
  • repo (public only) - Access public repository data

We do NOT request write permissions or access to private repositories.

Data Processing Location

  • Application hosting: Vercel (global CDN)
  • Database: Neon PostgreSQL (EU/US regions)
  • GitHub API: api.github.com (GitHub's infrastructure)
  • Payment processing: Stripe (when applicable)

Data Retention

Active Accounts

  • Profile data retained indefinitely while account is active
  • Analysis snapshots kept for historical comparison
  • Cache automatically expires and refreshes

Inactive Accounts

  • Accounts inactive for 12+ months may be archived
  • Archived data can be restored upon login
  • Session data cleared after 30 days of inactivity

Deleted Accounts

  • All user data deleted within 30 days of account deletion
  • Backups purged within 90 days
  • Aggregated anonymous analytics may be retained

Data Accuracy and Updates

GitCheck analyzes your GitHub data as of the moment you run an analysis. To ensure accuracy:

  • FREE users can re-analyze anytime
  • PRO users can refresh analysis after cache expires (1 hour)
  • New repositories appear in next analysis
  • Historical data reflects GitHub's permanent record
  • Deleted repositories are removed from future analyses

Data Sharing and Third Parties

We do NOT sell your data. We only share data with:

  • GitHub: To fetch your public data via their API
  • Vercel: For hosting and processing
  • Neon: For secure database storage
  • Stripe: For payment processing (PRO users only)
  • Legal authorities: If required by law

All third-party processors are contractually bound to protect your data.

Your Data Rights

Access

View your analyzed data anytime in your dashboard. Request a complete data export by contacting us.

Correction

Data is fetched directly from GitHub. To correct it, update your GitHub profile, then re-analyze.

Deletion

Revoke GitHub OAuth access or request account deletion. All data removed within 30 days.

Portability

Export your analysis results in JSON format (feature coming soon).

Updates to This Policy

We may update this Data Usage Policy to reflect changes in our practices or legal requirements. Significant updates will be communicated via email or prominent website notice.