Git Data Recovery Techniques: Rescue Lost Commits and Data
Understanding Git's Recovery Capabilities
Git is designed with data integrity and recovery in mind. Unlike many version control systems, Git rarely deletes anything immediately. Commits, trees, and blobs that become unreachable remain in the repository for weeks or months before garbage collection removes them. This design gives you a powerful safety net—most "lost" data can be recovered if you know where to look and which tools to use.
Core principle: Git's object database is append-only. When you delete a branch or reset a commit, Git removes the reference (pointer) to the objects, but the objects themselves persist. The git reflog tracks reference changes, and git fsck finds unreachable objects. Together, they form your primary recovery toolkit.
The Recovery Toolkit: Essential Commands
| Command | Purpose | When to Use |
|---|---|---|
git reflog |
Shows history of HEAD and branch movements | Recover recently lost commits, branches, or resets |
git fsck |
Verifies database integrity, finds dangling objects | Find unreachable commits not in reflog |
git log -g |
Shows reflog with commit information | Detailed view of reflog entries |
git show [hash] |
Displays object contents | Inspect recovered objects before restoring |
git branch [name] [hash] |
Creates branch at specific commit | Restore a recovered commit as a branch |
git cherry-pick [hash] |
Applies specific commit to current branch | Extract individual commits |
Technique #1: Recovering Lost Commits with Reflog
The reflog (reference log) is Git's safety net. It records every time the tip of a branch or HEAD is updated—commits, checkouts, resets, rebases, and merges. By default, reflog entries are kept for 90 days, giving you ample time to recover from mistakes.
The Scenario
You ran git reset --hard HEAD~3 to undo some commits, but now you need those commits back. Or you deleted a branch and want to restore it. The reflog is your first line of defense.
Step-by-Step Recovery Using Reflog
$ git reflog
a1b2c3d HEAD@{0}: reset: moving to HEAD~3
e4f5g6h HEAD@{1}: commit: Add payment integration
i7j8k9l HEAD@{2}: commit: Update API endpoints
m0n1o2p HEAD@{3}: commit: Fix validation bug
q3r4s5t HEAD@{4}: checkout: moving from main to feature
# 2. Identify the commit you need (e4f5g6h in this example)
# 3. Inspect the commit to verify it's correct
$ git show e4f5g6h
# 4. Restore as a new branch
$ git branch recovered-feature e4f5g6h
# 5. Or reset current branch to that commit
$ git reset --hard e4f5g6h
# 6. Or cherry-pick specific commits
$ git checkout main
$ git cherry-pick e4f5g6h i7j8k9l
Why this works: The reflog maintains a complete history of where HEAD pointed. Even though you reset to an older commit, the reflog still knows about the commits you "lost." They're still in the object database, just not referenced by any branch.
Recovering a Deleted Branch with Reflog
$ git reflog --all
# 2. Find the last commit of the deleted branch
# Look for entries mentioning your branch name
a1b2c3d refs/heads/deleted-branch@{1}: commit: Final feature commit
# 3. Recreate the branch at that commit
$ git branch deleted-branch a1b2c3d
# 4. Verify your work is restored
$ git checkout deleted-branch
$ git log
Technique #2: Finding Dangling Commits with git fsck
Sometimes commits aren't in the reflog—perhaps they were from a repository that was garbage collected, or they were never referenced by a branch. git fsck (filesystem check) verifies the object database and can find dangling commits and blobs.
The Scenario
You created some commits in a detached HEAD state, then switched away without creating a branch. The commits aren't in any reflog because they were never attached to a branch reference. Or you're recovering from a corrupted repository.
Finding and Restoring Dangling Commits
$ git fsck --lost-found
dangling commit a1b2c3d4e5f678901234567890abcdef12345678
dangling commit b2c3d4e5f678901234567890abcdef1234567890
dangling blob c3d4e5f678901234567890abcdef123456789012
# 2. Inspect each dangling commit
$ git show a1b2c3d4e5f678901234567890abcdef12345678
# 3. If it's the commit you need, create a branch
$ git branch recovered-work a1b2c3d4e5f678901234567890abcdef12345678
# 4. Dangling blobs might be file contents - examine them
$ git show c3d4e5f678901234567890abcdef123456789012
# 5. Restore blob to a file
$ git show c3d4e5f6 > recovered-file.txt
How it works: git fsck --lost-found scans the object database and reports objects that aren't reachable from any ref (branch, tag, or HEAD). These are your "lost" commits and files. The --lost-found option also writes dangling blobs to .git/lost-found/ for easy recovery.
💡 Pro tip: Combine reflog and fsck for comprehensive recovery. Use reflog for recent losses (within 90 days), and fsck for older or orphaned objects. Regular maintenance like git gc will eventually remove unreachable objects, so act quickly.
Technique #3: Recovering After a Corrupted Repository
Repository corruption is rare but catastrophic. It can happen due to hardware failure, improper shutdowns, or manual tampering with .git directory contents. Git provides tools to diagnose and repair corruption.
⚠️ WARNING: Repository corruption requires immediate action. Stop using the repository, create a backup of the .git folder, and work on a copy. Improper repair attempts can make recovery impossible.
Diagnosing Corruption
$ git fsck --full
error: object file .git/objects/ab/cdef123... is empty
error: object file .git/objects/12/345678... is missing
missing blob 1234567890abcdef1234567890abcdef12345678
# Check for corrupt packfiles
$ git verify-pack -v .git/objects/pack/*.idx
Recovery Strategies for Corrupted Repositories
Strategy A: Restore from Remote
If you have a healthy remote repository, the simplest solution is to clone a fresh copy and reapply any local uncommitted changes.
$ cp -r working-directory working-backup
# Clone fresh from remote
$ git clone [remote-url] fresh-repo
# Copy over uncommitted changes
$ cp -r working-backup/* fresh-repo/
$ cd fresh-repo
$ git status # Review and commit changes
Strategy B: Repair Missing Objects from Alternate Sources
If you have another clone of the repository (perhaps on a teammate's machine or a backup), you can copy missing objects.
$ cd healthy-repo
$ git fsck --missing # List missing objects
# In corrupted repo, add remote and fetch
$ git remote add backup [path-to-healthy-repo]
$ git fetch backup
# Or manually copy objects
$ cp healthy-repo/.git/objects/[ab]/[cdef...] corrupted-repo/.git/objects/[ab]/
Strategy C: Recover Using git unpack-objects
If packfiles are corrupted but loose objects exist, you can attempt to unpack and recover.
$ mv .git/objects/pack/* /tmp/
# Attempt to recover loose objects
$ git unpack-objects < /tmp/pack-[name].pack
# Run fsck to check progress
$ git fsck --full
Technique #4: Recovering Lost Stashes
Stashes are a common source of "lost" work. You stash changes, then later run git stash drop or git stash clear accidentally. Fortunately, stashes are just commits under the hood.
$ git fsck --unreachable | grep commit | cut -d' ' -f3 | xargs git log --oneline
# Or examine stash reflog
$ git reflog show refs/stash
# If you find a stash commit, apply it
$ git stash apply [stash-commit-hash]
# Or create a branch from the stash
$ git branch recovered-stash [stash-commit-hash]
Technique #5: Recovering Uncommitted Changes
What if you lost uncommitted changes? Git doesn't track unstaged changes automatically, but there are still recovery options.
The Scenario
You ran git checkout . or git reset --hard and lost all your unstaged changes. Or your editor crashed without saving. Is the work gone forever?
Recovery Options for Uncommitted Changes
$ git fsck --unreachable | grep blob
# 2. Look in .git/objects for recent blobs
$ find .git/objects -type f -mtime -1
# 3. Check editor recovery files
$ ls ~/.vim/backup # Vim
$ ls .idea/workspace.xml # IntelliJ
# 4. Use file recovery tools on filesystem
$ sudo extundelete /dev/sda1 --restore-file path/to/file
# 5. If you had staged changes, they might be in index
$ git ls-files --stage | grep [filename]
⚠️ Important: Git does not automatically save unstaged changes. The best prevention is frequent commits or using git add -p to stage incrementally. Consider using IDE features that maintain local history.
Technique #6: Recovering After Force Push
Someone force-pushed to a shared branch and overwrote commits. Now the branch is missing work. Recovery depends on having access to the lost commits elsewhere.
The Scenario
A teammate ran git push --force on the main branch, overwriting several commits. Other team members may have the commits locally, but the remote is now missing them.
Recovery Steps
$ git reflog show origin/main # On teammate's machine
# 2. If found, push them back
$ git push origin [lost-commit-hash]:main --force-with-lease
# 3. If no local copies, check CI/CD artifacts
# Many CI systems keep commit hashes in build logs
# 4. Look in other clones or backups
$ git clone [other-clone-url] recovery-repo
# 5. As last resort, use git reflog on remote (if accessible)
# Some hosting providers expose reflog via API
Technique #7: Recovering Deleted Tags
Tags can be deleted accidentally. Like branches, they're just references to commits, so the commits themselves remain.
$ git reflog --all | grep tag
# Look for tag references in fsck
$ git fsck --unreachable | grep tag
# Recreate the tag at the found commit
$ git tag v1.0.0 [commit-hash]
# If it was an annotated tag, recreate with -a
$ git tag -a v1.0.0 [commit-hash] -m "Release 1.0.0"
Technique #8: Recovering from Corrupted Index
The Git index (.git/index) can become corrupted, causing strange behavior with git status and staging.
# 1. Backup current index
$ cp .git/index .git/index.backup
# 2. Remove and rebuild index
$ rm .git/index
$ git reset # Rebuilds index from HEAD
# 3. Restage any unstaged changes
$ git add .
Technique #9: Recovering Lost Notes
Git notes attach metadata to commits without changing the commit itself. Notes can be lost if the notes ref is deleted.
$ git fsck --unreachable | grep commit | xargs git show
# Restore notes ref
$ git update-ref refs/notes/commits [found-notes-commit]
Technique #10: Automated Recovery Script
Create a recovery script to automate finding and listing recoverable objects:
# git-recover.sh - Find recoverable commits and objects
echo "=== RECOVERABLE COMMITS FROM REFLOG ==="
git reflog --date=local | head -20
echo "=== DANGLING COMMITS ==="
git fsck --lost-found 2>/dev/null | grep "dangling commit"
echo "=== DANGLING BLOBS ==="
git fsck --lost-found 2>/dev/null | grep "dangling blob"
echo "=== RECENTLY MODIFIED OBJECTS ==="
find .git/objects -type f -mtime -7 | head -10
# Make executable and run
$ chmod +x git-recover.sh
$ ./git-recover.sh
Prevention: Backup Strategies for Git Repositories
The best recovery is prevention. Implement these backup strategies to ensure you never lose Git data:
| Strategy | Implementation | Frequency |
|---|---|---|
| Remote mirrors | git clone --mirror to backup server |
Daily |
| Bundle backups | git bundle create repo.bundle --all |
Weekly |
| Reflog preservation | git config gc.reflogExpire never |
One-time |
| Filesystem snapshots | ZFS/LVM snapshots of .git directory | Hourly |
| CI/CD artifact storage | Store commit SHAs in build artifacts | Per build |
💡 Best practice: For critical repositories, maintain multiple backups in different locations. Use git bundle to create portable backups that can be restored even without a full Git server.
Recovery Decision Tree
Lost Commits?
- Check
git reflog(most recent losses) - Check
git fsck --lost-found(older/orphaned) - Ask teammates for their local copies
- Check CI/CD logs for commit hashes
- Restore from backup bundles
Corrupted Repository?
- Clone fresh from remote
- Copy missing objects from other clones
- Repair packfiles with
git unpack-objects - Restore from filesystem backups
Lost Uncommitted Changes?
- Check editor recovery files
- Look for staged blobs in object database
- Use filesystem recovery tools
- Check IDE local history
Frequently Asked Questions
By default, unreachable commits remain in the object database for 90 days (the reflog expiration period). After that, they become eligible for garbage collection (git gc). Once garbage collection runs, unreachable objects are permanently deleted. However, if you have backups or other clones, you may still recover from those sources. The exact timing depends on your gc.reflogExpire and gc.reflogExpireUnreachable settings.
git log shows the commit history reachable from current references (branches, tags). git reflog shows the history of where HEAD pointed, including commits that are no longer reachable from any branch. The reflog is local to your repository and isn't shared with remotes. Think of reflog as your personal safety net, while log shows the public history.
Once garbage collection has physically removed objects from the filesystem, standard Git commands cannot recover them. However, if you have filesystem-level backups, snapshots, or another clone of the repository that still has the objects, you can restore from those sources. Some advanced recovery tools might attempt to reconstruct data from disk blocks, but this is unreliable and not Git-specific. Prevention through regular backups is essential.
Find the commit where the file last existed: git log -- [file-path]. Once you have the commit hash, restore it with: git checkout [commit-hash]^ -- [file-path]. The caret (^) means the commit before deletion. This restores the file to your working directory, staged and ready to commit.
GitHub maintains a reflog-like feature for some operations. Check the "Events" or "Security log" in repository settings. If the commits were recently pushed, they might still be in GitHub's internal reflog. You can also check if any forks or clones of the repository have the commits. For GitHub Enterprise, contact support immediately—they may be able to restore from backups within a limited window.