Skip to main content

GIT: Remove sensitive data


From time to time users accidentally commit data like passwords or keys into a git repo. While you can use git rm to remove the file, it will still be in the repo's history. Fortunately, git makes it fairly simple to remove the file from the entire repo history.

Change your files

This step should be blatantly obvious, but some users still skip it. If you committed a password, change it! If you committed a key, generate a new one. If you commited private files remove them.
Once the commit has been pushed you should consider the data to be compromised.

Purge the file from your repo

Now that the password is changed, you want to remove the file from history and add it to the.gitignore to ensure it is not accidentally re-committed. For our examples, we're going to removeRakefile from the GitHub gem repo.
$ git clone https://github.com/defunkt/github-gem.git
# Initialized empty Git repository in /Users/tekkub/tmp/github-gem/.git/
# remote: Counting objects: 1301, done.
# remote: Compressing objects: 100% (769/769), done.
# remote: Total 1301 (delta 724), reused 910 (delta 522)
# Receiving objects: 100% (1301/1301), 164.39 KiB, done.
# Resolving deltas: 100% (724/724), done.

$ cd github-gem

$ git filter-branch --index-filter 'git rm --cached --ignore-unmatch Rakefile' \
  --prune-empty --tag-name-filter cat -- --all
# Rewrite 48dc599c80e20527ed902928085e7861e6b3cbe6 (266/266)
# Ref 'refs/heads/master' was rewritten
This command will run the entire history of every branch and tag, changing any commit that involved the file Rakefile, and any commits afterwards. Commits that are empty afterwards (because they only changed the Rakefile) are removed entirely. Now that we've erased the file from history, let's ensure that we don't accidentally commit it again.
Please note that this will overwrite your existing tags.
$ echo "Rakefile" >> .gitignore

$ git add .gitignore

$ git commit -m "Add Rakefile to .gitignore"
# [master 051452f] Add Rakefile to .gitignore
#  1 files changed, 1 insertions(+), 0 deletions(-)
This would be a good time to double-check that you've removed everything that you wanted to from the history. If you're happy with the state of the repo, you need to force-push the changes to overwrite the remote repo.
$ git push origin master --force
# Counting objects: 1074, done.
# Delta compression using 2 threads.
# Compressing objects: 100% (677/677), done.
# Writing objects: 100% (1058/1058), 148.85 KiB, done.
# Total 1058 (delta 590), reused 602 (delta 378)
# To https://github.com/defunkt/github-gem.git
#  + 48dc599...051452f master -> master (forced update)
You will need to run this for every branch and tag that was changed. The --all and --tags flags may help make that easier.

Cleanup and reclaiming space

While git filter-branch rewrites the history for you, the objects will remain in your local repo until they've been dereferenced and garbage collected. If you are working in your main repo you might want to force these objects to be purged.
$ rm -rf .git/refs/original/

$ git reflog expire --expire=now --all

$ git gc --prune=now
# Counting objects: 2437, done.
# Delta compression using up to 4 threads.
# Compressing objects: 100% (1378/1378), done.
# Writing objects: 100% (2437/2437), done.
# Total 2437 (delta 1461), reused 1802 (delta 1048)

$ git gc --aggressive --prune=now
# Counting objects: 2437, done.
# Delta compression using up to 4 threads.
# Compressing objects: 100% (2426/2426), done.
# Writing objects: 100% (2437/2437), done.
# Total 2437 (delta 1483), reused 0 (delta 0)
Note that pushing the branch to a new or empty GitHub repo and then making a fresh clone from GitHub will have the same effect.

Dealing with collaborators

You may have collaborators that pulled your tainted branch and created their own branches off of it. After they fetch your new branch, they will need to use git rebase on their own branches to rebase them on top of the new one. The collab should also ensure that their branch doesn't reintroduce the file, as this will override the .gitignore file. Make sure your collab uses rebase and not merge, otherwise he will just reintroduce the file and the entire tainted history... and likely encounter some merge conflicts.

Cached data on GitHub

Be warned that force-pushing does not erase commits on the remote repo, it simply introduces new ones and moves the branch pointer to point to them. If you are worried about users accessing the bad commits directly via SHA1, you will have to delete the repo and recreate it. If the commits were viewed online the pages may also be cached. Check for cached pages after you recreate the repo, if you find any open a ticket on GitHub Support and provide links so staff can purge them from the cache.

Avoiding accidental commits in the future

There are a few simple tricks to avoid committing things you don't want committed. The first, and simplest, is to use a visual program like GitHub for Mac or gitx to make your commits. This lets you see exactly what you're committing, and ensure that only the files you want are added to the repo. If you're working from the command line, avoid the catch-all commands git add . and git commit -a, instead use git add filename and git rm filename to individually stage files. You can also use git add --interactive to review each changed file and stage it, or part of it, for commit. If you're working from the command line, you can also use git diff --cached to see what changes you have staged for commit. This is the exact diff that your commit will have as long as you commit without the -a flag.

Other reading

Comments

  1. Bạn đang tìm dịch vụ giao hàng? Bạn cần dịch vụ vận chuyển hàng hóa với giá cả phải chăng, đặc biệt phải giao hàng nhanh đến tay khách hàng của bạn. Đến với chúng tôi, với các dịch vụ vận chuyển phong phú đa dạng như: chuyển hàng đi miền tây, giao hàng nội thành tphcm, gửi hàng từ tphcm đi hà nội, gửi hàng đi bạc liêu, gửi đồ từ hà nội vào sài gòn, cho thuê kho quận 7 , ký gửi hàng hóa, dịch vụ giao hàng thu tiền hộ... Chúng tôi hiện đã phục vụ khắp 64 tỉnh thành. Khi bạn có nhu cầu cần vận chuyển hãy nhớ đến chúng tôi.

    ReplyDelete
  2. Một vụ nhập lậu 240 sản phẩm tân dược; 400 lọ tinh dầu diệt côn trùng, 60 hộp bột thạch rau câu và 12 gói trà xanh sữa dạng bột vừa bị Đội Kiểm soát Hải quan số 1, Cục Hải quan Quảng Ninh bắt giữ tại thang máy gia đình chất lượng cao.

    Lực lượng chức năng lập tức đuổi theo bằng dịch vụ van tai hang hoa Bac Nam, van chuyen hang hoa ra Sai Gon, vận chuyển hàng hoá ra Hà Nội để tiếp cận đối tượng và hàng hóa để kiểm tra. Khi lực lượng chức năng áp sát, đối tượng nhanh chóng cắt dây buộc hất thùng carton xuống đường và phóng xe với tốc độ cao để bỏ chạy vận chuyển hàng đi Phú Quốc.

    Tại đây, lực lượng chức năng thu giữ tang vật gồm: 240 sản phẩm tân dược (thuốc kháng sinh, miếng dán chữa mắt cá chân, thuốc giảm đau, thuốc mỡ); 400 lọ tinh dầu diệt côn trùng, 60 hộp bột thạch rau câu và 12 gói trà xanh sữa dạng bột để vận chuyển hàng hoá đi Campuchia giá rẻvận chuyển hàng qua Lào.

    ReplyDelete
  3. NSƯT Hoài Linh cũng viết lên trang cá nhân: “Tên của em (Mỹ Tâm - PV) đã kể lên điều đấy. Mùa đông ko lạnh. Cám ơn em nhé gái ơi”.

    "Xem clip này, tôi đã khóc. Tôi khóc vì cô đó là một ngôi sao lớn nhưng lại hết sức nhỏ bé, giản dị đứng trước mọi người. Tôi khóc vì Mỹ Tâm ko khoe khoang tôi là 1 ca sĩ, ngôi sao. Chỉ thuần tuý chị đấy giới thiệu về tên mình và động viên các người khuyết tật và kêu gọi sự viện trợ và tài trợ gian phoi do thong minh. bao nhiêu đấy thôi cũng đủ làm chúng ta ấm lòng giữa đêm đông”.

    ReplyDelete

Post a Comment

Pascal Fares and Open Source Lebanese Movement >

Popular posts from this blog

Setting up MySQL SSL and secure connections

There are different articles on how to setup MySQL with SSL but it’s sometimes difficult to end up with a good simple one. Usually, setting up MySQL SSL is not really a smooth process due to such factors like “it’s not your day”, something is broken apparently or the documentation lies... Read this article : Setting up MySQL SSL and secure connections Pre-requisite : Creating SSL Certificates and Keys Using openssl

Python

Python is an easy to learn and powerful programming language, with a comprehensive standard library that provides functions and interfaces for almost any task. It is object-oriented, extensible and interpreter-based, which means it scales well to all types of projects, from small scripts to extensive code bases. Its elegant syntax allows writing code that is extremely readable and concise. http://www.python.org/