Hacking:repo-shrink of 2014-06
abslibre.git has gotten large (>150MB) because some files have been committed that shouldn't have been. It is slow to clone, and is taking too much disk space on the server.
To correct this, lukeshu wrote a filter-branch script to shrink the repo. If you have an existing checkout of abslibre.git, you will get a warning about the upstream changing. You have two options:
- Delete (or backup) your checkout, and clone the new version.
- Run the script yourself on your copy. On lukeshu's box, it took about 13 minutes. On the git server, it took about 8 minutes.
If you have unpushed commits in your copy, and are not comfortable with git rebase, then running the script on you copy may be the better option. If you are comfortable with git rebase, then you don't need me to explain what to do.
Running the script
The script is:
cleanup.sh
#!/bin/bash files=( # sources libre-testing/hplip-libre/hplip-3.12.4.tar.gz libre-testing/hplip-libre/hplip-3.12.4.tar.gz.asc pcr/ryzom-hg/.ryzom-hg-20131213 # sources # I've verified that all of these PKGBUILD behave correctly without these files libre/blackbox-libre/blackbox-0.70.1.tar.gz libre/dvdrip-libre/dvdrip-0.98.11.tar.gz pcr/python2-sfml2/1.4.zip pcr/python2-sfml2/master.zip pcr/qtkeychain/qtkeychain-0.1.zip # compressed files # I would rather they were uncompressed, but I guess they can stay #libre/linux-libre/patch-3.14-gnu-3.14.1-gnu.xz #pcr/debootstrap-libre/debootstrap.8.gz # binaries pcr/wuala # non-free, .jar file checked into git # logs libre/p7zip-libre/p7zip-libre-9.13-2-i686-build.log # vim/kate swap files kernels/aufs3-libre/.PKGBUILD.kate-swp libre/gnu-ghostscript/.PKGBUILD.swp libre/grub2/.archlinux_grub2_mkconfig_fixes.patch.swp libre/grub2/.archlinux_grub_mkconfig_fixes.patch.swp libre/hplip-libre/.hplip.install.swp libre/iceweasel-libre/.libre.patch.swp libre/kdebase-konqueror-libre/.PKGBUILD.swp libre/linux-libre-tools/.PKGBUILD.swp social/hunspell-pt-br/.PKGBUILD.kate-swp '~emulatorman'/hunspell-pt-br/.PKGBUILD.kate-swp ) git filter-branch --prune-empty --index-filter "git rm -r --cached --ignore-unmatch $(printf '%q ' "${files[@]}")" master
Before doing any of this, make sure you have no uncommitted files.
My process of running the script was:
$ emacs cleanup.sh ( enter the above script ) $ cd abslibre $ time bash ../cleanup.sh ( hundreds (thousands?) of lines of output ommitted ) Rewrite 390443060f96a9599bdea9e18811d5a56e4c5b64 (6953/6954)rm 'pcr/qtkeychain/qtkeychain-0.1.zip' Rewrite d07c850a109062459c30ac81a4097d13603872ee (6954/6954)rm 'pcr/qtkeychain/qtkeychain-0.1.zip' Ref 'refs/heads/master' was rewritten real 13m26.555s user 3m31.543s sys 1m44.310s $ cd .. $ mv abslibre abslibre.bak $ git clone file:///${PWD}/abslibre.bak abslibre Cloning into 'abslibre'... remote: Counting objects: 39965, done. remote: Compressing objects: 100% (28925/28925), done. remote: Total 39965 (delta 14456), reused 27420 (delta 8807) Receiving objects: 100% (39965/39965), 27.82 MiB | 13.71 MiB/s, done. Resolving deltas: 100% (14456/14456), done. Checking connectivity... done. $ cp abslibre.bak/.git/config abslibre/.git/config $ du -h --max-depth 0 abslibre{,.bak} 50M abslibre 223M abslibre.bak
The reason for performing the clone is that git tries hard to not delete your data, and that the "large" version is still sitting there. It could be purged in the existing repository, but it is safer to clone it, especially if you aren't extremely familiar with git. You probably want to keep your abslibre.bak for a while, just in case.