Versioning large files (such as audio samples, videos, datasets, and graphics) can be difficult when working with distributed version control systems like Git. Fortunately, a new extension to Git makes handling of large files easier: Git Large File Storage (LFS) is an open-source project that replaces large files with text pointers inside Git, while storing the contents of the files on a remote server like GitHub or an AWS bucket.
Installers for Mac, Linux, and Windows are available online at git-lfs.github.com. This site also contains the following brief installation guide. In essence, you only need to download the installer, decompress it, and run the installation script. If you have a Mac, Git LFS is also available via Homebrew:
$ brew install git-lfs
After running the installation script, set up LFS via the following command:
$ git lfs install
Tracking file types
All you need to do now is to tell Git LFS which file types to track. Navigate to your Git repository, and issue a git lfs track command. For example, if you want Git LFS to automatically handle all .mat files in your repository (although it's rarely a smart idea to have binaries under version control), you would call:
$ git lfs track "*.mat"
If your Git repository has subdirectories, you can use globbing to track all .mat files in all subdirectories:
$ git lfs track "**/*.mat"
Or you can track single files:
$ git lfs track myLargeFile.mat
That's it! Continue your work using git commit and git push as usual.
Storing large files
If you have tried uploading large files to the remote repository before, you might have noticed a warning popping up telling you that GitHub does not recommend to upload files larger than 50MB. You won't even be able to upload files larger than 100MB. With Git LFS installed, the file will instead be uploaded to a dedicated remote host that is different from your remote repository, and the git push command will go through as usual:
$ git commit -am "add large file" $ git push origin master
Instead of storing the file in the remote repository, Git LFS will upload only a small file reference. If you try to inspect the file on GitHub, you will only find the following note:
Back in the local repository, you will notice that the file is still accessible, until you switch branches.
Retrieving large files
As soon as you switch branches, the locally stored binaries will be gone. If you now inspect the file controlled by Git LFS, all you will find is a tiny text file that might look something like this:
version https://git-lfs.github.com/spec/v1 oid sha256:d63d7c81d9191f17263b0c65f97101083dade9637e069aea23c6be778cbf89bdf size 68536835
So where did your file go, you might ask? It is still on the LFS remote host. To download the file from the remote host, use the following command:
$ git lfs fetch
To see a list of all LFS-related commands, simply type:
$ git lfs