git-clone-subset (1) - Linux Manuals

NAME

git-clone-subset - Clones a subset of a git repository

SYNOPSIS

git-clone-subset [options] repository destination-dir pattern

DESCRIPTION

Clones a repository into a destination-dir and runs on the clone
git filter-branch --prune-empty --tree-filter 'git rm ...' -- --all
to prune from history all files except the ones matching pattern, effectively creating a clone with a subset of files (and history) of the original repository.

Useful for creating a new repository out of a set of files from another repository, migrating (only) their associated history. Very similar to what
git filter-branch --subdirectory-filter
does, but for a file pattern instead of just a single directory.

OPTIONS

-h, --help
show usage information.
repository
URL or local path to the git repository to be cloned.
destination-dir
Directory to create the clone. Same rules for git-clone applies: it will be created if it does not exist and it must be empty otherwise. But, unlike git-clone, this argument is not optional: git-clone uses several rules to determine the "Humane" dir name of a cloned repo, and git-clone-subset will not risk parse its output, let alone predict the chosen name.
pattern
Glob pattern to match the desired files/dirs. It will be ultimately evaluated by a call to bash, NOT git or sh, using extended glob '!(<pattern>)' rule. Quote it or escape it on command line, so it does not get evaluated prematurely by your current shell. Only a single pattern is allowed: if more are required, use extglob's "|" syntax. Globs will be evaluated with bash's shopt dotglob set, so beware. Patterns should not contain spaces or special chars like " ' $ ( ) { } `, not even quoted or escaped, since that might interphere with the !() syntax after pattern expansion.

Pattern Examples:

"*.png"
"*.png|*icon*"
"*.h|src/|lib"

LIMITATIONS

Renames are NOT followed. As a workaround, list the rename history with 'git log --follow --name-status --format='%H' -- file | grep "^[RAD]"' and include all multiple names of a file in the pattern, as in "currentname|oldname|initialname". As a side efect, if a different file has taken place of an old name, it will be preserved too, and there is no way around this using this tool.

There is no (easy) way to keep some files in a dir: using 'dir/foo*' as pattern will not work. So keep the whole dir and remove files afterwards, using git filter-branch and a (quite complex) combination of cloning, remote add, rebases, etc.

Pattern matching is quite limited, and many of bash's escaping and quoting does not work properly when pattern is expanded inside !().

AUTHOR

Rodrigo Silva (MestreLion) linux [at] rodrigosilva.com