68

Moving Files from one Git Repository to Another, Preserving History

Posted May 17th, 2011 in Development and tagged , , by Greg Bayer
                                

If you use multiple git repositories, it’s only a matter of time until you’ll want to refactor some files from one project to another.  Today at Pulse we reached the point where it was time to split up a very large repository that was starting to be used for too many different sub-projects.

After reading some suggested approaches, I spent more time than I would have liked fighting with Git to actually make it happen. In the hopes of helping someone else avoid the same trouble, here’s the solution that ended up working best. The solution is primarily based on ebneter’s excellent question on Stack Overflow.

Another solution is Linus Torvald’s “The coolest merge, EVER!” Unfortunately, his approach seems to require more manual fiddling than I would like and results in a repository with two roots. I don’t completely understand the implications of this, so I opted for something more like a standard merge.

Goal:

  • Move directory 1 from Git repository A to Git repository B.

Constraints:

  • Git repository A contains other directories that we don’t want to move.
  • We’d like to perserve the Git commit history for the directory we are moving.


Get files ready for the move:

Make a copy of repository A so you can mess with it without worrying about mistakes too much.  It’s also a good idea to delete the link to the original repository to avoid accidentally making any remote changes (line 3).  Line 4 is the critical step here.  It goes through your history and files, removing anything that is not in directory 1.  The result is the contents of directory 1 spewed out into to the base of repository A.  You probably want to import these files into repository B within a directory, so move them into one now (lines 5/6). Commit your changes and we’re ready to merge these files into the new repository.

git clone <git repository A url>
cd <git repository A directory>
git remote rm origin
git filter-branch --subdirectory-filter <directory 1> -- --all
mkdir <directory 1>
mv * <directory 1>
git add .
git commit

Merge files into new repository:

Make a copy of repository B if you don’t have one already.  On line 3, you’ll create a remote connection to repository A as a branch in repository B.  Then simply pull from this branch (containing only the directory you want to move) into repository B.  The pull copies both files and history.  Note: You can use a merge instead of a pull, but pull worked better for me. Finally, you probably want to clean up a bit by removing the remote connection to repository A. Commit and you’re all set.

git clone <git repository B url>
cd <git repository B directory>
git remote add repo-A-branch <git repository A directory>
git pull repo-A-branch master
git remote rm repo-A-branch

Update: Removed final commit thanks to Von’s comment.

                                
  • http://zzamboni.org/ zzamboni

    I managed to do what you say by replacing line 4 with “git filter-branch –tree-filter ‘rm -rf $(ls | egrep -v )’ — –all”, which causes the filter to remove unneeded things instead of just selecting that one directory.

  • Eve Weinberg

    Hello – i’m trying to follow your steps. I’m confused about step6. Am i supposed to use the quote marks? Do I use the words in= ?

    My target path is: Users/NeverOdd/Desktop/of_v0.8.4_osx_release/Rules-of-Art-School, and is repo B, just a new blank folder? or the URL. My repo B is here: https://github.com/evejweinberg/Rules-of-Art-School

  • Daniel Kahlenberg

    regarding step6 etc. those are placeholders only replace them by real path names.

  • Haley

    This was very useful. I made a couple of tweaks to use a branch before pushing back to the server, but otherwise it did just what I needed.

  • Gaurav Negi

    Thanks This is useful. However GIT COMMIT ID will get changed. Is there anyway we keep the GIT commit id also the same ?

  • balajinix

    This was useful. Thank you.

  • Olga Maciaszek-Sharma

    This is a great tutorial. Thanks.

  • Mike

    To quote Randy Moss, Straight Cash Homey!!! Real Nice. Thank you.

  • http://www.fractalizer.ru Vladislav Rastrusny

    SECURITY NOTE: this operation can be dangerous if you are trying to move isolated content from private repository into public one. filter-branch modifies history, but it seems, that all original objects, that were present in source repository, are left intact inside .git folder in the resulting repository. At least `git gc –aggressive –force` shows the same number of objects in both repos. Be careful.

  • Vishnu Viswanath

    no need to push after the git commit in the first section of commands(get files ready to move) ?

  • Tiberiu Tanasa

    I’ve tried your solution, but I got stuck at step 4.

    C:aynmisc [development]> git filter-branch –subdirectory-filter — –all

    usage: git filter-branch [–env-filter ] [–tree-filter ]

    [–index-filter ] [–parent-filter ]

    [–msg-filter ] [–commit-filter ]

    [–tag-name-filter ] [–subdirectory-filter ]

    [–original ] [-d ] [-f | –force]

    […]

    Any idea why I’m getting this?

  • Tiberiu Tanasa

    I forgot to mention that I was using a git shell provided together with GitHub Desktop for Windows (version 3.0.14.0). It seems that it is a problem with this particular git application. I also tried to do the same steps with git installed from https://git-scm.com/ and it works perfect.

  • Dani Church

    For anyone else that gets here and is confused: Bastian’s post seems to have gotten run through an XML filter, which has confused things. Imagine that every =”” disappears, and you get {target-path in repo-b} (replacing the angle brackets with curly braces), which is much easier to understand.

  • Michal Plichta

    Gr8 post! Can you tell me how to remap commiter from repository A to other commiter in repository B. In repoA I have commits as: 1stname.2ndname@company.com and in repoB as: userid@company.com

  • Simon Greensmith

    Accepting this is an old post, for the benefit of anyone reading it now, Git have introduced a command “subtree” that will do this in a heartbeat, maintaining only the relevant commit history e.g.:
    From within old repo:
    $ git subtree split -P -b

    Then from within brand new directory:
    $ git init
    $ git pull
    $ git remote add origin
    … etc.

  • neelima m

    Hi, thanks this is working for directories. Do you know how to move individual files similarly? –subdirectory-filter needs to be replaced with some other option?

  • neelima m

    Hi Simon, This works fine. But my requirement is to add one more step, that is, I want the files in the new repo to be in the same directory structure as they were in the old repo. So after doing a git pull, if I move the files into the old/directory/structure, and do a commit, the previous history of the files is lost. Only my last commit is shown. Any idea of how I can achieve this and still retain all the history for the files that are copied? Thanks.

  • theowoo