Moving Files from One Git Repository to Another, Preserving History
Originally published in 2011. This remains one of the most-referenced guides for this common git workflow.
If you use multiple git repositories, it’s only a matter of time until you’ll want to refactor some files from one project to another. Today at Pulse we reached the point where it was time to split up a very large repository that was starting to be used for too many different sub-projects.
After reading some suggested approaches, I spent more time than I would have liked fighting with Git to actually make it happen. In the hopes of helping someone else avoid the same trouble, here’s the solution that ended up working best.
Goal
Move directory 1 from Git repository A to Git repository B.
Constraints
- Git repository A contains other directories that we don’t want to move.
- We’d like to preserve the Git commit history for the directory we are moving.
Get files ready for the move
Make a copy of repository A so you can mess with it without worrying about mistakes too much. It’s also a good idea to delete the link to the original repository to avoid accidentally making any remote changes (line 3). Line 4 is the critical step here. It goes through your history and files, removing anything that is not in directory 1. The result is the contents of directory 1 spewed out into the base of repository A. You probably want to import these files into repository B within a directory, so move them into one now (lines 5/6). Commit your changes and we’re ready to merge these files into the new repository.
git clone <git repository A url>
cd <git repository A directory>
git remote rm origin
git filter-branch --subdirectory-filter <directory 1> -- --all
mkdir <directory 1>
mv * <directory 1>
git add .
git commit
Merge files into new repository
Make a copy of repository B if you don’t have one already. On line 3, you’ll create a remote connection to repository A as a branch in repository B. Then simply pull from this branch (containing only the directory you want to move) into repository B. The pull copies both files and history. Note: You can use a merge instead of a pull, but pull worked better for me. Finally, you probably want to clean up a bit by removing the remote connection to repository A. Commit and you’re all set.
git clone <git repository B url>
cd <git repository B directory>
git remote add repo-A-branch <git repository A directory>
git pull repo-A-branch master --allow-unrelated-histories
git remote rm repo-A-branch
Update: Removed final commit thanks to Von’s comment. Update 2: Added “—allow-unrelated-histories” thanks to several comments.
Comments
Comments from the original blog post.
Kate Ebneter - May 17, 2011 at 1:27 pm
Very nice writeup, and I’m happy to see that my question/answer was helpful!
Von - May 17, 2011 at 5:12 pm
Just tried this on a couple quickly whipped up repos and it worked well. Only comment was there was nothing to commit after the pull, so I’m not sure you need the last two steps.
Greg Bayer - May 18, 2011 at 11:55 am
Great catch! The commit is not required with the pull approach. Will update accordingly.
Anonymous - July 1, 2011 at 11:45 am
Thank you, for explaining this!. Your post will be a big help. I have wanted to do this for some time, but kept putting it off!
Joseph Chiu - July 27, 2011 at 6:44 pm
hi, based on what I read on http://stackoverflow.com/questions/1365541/how-to-move-files-from-one-git-repo-to-another-not-a-clone-preserving-history, I wonder if your line 5 should be “mkdir -p
” and line 6 should be “git mv * <directory 1>” ?
Greg Bayer - July 27, 2011 at 10:03 pm
Thanks for the feedback. In this case I believe the results should be similar either way.
“mkdir -p” is only required if the new directory is more than one level deep.
There shouldn’t be much difference between “mv *” and “git mv *” in this case.
Mguyre - September 20, 2011 at 5:02 pm
Need to add a git fetch between steps 4 and 5 for the new repository to retrieve the tags and history
Adam Monsen - November 10, 2011 at 2:40 pm
Great post, thank you. One suggestion: change the title as follows: s/file/one directory/.
Adam Monsen - November 10, 2011 at 2:41 pm
Great post, thank you. One suggestion: change the title as follows: s/Files/one directory/.
Adam Monsen - November 10, 2011 at 10:30 pm
Sorry, I meant: change “Files” to “one directory” or “a directory”.
Greg Bayer - November 11, 2011 at 12:15 am
Thanks for the suggestion. You can actually use this approach to move an arbitrary set of files by first moving them into a temporary directory. Because of this, the current title seems to be appropriate and more general.
Navitf - February 19, 2012 at 8:08 am
Hi, But when files are moved into a temporary directory, the command: “git filter-branch –subdirectory-filter” extract history that is relevant only to the temporary directory and thus real history logs are not preserved. Any idea how to overcome this?
Greg Bayer - February 22, 2012 at 5:48 pm
Thats an interesting point. This wasn’t a problem in my case, so I haven’t looked into it. Maybe another reader can suggest a solution?
123456 - February 27, 2012 at 5:30 pm
I get an error message when I run the get filter-branch command:
$ git filter-branch –subdirectory-filter mt — –all
C:Program Files (x86)Git/libexec/git-core/git-filter-branch: line 289: /libexe
c/git-core/git: Bad file number
Could not get the commits
In typical git fashion, the error message is incomprehensible to me. Any idea what’s going wrong?
Greg Bayer - February 27, 2012 at 5:52 pm
I think that means git can’t access one of your files. Based on a few posts I see on stackoverflow.com, this could be caused by a bad network connection or proxy configuration.
123456 - February 27, 2012 at 6:07 pm
Don’t understand how this could be the case. I cloned the repo to local disk.. and to my understanding, “git remote rm origin” severs the link between my local repo and the remote one.. so I don’t see where networks/proxies would enter into it.
Xavier MARTIN - June 22, 2012 at 3:18 am
Really really helpful post…
Thanks much for sharing 🙂
SteveALee - June 22, 2012 at 3:50 pm
if you don’t use git mv * the deleted files will not be staged for a commit. Also use -k to skip error of moving the dir itself
Robbie Van Gorkom - June 28, 2012 at 12:57 pm
We had the same issue at the office, I wrote a script to exactly this. https://github.com/vangorra/git_split
Rasheed Barnes - July 16, 2012 at 1:25 pm
Good stuff. worked like a charm.
Bernd - July 24, 2012 at 12:26 pm
Thaaaanks a lot. This post saved me a shit ton of hours. 🙂
devguy - August 14, 2012 at 3:52 pm
this is cool, but it seems that you can only see the old history if you do git log –follow [file] , which is kinda inconvenient, especially in a large project. am i missing something? is there a way to modify this process so that the –follow is not required?
Greg Bayer - August 15, 2012 at 1:36 am
You should be able to see the history in all the normal ways. Personally I tend to use gitk or github to view the history for old files.
How do you want to be able to view it?
Bastian Krol - August 15, 2012 at 1:48 pm
Thanks a bunch for this guide.
Here is my version:
To move some directories from repository A to repository B without losing history:
git clone tmp-repo
cd tmp-repo
git checkout
git remote rm origin # not really needed
git filter-branch –subdirectory-filter — –all
mkdir -p
git mv -k *
git commit
cd # clone it, if you didn’t do already
Create a new branch and check it out
git remote add origin-tmp-repo
git pull origin-tmp-repo
rm -rf
Repeat all steps with every that needs to be moved. You’ll need a new tmp-repo for every directory, because “git filter-branch –subdirectory …” can only take one directory as an argument and the repo is largely unusable after executing the command. That’s why there is a rm -rf in step 13. When transferring subsequent directories, steps 10 and 11 can be omitted.
When you are done with all directories, you should do
git remote rm origin-tmp-repo and git push in local repository B.
webdevguy - November 4, 2012 at 7:01 pm
This was very helpful. I had a much simpler requirement. I needed to pull one directory and all it’s history out of one repo and create a new repo for just that directory. Here are the steps that worked for me:
git clone
cd
git remote rm origin
git filter-branch –subdirectory-filter <directory 1> — –all
Nathan Whitehead - December 21, 2012 at 11:21 pm
Thanks for putting this up, worked like a charm. I just set up git as a deployment mechanism, this helped me split out the bits I needed from my main repository.
Samuel Le Berrigaud - February 12, 2013 at 2:07 am
using “git mv *” worked better for me. Somehow the history is ‘better’ kept that way… Not sure why and how though.
Arthur Taborda - April 5, 2013 at 7:13 am
Easier: to merge files to new repository, simply do:
git push :master
zupeanut - July 10, 2013 at 1:45 pm
“It goes through your history and files, removing anything that is not in directory 1”
The problem with this is if files that are currently in directory 1 but previously were not will have history before the move lost.
Cat Lookabaugh - August 8, 2013 at 7:29 am
I was reading this post, and it doesn’t seem to quite be what I’m after. Caveat: I’m new to git, so I just may not understand.
Here’s what I want:
I’m using git to store documentation. We have different docbooks for each of our products and some content that fits more than one book. So, we have a common repository with that content (in a directory) and other repos for each product. The current process to get at the shared content involves some scripts and the git subtree command, with the shared content ending up in the product repo in a directory called shared-files. It’s a bit convoluted. Currently the shared content is only used for two products.
I need to add another set of shared content that will be used by all product docbooks (stored in its own directory in common). I want to be able to pull one or both shared content directories into a product repo. I don’t need any history. I just want the current up-to-date common files in my product repo in their own directory. If I want to update the shared content, I will do so directly in the common repo.
One way to accomplish this would be to clone the common repo and the product repo to my pc and then literally copy the directory I want from common repo into the product repo.
Is there a relatively simple way to do this without cloning the common repo? I’d appreciate your advice!
Greg Bayer - August 8, 2013 at 9:14 am
Have you considered adding the common repo as a submodule of each of the product repos?
Cat Lookabaugh - August 8, 2013 at 9:19 am
My research thus far seems to indicate that subtree is preferred over submodule, though i’m not sure why. would submodule option allow me to pick and choose which directory or directories i want from the common repo?
Greg Bayer - August 8, 2013 at 9:42 am
It wouldn’t let you choose certain directories, but it might be simpler overall if thats not a hard requirement. You could also combine the submodule approach with a simple script that deletes directories you don’t need from the local clone of the common submodule in each product repo.
You could also consider creating separate common repos for each directory and only pulling in those you need for each product.
Cat Lookabaugh - August 8, 2013 at 10:06 am
food for thought…thanks 🙂
gitnewbie - September 5, 2013 at 11:32 am
Used the exact commands listed above and I see only the first 2 entries with git log although git log on the source shows me many more. Looks like partial history is moved.
Eliyahu - January 7, 2014 at 12:54 pm
Thanks!
Brandon Mintern - February 5, 2014 at 5:24 pm
Thanks! This was very helpful.
Note that if your repository B branch has
rebase = trueset, then you will almost certainly want to usegit pull –no-rebase repo-A-branch master.
Dave - May 9, 2014 at 12:22 pm
This helped … a lot!!! Thank you Greg!!!
Motivated by your excellent posting (I can’t stress that enough), I dug around some more, since I also had to go the other direction … that is, once I did this, and had my desired subdirectory now as the entire project, I then wanted to submerge it into a subdirectory, still keeping the history, of course (i.e. no –follow required).
Which had:
git filter-branch –prune-empty –tree-filter ‘
if [[ ! -e foo/bar ]]; then
mkdir -p foo/bar git ls-tree –name-only $GIT_COMMIT | xargs -I files mv files foo/barfi’
This worked for me, pretty much literally.
If this helps anyone, I dedicate the good will to Greg, who exemplifies what these postings should be all about imho.
Thanks again Greg.
romu - June 18, 2014 at 7:43 am
Hi, bit late to come here now, but I’ve discovered this article because of a similar need.
I ran this instructions on the code I needed to move, and everything was fine…except a little problem, the history is not preserved at all. The files are well moved but with no history.
Any idea? Thanks.
Karthik T - July 10, 2014 at 11:37 pm
I am using this to extract some stuff out as a gem and it works like a … gem! I didnt realize the first set of commands were destructive.. but no worries!
Carlos Vinicius - August 28, 2014 at 12:31 pm
perfect <3
efalk - September 4, 2014 at 7:45 pm
This is what I’ve always done, but that “mv * ” or “git mv * ” step results in one massive commit which looks like a zillion file deletions followed by a zillion file adds. Is there no way to get filter-branch to leave the directory structure alone?
NW - September 18, 2014 at 9:22 am
This solution appears to only preserve the master branch history for the folder/files moved, not the history of any other branches that involve them. Any thoughts on how to preserve that history as well?
Guest123456 - October 27, 2014 at 11:17 am
I hit the same problem, also on a local repo. The issue seemed to be that the $rev_args argument from the relevant git command is just too long (see https://github.com/github/windows-msysgit/blob/master/libexec/git-core/git-filter-branch) – the resulting shell command when that list is expanded is too long which results in the ‘Bad file number’ error.
I tried the following horrible hack, which worked for me (but YMMV, so be careful!).
– Find and edit git-filter-branch (this was at /libexec/git-core/git-filter-branch for me, using git-bash on Windows)
– Comment out lines 287-9 (git rev-list …. || die “Could not get commits”)
– Replace with this:
rm ../revs
for rev in $rev_args
do
git rev-list –reverse –topo-order –default HEAD –parents –simplify-merges $rev “$@” >> ../revs
done
(so basically just run the same command over and over with each subsequent commit, appending to the same output file).
From that point the rest of the script worked ok (and this step ran quickly).
Andhra - December 23, 2014 at 9:29 pm
thank you
Alexandr Artemov - January 14, 2015 at 7:46 am
Thank you! I had a problem with this script – it always deleted my directory with cloned repo and then failed. I fixed it, don’t remember how. But there’s another problem as well:
if I specify not empty target repo, it fails with the following error:
To git@:tools/utility.git
! [rejected] master -> master (fetch first)
error: failed to push some refs to ‘git@:tools/utility.git’
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., ‘git pull …’) before pushing again.
hint: See the ‘Note about fast-forwards’ in ‘git push –help’ for details.
Gonzalo Casas - February 21, 2015 at 7:32 am
Thanks!
Robert Goldman - March 2, 2015 at 2:25 pm
Thank you very much for this post. It just saved my life.
ChanderG - March 28, 2015 at 9:20 pm
Thanks a lot.
Hyip - April 9, 2015 at 8:17 am
Thanks for the script. script. I found it work to automate the process into an independent repository.
zzamboni - April 16, 2015 at 8:12 pm
I managed to do what you say by replacing line 4 with “git filter-branch –tree-filter ‘rm -rf $(ls | egrep -v )’ — –all”, which causes the filter to remove unneeded things instead of just selecting that one directory.
Eve Weinberg - May 16, 2015 at 11:52 am
Hello – i’m trying to follow your steps. I’m confused about step6. Am i supposed to use the quote marks? Do I use the words in= ?
My target path is: Users/NeverOdd/Desktop/of_v0.8.4_osx_release/Rules-of-Art-School, and is repo B, just a new blank folder? or the URL. My repo B is here: https://github.com/evejweinberg/Rules-of-Art-School
Daniel Kahlenberg - June 10, 2015 at 12:29 am
regarding step6 etc. those are placeholders only replace them by real path names.
Haley - June 22, 2015 at 12:49 pm
This was very useful. I made a couple of tweaks to use a branch before pushing back to the server, but otherwise it did just what I needed.
Gaurav Negi - August 24, 2015 at 5:09 pm
Thanks This is useful. However GIT COMMIT ID will get changed. Is there anyway we keep the GIT commit id also the same ?
balajinix - September 14, 2015 at 10:36 pm
This was useful. Thank you.
Olga Maciaszek-Sharma - September 17, 2015 at 12:49 am
This is a great tutorial. Thanks.
Mike - September 23, 2015 at 6:51 am
To quote Randy Moss, Straight Cash Homey!!! Real Nice. Thank you.
Vladislav Rastrusny - October 5, 2015 at 7:50 am
SECURITY NOTE: this operation can be dangerous if you are trying to move isolated content from private repository into public one. filter-branch modifies history, but it seems, that all original objects, that were present in source repository, are left intact inside .git folder in the resulting repository. At least
git gc –aggressive –forceshows the same number of objects in both repos. Be careful.
Vishnu Viswanath - February 7, 2016 at 10:07 am
no need to push after the git commit in the first section of commands(get files ready to move) ?
Tiberiu Tanasa - March 2, 2016 at 7:11 am
I’ve tried your solution, but I got stuck at step 4.
C:aynmisc [development]> git filter-branch –subdirectory-filter — –all
usage: git filter-branch [–env-filter ] [–tree-filter ]
[–index-filter ] [–parent-filter ] [–msg-filter ] [–commit-filter ] [–tag-name-filter ] [–subdirectory-filter ] [–original ] [-d ] [-f | –force] […]Any idea why I’m getting this?
Tiberiu Tanasa - March 4, 2016 at 4:01 am
I forgot to mention that I was using a git shell provided together with GitHub Desktop for Windows (version 3.0.14.0). It seems that it is a problem with this particular git application. I also tried to do the same steps with git installed from https://git-scm.com/ and it works perfect.
Dani Church - April 19, 2016 at 10:04 am
For anyone else that gets here and is confused: Bastian’s post seems to have gotten run through an XML filter, which has confused things. Imagine that every =”” disappears, and you get {target-path in repo-b} (replacing the angle brackets with curly braces), which is much easier to understand.
Michal Plichta - May 5, 2016 at 8:22 am
Gr8 post! Can you tell me how to remap commiter from repository A to other commiter in repository B. In repoA I have commits as: 1stname.2ndname@company.com and in repoB as: userid@company.com
Simon Greensmith - May 20, 2016 at 3:55 am
Accepting this is an old post, for the benefit of anyone reading it now, Git have introduced a command “subtree” that will do this in a heartbeat, maintaining only the relevant commit history e.g.:
From within old repo:
$ git subtree split -P -b
Then from within brand new directory:
$ git init
$ git pull
$ git remote add origin
… etc.
neelima m - June 21, 2016 at 1:22 am
Hi, thanks this is working for directories. Do you know how to move individual files similarly? –subdirectory-filter needs to be replaced with some other option?
neelima m - June 22, 2016 at 8:45 pm
Hi Simon, This works fine. But my requirement is to add one more step, that is, I want the files in the new repo to be in the same directory structure as they were in the old repo. So after doing a git pull, if I move the files into the old/directory/structure, and do a commit, the previous history of the files is lost. Only my last commit is shown. Any idea of how I can achieve this and still retain all the history for the files that are copied? Thanks.
theowoo - September 3, 2016 at 7:56 am
Here is one way to do that:
https://gist.github.com/theowoo/2823a7e7b785b6fde647d9d2f6e5e68d
Ben Warner - October 7, 2016 at 1:54 am
If using git 2.9 (and later I assume), you will need to use the –allow-unrelated-histories flag on the git pull.
e.g.
git pull repo-A-branch master –allow-unrelated-histories
smartester.com - October 12, 2016 at 5:26 am
I got this done much easier way ..
First I created a new repository.
I went to git client(source tree) and changed the url of my existing remote repository to new repository and did a force push
tej - October 12, 2016 at 6:01 am
First I created a new repository.
I went to git client(source tree) and changed the url of my existing remote repository to new repository and did a force push
niquis7 - October 19, 2016 at 7:47 am
Thanks for this post, it helped me a lot. Git surely has a plethora of tools!
Rajpaul Bagga - November 9, 2016 at 6:36 am
(So that someone who gets the error can find this comment by searching for the error, as I tried to do):
If you don’t add the flag you get this error:
fatal: refusing to merge unrelated histories
Saurabh Jain - December 5, 2016 at 11:55 pm
I followed these steps but not able to see log history. When I run
git log .I get only one commit.
Piter Vergara - February 17, 2017 at 8:45 am
Thanks!!
I had to do some extra steps to keep my tags. Since the commits are rewritten when we do ‘filter-branch’, the tags will not point to the new commits. so, to also rewrite the tags I have exchanged:
git filter-branch –subdirectory-filter — –allto
git filter-branch –subdirectory-filter –tag-name-filter cat — –allBesides that, I have used a merge instead of a pull, because pull didn’t add my tags. So, instead of:
git remote add repo-A-branch git pull repo-A-branch masterI did
git remote add repo-A-branch git fetch repo-A-branch git checkout -b master git merge repo-ambiente/master
Fakabbir Amin - March 4, 2017 at 4:01 am
Won’t mv * move everything to the directory instead of moving a particular folder ? In my case everything is moved to that directory..
Fakabbir Amin - March 4, 2017 at 9:06 pm
Alh, Got it, git filter-branch –subdirectory-filter would take a lot of time and interputing it in between will be of no use (in my case it took 2.5 hours). To get the console output do “GIT_TRACE=1” and then run the commands.
aousterh - March 26, 2017 at 6:27 pm
When I tried this approach, I found that step 4 in the merge portion dumped the contents of directory 1 into the root of repository B, instead of into /. Is that the expected behavior? How can I have the files end up in / instead? Thanks!
Mohamed Ezz - April 6, 2017 at 3:00 pm
Thank you for this comment!
Artem - April 10, 2017 at 9:34 am
Thank you for this post. I translated it into Russian, changed a bit and posted into my own blog. Of course I mentioned that I based on your post and posted a link here 🙂
dwec - May 23, 2017 at 8:53 am
Can you elaborate on this for a git novice? I followed the steps and my “repository B” is 268KB whereas the original “repo A” was 2.4MB. If I copy the desired subdirectory from repo A somewhere using unix copy the result is 264KB worth of files. What extra data is being copied using the filter-branch technique described above?
Greg Bayer - May 23, 2017 at 9:50 am
Awesome!
Johnney Darkness - June 6, 2017 at 1:27 pm
I had to use –allow-unrelated-histories to convince git that step 4 part 2 was okay. Also you probably need to clarify to cd back up between the steps … or maybe I got this wrong.
Bart - June 8, 2017 at 11:31 am
You may need the “–allow-unrelated-histories” switch in the pull command… I used “fetch” and “merge” instead of “pull”, and the “merge” needed that switch.
Steve Terpe - October 1, 2017 at 1:53 am
Hey Greg, you may want to update this to note that
“`
git pull repo-A-branch master master
“`
now require “
–allow-unrelated-histories“flag.Still one of my most all-time useful bookmarks.
Greg Bayer - October 3, 2017 at 4:52 pm
Thanks! Will make the edit.
Steve Swinsburg - January 17, 2018 at 7:58 pm
Thanks for this. Worked well. Perhaps add the last command ‘git push origin master’ so that the new directory gets pushed up.
Sébastien Vermeille - February 28, 2018 at 6:59 am
Big thank you ! You saved my day!
Jaimon - April 25, 2018 at 7:02 am
I’m trying to move a folder from one repo to another without losing the history. In step1, history is intact until step 6. I lose history as soon as I move files from root of the repo to the newly created folder. I’m wondering how others have achieved this. It will great if one of you (who got it working in the recent times) could post the exact commands used. Thanks in advance.
Jaimon - April 29, 2018 at 11:26 am
Finally I got it working by replacing lines #5-8 with one more filter-branch operation.
git filter-branch -f –index-filter \
‘git ls-files -s | /usr/local/bin/sed -e “s/\(\t\)\(.*\)$/\1\/\2/” | GIT_INDEX_FILE=$GIT_INDEX_FILE.new \ git update-index –index-info && mv “$GIT_INDEX_FILE.new” “$GIT_INDEX_FILE”‘ HEADRemember to use gnu-sed. I wasted almost 2 days trying to figure out why a simple sed regex doesn’t work. Replace in above sed regex with the folder name that you want to have. -f switch is needed after filter-branch as the previous command must have created a backup reference.
Stanley - May 23, 2018 at 3:10 pm
Thank you for this! Saved heaps of wrestling. I couldn’t get line 4 working but eventually replaced it with this:
[git filter-branch –tree-filter “find . -not -path ‘./directory’ -delete”]
which also keeps the directory itself intact
Chris S - May 25, 2018 at 8:15 am
For what it’s worth, I’ve been using this tutorial when needed for the last couple of years. It’s withstood the test of time for all I’m concerned. Nice work!
Mona - June 12, 2018 at 4:55 am
Hi, I did this flow (took a long time) but history is not in the target repository :-/
Rafael - June 13, 2018 at 3:12 pm
Hint about this command:
git pull repo-A-branch master –allow-unrelated-histories
In my first tries, I got:
fatal: Couldn’t find remote ref heads/master
fatal: The remote end hung up unexpectedly
It took me a while to figure it out that master is from “source” , as I wanted to move from develop branch, the correct command is:
git pull repo-A-branch develop –allow-unrelated-histories