A Computer Scientist's Quest: Versioning my work

Sunday, March 14, 2010

Versioning my work

I've found it helpful to keep around clean and working versions of projects that I edit, especially when I am changing someone else's work. This is suggested behavior for generating patches for open source projects, but I find it useful for both programming and document editing.

Of course, version control (CVS or subversion) helps, but I don't always want to commit my changes back to the repository. Someday I should also investigate a distributed version control system that lets other users "check out" your branches. However, most of the projects I work on use cvs and svn. I also like to extract the changes that I've made, in order to keep track of my versions by hand. This is useful when I reach milestones in projects, and I want to make off-line backups.

My preferred method is to use diff and patch. When I'm planning to create my own patches, I first copy the original project's directory tree. For example, if the project I'm working on is in a directory called "foo", then from the parent directory of foo I will:

>$ cp -lpr foo foo-new

This generates a copy of foo, with hard links instead of copying the files, reducing the storage space overhead. My editor is configured to break hard links when it saves files, which is done in vim by adding the following to my .vimrc file:

>$ echo "set backupcopy=auto,breakhardlink" >> ~/.vimrc

Now I will leave the foo directory untouched, and do my development work in foo-new. When I get to some point that I want to make a version patch, just go to the parent of foo-new and:

>$ diff -uprN foo foo-new > foo-new-1.00.patch
>$ ls -la foo-new-1.00.patch

I check the ls output to make sure my patch is a reasonable size. I will also usually open it and make sure it looks right. Then, I can send my patch to someone, or apply it to a copy of foo. I will often test my patch first, for example:

>$ cp -lpr foo foo-test

>$ cd foo-test
>$ patch -p1 < ../foo-new-1.00.patch

This works great for code, but isn't always the best for less well-structured documents. It also can create fairly large .patch text files that can be cumbersome to apply.

If I am creating a lot of new files, or replacing entire files, I will also just generate tar balls of my entire project. But these can get rather large, and when I really just want the diff, I recently found a way that I can do better.

With tar, I can also just archive the files that have been changed. Then when you untar, only those files are overwritten. The first thing I do is create a "staging area" for the new files in the same parent directory as the foo project:

>$ mkdir foo-stage

Next I copy the directory structure (only the directories!) into the staging area, so that I can put new files in the appropriate directories:

>$ cd foo-stage
>$ (cd ../foo; find -type d ! -name .) | xargs mkdir

Finally, I will put the new files that I changed in to their appropriate locations in the staging area, and finally I will create a compressed tar (from within foo-stage directory):

>$ tar -zcf ../foo-1.00.tgz .

Finally, to apply the tar to a copy,

>$ cp -lpr foo foo-test
>$ cd foo-test
>$ tar -zxmf ../foo-1.00.tgz

The -m option keeps tar from updating the modified timestamps for directories that are unchanged. This will replace files in foo only with the files that I added to foo-stage. This method can also be used in conjunction with diff/patch, in which case I would not use the -N option to diff.

1 comment:

GedareJuly 27, 2010 at 4:41 PM
After creating a foo-stage directory and copying the appropriate files into it, the empty directories left over can be a little confusing. The following one-liner will delete the empty directories, leaving only the files that I added:
> find foo-stage -depth -type d -empty -delete
Enter that from the parent directory of foo-stage.
ReplyDelete
Replies

Add comment