Friday, October 12, 2012

Version control for text/LaTeX?

I often use version control (VC) for text documents—especially LaTeX. Modern VC software is good for code, and even text when working alone, but collaboratively editing text with VC is a nightmare. The biggest challenge for text VC seems to be structure/formatting. Code is well-structured within and between lines: text less so. Words/sentences can be re-arranged within sentences/paragraphs, and lines are meaningless. With VC, line wrapping propagates small changes and frustrates reviewing and merging.

Word processors track revisions, but such tools are restrictive and won't work for markup languages. Cloud tools for collaboration such as Docs exist, including some for LaTeX, but requiring connectivity while editing is a non-starter for me.  I would even be happy with a custom LaTeX solution; the features I desire for text VC are similar to those for code:
  • Non-proprietary, application-agnostic, platform-independent FOSS
  • Revision history: see what has changed.
  • Revert: undo changes back to forever.
  • Lock-free: work in parallel, which requires...
  • Pain-free merge: help resolve conflicting commits.
  • Distributed and offline editing: no active (server) connections
  • External contributions: integrate changes made outside of VC
Most VC tracks changes either with changesets or snapshots. Changesets seem worthless because of the structural problems of text. Snapshots seem viable, but I haven't seen turn-key solutions for plain text or LaTeX, and handling external contributions seems especially troublesome.

4 comments:

  1. I just keep my LaTeX files in Dropbox, works pretty well. I know you said no connectivity, so I don't know if that's helpful (but maybe the versioning automatically happens when you're offline as well? No clue).

    ReplyDelete
  2. I simply use latexdiff with git. Works great.
    Aanjhan

    ReplyDelete
  3. There is one simple trick how to make the versioning of LaTeX files in common VCSs (say Subversion or Git) much more effective and pleasant: Write each sentence on a single new line. It's trivial, but it works great and it's not so difficult to get used to it (the editor can automatically wrap the lines visually and you can still distinguish the paragraphs clearly thanks to the blank lines between them). We use it for most of the papers and reports written in our department.

    Of course, you should also avoid doing useless edits such as unnecessarily changing whitespaces and so on, but since editing a LaTeX file feels more like coding than writing poetry (at least I feel it that way), it is not so hard to keep the discipline.

    Having said that, a more intelligent tool would be still nice :)

    ReplyDelete
  4. Thanks for the ideas! Dropbox can work, though it lacks some finer-grained version control capabilities. I think I looked at latexdiff, IIRC it helps with finding the changes, but I don't think conflicts are any easier with it? Putting every sentence to its own line can help. Changes within a sentence still are hard to see, but at least the changes between lines are easier to tell.

    ReplyDelete