A small trick to distributed paper writing with [pdf]latex and git

Update 26.3.2021: This is a repost of a blog post I made four years ago. Since then many tools have gained widespread acceptance, like Overleaf, and latexmk has superseded the need to have arcane makefiles for latex projects. But I believe that the general concept of including stuff into compiled latex PDFs using lower-level git commands is still valuable.

Introduction

With most of my collaborators, we use git for version control of draft papers, written in latex. I usually bundle a makefile in the git repo to check for unresolved references and rerun [pdf]latex if necessary. In this post I will discuss how to use low-level git commands to embed useful information into the generated PDF.

The trick

My workflow is usually:

  1. Edit the source, adding or editing figures if necessary
  2. Commit and push
  3. Make PDF and email to colleagues to inform them of my changes

Unfortunately I all to often either forget about stage 2 or leave files out when committing. A surefire way of spotting this (and also knowing which version of the PDF people are discussing) would be to print both the commit and the number of uncommitted changes in the paper directory, in the PDF file.

If one trusts one's local system, i.e., you are happy to use the --shell-escape flag to pdflatex, and you have a fairly recent version of git, this is all easy to do. I have the following in the abstract of a paper I am currently working on (the forced linebreaks and noindents are clunky, but what can you do...):

\\ \noindent
\textbf{Compiled from commit:}
\input "|git describe --first-parent"
\\ \noindent
\textbf{Files in paper directory with uncommitted changes: }
\input "|git diff-index --name-only HEAD -- ./ | wc -l"

I appreciate that git describe is a porcelain command and git diff-index is plumbing, but they get the results I want. Note that for git describe --first-parent to work, a release has to have been tagged at some point.

Anyway, I have found this useful! I'm not, it seems, the first person to have pondered this:

  • Here is a similar but more sophisticated approach by Thore Husfeldt
  • Something simpler from Ward Muylaert
  • And there's even a gitinfo package in CTAN.

In my case, though, it was inspired by a Hacker News discussion that I can't find now, from several months back. If I can find it I'll add a reference here.