Yesterday whilst catching up on the Git mailing list I stumbled upon this patch proposing to improve the hunk header regex for Java. I had never paid much attention to how git-diff(1) finds the right method signature to show in the headers though I was vaguely aware of a bunch of regexes for different languages.

Turns out that by default, as explained in the manual for gitattributes(5), git-diff(1) emulates the behaviour of GNU diff -p and does not consult any of the language-specific regular expressions. This came as a bit of a surprise to me, as Git usually has relatively sane and extensive defaults. Why define all these regexes and then not use them by default?

Perhaps one reason is that it is hard to tell when to use which. Git can only look at the filename, and not all shell scripts share the .sh ending, for example. Surely it would not be too invasive, however, to define sensible defaults for, say, files ending in .py or .rs.

In any case I updated my ~/.config/git/attributes with the following, and am now enjoying better hunk headers across the board:

*.c	diff=cpp
*.cpp	diff=cpp
*.go	diff=go
*.md	diff=markdown
*.pl	diff=perl
*.py	diff=python
*.rs	diff=rust
*.sh	diff=bash
*.tex	diff=tex

The markdown setting is especially neat since it will now display the nearest section right in the diff, like so:

--- a/posts/weltschmerz.md
+++ b/posts/weltschmerz.md
@@ -24,6 +24,10 @@ ## Download