zunzuncito

For whatever reason I’ve been uncovering software bugs at an unprecedented rate in the past 10 days. This is by no means a bad thing, I enjoy hunting down and fixing bugs, but it does mean that the additional overhead of drafting a post about each bug becomes a bit too much. So instead here’s a quick overview - the linked patches and merge requests will have more information, if you are interested.

Trash size calculation in KIO

I noticed this one pretty much right after starting to use Dolphin but did not end up looking into it until quite a bit later: when displaying the size of the items in the trash, the application would always show 0 bytes. This would also cause the automated cleanup of items to fail - Dolphin simply believed that the trash was empty.

KDE uses the KIO framework to provide management of the trash. A recent commit had changed the construction of a QDirIterator in a way that would make it ignore all items when iterating over the trash directory. Thankfully the fix was straightforward and it was merged quickly.

git-shortlog(1) segfaults outside of a git repository

This one I uncovered as I was writing a small script to give me an overview of commit authors in all the git repositories I had cloned locally. I was happily scanning through my source directory using the --author flag for git-shortlog(1) to generate this, fully expecting git to complain about the few non-git directories I had. Instead of complaints, however, I got a segfault.

Turns out that a change back in May stopped setting SHA1 as the default object hash. This was done to progress the slow-moving transition to stronger hash functions but inadvertently broke git-shortlog(1) whose argument parsing machinery expected a default hash algorithm to be set. I sent a patch upstream.

An infinite loop in plasmashell

I regularly use the Activities functionality in Plasma 6 and switch through my activities using Plasma’s built-in activity manager. A couple of days ago I managed to make plasmashell, the provider for Plasma’s desktop and task bar, freeze - I had hit the “up arrow” key in the activity filter text box when there were no results visible. This was perfectly reproducible, so I went to investigate.

The cause of the issue was a do-while construct not handling a specific sentinel value, making it loop infinitely. For this one I also opened a merge request upstream.

Yesterday whilst catching up on the Git mailing list I stumbled upon this patch proposing to improve the hunk header regex for Java. I had never paid much attention to how git-diff(1) finds the right method signature to show in the headers though I was vaguely aware of a bunch of regexes for different languages.

Turns out that by default, as explained in the manual for gitattributes(5), git-diff(1) emulates the behaviour of GNU diff -p and does not consult any of the language-specific regular expressions. This came as a bit of a surprise to me, as Git usually has relatively sane and extensive defaults. Why define all these regexes and then not use them by default?

Perhaps one reason is that it is hard to tell when to use which. Git can only look at the filename, and not all shell scripts share the .sh ending, for example. Surely it would not be too invasive, however, to define sensible defaults for, say, files ending in .py or .rs.

In any case I updated my ~/.config/git/attributes with the following, and am now enjoying better hunk headers across the board:

*.c	diff=cpp
*.cpp	diff=cpp
*.go	diff=go
*.md	diff=markdown
*.pl	diff=perl
*.py	diff=python
*.rs	diff=rust
*.sh	diff=bash
*.tex	diff=tex

The markdown setting is especially neat since it will now display the nearest section right in the diff, like so:

--- a/posts/weltschmerz.md
+++ b/posts/weltschmerz.md
@@ -24,6 +24,10 @@ ## Download

In the previous post I talked about a couple of different ways to apply patches with mutt(1) or neomutt(1). Turns out Maildir might not be the best format to use for git-am(1) because its files are not guaranteed to be in any specific order (per spec they need only carry unique names).

As git-am(1) does not sort its input, patches might be applied in the wrong order. This came up on the mailing list as well, all the way back in 2013. A fix specific to Maildir files created by mutt(1) was added in 18505c3.

Sadly neomutt(1) changed this format 5 years ago, removing the sequence number that git-am(1) relies on in commit 75b3708 and replacing it with a call to mutt_rand64(). I can only assume no one is using neomutt(1) to export patches to Maildir, since having patches applied in the wrong order is a pretty significant problem.

For now I recommend using the mbox format instead when exporting patches. Whilst that doesn’t guarantee a specific order either, usually mail clients are nice enough to export mails to mbox in the order they are shown.

The core issue remains until git-am(1) learns to sort mails itself.

When maintaining a project sooner or later there comes the time when you need to apply patches that have been submitted to you. Assuming a patch-based workflow, those are going to be one patch per mail, possibly connected in a thread. There’s lots of different ways of getting those patches to their final destination, git-am(1), and in this post I want to take a look at ones that work well with mutt(1) or neomutt(1) since that is what I use.

I want to rely on the default bindings as much as possible. All mentioned bindings should work out of the box.

Applying a single patch is very straightforward: Select the mail in the pager and hit | to pipe it to an external command. This command will be some variation of git-am(1), perhaps git am -s to also add a Signed-off-by trailer.

What if you want to apply a patch series, however? An obvious solution would be to pipe each message in the thread to git-am(1). The <pipe-message> command we invoked earlier with | only applies to the currently selected message, so we can’t use that on the whole thread. Instead we can tag the whole thread using <Esc>t, then use the <tag-prefix> command ; followed by | to send all tagged messages to git-am(1).

There’s two problems with this, though. The first is that depending on the setting of pipe_split, git-am(1) might only apply the first patch in the series. This is the case if pipe_split is set to the default of no; mutt(1) will then concatenate the messages before sending them to the external command. Sadly this concatenated format is slightly different from the mbox format that git-am(1) expects, making it not see anything past the first patch.

With pipe_split set to yes, mutt(1) spawns one process per tagged mail instead, applying all patches correctly. Now that git-am(1) is spawned once per mail, however, you lose its atomicity: Neither ORIG_HEAD will be set correctly, nor will --abort go back to the original branch.

This might not be a big issue, but I am not a fan. Thankfully git-am(1) supports reading patches from mbox files or Maildir structures. So instead of piping mails to git-am(1) via <pipe-message>, let’s save them to the current directory: With the thread still tagged, ;C (<tag-prefix> followed by <copy-message>) will save a copy of it under a given path. Now you can apply the series by hitting ! and running git am -s <path>. This works with mbox_type set to either mbox or Maildir (but see № 9 ).

Of course there is no need to rely on the default bindings, especially if you need to do this kind of thing very often. mutt(1) is easily customizable, making it possible to bind the entire chain of actions to just one keystroke. If you’re interested in a more detailed examination of a patch-based workflow with mutt(1), check out this post by Greg Kroah-Hartman.

Even though git-send-email(1) calls git-format-patch(1) if you provide a revision list, oftentimes when submitting a patch series you should instead run git-format-patch(1) first, and later use git-send-email(1) only to submit the files it generated. There’s a bunch of reasons for this.

send-email is built chiefly for mail submission, not patch formatting. There is no way to pass options meant for format-patch, meaning that you miss out on really good features like --cover-letter or --interdiff.
If you use --annotate or decide to edit the mail body, you will lose all changes if you quit before sending. You can’t save your work and continue writing the mail later either. Once you call send-email, you’re committed; It’s all or nothing.
--compose gets you a worse version of format-patch’s --cover-letter. No diffstat is included by default, and the same problems as in 2) apply.
format-patch outputs text files for you to browse and edit. This can be done on your own time and with your own tools, without the send-email prompt nagging and stressing you.

Providing a convenient way of quickly sending out small patches makes sense, but all in all I think the inclusion of formatting in git-send-email(1) is a glaring misfeature. Hiding git-format-patch(1) away from the user makes git-send-email(1) intransparent and, worse, really clunky to use for regular patch workflows.

Sadly, git-send-email.io still tells newcomers to use only get-send-email(1), without even mentioning git-format-patch(1). This won’t change any time soon, either.

That is a shame. Guiding people to a worse workflow will not increase the standing of mail-based processes. For now I’d recommend linking newcomers this section of Git’s own contribution tutorial instead.

The notion of a “merged” branch is highly dependent on the workflow used for a project. I was wanting to clean up some topic branches in my copy of git.git today, but git branch -d refused to delete them, pointing out that they were not yet merged.

I knew for a fact that they were, which made me look up how git branch -d actually determines that. The manual is not entirely clear, but a comment in the code pointed out that git constructs the merge base of the branch and its upstream (or HEAD if there is none) and checks whether the branch is reachable from that merge base.

In a patch workflow, this will generally not be true. A lot of things may happen to your patches before inclusion, and with git.git they will get at least one other sign-off. They’ll be recorded in a merge commit, but it will not have your original branch as one of its parents.

Therefore, neither git branch -d nor git branch --merged will report your branch as merged. Both of these tools are built for the merge workflow instead.

To see if your work was merged in patch-based workflows, use git-cherry(1). Then you can safely force deletion of the branch with git branch -D

zunzuncito

A plethora of bug fixes

Trash size calculation in KIO

git-shortlog(1) segfaults outside of a git repository

An infinite loop in plasmashell

Better hunk headers with gitattributes(5)

git-am(1) and mail sorting

Applying patches with mutt(1)

Use git-send-email(1) only to send

Patch workflows and git branch -d