Proper Git Commit Messages and an Elegant Git History
Have you ever run a git log command, looked at the output, and though, “WTF
happened here?”. I ran git log
on a project yesterday and found myself staring
at this git commit history:
commit 26ae8c3f0cf2a3d0b4aa52570711b2047b939592
Merge: 02f1f9a a73ecc6
Author: Programmer One
Date: Tue Mar 22 12:03:59 2011 -0700
Merge branch 'master' of github.com:exampe/app
commit 02f1f9a146936373fcf8918036684ea209782a4b
Author: Programmer One
Date: Tue Mar 22 12:01:21 2011 -0700
fix test
commit a73ecc678165284627f67eeede0eb43512422e71
Merge: f8cbf68 62cab2b
Author: Programmer Two
Date: Tue Mar 22 10:17:59 2011 -0700
Merge branch 'master' of github.com:example/app
commit f8cbf680943e81aa57c5e09226f63817c09a13b5
Author: Programmer Two
Date: Tue Mar 22 10:16:53 2011 -0700
added logged out user page
What am I supposed to do with that? This tells me nothing about what’s happening on in the project. If I ran the same command six months from now I would have no idea what test “fix test” was referring to or what user related page “user page” was referring to. In fact I have no idea what any of this means now. Let’s take a look at the issues with these commits and explore how we can improve them.
What’s the big deal with git log anyways?
Joel Spolsky put it best when he said, “The difference between a tolerable programmer and a great programmer is not how many programming languages they know, and it’s not whether they prefer Python or Java. It’s whether they can communicate their ideas… By writing clear comments and technical specs, they let other programmers understand their code, which means other programmers can use and work with their code instead of rewriting it. Absent this, their code is worthless. By writing clear technical documentation for end users, they allow people to figure out what their code is supposed to do, which is the only way those users can see the value in their code.”
This same concept applies to git commit messages and git history. The end user in this case is your fellow programmers and you a couple months later when you don’t remember what you did to fix that bug. If you want to be a good programmer you’ll write good commit messages that others can understand, that properly documents what you have done. One of the best skills you can learn as a programmer is to write about your code and to explain what you are doing. Git commit messages and code comments are the best place to start.
Merge and Rebase Commits
When your git log has as many merge commits as it does normal commits it means you need to start rebasing some of your smaller commits. If your project doesn’t have any merge commits, it means you need to start using branches and merging your larger feature branch. Every project should have it’s own guidelines around merging and rebasing.
Why should you merge and rebase? Because using both methods provides a clean history and it helps differentiate a large branch merge from a simple fix or patch. Merge commits should have their own comments and they should provide a history of milestones.
The difference between merging branches and rebasing branches really comes down to how you want to format your history. In my opinion when you have a small atomic fix that is a single commit it’s always best to rebase. When you have a large feature history in a separate branch that you’ve been working on for a while it’s always best to merge. Anything between these two is really up for grabs, but think about what you want to convey when you merge your code into another branch. Is the merge commit really as significant as your original commit?
What should I include in my commit message?
The three basic things to include are a summary or title, a detailed description, and a tracking or ticket number, if it’s applicable. Here is a sample git commit with all that information:
Fixed bug where user can't signup.
Users were unable to register if they hadn't visited the plans
and pricing page because we expected that tracking
information to exist for the logs we create after a user
signs up. I fixed this by adding a check to the logger
to ensure that if that information was not available
we weren't trying to write it.
[Fixes #2873942]
This is usually followed by the change information for this commit that is automatically generated by git. Because of the way git logging functions format commit messages and summaries, it’s best to not exceed 50 characters for the title or summary line. This makes it easier to create summary logs and perform interactive merging and rebasing. It’s also best to use about 72 characters for each line of the detailed description. The reason for a seemingly arbitrary 72 characters is that a typical terminal window is 80 characters and the commit message is indented about 4 characters. If you leave an additional 4 characters on the other side for nice formatting this leaves you with 72 characters per line.
The tracking or ticket number is optional, but most projects use some kind of tracking or bug management tool so it’s nice to be able to keep track of projects and bugs with a ticket number. It’s often possible to integrate git with your tracking utility as well, so that your commits appear in your project tickets.
Conclusion
As always, this is all up for some customization. There are no hard-and-fast rules about rebasing and merging, but be consistent. You and your fellow developers should agree on a set of rules for when to rebase and merge. There is no 100% correct git commit message template, but use something that makes sense for your project. The guidelines you develop should help produce both a good concise history and a good expanded history. The main thing to take away is that you should actively be thinking about your git history and how you can use it as a tool to understand a project’s history. Treat it as an exercise in making yourself a better programmer.