The purpose of a git commit message is to answer the question “why does this commit exist?” That is the principal question you should be answering when you type `git commit`. This is the question you will be asking yourself when you find that commit in `git blame` or if it shows up in `git bisect`. Try to help your future self out.
The changelog, on the other hand, answers the question “why does the customer care about these changes?” It could be the same reason as why the commit exists, but the question is different. If the customer doesn’t care, maybe it doesn’t even need an entry. Maybe why they care is slightly different than the reason the commit(s) exist.
This is why I advocate for a hand-updated CHANGELOG.md. It’s a very small amount of writing that forces developers to consider how their PRs will impact customers that an auto-generator will never be able to do well.
We autogenerate our CHANGELOG.md extract from Changelog-[section] our git commit messages (Release Captain massages them if necessary), and have a GH bot which checks that you have a Changelog- note somewhere in your PR commits (can be Changelog-None: ....).
A couple jobs ago, I was at a company that had a "mandatory" squash / rebase / merge workflow so history would be clean. On top of that, they forced all their developers to update a change log as part of the merge. That file was a source of contention / merge conflict for nearly every PR, often requiring additional rounds of rebasing. On top of that, it was full of information that could've been gathered from the git log. Big waste of time, in my opinion.
We had something kind of similar, but designed to avoid merge contention. We had each PR include a randomly-generated number, and a `changelogs/` folder where you could add a `${number}.md` that was either blank or had a message that would be added to the changelog. After you made the PR, you could run a bash script to edit the PR to contain the number and generate the `${number}.md` file.
It felt kind of silly and I don't know if anyone actually looked at the changelogs, but it took 2 minutes out of my day and worked well.
If you have never needed to find out why a particular piece of code is the way it is, then you have been very lucky to work in very clean code bases. In my own experience, this has been an infrequent but inevitable part of work - maybe once a month or so, I have had to understand whether a particular piece of code, that seems wrong, had a good reason for existing or not.
Sometimes it turned out to be a mistake in the original commit, or working around a limitation that no longer exists, other times it has saved me from re-introducing a bug that someone had spent effort fixing.
ArrayBoundCheck didn't say they never had to find out why a particular piece of code is the way it is, just that they didn't have to search through log messages.
For isntance, when you use "git blame" and similar tools, the log messages are not involved. You might end up reading the log message of the commit that was responsible for a change, but you didn't search log messages to get there.
In a project with poor commit messages, they will be of little use; reading them won't produce much value, let a lone searching.
Anyway, that seems like the best possible interpretation of the user's comment, anyway.
I think you just haven't discovered you can? I don't know of anyone arbitrarily looking through logs, but it's incredibly useful with git blame when you get to a section of code and don't understand it or typically it's done in an odd or unintuitive way. The blame shows who wrote it, information as to what they were working on via the actual commit message, and branch information.
If the person still works with you, you can just ask them about it.
If they don't or you don't want to bother them, the commit message tells you what the change was and often times why to give you a better context.
All the teams I've worked on included information like feature, bug, defect, and the associated ticket number in the branch name, so you have the information at hand to go look at the ticket directly and see what requirements were needed.
I'm starting to understand why. It appears my workflow has the information other people would want from a git log elsewhere (test, specs, examples, etc)
Let's say you have some code written 2 years ago. 2 months ago someone made a bug fix in the code replacing a few of the lines. How would you know from test, specs or
examples which specific lines were modified and by whom? I don't get it.
Why would I want to do that?
Usually if there's a problem I'll either write a new test or see if someone modified a test I thought was covering the case
Usually people 'own' a file or part of the system so that wouldn't really be happening anyway
Because sometimes we can learn from history. If a mistake was made at some point it can be good to understand why. Of course you can adopt the mindset that you don't care what ended up causing a bug, but learning from mistakes is a good thing.
> see if someone modified a test I thought was covering the case
How do you see if someone modified the test then? I feel like we are maybe misunderstanding eachother because to me, and seemingly most other commenter here, this is such an obvious no-brainer that it seems something is lost in the communication.
> Because sometimes we can learn from history. If a mistake was made at ...
A commit message helping me learn anything that I didn't get from a code comment sounds like a stretch. I'm doubting this
> How do you see if someone modified the test then? I feel like we are maybe misunderstanding eachother
Clearly. I was suggesting I might see commit messages if I'm using git blame to find out if a case was removed from a set of test or if it never existed in the first place. But I don't see how messages would help at all in anything I do. What I'm looking for is far too specific to be in a commit message and this whole thread about using commit messages to learn sounds nonsensical.
As I look at the top ten committers to our key service, two of us are still here, the other eight (including the lead from day one) switched teams or left the company over the last five years. Unfortunately I don’t think 27% turnover per year is unusually high in tech, so ownership isn’t a good replacement for written records.
That's right; for log messages to be useful, they have to be very disciplined. They have to stick to a certain format and then if the information you're looking for is of the kind which is provided by that schema, then they are useful.
As an example, say log messages are strictly required to contain a bug database ticket number (even if they are not fixes: bug database tracks tasks too) then that is useful; you can quickly search the git log for a bug number to find all of its commits.
Doesn't mean others are in the same boat as you. The team I work on frequently write detailed commits to the point where I can (and have several times) successfully searched information from years ago on a block of code with the ticket number and the developers (even myself) reasoning at the time.
Maybe it depends on the project but I've found myself doing this so often that I won't stop, it's such a small task that has given me so much benefit. If it doesn't help anyone then nothing was lost.
To explain why a certain block of code exists. When the code came into existence is rarely that important. You just want to know why.
Why does the code fence against a particular circumstance you didn't think should be possible? Why does it call out to something you think is unrelated? Those questions can be answered by a proper commit message.
Oh jeez, please do not make me run “git log” and then open a hundred tabs in an old bug tracker that may or may not still exist to figure out when a problem may have been introduced. I want code reviewers to insist on at least somewhat useful messages for us to skim at 3 AM.
Could you please stop breaking the site guidelines? You've been doing it a lot lately, unfortunately. Not just with these off-topic complaints about downvoting, but with comments like these:
We eventually ban accounts that carry on like this on HN. I don't want to ban you, so if you'd please review https://news.ycombinator.com/newsguidelines.html and stick to the rules when posting here, we'd appreciate it.
Some of those comments were in response to people calling me a liar with upvotes on their comments and downvotes on mine. It's incredibly irritating and frustrating when it appears those accusing me of lying not suffer any consequences.
For example (one that you linked to) a person said I was wrong and his comment wasn't flagged yet when I asked him to stop replying to my comments (many of my comments in the thread weren't on his comments) mine gets flagged https://news.ycombinator.com/item?id=32086054
It's certainly irritating and frustrating, but it doesn't make it ok to break the rules.
If you see a post that ought to have been moderated but hasn't been, the likeliest explanation is that we didn't see it. (There are far too many posts here for us to read them all.) You can help by flagging it or emailing us at hn@ycombinator.com.
(Mostly a rhetorical question) I'll probably never be able to do that if it's based on karma but my question is, if I stop saying things that get my comments flagged but proceed to get into negative ranking (ex: -50 or maybe -100 if I get on an unlucky roll) will I be banned by the system automatically? I assumed so which was the other reason that annoyed me
If karma goes below a certain threshold (I think it's -12) then the account's comments get autokilled, but only for as long as the karma remains below that threshold.
Because if 37 changes are done to the same 15 line function over time, the amount of comment material will dwarf the function. And most of it will pertain to historic versions of the function which are not what actually appears below the comment; a comment made 13 revisions ago makes sense for the 13-revision-old version of the function.
- written by people who have since left the company
- the original documentation has rotted away through wiki replacements, issue tracker replacements, or being lost via people turnover or system replacement.
The result in many cases is the code is the _only_ documentation of the system behaviour, so seeing what was introduced together is important context to understanding why it is the way it is. I probably run git blame more than git commit at this point and there's a real QoL difference between the good commit messages and the "changes for ticket12345" commit messages.
Never looking at git history is like, not reading comments or something. It's an incredibly valuable resource for understanding why the existing code is the way it is.
You didn't answer why. My code passes all the unit test and we almost always have real code using it immediately. The function works. Why am I reading it? The only thing I read are bug reports (usually a spec problem, not normally a logic bug) and new features, or test outputs
If someone complains that our system did something weird, sometimes it’s a mistake we can just fix, but sometimes it’s a non-obvious consequence of some requirement (which I might not have been aware of) that we have to explain to our consumer. It also helps a lot to tell whether it has been like that for a week or five years.
I can’t even imaging having a spec that fully answers anything like this; Microsoft tried for that level of detail but I found they couldn’t keep it up to date.
I think commits should contain atomic-yet-meaningful changes and the commit message should describe this as well as possible.
It's worth rewriting the history to achieve this and squashing or splitting commits until this is the case. You shouldn't do this for the benefit of your users or a changelog, you should do this in order that it is easier to bisect the history or for other contributors to understand exactly the change a commit relates to. There is nothing worse than commits which combine a working bug fix with a half-written feature -- split them out!
Obviously, it's possible to inadvertently create a misleading history by re-arranging the order that work was done or getting rid of failed attempts at a solution, but generally the false reality is easier to understand and good understanding is key.
Yeah, we don’t do much but I started writing up a brief description with a link to the PR, hotfix, or commit, so we can easily find links to relevant changes if we need to. It’s not that difficult to write it up manually. Automating it is too prone to either errors or a less than helpful message.
For instance, I retired the ChangeLog in the TXR project in 2015; commit messages continue to be in the ChangeLog format. A ChangeLog file could easily be produced from the commit messages.
Replicating that information in a file that is checked into git is silly; you're just begging for merge conflicts. Any time anyone sends you a patch, if it is not rebased to your current HEAD, you have a guaranteed conflict in the ChangeLog file.
Why do that to yourself.
> The changelog targets your users. It must answer questions like:
> "What cool new feature is in this version?"
> "Is this annoying bug fixed?"
> "Is it safe to upgrade, or do I need to adjust my code/workflow to this new version?"
Oh, I see what this person's problem is. He's referring to some lower case "changelog" that everyone calls "release notes".
I agree; the git log is not your release notes.
Please don't call "release notes" "changelog" in 2022.
Release notes aren't a change log because, doh, they don't (exactly) log the changes.
(Author here) Yes, "release notes" is probably the appropriate term. The issue remains however: many projects generate their release notes from their git commit messages.
Regarding the conflicts issue, this is where tools like Changie can help: instead of modifying one single file, every merge adds its own separated entry. They are then assembled together at release time with `changie batch $VERSION` to produce a single file for $VERSION, then merged into the global changelog/release notes file with `changie merge`.
This is something you could do with a commit hook which verifies that commit messages have a well-formed release notes entry section, and some seven line Awk script to combine them together when needed.
If "every merge adds its own separated entry" and that doesn't refer to a special section of the commit message, you're doing something silly.
I find that many programmers are hyper-focused on writing automation for things that feel like a chore. This thinking has its purpose, but it’s almost an addiction.
For these folks, putting deliberate effort into change logs, release notes, and documentation feels wasteful.
My hunch is that this is due to a missing feedback loop: we are unlikely to get feedback about documentation, and more likely to get feedback on our project’s code.
My own writing improved after a past project had a strong feedback loop with my documentation’s intended audience. This has been so damn rare in my career, that it’s never surprising when I meet programmers who are uncomfortable with technical writing.
I think the reason programmers want to automate documentation is:
- they are programmers, automating things is their job, that's what they are good at, so of course they will do that
- there is the general "don't repeat yourself" idea. Documentation repeats the code so, ideally, if both are needed one should be generated from the other. Sometimes, the code can be generated from the documentation, but most of the times you can't, so documentation becomes secondary to the code.
According to the GNU coding standards, which have been used for decades for a large amount of the core software on a Linux system, what you should put in the changelog looks quite a lot like a good git commit log would look like[0]. And Linux currently uses the git log as the changelog, and IIRC had a similar format in the pre-git era.
> The changelog targets your users. It must answer questions like:
> "What cool new feature is in this version?"
> "Is this annoying bug fixed?"
> "Is it safe to upgrade, or do I need to adjust my code/workflow to this new version?"
To me, that sounds like what a lot of projects would call "release notes" or (per GNU) "NEWS file"[1], not the changelog.
I agree, I used to have a NEWS file in my projects (later a NEWS.md), but as others commented, the signification of the term "changelog" has changed. Sites like https://keepachangelog.com/ really refers to release notes or news.
GNU's notion of a changelog predates the widespread use of source control, and is indeed there to cover which is now covered by git history. It's not what anyone using modern software development approaches means by a changelog.
> It's not what anyone using modern software development approaches means by a changelog.
Well, it's a good job then that this forum is cool with blurring the subtle distinctions between technical terms that have been in use for decades, and is happy to just go with whatever mainstream definition of a word has the most traction...
...on this site for news about breaking into computer systems.
Yes, a commit log isn't a changelog. However, a good commit log can make writing your change log much easier. While this isn't an automatic process, writing a changelog becomes a bit of filtering of the commit messages as well as rephrasing them for the intended audience.
> Git commit messages and changelogs do not have the same target audience. [...] Some people dislike merge commits. [...] If that bothers you, you can always ask contributors to rebase before merging.
I think that squashing and merging, much like commit messages and changelogs, also have different target audiences.
If there are contributors to the project who aren't proficient with git, asking them to rebase would be a huge obstacle for them and create a much worse mess. Squashing simplifies their workflow and allows any mistakes (e.g. checking in the database, accidental merges from the wrong branch, etc.) to be cleaned up and kept out of the permanent history.
Agree with the premise of the article: changelogs—or rather, news files—are not the commit history. Describing the changes between releases at a high level is super important but also a skill that’s hard to acquire.
I thought I had written about this way back as well, but what I found instead is a post from 2005 that’s tangentially related. I remember that some projects back in the day tried to replicate the equivalent of git log in their ChangeLog file… by hand!
The missing part of this is filtering git history with "git log --first-parent", but that only works if your repo has good a) merge commit messages and b) hygiene in creating your merges so that your "Pretty" history is on branch 1.
On the other hand, it takes a lot of discipline to make the second branch useful at all. If you constantly make short commits with messages like "stuff", why bother keeping them in the repo at all? There's a ton of junk you don't need in the typical code review, and making it useful requires... Discipline.
At work we squash and use standard-version to create a generic change log, then have a script that ingests that change log and goes to the PR to grab any associated tickets, screenshots, and release notes the developer may have written. The body of the PR is split into sections (just using markdown headers), so there is control over what goes into the release notes.
A git log is not a change log, sure. But PRs can contain a lot of useful information.
I use the git log to feed my changelog. I prefix the stuff that's supposed to go in the release notes with a asterix and the technical boring stuff is just a normal line. Then at release time I have a script that pulls the asterix prefixed lines from the change log into the RELEASENOTES.md. I wouldn't want to bother with more.
I completely agree with this. Every single project with « conventional commits » that try to generate release notes end up with completely useless release notes. Just bite the bullet and write for humans, it’s not that hard, especially if you do it on the fly as suggested by the post.
If you write dev facing projects, unless your volume is that high, your commit history can absolutely be your change log using conventional commits.
Otherwise your PM can write change logs basing off of JIRA or something. Your change log generated from commits can still be useful for incident management, etc.
NixOS maintainer here and this is really annoying me since months. I can't really know what from a 100+ line git log is a breaking change and needs special attention and what is totally uninteresting for consumers of the piece of software.
Yep, something like Git Cliff[1] is great for generating release notes from your commit messages.
And conventional commits are good thing to do regardless of whether you use them for release notes or not. Commit messages should be helpful and immediately obvious, too often its "fixed bug" or "finally figured out foo!", which really tell you nothing - might as well not have a message.
This is the kind of tools I dislike. It does generate a "nice looking changelog", but the result is not as useful as a separately written changelog, because the content is not curated, so the signal-noise ratio can be quite high.
we use https://github.com/anchore/chronicle to generate release notes in a changelog format using the issues and PRs from GitHub as the source of truth. In this way time well spent in the curation of issues and PRs (which is something we need to do anyway) means that we automatically get release notes for free. (disclaimer: I'm the author of chronicle)
Most software is written for internal use and not sold to external customers. For this case Changelogs are busy work without much utility up until the point your organization is big enough that you and your "customer" may as well be separate companies. In that case you're better off writing a blog post or release announcement than a text file change log which is for grey beards, not users.
The changelog, on the other hand, answers the question “why does the customer care about these changes?” It could be the same reason as why the commit exists, but the question is different. If the customer doesn’t care, maybe it doesn’t even need an entry. Maybe why they care is slightly different than the reason the commit(s) exist.
This is why I advocate for a hand-updated CHANGELOG.md. It’s a very small amount of writing that forces developers to consider how their PRs will impact customers that an auto-generator will never be able to do well.