WE ANALYZED 1 MILLION GIT COMMITS
To Decode Software Dev Trends & Best Practices
by Rajeev Bera
Updated June 18, 2023
We analyzed more then 1 million Git commits to better understand the world of Software Development and Git Best Practices right now.
And I want to make something crystal clear:
This is NOT your typical "Git in 2023" forecast article.
Instead, I'll discuss the most crucial practices for your Git repository.
But that's not all.
I'll also give you new ways to do things. These new ways are working really well right now.
So, if you want to make your Git repository better and faster this year, you will love this extensive guide.
And now it’s time to share what we discovered.
Here is a Summary of Our Key Findings:
- 1The analysis of Git commits reveals the top commit types. They are development, bug fixes, PR, merge, and test-related commits. It reflects the priorities in software development.
- 2The majority of Git commits (about 79%) were about feature development and bug fixes. It indicates a healthy focus on product growth and quality maintenance.
- 3Bug-fixing tasks account for a significant proportion of Git commits, approximately 27%. This shows that there is a strong emphasis on improving software quality and reliability.
- 4When commit messages carry a positive sentiment, they are 93% less likely to associate a commit with a bug fix. This suggests that Developers show more positivity in commit messages when not handling bugs.
- 5The highly successful repositories have a pattern of frequent collaboration. They excel in commit frequency and code quality. Higher pull request and merge commit rates correlate with a 9% increase in test-related commits. It is a crucial indicator of code quality.
- 6The majority of Git commits show positive sentiment, with an average of 57%. This suggests that developers generally have a positive view of the development process. And they feel good about their work.
- 7Frequent committers tend to have more positive sentiment scores. Developers who commit code more than twice a day had an average sentiment score 1.7% higher than less frequent contributors. It suggests a correlation between activity level and positive sentiment.
- 8Comments on pull requests were 43% shorter compared to comments on commits without pull requests. This indicates a more concise and focused way of communicating in collaborative situations.
- 9Bug-fixing commits have comments that are, on average, 22% longer than those in pull requests. It indicates that developers give more detailed descriptions when addressing bugs.
- 10Wednesday and Tuesday are the most active days for commits. Conversely, Friday is the lowest activity, with only 4% commits.
- 11Interestingly, the proportion of documentation-related commits is just 0.69% of the total commits. It emphasizes the efficiency and focus of developers on code production and bug fixes.
- 12Utilizing Git blame leads to 24% fewer lines added in code changes. It shows Git blame improves code understanding & reduces change volume.
- 13Refactoring is a common practice in all repositories, with at least 1.02% of commits dedicated to it. It shows the ongoing effort to improve and simplify the code's structure.
Top Five Commit Types: Reflecting Priorities in Software Development
In software development, what type of changes we make can show what the team focuses on.
From our data, the top five kinds of commits tell us what the team cares about the most.
- 1New Feature Commits: They lead the way, making up around 62% of all commits. This shows how dedicated the team is to improving the software's functionality and making it a top priority.
- 2Bug Fix Commits: At the forefront, bug fix commits makeup approximately 17% of the total commits. It shows the team's hard work to keep the software running smoothly.
- 3Pull Request (PR): Making up about 11% of all the changes. PRs show us that the team is always working on improving the project. The number of PRs also tells us that the team is actively working in collaboration.
- 4Merge Commits: With a contribution of approximately 4% of all commits. These commits help the team ensure that various changes work together seamlessly. And they result in reliable and well-integrated code.
- 5Test Commits: These makeup about 2% of all commits. The team uses them to check if everything in the software works right. This helps the team ensure they're giving out good, reliable code.
Key Takeaway: The analysis of Git commits reveals the top commit types. They are development, bug fixes, PR, merge, and test-related commits. It reflects the priorities in software development.
A Strong Emphasis on Feature Development and Bug Fixes
Mostly, developers use Git to add new features or fix bugs in their projects.
Our study proves this - about 79% of all Git activity is just for this. (Either for adding new features or fixing bugs)
Git commits activity is mostly for adding features and fixing bugs.
This makes sense. When people work on their projects, they want to keep making them better. That's why most Git commits are about improving features or getting rid of problems.
We found exciting things about what these commits are and how often they're done in our study. I will share more about this later.
For now, We need to acknowledge the vast git commits goes into feature development and bug fixes.
Key Takeaway: Feature development and bug fixes relate to 79% of Git commits.
Prioritizing Bug Fixes for Quality Software
The Proportional Bug-fixing Git commit rate is high at 17%.
Our study shows that more than a quarter of the changes programmers make in Git are to fix bugs. This is how programmers make their software better and more reliable.
Bug fixes get priority.
We looked at different commit types. These included adding new features, bug fixing, and refactoring, among others. Our analysis found that bug fixes made up 17% of the changes.
Why is the bug-fixing commit rate so high?
The reason programmers fix bugs often is to make the software work better and last longer. It helps overall enhance the product's stability and reliability.
This is crucial for the software's usability and longevity.
Bug-fixing tasks get a high priority in Git commits.
Key Takeaway: About 17% of changes in Git are for fixing bugs. This is a significant portion. It tells us that making the software better and more reliable is really important.
Fewer Bugs with Positive Sentiment in Commit Messages
We found a clear trend in how the sentiment in commit messages relates to the likelihood of fixing bugs. Commit messages with a positive tone decrease the likelihood of linking to a bug fix by 93%.
Upon studying the correlation between the sentiment of commit messages and the rate of bug fixes, a pattern emerged. It revealed that a higher positivity in commit messages led to a lower rate of associated bug fixes.
It implies developers have more fun when they work on adding new features to a project. They find joy in creating and implementing new functionalities that enhance the user experience.
If you're a developer working on a solo project, the sentiment of your commit messages might not seem of high importance. But, for larger development teams, understanding the sentiments in commit messages becomes critical.
For instance, If you're working on a complex coding project and see a positive sentiment in the commit message, it probably doesn't fix a bug. Instead, it could relate to a new feature implementation, a code improvement, or a performance boost.
Key Takeaway: A 93% lower likelihood of associating a commit with a bug fix correlates with positive sentiment in commit messages. This suggests the developers are more positive when not dealing with bugs in their commit messages.
Successful Repositories: Collaborative, High-Quality Code
We performed an in-depth study across a range of popular repositories.
And specifically examined:
And found that successful repositories tend to have a high frequency of commits. It indicates a continuous and iterative development process.
It is noteworthy that repos with more pull requests and merge commits showed a 9% increase in test-related commits.
Collaborating more boosts code quality by linking test-related commits to success.
Our study also indicates that repos with less collaboration tend to have lower code quality.
Less thorough review processes might cause this. Or the lack of diverse perspectives during development.
For example, consider this pull request that had a long waiting period. The long waiting period for a PR could negatively impact the overall code quality.
Here is a Git tip: Communicate, break down PRs, address feedback promptly for smoother reviews and better code.
Key Takeaway: The highly successful repositories have a pattern of frequent collaboration. They excel in commit frequency and code quality. Higher pull request and merge commit rates correlate with a 9% increase in test-related commits. It is a crucial indicator of code quality.
Positive Sentiment in Git Commits Sits at 57% (on average)
Our analysis unveiled that most Git commits come with a positive sentiment, hitting an average of 57%.
This sentiment evaluation categorizes Git commits into two broad buckets: positive and negative.
Positive sentiment Git commits have the upper hand.
Why are most Git commits positive?
The logical explanation lies in the nature of development work itself. Developers are problem-solvers, and each commit often represents a solved problem or a step closer to a solution.
For instance, a simple commit message like: "Fixed bug in the login feature" is a testament to a problem identified and resolved. And here is a negative sentiment commit: "Stuck with the recursive function."
This commit message depicts a challenge or a roadblock in the development process.
However, our analysis does not disregard the existence of negative sentiment commits. Challenges and roadblocks are part of the development process.
The majority of Git commits show positive sentiment. This suggests that developers generally have a positive outlook on their work.
Key Takeaway: The majority of Git commits show positive sentiment, with an average of 57%. This suggests that developers generally have a positive view of the development process. And they feel good about their work.
Committing Code Frequently, Promotes Positive Sentiment
Our study highlighted an interesting correlation. Frequent code contributors maintain higher average sentiment scores.
Explicitly, developers who commit code more than twice a day have an average sentiment score 1.7% higher than their less active peers.
Why would frequent commit activity correlate with higher sentiment scores?
Here is a plausible explanation. Frequent commits show that developers stay engaged and happy with their work. They're involved and passionate about their project and work.
For instance, consider a developer with sparse commit activity. Their infrequent commits could be a sign of detachment or lack of engagement. And it may reflect in their sentiment score.
It's important to remember that just because they commit more doesn't mean it directly causes them to feel more positive. There could also be involvement of other things.
Key Takeaway: Frequent committers tend to have more positive sentiment scores. Developers who commit code more than twice a day had an average sentiment score of 1.7% higher than less frequent contributors. It suggests a correlation between activity level and positive sentiment.
Pull Requests Encourage More Concise and Focused Communication
We aimed to explore the correlation between the use of pull requests and the nature of the commit message.
Our data unveiled that comments on pull requests were, in fact, 43% shorter compared to comments on commits without pull requests. This highlights the impact of structured communication dynamics in software collaboration.
Direct commits might sometimes trigger longer discussions, for instance.
Here's an example of longer commit.
In contrast, the pull request encourages more pointed and brief exchanges. Here's an example
This can save time and mental resources, allowing developers to focus more on the task at hand.
Key Takeaway: Pull requests lead to 43% shorter comments. It encourages more concise and focused communication.
More Detailed Comments With Bug-Fixing Commits
We found an intriguing trend while analyzing the bug-fixing and pull request commits. Developers often write longer commit messages when fixing bugs.
Our data revealed that bug-fixing comments were approximately 22% longer on average. It indicates a significant level of detail provided when developers address bugs.
A generic or ambiguous description may not provide enough context or explanation. And it could be one of the reasons why developers often take extra care to document their steps while fixing bugs.
This depth of information facilitates both the code review process and any future reference.
Key Takeaway: Bug-fixing commits have comments that are, on average, 22% longer. It indicates developers' tendency to provide detailed descriptions when dealing with bugs.
Wednesday Stand Out for Commits
We try to find out how the commit activities varied based on the day of the week. Our data analysis revealed that Wednesday had a distinct advantage over the other five days of the week. Friday shows the lowest commit activity.
Here is the finding of comparing the “best” day (Wednesday) to the “worst” day (Friday).
Wednesday had 19% Git commits. And Friday had 14% commits. (For most active repos)
We also compared commit rates for weekdays versus weekends.
We found that developers make more commits from Monday to Thursday compared to commits made on Friday, Saturday, or Sunday.
Git commits made on weekdays have significantly more activity than those made on weekends.
Key Takeaway: Software commits made on Wednesday, and Tuesday see more activity than any other day of the week. However, most smaller-scale projects don’t need to structure their commit patterns based on the day of the week.
Low Proportion of Documentation-Related Commits
We also check how much time coders spend on making documents compared to their total work time.
What we found is quite enlightening - a mere 0.69% of total commits pertain to documentation. This shows that developers are more focused on writing code and fixing bugs.
It's important to understand that this isn't indicative of negligence towards documentation. This relatively low percentage can also point toward efficient documentation practices. Effective and concise documentation strategies may reduce the required time.
As with any process, there can be a point of diminishing returns. Software developers' focus is more on coding instead of documentation-related commits. As it directly helps improve software development.
Key Takeaway: The small number of commits linked to documentation. It shows that developers spend most of their time on coding.
Git Blame Improves Code Understanding & Reduces Change Volume
Our study also aimed to determine the effects of utilizing Git Blame on the volume of code changes. This can show how well a coder understands the code and their level of responsibility.
Our study found an interesting pattern. When developers used a tool called Git Blame, they made 24% fewer changes to the code. This shows that Git Blame helps developers understand the code better. Also, it helps to make more careful changes.
Without using the Git Blame ( Its is a git command), developers might change more code than needed because they don't fully understand the original code's reason or use.
The usage of Git Blame tends to coincide with smaller, more focused code changes. Here is an example:
Key Takeaway: Utilizing Git Blame leads to 24% fewer lines added in code changes.
Refactoring: A Common Practice Across Repositories
We explored code development practices in many repositories. And we found that refactoring is common.
Refactoring means making minor changes to the code. And this minor change will make it better without changing its behavior. It's like tidying up your room without moving the furniture.
We discovered that developers dedicate an average of 1.02% of all commits to refactoring. It shows ongoing efforts to improve and simplify the code's structure.
Refactoring is helpful, but doing too much can waste time and resources. The key is to refactor and improve code where it's needed without going overboard.
We have not found in a successful repo more than 3% refactoring commits. Going over this 3% benchmark could mean spending too much time cleaning up.
Key Takeaway: Refactoring is a common practice, with at least 1.02% of commits dedicated to it. It shows the ongoing effort to improve and simplify the code's structure.
I learned a lot about git best practices and industry standards from this study, and I hope you did too.
Now, it's your turn: What's the most important thing you learned from this study?
Do you have any questions?
Please feel free to leave a quick comment below