My work on code analytics started 10 years ago, and the CodeScene analysis tool has been my main focus for the past 4 years.
CodeScene is the first real product built around the concept of behavioral code analysis, which is a radical departure from traditional static analysis techniques.
Over the past years, I have spoken at a ton of conferences, written two books, and published several articles and the occasional research paper on behavioral code analysis. I’ve done my best to popularize the field.
At times, this has been a lonely journey, so it’s great that more people and more companies are joining this community.
The most recent addition is GitLab that is now entering the behavioral code analysis space. This marks a milestone for the field, and a validation for me personally that what I have been claiming for years makes sense to outsiders as well; behavioral code analysis is as close as we get to a silver bullet for making sense of large-scale codebases. The level of insights and the speed with which we get them continues to fascinate me. Behavioral code analysis also has a clear advantage over silver bullets: it’s real.
GitLab’s entry also provides validation for the CodeScene tool. We never took any venture capital, but decided to build a great product first to prove the value and business model (read the full startup story here). Since then, CodeScene has grown into a tool suite that’s used by organizations around the world; thousands of people are using CodeScene in their daily work on large-scale codebases.
However, to serve a growing community, we need to focus around a common vocabulary that clarifies the concepts. Let me explain.
Behavioral code analysis identifies patterns in how a development organization interacts with the codebase they are building. That is, while the properties of the code are important, there’s even more value in learning how the code got that way and where it is trending. This information is used to prioritize technical debt, detect implicit dependencies that are invisible in the code itself, and measure organizational factors like knowledge gaps and support on- and off-boarding.
There have been tools in this space before CodeScene, like delta-flora by Michael Feathers, my own open source tools, as well as research tools like the Evolution Radar. All of these have influenced CodeScene.
From its inception, CodeScene has built on research and has been the topic of research itself, such as comparisons to static analysis.
We often joke that naming is one of the hardest problems in software. The reason those jokes are fun is because they are true. Naming is hard. The naming problem is there for a product in an evolving field too. I know, since I have made my fair share of mistakes.
I’m responsible for most of the names and concepts that you find in CodeScene. Some names are new to CodeScene, others are lifted from my books or academic research papers. What follows are some examples on how ill-chosen names cause unnecessary confusion:
In my previous writings--and occasionally in the tolling - you may come across the term temporal coupling instead of change coupling.
This is unfortunate since it overloads the term. The fault is all mine; I chose the temporal coupling name - unaware that it had a previous use - te emphasize the notation of cochange in time. In its original use, temporal coupling refers to dependencies in call order between different functions. For example, always invoke function Init before calling the AccelerateToHyperspeed method or bad things will happen. This kind of temporal coupling is a code smell and is discussed in The Pragramatic Programmer: From Journeyman to Master [Ht00].
My initial thinking is continuously evolving as I learn more by working with others in the community. I will continue to share those learnings. After all, behavioral code analysis is a young discipline, a new generation of code analysis, and leading the way means educating new users and encouraging them to explore the space. When it comes to exploration and learning, a clear and consistent vocabulary is paramount. Join in and welcome to CodeScene!
CodeScene in action: within minutes, the analyses let you build a mental model of a previously unfamiliar codebase.