Behavioral Code Analysis. Find the code complexity that matters.

Written by Adam Tornhil | Nov 12, 2019 5:00:00 PM

The main difference between CodeScene’s behavioral code analysis and traditional code scanning techniques is that static analysis works on a snapshot of the codebase while CodeScene considers the temporal dimension and evolution of the whole system.

This makes it possible for CodeScene to prioritize technical debt and code quality issues based on how the organization actually works with the code. Hence, we can limit the results to information that is relevant, actionable, and translates directly into business value.

CodeScene also goes beyond code as we consider the organization and people side of the system. This gives you valuable information that is invisible in the source code itself, such as measures of team autonomy and off-boarding risks.

This article explores how this is possible, and we also look at some independent academic research to find out how well it works in practice; with code quality it’s your time and money on the line, so let's invest wisely in our tooling.

How is CodeScene different from Static Analysis?

A traditional static code analysis tool focuses on a snapshot of the code as it looks right now. Such tooling is valuable in that it might find code that is overly complex, has heavy dependencies on other parts, or contain error prone constructs. It’s genuinely useful and I use static code analysis myself – it’s a valuable practice.

However, a static analysis will never be able to tell you if that excess code complexity actually matters – just because a piece of code is complex doesn’t mean its a problem. This is where CodeScene’s behavioral code analysis fills an important gap.

CodeScene identifies and prioritizes technical debt based on how the organization works with the code. That is, CodeScene puts technical metrics into a business context. It’s a game changer as developers and managers can now prioritize technical debt using a shared model.

CodeScene analyses multiple data sources to provide its analytics.

The analysis is completely automated. CodeScene prioritizes a small part of your codebase – typically 1-4% – that points to the most likely return on any code quality investments:

CodeScene puts technical metrics into a business context and makes them actionable with short feedback loops for the development organization.

CodeScene compared to Static Analysis (SonarQube)

CodeScene was created to fill a gap; whereas static analysis tools are good at catching coding mistakes and provide detailed feedback to programmers, the same techniques don’t work particularly well for prioritizing technical debt. This is reported in a recent research paper from the University of Ottawa which concludes that:

“in reality, acting upon all the TD instances is not worthy” (Parthiban, D.G. Examination of tools for managing different dimensions of Technical
Debt, 2019).

There’s simply too much technical debt, and the business value from fixing it isn’t clear. But the paper continues:

“There are tools like CodeScene which helps in prioritizing the refactoring targets. It prioritizes TD instances based on their technical debt interest rate”, which is exactly our claim above."

So CodeScene works well in practice for prioritizing technical debt. But what about the impact of its reported issues? Additional research, this time from the University of Victoria’s code quality study, compared CodeScene to SonarQube – a market-leading static analysis tool – and verified the results by
human inspection:

Problems detected by the static analysis tool were likely small issues which would result in little reward if fixed.
The study also claims that Next, we ran CodeScene on Bokeh [a codebase], which lead to more significant results.
The case study concludes that We found CodeScene to be more useful [..] as it provided us with a higher level view of problems and potential issues.
Using CodeScene also shed a light on issues that were not apparent while previously examining the source code.

Both of these studies also looked at people-factors. The people side of code is at the core of behavioral code analysis and something you just cannot measure in a static analysis.

CodeScene is more than just a technical analysis: detect team coordination bottlenecks and organizaitonal dependencies.

This is important because the code itself is only one component of a software system. A behavioral code analysis fills in the blanks. Since behavioral code analysis builds on social data – CodeScene knows exactly which programmer that wrote each piece of code – it’s possible to build up knowledge maps of a codebase and aggregate these on the organizational level. This lets you answer questions like:

System Mastery: Does the current team have enough experience with the codebase or do we have any knowledge gaps?
Off-Boarding Risks: What happens if a core developer leaves, or a team is moved to work on a different product? Where should we focus the hand-over and on-boarding?
Coordination bottlenecks: Are there any parts of the software architecture where multiple teams have to coordinate their work? Such modules frequently lead to waste via merge conflicts and tend to be defect dense.
Team Coupling: CodeScene’s Change Coupling analysis lets you evaluate how well a software architecture aligns with the organizational structure of the teams that build the system. What does the dependencies between teams look like? How can they be streamlined?

Note that none of this information is available from the code alone.

CodeScene: a Communication Tool

CodeScene’s behavioral code analyses go beyond the scope of a static analysis tool. CodeScene focuses on code health trends, socio-technical factors like key personnel dependencies and inter-team coordination, but most importantly CodeScene puts its findings into context. This limits the information to what’s relevant and actionable; in CodeScene, you will never ever see 5,000
critical warnings.

These priorities are key. All larger codebases have their fair share of quality issues. The thing is that an organization can – and often does – live with a certain amount of technical debt. CodeScene guides you by highlighting the parts where any technical debt is expensive, and also points out the code quality issues that just require attention but don’t necessarily have to be fixed.

Most importantly, we view CodeScene as a communication tool. Our sweet spot is that we serve the whole engineering organization, not just managers or developers, but both. Communication is critical to successful software development. You might still find static analysis useful – we do – but static analysis works best as a low-level feedback loop during development, but it’s not something you can use to prioritize technical debt or reason about the efficiency of the team structure.

Explore More and try CodeScene

Check out our white paper to learn more about CodeScene, its use cases, and how they fit into your existing workflow and roles.

CodeScene is available as an on-premise version and as a hosted CodeScene SaaS.

View full post