November 29, 2009 19:58
Doesn't it look like Christmas has already started? Recently I had the pleasure of being contacted by Santa Claus Patrick Smacchia, lead developer of NDepend, who offered me a free Pro license.
(In case you don't know: NDepend is an awesome static code analysis tool to measure quality of .NET apps in a bunch of ways, including code metrics and bewilderingly abstruse, yet amazing, visualization approach.)
Woo-hoo! I remember playing with a trial version last year, which was kind of nice, but a free Pro is a free Pro. The offer didn't require a blog post in return, but very soon I realized that I cannot but post... because I'd picked up Resharper as a guinea pig.
Comparison.
I selected R# 4.5 libraries and compared them to the dlls of a R# 5.0 nightly build, to see what has changed in the new version.
It took NDepend 10 minutes to analyse everything and return with a lot of results, some general info displayed here at the left.
Assemblies that have been changed are underlined, the new ones are in bold, and assemblies that no longer exist are striked out.
Then, I've never wondered how many lines of code R# has, and was pretty impresssed to see it had half a million (!) of them back in 4.5, and has now grown ~60K more. And the amount of assemblies increased by almost 60%.
So, as everyone probably expected anyway, the new R# is bigger. (And, of course, better - but it's subject for another post.)
Surely a lot of existing code was affected as well. And as someone who writes plugins for Resharper, my only concern is API. I don't care about private types and methods but anything that was public in 4.5 and got removed or hidden in 5.o, could potentially break the clients.
So I asked NDepend: what was changed in the APIs? Did they really remove those public interfaces named.. wait.. something with "Handler" at the end? Aye, said NDepend. Beware!
With a lot of things you can do while comparing two code bases, breaking change analysis is (almost always) a must. NDepend provides "main" queries to test methods/parameters/interfaces, and there are more fine grained options to find, say, public methods that became obsolete, or assemblies where comments were changed.
Improvements.
But wait. Comparing codebases is cool, but improving the new stuff is cooler!
From architectural point of view, first things to look at are coupling and ciclomatic complexity. Here's a quote from NDepend recommendations: "Methods where IL cyclomatic complexity is higher than 40 are extremely complex and should be split in smaller methods (except if they are automatically generated by a tool)".
And a query for the most complex methods (ILCC > 40) in R# 5.0 codebase gives this picture:

As you can see, all complex methods are in Psi namespaces where they build PSI trees representing the code, and this probably reflects the idea that complex business requirements cannot really get translated into simple programming solutions.
But let's look at another metric - lines of code in a method. To count them NDepend reads .pdb files and computes language independent "logical" lines of code, not affected by coding styles. Again, their recommendation: "Methods where NbLinesOfCode is higher than 20 are hard to understand and maintain. "
Let's double this up to a sanity maximum of 40 logical LOC, anything above that absolutely should be split up. (I understand that such tasks are usually of a priority 27, but hey!) So let's look at how R# fits that. Assuming that PSI-related code would champion here as well, we run a query that would cut it off from the results:
SELECT METHODS OUT OF NAMESPACES "JetBrains.ReSharper.Psi.*" WHERE NbLinesOfCode > 40
...and here's the result (585 methods):

NDepend as design guideliner.
Reading predefined CQL queries in NDepend is like reading a good design guidelines book. Really. There's a cornucopia of advices on code quality, design, naming conventions, usage of .NET framework, and more. I just like to point out two items.
Look at this one: "The .NET Framework class library provides methods for retrieving custom attributes. By default, these methods search the attribute inheritance hierarchy; for example System.Attribute.GetCustomAttribute searches for the specified attribute type, or any attribute type that extends the specified attribute type. Sealing the attribute eliminates the search through the inheritance hierarchy, and can improve performance."
So if you can mark attributes sealed, this will improve performance. Time for a new query?
SELECT TYPES WHERE IsAttributeClass AND !IsSealed AND !IsAbstract AND IsPublic AND !IsInFrameworkAssembly
...and we can see that 137 R# attributes out of 255 are not sealed. I went to Reflector to arbitrary check a few of them and see if they have any derived attributes (apparently there's no "CanBeSealed" keyword in CQL, alas) and didn't find any. Looks like a room for improvement?
And another one is boxing/unboxing analysis. It's already been my favorite one for quite a long time, since it is quite easy to achieve by searching for "box" in plain IL, but with NDepend the process gets nicer:
More than 4000 methods found. Right click offers you "Go To Reflector" option where you can see how exactly this boxing/unboxing occurs and whether it's possible to avoid it:

Disclaimer.
Well, have just re-read the post and realised that it might look like I'm saying R# is badly written. Definitely no. R# is a great product and it was a huge challenge to implement it.
And surely, any software has its room for improvement.
I'm just amazed with how easy you can spot these potential improvements with NDepend.