When I consulted to companies on improving their source code (for refactoring it, improving the performance, detecting the design bottlenecks, detecting problematic files etc), I needed a way to quickly analyze a code base. However, there were not many tools available which gave me a quick insight on code. Commercial tools like Coverity, KlocWorks, Lattix etc are expensive. Because i could use it, I had to convince my client to 'license' it and that was difficult. Hence about 2 years back I wrote few python scripts to quickly help me analyze a codebase. Later I open sourced these python scripts a 'TCToolkit'.
Recently I have done significant refactoring and updates to these scripts and also added some new scripts. Also I have moved the TCToolkit code to Bitbucket. (https://bitbucket.org/nitinbhide/tctoolkit ).
Important updates are listed below
- Improved the performance of CDD (Code Duplication Detector). On my Dell laptop, subversion C code base (around 450 files) can now be analyzed for duplication in about 90 seconds.
- Now I use d3js library for generating the visualizations. Token tag cloud (TCC) now uses d3js for generating the tag cloud. CDD uses d3js for displaying the 'duplication matrix'.
- A new script 'CCOM' (Class Co-occurrence matrix) is added. This script analyzes the code base and finds out which classes are used together. It displays this information in matrix form.
For example, class A has class B as member variable, or member function of class A uses class B as parameter then class A and B are treated as occuring. If a function takes two parameters objects class B and class C, then class B and C are treated as 'co-occurring'. If classes are co-occurring, then chances are there is some dependency between their functionality and hence changes in one MAY impact other. - smjstreemap.py : This script generates a treemap visualization from the excellent freeware code metrics tool SourceMonitor. It also uses d3js for displaying the treemap.