Announcement

Showing posts with label svnplot. Show all posts
Showing posts with label svnplot. Show all posts

Sunday, September 25, 2016

SVNPlot Version 0.9.0 Released

Today I am releasing SVNPlot Version 0.9.0.


SVNPlot now works on Python 2.7.x and on Python > 3.5.x. Current release also contains many small bug fixes esp. related to Unicode handling.



You can download the installers from Bitbucket Download Page.

Wednesday, October 30, 2013

Simple framework to assess potential risks in a Software Project

If you are in a software project, how do you assess the potential risks for a given software project ? So far I have not seen any coherent way of assessing the possible risks. Usually problems are discovered at really late the project life cycle (i.e. just before release dates) and by that time it too late to take any corrective actions. So a common problem is how to detect possible risks as early in project life cycle as possible ?

However, how do you define the 'success' of a project ?
  1. Project is delivered to customer. 
  2. Your company got the expected profit margin from the project
  3. Customer accepted the delivery
  4. Customer's end users are happy with delivery.
  5. Number of bugs reported are and hence your warranty costs are low.
Ideally a 'successful' project should include all the above. However, many times you achieve few items out of this list. For example, Customer accepted the delivery and end users are happy with features but there are lot bugs reported and rework is high. Customer has request new features and to implement new features require lot of changes in code etc Hence your cost are high/profit margin is now low. How do you assess these kind of risks ?

Last few years, I have been working on various code analysis techniques (Check my open source projects SVNPlot and TCToolkit).  Based on my experience I am convinced that analysis of code, design, version control history etc gives you pretty good idea about the success or failure of a project. 

Recently I have created simple framework to assess the possible risks. 

First we analyze the project in three ways 
  • Code Vs Testing quadrant 
  • Requirement Vs Testing quadrant 
  • Design Vs Codequadrant
Map where your project falls in each case. Based on which quadrants the project is mapped, will tell you possible risks for your project.




I find that based on various project metrics, if I mentally map the project to these quadrants, I get a 'rough judgement' of kind problems project will have in future.

What do you think ? 

Saturday, February 13, 2010

svnplot - one year later

About one year back (Dec 2008) I changed job. Between two jobs I had some free time. I wrote first version of svnplot during these 4-5 days. Then I released it as 'open source' project on Google Code. Soon many people started using it. I started getting the bug reports filed on the project page. To me, this was a indication that people are really using this project. 
  1. I got bug reports from developers/scientist working in places like CERN, AMD.
  2. One of the bug reports mentioned that "I'm using SVNPlot on over 100 of my users' repositories"
  3. Svnplot was mentioned in discussions on StatSVN forums
  4. StatSVN developers added the feature of  tag cloud of commonly used words in commit messages inspired from the similar feature in svnplot. So as mentioned by  Benoit "it is now a bilateral inspiration". Since initially features in svnplot were inspired from excellent StatSVN project.
Many people contributed bug fixes and improvements to svnplot.
  1. Chris Glasman added  support for repository authentication.
  2. Oscar Castaneda developed/contributed code to convert SVN logs to output files can be used in CMU's ORA and Apache Agora as as part of Google Summer of Code 2009 (GSoC09). You can read the details of his contribution here.
  3. kitpz2 contributed code for better pie-chart display of directory sizes.
I think the key advantage of  svnplot is it doesn't require a checked out copy of repository. Also it is easy to hack.

So what's next ?
 
I am now working on next version of svnplot (0.6). The key new feature will be graphs be generated on client side with javascript and HTML canvas. This will reduce the dependency on matplotlib and it will be easier for users to deploy it.  After checking few Javascript charting libraries like Flot, jquery.Visualize plugin, I decided to use jqPlot. I am planning to release Svnplot 0.6 in few weeks time.

Wednesday, January 28, 2009

Using Social Network Analysis with Version control data

As I mentioned in the last post, am experimenting about using social network analysis (sna) on verision control data. Now with SVNPlot project, I have a way of converting the Subversion logs into sqlite database. It allows me to query the data in many different ways.

I used the Rietveld repository data and did some premilinary analysis. I am not an expert on SNA but Initial results look very interesting and promising. You can see the results on my website



Update : Oscar Castaneda has added SNA data extraction to SVNPlot as part of GSoC 2010 project. He has used these modifications to analyze Apache repositories and reported his findings in ApacheCon. Check the details at
  1. Life After Google Summer of Code by Oscar Castaneda
  2. Oscar's GSoC 2010 proposal 
  3. Details on how to use his contributions in SVNPlot to extract the data.

Sunday, January 18, 2009

Social Network Analysis and Version Control

Recently I came across the concept of Social Network Analysis.

Given below is small introduction of Social Network Analysis is from Orgnet site
Social network analysis [SNA] is the mapping and measuring of relationships and flows between people, groups, organizations, computers, web sites, and other information/knowledge processing entities. The nodes in the network are the people and groups while the links show relationships or flows between the nodes. SNA provides both a visual and a mathematical analysis of human relationships.
The concept is originated in 'social sciences (socialogy, anthropology)' to study the relationships on communities. Today it is being used in fraud ring detection, identifying leaders in organizational network,analyzing the relience of computer networks and various other ways. The various casestudies from Orgnet site can give you good idea about the possibilities.

I started thinking about applying SNA for version control history with files and authors as nodes. There is some research going on in this area in universities. References below have few links. Google search with "data mining version control" will give you additional links

With SVNPlot, now I have a way of converting Subversion logs into an SQLite database. Also Python have some excellent libraries for Network analysis. I am using NetworkX for analysis and Matplotlib for visualization. I think such analysis will be useful in
  1. In indentifying the key developers and their specific areas in the project.
  2. Key files (files which are involved in the code changes more frequently than others)
  3. Identify the clusters of related files (across directories and modules)
I think the results will be useful to software development companies as well especially for getting advance warning for problems and especially big projects in indentifying critical developers, planning the technology transfer during movement from people from one project to another etc. I see many exciting possibilities.

The initial results are interesting. I will put up the charts/analysis etc on my site in a few days time.

References and Interesting Articles/Links
  1. Introduction to Social Network Analysis (from orgnet.com)
  2. Casestudies of Social Network Analysis (from Orgnet.com)
  3. Wikipedia page on Social Networks (Check the history of Social Network Analysis)
  4. Social Life of Routers (Computer networks as social networks)
  5. Finding Go-to People and Subject Matter Experts in Organization
  6. Predicting Defects using Network Analysis on Dependency Graphs – ICSE 2008
  7. Mining Software Archives (a special issue of IEEE magazine)

Wednesday, January 14, 2009

SVNPlot - my first opensource project

During the 1 week gap between the two jobs, I finally started an opensource project. The project is called in SVNPlot. It is inspired by the excellent StatSVN Subversion Statistics generation package.

SVNPlot generates graphs similar to StatSVN. The difference is in how the graphs are generated. SVNPlot generates these graphs in two steps. First it converts the Subversion logs into a 'sqlite3' database. Then it uses sql queries to extract the data from the database and then uses excellent Matplotlib plotting library to plot the graphs.

I believe using SQL queries to query the necessary data is resulting great flexibility in data extraction. Also since the sqlite3 is quite fast, it is possible to generate these graphs on demand.

As tribute to python and author of Python-Guido van Rossum, I have generated the graphs for Rietveld project. Check it out here

SVNPlot
is hosted on Google code (http://code.google.com/p/svnplot/) and licensed under New BSD license. For information on installation and usage, check the introduction page here

I am using python to implement SVNPlot. I am a novice to python. Hence any suggestions to improvement are welcome.