Sunday, January 18, 2009

Social Network Analysis and Version Control

Recently I came across the concept of Social Network Analysis.

Given below is small introduction of Social Network Analysis is from Orgnet site
Social network analysis [SNA] is the mapping and measuring of relationships and flows between people, groups, organizations, computers, web sites, and other information/knowledge processing entities. The nodes in the network are the people and groups while the links show relationships or flows between the nodes. SNA provides both a visual and a mathematical analysis of human relationships.
The concept is originated in 'social sciences (socialogy, anthropology)' to study the relationships on communities. Today it is being used in fraud ring detection, identifying leaders in organizational network,analyzing the relience of computer networks and various other ways. The various casestudies from Orgnet site can give you good idea about the possibilities.

I started thinking about applying SNA for version control history with files and authors as nodes. There is some research going on in this area in universities. References below have few links. Google search with "data mining version control" will give you additional links

With SVNPlot, now I have a way of converting Subversion logs into an SQLite database. Also Python have some excellent libraries for Network analysis. I am using NetworkX for analysis and Matplotlib for visualization. I think such analysis will be useful in
  1. In indentifying the key developers and their specific areas in the project.
  2. Key files (files which are involved in the code changes more frequently than others)
  3. Identify the clusters of related files (across directories and modules)
I think the results will be useful to software development companies as well especially for getting advance warning for problems and especially big projects in indentifying critical developers, planning the technology transfer during movement from people from one project to another etc. I see many exciting possibilities.

The initial results are interesting. I will put up the charts/analysis etc on my site in a few days time.

References and Interesting Articles/Links
  1. Introduction to Social Network Analysis (from
  2. Casestudies of Social Network Analysis (from
  3. Wikipedia page on Social Networks (Check the history of Social Network Analysis)
  4. Social Life of Routers (Computer networks as social networks)
  5. Finding Go-to People and Subject Matter Experts in Organization
  6. Predicting Defects using Network Analysis on Dependency Graphs – ICSE 2008
  7. Mining Software Archives (a special issue of IEEE magazine)


F said...

Hi. Great work! Are the scripts you've used available? I'm using svnplot, and have seen the charts you posted on your website. I find they'd be a great addition to svnplot :)

Nitin Bhide said...

I have not added these scripts in SVNPlot yet as these scripts contain lot of hard coding and experimental code. If you are interested, i will be happy to share the scripts with you. Please send me a mail.