Announcement

Showing posts with label performance optimization. Show all posts
Showing posts with label performance optimization. Show all posts

Wednesday, March 20, 2013

Software Performance Optimization - A Different Skill

For almost every project I worked on in last 18 years, required performance optimization. Now I have become somewhat of an expert in Performance Optimization in various domains. I have worked on optimizing performance in CAD/CAM algorithms, database queries, caching. I have considered alternative algorithms, alternative data structure usage, impact of page faults, impact of caching etc etc.

In every domain few things are different but some basics remain constant. First rule of optimization is "Don't depend on your gut feel about the location of the performance bottleneck". 99% of times Your gut feel is wrong. So you need to use tools to locate the performance bottleneck. Essentially the process boils down to 
  1. Identify appropriate tool to generate the performance data. Usually this will be a 'profiler'. But sometimes other tools are required (e.g. for analyzing database queries which are taking long time).
  2. Generate the performance data using the tool.
  3. Interpret the data and locate the performance bottleneck. This requires some practice (and guidance if available)
  4. Study the bottleneck code and find out a way to eliminate bottleneck with least amount of code changes. It is important to ensure that code changes are minimum. Large amount of code change can result in new bugs.
  5. Make code changes and test.
  6. Generate the new performance data and ensure the bottleneck is fixed. If not, revert the changes.
  7. If performance is improved keep the changes and commit it.
  8. Analyze the performance data again for the next bottleneck.
  9. Repeat the steps 3-8.
For most projects 5 to 10 times speed ups are possible. However, usually project teams find it hard to believe. Recently I worked with SigmaTEK Systems India team for improving the performance of their Tube Nesting product. Together we were able improve the performance of  their Tube Nesting product by more than 5 times.

It was a real pleasure to work with Nitin on several projects, especially related to performance development.

Nitin came on board at a typical situation, where the customer was unhappy about the speed of the algorithm, and there was lot of pressure to improve it significantly more than the current speed.

Nitin showed us how to systematically analyze code using simplest tools possible (emphasis was always on understanding, never too much on tools). His inputs and ideas on how to improve the performance, without having to compromise with the quality of the results, very extremely valuable. In addition to just code optimization using performance metrics, Nitin was very keen on evaluating the algorithm techniques as well, and provided us several alternatives right down to the core level, on alternative approaches to evaluate for performance improvement.

This experience has been a real eye opener for us, and although it sounded cliché, when Nitin mentioned the very first time, that he has been involved in several projects with optimization improvements of 5X are more, it was extremely satisfying to see that he guided us using his systematic methods and principles, to performance gains of 5X + in our project as well.



Monday, April 06, 2009

Unusual way of Performance Optimization

I have done performance optimizations in all my projects. Usually it involves selecting the appropriate data structure or algorithm, caching the calculated data to avoid re-computations etc. However, sometimes performance optimizations pose unique challenges.

In this particular case, the problem involved fitting a spline surface to cloud of points. To fit the final surface, a reference spline surface was created and point cloud was projected on to the reference spline surface. The (u,v) coordinates of projected point were then assigned as reference (u,v) point to original point. We were using Parasolid geometry kernel. Parasolid has a point projection function. There is no analytical solution for projecting the point on spline surface. Hence the algorithms for projecting point to spline surface are iterative algorithms which converge to the solution within given tolerance. However, in our case, we were projecting anywhere from 3000 to 10,000 points to a spline surface. In the typical use-case, the projection of point cloud happened many times on different reference surfaces.

Initial naive version took 30 to 35 minutes to complete simplest case. Complex cases required hours. Obvious this was 'deal breaker'. It was clear that this was not going to be acceptable to customer. We did some obvious changes like caching the results wherever possible. This somewhat improved the computation time (e.g. from 30 mins to 10-15 mins).

Obviously that was not enough. We wanted to do the simplest cases in seconds. A different solution was needed and we were stuck. Implementing the projection algorithm was out of question (as it is complex algorithm and Parasolid implementation was probably already highly optimized). Parallelizing with multiple threads was another option but that would give around 3/4 times speedup. We needed much-much larger speedup than that.

I re-looked at the Parasolid projection function and found that there is an additional parameter to the function. It is possible to give a 'hint' about projected (u,v) values. If the hint is good, it can cut down the number of iterations required to converge to the solution. But the way to compute the 'hint (u,v) values' has to be simple and quick. All of a sudden, I realized that the input point cloud is a laser scanned data, so the chances that previous point is near to current point are very high. Now the solution was simple use the calculated (u,v) values of previous point as 'hint' for the calculation of projected (u,v) for the current point. Since most of the cases, the previous point and current point are very near to each other, number of iterations required to compute the projection were drastically reduced. The code change required to implement this solution was 'minimal'. But the speed ups achieved were really dramatic.

We did few other performance improvements. Finally when we delivered the project the same simple testcase which took 30 minutes was finishing in 7-10 seconds. A speed up of 180 times. :-)

This was really most unusual and dramatic performance optimization that I did. Also it was certainly the case that 'complex looking problem usually have very simple solution'