I have done performance optimizations in all my projects. Usually it involves selecting the appropriate data structure or algorithm, caching the calculated data to avoid re-computations etc. However, sometimes performance optimizations pose unique challenges.
In this particular case, the problem involved fitting a spline surface to cloud of points. To fit the final surface, a reference spline surface was created and point cloud was projected on to the reference spline surface. The (u,v) coordinates of projected point were then assigned as reference (u,v) point to original point. We were using Parasolid geometry kernel. Parasolid has a point projection function. There is no analytical solution for projecting the point on spline surface. Hence the algorithms for projecting point to spline surface are iterative algorithms which converge to the solution within given tolerance. However, in our case, we were projecting anywhere from 3000 to 10,000 points to a spline surface. In the typical use-case, the projection of point cloud happened many times on different reference surfaces.
Initial naive version took 30 to 35 minutes to complete simplest case. Complex cases required hours. Obvious this was 'deal breaker'. It was clear that this was not going to be acceptable to customer. We did some obvious changes like caching the results wherever possible. This somewhat improved the computation time (e.g. from 30 mins to 10-15 mins).
Obviously that was not enough. We wanted to do the simplest cases in seconds. A different solution was needed and we were stuck. Implementing the projection algorithm was out of question (as it is complex algorithm and Parasolid implementation was probably already highly optimized). Parallelizing with multiple threads was another option but that would give around 3/4 times speedup. We needed much-much larger speedup than that.
I re-looked at the Parasolid projection function and found that there is an additional parameter to the function. It is possible to give a 'hint' about projected (u,v) values. If the hint is good, it can cut down the number of iterations required to converge to the solution. But the way to compute the 'hint (u,v) values' has to be simple and quick. All of a sudden, I realized that the input point cloud is a laser scanned data, so the chances that previous point is near to current point are very high. Now the solution was simple use the calculated (u,v) values of previous point as 'hint' for the calculation of projected (u,v) for the current point. Since most of the cases, the previous point and current point are very near to each other, number of iterations required to compute the projection were drastically reduced. The code change required to implement this solution was 'minimal'. But the speed ups achieved were really dramatic.
We did few other performance improvements. Finally when we delivered the project the same simple testcase which took 30 minutes was finishing in 7-10 seconds. A speed up of 180 times. :-)
This was really most unusual and dramatic performance optimization that I did. Also it was certainly the case that 'complex looking problem usually have very simple solution'