The Art of Performance tuning and Optimization


About 2500 years ago there lived a brilliant Chinese military strategist, general Sun Tzu. His life’s work was a book entitled “The  Art of War”. It is full of ideas or in a way design patterns for winning small to large-scale conflicts in the most optimal ways.

Even though the book is now 25 centuries old it still inspires leaders in our times. Sun Tzu believed that one of the better ways to win a war was to avoid a fight in a clever way. In our beloved software industry there are actually quite a few who believe the same. You may have heard that the best code is the one you never have to write at all (Think: existing libraries). In this article I will try to explain how in my opinion we can make our software optimal without performing complex optimizations − the Sun Tzu way.

 # 1 – He who optimizes prematurely losses at the first load test

Some time ago, just to gain some experience, I decided to create a caching framework. Basically something that “understands” when to cache things and could be easily integrated with DAL or ServicesLayer via attributes. At first I thought I did well. My code did not cache things that it often had to invalidate. It even used reflection to understand how things are interconnected with each other so that once you invalidated a child object – a parent and all linked object also got invalidated.

I have written a lot of smart code in this framework but when I got to profiling it turned out that the performance gain wasn’t really as big as I expected. I wasn’t happy to realize that my self-caching data access wasted too many resources. In a way I have wasted time on this but I have also learned something: there is no such thing as a generic caching in software just as there are no generic solutions in software development. A lesson I have learned here was not to cache until I have a working solution and some real usage statistics – only then can one make a decision on what to cache – what will actually benefit of being cached. Trying to come up with a caching strategy at early stages of development is counterproductive. It is actually better to get poor results of stress tests as once you analyze it you may decide to optimize instead of cache – something to think about.

# 2 – While sword is important, a victorious warrior should also know his spear and shield

It is vital that as a developer you know your tools. So as an exercise I recommend spending an hour on each of the following tools, just to see what they can do and perhaps read some basic tutorials if you find them interesting:

  • Glympse: an amazing tool which when added to the application (via painless integration) renders itself on top of our website, allowing firebug like analysis of the most interesting aspects in page requesting/rendering. Below are just a few things it can tell you:
    • Detailed rendering time information
    • Server and client time consumed for the requests
    • Active database connections
    • Queries executed (including the actual sql… How does it know?!)
    • MVC Routing information
    • MVC Model binding data

There are tons of plugins for profiling: Azure, EntityFramework, SignalR, Knockout, various IOC frameworks and many more. They are available here:

  • CLR Profiler: In some scenarios you may need to look for performance issues deeper in your custom server logic. For this task I recommend CLR Profilers. There are some free ones, e.g.: It doesn’t take great skills to understand the way profilers work. You simply then click “start profiling”, do your thing and click “stop profiling”. The result should be some sort of a tree-view report in which you can explore method calls invoked (each one has some useful cpu and memory utilization data displayed). It may literally take minutes to find the exact method and often even the specific line of code which is your bottleneck or hotspot.
  • SQL Profiler: Often performance issues originate at the data source. Whether it is a badly written stored procedure with too many joins or heavy usage of cursors when they are not required, the tool which will help you uncover the bad guy here is the SqlProfiler. You can find it under tools menu in the SQL Management Studio. Using this profiler you are actually setting up a proxy on top of the database engine. It allows you to filter all the queries going into the database by text, name, client-machine and more. It shows what and how often is queried and most important how the queries are structured. In case you cannot spot anything wrong with a particular bit of SQL it is worth checking the execution plan for it. It should be enough to pinpoint any inefficiency.

# 3 – Mistakes multiply when introduced as conventions and patterns

performance2_istock_9To use design patterns one should first understand anti-patterns. When you introduce a pattern to your solution you need to be aware that it will be repeatedly utilized by other developers. Hence all use scenarios need to be considered and most importantly – a pattern needs to be fully understood before it is utilized. Also, for performance reasons you should understand all the parallel development patterns like:

  • Async/Await pattern
  • Task Parallel Library – Task cancellation and continuation concepts
  • Parallel Loops – Parallel.For(…)

                Those are available for a while now but we still do not see them used as often as we should hence performance is lost and servers under-utilized… There is a good reason we have so many cores available now – use them.

# 4 – He who fails to prepare is essentially preparing to fail

So we know now that optimization should happen once the majority of work has been done. There still are things we can do at the very beginning. We can make a lot of good design decisions which will improve our performance:

  • Think about your database access – perhaps you shouldn’t use ORM and its many abstraction levels.
  • Think about database choice – perhaps you should consider non-relational databases for some or even all of your data.
  • Think about your layers – do you need 5 layers mapping similar objects from one layer to another on each call? Is there a clear benefit for each separation?
  • Consider single page applications – perhaps it would be better to get most of the data initially and then only call simple save/update methods on each user action?

# Summary − The art of profiling is an essential tool for delivering software solutions

If I were to summarize this article using only one word, it would be: “Think”. And if I were to use only one sentence, I would say: “Always design for performance, utilize parallel computing whenever it makes sense and at the very end use all the profilers you know to find perfect candidates for caching”.

Feel free to add your comments and thoughts below.

Tags: ,

1 comment

  1. I couldnt agree more with the summary. Performance and code optimization is not even an inch away when writing it for the first time.

Comments are closed.