Monday, 7 January 2013

Performance series: Is thread contention bad?

[Level T2] For better or worse, I have recently been working a lot on the area of performance optimisation and bottleneck hunting.

Performance monitoring is an art as much as it is a science. On one hand you have all various metrics (simple, complex or derived) that a system can produce (at each level, from the client to presentation layer to middleware to database) and on the other hand you require the art of comparison and the ability to decide to ignore or act upon a particular change. The work can be exciting at times while slow and frustrating at other times - finding performance bottlenecks could be difficult but when found they are extremely gratifying!

I am going to have a separate post on performance monitoring an ASP.NET web application but I will use this post to look at a particular scenario and that is the thread contention. This is especially important in the face of increasing popularity of asynchronous programming and use of .NET 4.0's TPL and .NET 4.5's async/await.

So let's imagine this is the problem statement:
From version 1 of the software to version 2 of the software, thread contention has increased substantially. 
And imagine you are responsible for your application's performance. What would you do?

What is Thread Contention?

In .NET, contention rate/second is the performance counter to measure the level of contention as far as the managed threads are concerned. This measures number of unsuccessful managed lock attempts per second. Where would you take a lock? At the time of synchronisation. So contention rate is more a measure of synchronisation rate rather than the multi-threading. 

To confirm the above statement, let's use a console application and monitor the contention rate / sec in the performance monitor for the console application process. Initially, let's try this code:

static void Main(string[] args)
{
    Parallel.For(1, 1000000, (i) => Math.Asin(Math.Log(i)));
    Console.Read();
}

In this case contention will stay at 0 since we have used multi-threading yet we do not have a shared resource requiring locking hence contention is zero. Second scenario is when we write the iterator/counter to the console:

static void Main(string[] args)
{
    Parallel.For(1, 1000000, (i) => Console.WriteLine(i));
    Console.Read();
}

Now performance monitor will show a very high number for contention rate (on my machine it is average of 200). But wait, we did not use any locking here?! Well, we did not but Console did - all calls to Console class is synchronised so it internally uses locking.

Can Thread Contention be healthy?

Absolutely! Contention is just a measurement and you should not base your judgement on Guide Metrics. Guide metrics such as "% Processor Time", "Contention Rate/sec", "Number of Threads" are to be looked at only in the presence of a deteriorated Goal Metrics such as throughput (related to scalability) and performance. For example increase in response time or a drop maximum number of concurrent requests/users is something to look at but increase in % Processor Time can be a sign of higher/better utilisation of the system in the case of removing a logical bottleneck (for example coarse locking).

So always look at guide metrics in the light of goal metrics.

An important example of this in case of contention is use of IO completion ports. Windows defines two types of threads: worker threads and IO completion ports. Worker threads are known to most developers and can be used for processing a piece of work in the background. .NET framework (and .NET runtime) has a ThreadPool which contains worker threads. IO completion ports are Windows's low overhead threads that can carry out an IO-bound operation and notify the commissioning thread when the job is done. These threads will be used by the .NET runtime as soon as you use .BeginXXX and .EndXXX on an IO related class such as FileStream.

ContentionTest is a simple project that illustrates use of IO completion ports. Just try it for yourself and see that setting async variable to true will improve the performance but will make the contention rate to go rocket high.

Conclusion

Always look at guide metrics objectively. Do not try to improve guide metrics, only focus on goal metrics. If a drop in goal metrics coincides with a change in a particular guide metrics, review the relationship.

Thread contention rate is a measurement of thread synchronisation. This rate can go up in various conditions such as increased throughput, use of IO completion ports or removal of a bottleneck in which case there is no cause for concern. If high contention coincides low throughput, consider refactoring the synchronisation.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.