Showing posts with label .NET. Show all posts
Showing posts with label .NET. Show all posts

Sunday, 20 January 2013

Review of .NET Framework cryptography and symmetric algorithms benchmark

[Level T2] .NET Framework provides a set of cryptography services in System.Security.Cryptography namespace. This namespace has been there for years and I have been using some of its classes on and off. Yet, I felt I need to make some research into all the services it provides. And this post is the result.

Symmetric and asymmetric algorithms

Symmetric algorithms use a secret key/password (along with an Initialisation Vector IV) to encrypt and decrypt data. A number of these algorithms have been described and documented. Secret key needs to be stored safely as whoever gets hold of the key can encrypt/decrypt the data. These algorithms are efficient and fast and can encrypt/decrypt large amount of data - they can even encrypt streams of data.

Initialisation Vector is used for encrypting the first block of data. It is generally not considered to be a secret and can be passed as plain text. However, the same IV needs to be used for both encryption and decryption. More information can be found here.

Algorithms available in .NET Framework:

  • DES: One of the oldest. Uses 8-byte keys.
  • RC2: Can use keys of 5-16 bytes.
  • TripleDES or 3DES: An advanced and more secure DES. Uses keys of 16 or 24 bytes.
  • Rijndael: A purely managed implementation. Can use keys of 16, 24 or 32 bytes.
  • AES: A purely managed implementation. Uses 16-byte keys. [Please see @leastprivilege comment at the end as AES is an alternative Rijndael implementation]

Performance of these algorithms are compared here (code to re-produce this can be found on GitHub):


As can be seen, AES and Rinjdael provide the best performance, partly due to the fact that they are purely managed. RC2, DES and 3DES are older algorithms and in terms of security not the most reliable so best to be avoided. So when it comes to choosing between an algorithm, use AES or Rijndael.

HMACSHA1 is a symmetric algorithm but is not used for encryption. This is mainly used for creating hash-based message authentication code which is a kind of signing.

Asymmetric algorithms on the other hand use a public/private key pair. These algorithms can be used for encryption as well as signing. Public key is used for encryption and verifying signatures while private key is used for decryption and generating signatures. So the idea is that the public key is shared publicly while the private key has to be stored securely. X509 certificates are a common type of public/private key pairs and usually installed on the machine's keystore so the operating system (and resource ACL) is responsible for its security.

Asymmetric algorithms are used for encrypting small pieces of data. So unlike symmetric algos, they cannot be used for encrypting large data or streams. So they are usually used to encrypt symmetric algo secret keys. Signing is an important use of asymmetric algorithms. .NET Framework itself uses it for signing assemblies (strong naming assemblies).

The most widely used algorithm is RSA which is implemented in.NET framework.

An example of using symmetric and asymmetric algorithms

SSL/TLS is an example of using both symmetric and asymmetric algorithms in the same security context session. In this protocol, client and server negotiate a size for the encryption key (as well as other aspects including the algorithm to use) and then client generates a random secret key of the agreed size for symmetric encrypting the communication. Then it uses public key of the server to encrypt the secret key and sends it to the server. Server, being the only entity in possession of the private key, decrypts the secret key. This secret key is used for the lifetime of the secure transmission to symmetrically encrypt (and decrypt) the data.

Other cryptography services

Random generation

Most likely you have used the class Random in the .NET Framework. This class is capable of generating random bytes that can be used as secret keys:
var random = new Random();
var buffer = new byte[1024];
random.NextBytes(buffer);
While this code works, Random class is incapable of generating completely random values, i.e. the values it produces are pseudo-random as it chooses from a finite set of numbers.

For producing numbers with cryptography-level randomness use RNG:
var buffer = new byte[1024];
using (var rng = new RNGCryptoServiceProvider())
{
    rng.GetBytes(buffer);
}

Hashing

Hashing is the process by which a signature of small fixed size is created for a much larger piece of data. Hashing algorithms such as MD5 or SHA can be used to generate such hashes. Hashing algorithms can also use a secret key which can be used while generating the hash. Ability to produce the hash with a given secret key can be used as a sign of the ownership of the secret without having to send the secret. This technique is used in OAuth using HMACSHA1 algorithm. [OAuth2 is different as per @leastprivilege comment. Please see comments]

There are a number of algorithms implemented in .NET Framework. MD5 is best to be avoided and one of the alternatives of SHA algo should be used. In case of a need for key-based hashing, use HMACSHA1.

How to store the secret key of the encryption

We are commonly in need of encrypting pieces of sensitive data and storing them. These could include content of cookies, user sensitive information, etc. For using a symmetric encryption, we need to use a secret key. So where should we store the secret key?

I have seen many cases where the secret key is hard-coded within the code, usually as a string. First of all, such secret keys can be easily retrieved using reverse engineering tools such as Reflector. Also, these strings are commonly meaningful strings, that usually can be guessed easily.

One way to have a secure secret key is to:
  1. Create a random secret key (of required size) using algorithms such as RNG
  2. Install an X509 certificate on the servers (private and public keys)
  3. Use asymmetric encryption to encrypt the key and store it for example as a file
  4. At the time of using the key, use the private key to decrypt the secret key and then use the key for symmetric encryption/decryption


Conclusion

Avoid older encryption or hashing algorithms such as MD5, DES, RC2 or 3DES. Use newer and managed-only implementations such as AES, Rijndael (and SHA for hasing). AES provides the best performance out of all symmetric encryption algorithms. Use combination of symmetric and asymmetric algorithms for secure encryption. Use RNG if you need crypto-level randomness.

Special thanks to Dominick Baier @leastprivilege for reviewing this article and posting comments which you can read in the comments section.

Monday, 7 January 2013

Performance series: Is thread contention bad?

[Level T2] For better or worse, I have recently been working a lot on the area of performance optimisation and bottleneck hunting.

Performance monitoring is an art as much as it is a science. On one hand you have all various metrics (simple, complex or derived) that a system can produce (at each level, from the client to presentation layer to middleware to database) and on the other hand you require the art of comparison and the ability to decide to ignore or act upon a particular change. The work can be exciting at times while slow and frustrating at other times - finding performance bottlenecks could be difficult but when found they are extremely gratifying!

I am going to have a separate post on performance monitoring an ASP.NET web application but I will use this post to look at a particular scenario and that is the thread contention. This is especially important in the face of increasing popularity of asynchronous programming and use of .NET 4.0's TPL and .NET 4.5's async/await.

So let's imagine this is the problem statement:
From version 1 of the software to version 2 of the software, thread contention has increased substantially. 
And imagine you are responsible for your application's performance. What would you do?

What is Thread Contention?

In .NET, contention rate/second is the performance counter to measure the level of contention as far as the managed threads are concerned. This measures number of unsuccessful managed lock attempts per second. Where would you take a lock? At the time of synchronisation. So contention rate is more a measure of synchronisation rate rather than the multi-threading. 

To confirm the above statement, let's use a console application and monitor the contention rate / sec in the performance monitor for the console application process. Initially, let's try this code:

static void Main(string[] args)
{
    Parallel.For(1, 1000000, (i) => Math.Asin(Math.Log(i)));
    Console.Read();
}

In this case contention will stay at 0 since we have used multi-threading yet we do not have a shared resource requiring locking hence contention is zero. Second scenario is when we write the iterator/counter to the console:

static void Main(string[] args)
{
    Parallel.For(1, 1000000, (i) => Console.WriteLine(i));
    Console.Read();
}

Now performance monitor will show a very high number for contention rate (on my machine it is average of 200). But wait, we did not use any locking here?! Well, we did not but Console did - all calls to Console class is synchronised so it internally uses locking.

Can Thread Contention be healthy?

Absolutely! Contention is just a measurement and you should not base your judgement on Guide Metrics. Guide metrics such as "% Processor Time", "Contention Rate/sec", "Number of Threads" are to be looked at only in the presence of a deteriorated Goal Metrics such as throughput (related to scalability) and performance. For example increase in response time or a drop maximum number of concurrent requests/users is something to look at but increase in % Processor Time can be a sign of higher/better utilisation of the system in the case of removing a logical bottleneck (for example coarse locking).

So always look at guide metrics in the light of goal metrics.

An important example of this in case of contention is use of IO completion ports. Windows defines two types of threads: worker threads and IO completion ports. Worker threads are known to most developers and can be used for processing a piece of work in the background. .NET framework (and .NET runtime) has a ThreadPool which contains worker threads. IO completion ports are Windows's low overhead threads that can carry out an IO-bound operation and notify the commissioning thread when the job is done. These threads will be used by the .NET runtime as soon as you use .BeginXXX and .EndXXX on an IO related class such as FileStream.

ContentionTest is a simple project that illustrates use of IO completion ports. Just try it for yourself and see that setting async variable to true will improve the performance but will make the contention rate to go rocket high.

Conclusion

Always look at guide metrics objectively. Do not try to improve guide metrics, only focus on goal metrics. If a drop in goal metrics coincides with a change in a particular guide metrics, review the relationship.

Thread contention rate is a measurement of thread synchronisation. This rate can go up in various conditions such as increased throughput, use of IO completion ports or removal of a bottleneck in which case there is no cause for concern. If high contention coincides low throughput, consider refactoring the synchronisation.

Sunday, 11 November 2012

NoSQL Benchmarking - Redis, MongoDB, Cassandra, RavenDB and SQL Server

Introduction

[Level C2] In the last post, I explained how limitation can lead to a better solution. This is an integral part of the NoSQL offering for me: the fact that we cannot abuse it by storing logic as well as data.

In this post I am going to report my NoSQL benchmark results. There are quite a few benchmarks already reported and available out there but this one focuses on NoSQL offerings available on windows. If you are not a windows developer, you might still find the results useful. My benchmark treats all these technologies as key/value store - although most of them have many other features.

The code used for benchmarking is available in GitHub.

Disclaimer

In a distributed system, performance is not as important of scalability - which is not compared here. Take this for whatever it is worth. I have used a method (key/value storage/retrieval described below) which might or might not match the way you intend to use these technologies. 

Use a storage system that suits you best. This report does not necessarily recommend or disapprove a particular technology. Performance of the NoSQL stores are affected also by the client technology used. However, this is a price we normally pay so I think it is relevant to be included in the measurement. The variety in usage of these technologies mean some results might have been skewed by the serialisation techniques.

Each of these technologies have different degrees of availability, consistency and partition tolerance. They also present different settings that can affect these variables. As such, the result of this benchmark must be interpreted in the light of them.

 Contenders

Here I briefly explain the technologies compared.

Redis

Redis is a high throughput caching/nosql technology which is written in C. This is mainly available on linux but windows ports can be used although its replication currently not supported on windows. Client library of choice is fully-async library by uber-geek Marc Gravell called BookSleeve. There is an alternate library available which is part of ServiceStack.

I used Redis port by MSOpenTech which can be downloaded from here. Full instruction for installation and running it is provided in there. I have used all the settings out of the box. Redis provides two different persistence mechanisms: RDB and AOF. RDB is faster and default but affects reliability depending on what you need from it.

To clear the data, just delete inst1 folder. Version used was 2.4.11.

MongoDB

MongoDB is written in C++ and it has a stable port for windows. It is a classic document database and provides its own query language which has been abstracted away nicely by NoRM library. It provides many nice querying features which we do not use here.

Downloading and installation easy - unzip it. You just need to create the folder C:\data\db which is the default storage area and then run the mongod.exe.

To wipe out the data, just delete contents of C:\data\db folder. Version used was 2.0.2.

RavenDB

RavenDB is an emerging document database fully written in C#. It comes with its own client library which uses HTTP for communication. It is a transactional database which is an important feature. It also has features such as map-reduce.

Downloading easy and no installation required. Just unzip the package and run Raven.Server.exe. To wipe out the data you just need to delete data folder. Version used was 1.0.960.

UPDATE
RavenDB's recommended approach is to open and close the session every time which I also used for tests (there is a default cap of 30 operations per session). I also tried a single session for the whole lot but performance was actually worse.

Cassandra

Out of all NoSql stores I know, this one looks more like RDBMS - and I know it the least. It is written in Java and is schema-full. Its power is high throughput and ability for unlimited scale-out unlike conventional RDBMS.

There are currently two client libraries available for accessing Cassandra which I used Fluent Cassandra by Nick Berardi. Version of the Cassandra used was 1.1.6.

SQL Server

OK, this one is a conventional RDBMS! But nothing stops you from using it as a key/value store and as we will see it competes very well with NoSql stores - while being transactional.

There are tens, if not hundreds, of libraries for accessing SQL Server and I used none of them. My approach was raw ADO.NET over store procedures. I kept the connection open - normally we would not do that but with a single connection we would not be using connection pooling so I think my approach more realistic.

I have a script that generates the table and stored procedures for database "benchmark". Please find/replace if your database is called something else. To empty the table I used truncate. Version of the SQL Server was SQL Server Express 2008.

Methods

I used my personal laptop for test which had 6GB of RAM and a 256 GB Samsung SSD. CPU would never reach 100% as the test was single-threaded. I am planning to run another sets of tests in a multi-thread fashion.

SQL Server Express and Cassandra were running as service while others where running as normal exe (similar to daemon). All of servers were used with out of the box settings. They all run on their standard port. All servers running on localhost so no network latency incurred.

I used a GUID string as the key (with no hyphens) and a randomised byte array of 4-20KB (random size) as the value. The process without storage had negligible effect - taking 0.2 millisecond per operation. Each operation consists of inserting the value against the key, retrieving value using the key and then asking for a non-existent key.

I ran operations for 10,000 and got the average for each operation measured in milliseconds (see results).

Serialisation or conversion to base64 would happen for all but SQL Server and Redis.

If you are running the test yourselves, make sure you run the exe outside IDE (Visual Studio) since RavenDB will look to perform very poorly in this case as non-existence key searches throw exception.

Results

Since performance degrades when there are more items in database, I generated results in two scenarios: empty database, database with 200,000 items in the same collection/table/familyColumn, etc.

This is the result for the empty database:



As can be seen, Redis is ultra fast (0.7ms) while RavenDB is slowest with 11.3ms:



Performance degrades in some stores when we have more items in the store but ordering do not change. Redis still shines with the best performance and RavenDB is slow compared to the rest. So this is the result when each store already contains 200,000 items (this is especially marked in SQL Server):



This is the breakdown of the results:




Conclusion

First of all, it is not about who is the fastest, it is about making an informed decision considering all parameters including speed. When choosing a NoSQL you would consider other factors which do not come into this benchmark and in fact some cannot be benchmarked.

In terms of results, Redis in all scenarios provides the best performance. I think this also in part is due to its excellent totally async client library. Marc Gravell has done a ton of work to make the client efficient.

Performance of MongoDB, Cassandra and SQL Server are close. SQL Server proves to be a valuable tool as a simple key/value scenario, if you already pay for its license. RavenDB is the slowest of all  - considering it is still under development.

Feel free to run the test yourselves. Find the code on GitHub here.

In my next series of tests, I will run the tests in a multi-threaded fashion.

Saturday, 22 September 2012

Take your Web API service consumption up to 11 with CacheCow.Client

ASP.NET Web API is here and a lot of teams have already started building software using it. If you have followed this and other blogs on webapibloggers.com, you probably have seen many various possibilities and avenues this framework brings for designing and building clean and scalable services.

ASP.NET Web API exposes all the goodness of HTTP. Caching is an important feature of HTTP and ASP.NET Web API allows for building services and clients that take advantage of this feature. I, along with a few friends in the Web API community, have been busy building caching extensions for Web API in a project called CacheCow which is hosted on GitHub.

CacheCow has two separate components: server and client. These will be used independently by service providers (server) and their consumers (client).

Server component allows for easy handling of HTTP caching scenarios on server by generating ETag, responding to validation of cache (see earlier posts on this subject, especially this and for full list here), cache invalidation and storage of cache metadata. Storage of cache metadata is possible in various stores, currently in-memory and SQL Server have been implemented and RavenDB and Redis is on the pipeline [UPDATE: RavenDB is implemented and NuGet package is available here]. Since storage has been abstracted away, any storage mechanism can be plugged in without making any server changes.

Client component looks after making cache-aware requests, cache validation and cache storage. Currently in-memory and file-based storage is available but other stores such as Redis, SQL Server, MongoDB and RavenDB are in the pipeline. Since storage has been abstracted away, any storage mechanism can be plugged in without making any client changes. One of important features of storage is total and per-site quota.

It is important to note that while clients can be browsers or native Apps (WPF, Silverlight, iOS, Android, etc), arguably more often than not they will be server components themselves. For example, an ASP.NET web site can call services of an ASP.NET Web API server. Also middleware components could similarly use resources exposed by Web API. As such it is very crucial that cache storage solutions are performant, scalable and configurable.

In this post, I will look into CacheCow.Client a little but more. For more info, you can read previous posts on the topic in this blog.

CacheCow.Client alternatives

The only alternative to CacheCow.Client (that I am aware of) is using WinINET caching. Internet Explorer also uses this so the cache store will be the same. This is basically windows' HTTP request stack which has been exposed in .NET Framework since v 2.0 through WebRequest:

RequestCachePolicy policy = 
        new RequestCachePolicy( RequestCacheLevel.Default);
WebRequest request = WebRequest.Create(uri);
request.CachePolicy = policy;
WebResponse response = request.GetResponse();

As you can see, we can define a cache policy which will be applied to the request and according to the policy, Internet Explorer cache is used. Cache policy has a few possible values that are defined here. Notable values include:

  • CacheOnly: retrieves the request only from cache
  • BypassCache: does not use cache at all and goes straight to the server
  • CacheIfAvailable: retrieves from local or intermediate cache if resource available otherwise retrieve from server
  • Default: Similar to previous but current cache policy takes effect 

This same mechanism is now exposed in HttpClient but basically is built on the top of WebRequest. Henrik fully covers this feature in his blog here.


Basically in order to use WinINET caching with the new Web API stack, you need to create an HttpClient but provide WebRequestHandler as the MessageHandler:

HttpClient client = new HttpClient(new WebRequestHandler()
                           {
                               CachePolicy = new RequestCachePolicy( RequestCacheLevel.Default)
                           });
// this is a sample. It is not advised to use .Result since can lead to deadlock!
var httpResponseMessage = client.GetAsync("http://carmanager.softxnet.co.uk/api/car/3").Result;
var httpResponseMessage2 = client.GetAsync("http://carmanager.softxnet.co.uk/api/car/3").Result;

Using this feature, you can enable caching with little coding on the client.

Why I would choose CacheCow.Client rather than WinINET

Because it goes to 11! As we saw, it is very easy to get started with caching in HttpClient. But as we noted, it is very likely that HttpClient could be used in a server context hence having a reliable and scalable solution is very important in production.

Here are a few advantages of CacheCow.Client over WinINET (or rather disadvantages of WinINET):

1. Caching will be shared with Internet Explorer

In a production scenario, you need an implementation which is predictable and reliable. If someone uses Internet Explorer on the machine, storage area for your application's resources will be taken by just simple browsing. This can lead Internet Explorer to flush application resources in order to store resources for the browsing session. 

2. You have little control over quota

With CacheCow.Client, you can define a global and a per-site quota for storage of resources while such feature is not accessible (although there could be some registry entries for changing these variables) in WinINET caching. Also these variables could be overwritten by installation of a newer version of Internet Explorer.

3. Cache is local to the machine and cannot be shared across servers

In a production scenario, it is desirable to be able to store caches in a central store so network traffic and requests could be limited while with WinINET caching, each server will use its own local cache store.

4. WinINET is file-based

With WinINET, cache is stored in a file location while for a high-throughput production environment, robust caching using solutions such as Redis is required. CacheCow client by abstracting the storage can use any number of storage mechanisms such as Redis, MongoDB, RavenDB, etc.

5. CachePolicy is global for the HttpClient instance

Sometimes you might need to bypass caching. With WinINET, this has to be done with changing policy at the client level which applies across all requests for that HttpClient while CacheCow.Client respects will not use cached resources if you set CacheControl header of the request to no-cache. This basically recommended implementation based on HTTP specification (RFC2616).

6. With WinINET you do not know if request was retrieved from cache

With WinINET, there is no way to tell if response was retrieved from the cache or origin server. CacheCow.Client provides x-cachecow header which provides various information which can be used for debugging and troubleshooting scenarios.

Introducing CacheCow.Client.FileCacheStore

Last week I finished first version of a persistent cache store which is file based. This is available using NuGet and (package name is CacheCow.Client.FileCacheStore) and the code available at GitHub.

Using this persistent store is very easy. After getting the package from NuGet, create an HttpCient while as a delegating handler, pass CachingHandler (covered before here) while setting the store to a new instance of FileStore. While creating a FileStore, you need to specify a folder for storing the cached resources:

var httpClient = new HttpClient(
 new CachingHandler(
 new FileStore("c:\\Cache"))
{
 InnerHandler = new HttpClientHandler()
});

That is all you have to do! Now all your requests will store cacheable resources in a file-based persistent store. 

Currently for quota it uses default values but I am in the process of exposing values so you can configure quota.

CacheCow roadmap

After exposing quota settings, I will be working on CacheCow.Client.RedisCacheStore for a high throughput production level cache storage.

Please keep me posted by your comments, feedback and raising bug/issues on the GitHub page. You are awesome!

Monday, 17 September 2012

Server-side Async: Careful with that Axe, Eugene

[Level T3]

In a previous post, I talked about the dangers lurking in doing server-side async operations in .NET 4.0. As you know, .NET 4.5 provides a much better syntax allowing async/await keywords to take your TPL Task-Soups to a much more readable and organised code. But even so, async will make debugging your application more difficult and bugs could take much longer to be reproduced, isolated and fixed.

Task-Soup

In .NET 4.0, when we add up continuations to create a chained task, we could end up with a few problems:

  1. We could end up with an unobserved exception problem. This is nicely described by Ayende here
  2. Nested lambda expressions could create unexpected problems with closure of variables
  3. The code becomes hard to read.
On the third note, I will just bring an example from my own code in CacheCow. What is it that we are actually returning here?

return response.Then(r =>
{
 if (r.Content != null)
 {
  TraceWriter.WriteLine("SerializeAsync - before load",
   TraceLevel.Verbose);

  return r.Content.LoadIntoBufferAsync()
   .Then(() =>
   {
    TraceWriter.WriteLine("SerializeAsync - after load", TraceLevel.Verbose);
    var httpMessageContent = new HttpMessageContent(r);
    // All in-memory and CPU-bound so no need to async
    return httpMessageContent.ReadAsByteArrayAsync();
   })
   .Then( buffer =>
      {
       TraceWriter.WriteLine("SerializeAsync - after ReadAsByteArrayAsync", TraceLevel.Verbose);
       return Task.Factory.FromAsync(stream.BeginWrite, stream.EndWrite,
        buffer, 0, buffer.Length, null, TaskCreationOptions.AttachedToParent);                                                        
      }
     );

   ;
 }

Even looking at brackets gives me headache.

Is Async worth it at all?

Now we talk a lot about Async operations and its role in improving scalability. But really, is it worth it? How much scalability would it bring? Would it help or hinder?

The answer to these questions is yes, it does help. The more IO you do on your server-side actions, the more you benefit from improvement from scalability. So it is highly advisable to implement your ApiController actions as Async by returning Task or Task<T>

The truth is, it will help even with your non-IO-bound operations although it is not advisable to use Async in such scenarios. You can test it for yourself, create a sync and an async controller to do exactly the same operation and use a benchmarking tool to compare the performance.

I have a CarManager sample on GitHub which I use for testing CacheCow.Server and it contains two simple  controllers: CarController and CarAsyncController. All these do is to use an in-memory repository and their GET only looking up the dictionary by its key:

// sync version
public Car Get(int id)
{
 return _carRepository.Get(id);
}


// async version (on another controller)
public Task<Car> GetAsync(int id)
{
 return Task.Factory.StartNew(() => _carRepository.Get(id));
}

So if you use a benchmarking tool such as Apache Benchamrk ab.exe, you could see a slight increase in throughput using the async controller. In my case, there was a 10% increase in throughput using async.

My ordeal with a bug

Development of CacheCow has been marred by existent of a problem which as we will see, turns out to be not in my code. I have been battling with this for a few weeks (on and off) and could not progress CacheCow development because of that.

OK, here is how my story begins; I think the Sherlock Holmes nature of this troubleshooting could be amusing for others too. After realising that using simple ContinueWith will not flow the context (see previous post) I was tasked with changing all such cases with Then in the TaskHelpers which checks existence of SynchronizationContext and flows the context if it exists.

On the other hand, lostdev, one of CacheCow's most loyal users, informed me of an occasional null reference exception in CacheCow.Server. Now, I had already fixed a bug related to null reference exception when the a resource was being retrieved for the first time. I attributed the problem to the fix I had made and reported that the problem is fixed in the current version.

So I started developing file-based cache storage for CacheCow.Client (which will have its own post very soon) and replaced all ContinueWith cases with Then.

And then I started to experience deadlocks in CacheCow.Client when I was using file-based caching and sending concurrent GET requests to the server. As soon as I would remove FileStore, and replace with InMemoryCacheStore, it would work. So I started searching through the client code, debug, look at the threads, debug again, change code, debug... to no avail. As soon as I was using file-based caching it would start to appear so it had to be on the client.

Then I noticed a strange thing: I could only run 4 concurrent calls and rest would be blocked. Why? Then I started playing with the maxconnection property of the system.net configuration:

  <system.net>
 <connectionManagement>
   <add address = "*" maxconnection = "N" />
 </connectionManagement>
  </system.net>

and interestingly, by setting the N to a high number, I would get more concurrent connections - but only up to the number defined. Hmmm... so the requests do not quite finish. OK, I fired up Sysinternals' TcpView but unfortunately these connections did not show up (and I do not know why).

I was getting nowhere until I accidentally loaded an earlier version of the server code. To my surprise, I did not get the deadlock but this error which @Tugberk separately reported earlier but attributed to order of handlers:

[NullReferenceException: Object reference not set to an instance of an object.]
System.Web.Http.WebHost.HttpControllerHandler.EndProcessRequest(IAsyncResult result) +112
System.Web.Http.WebHost.HttpControllerHandler.System.Web.IHttpAsyncHandler.EndProcessRequest(IAsyncResult result) +10
System.Web.CallHandlerExecutionStep.OnAsyncHandlerCompletion(IAsyncResult ar) +129

OK, so it is probably happening on the server but the continuation code gets deadlocked on unhandled exception. I am close! So it was time to go to bed and I was positive that I would nail it the day after.

It was funny that I woke up the day after and with my in-bed reading on tweets, stumbled on @Tugberk's tweet on issue he had just created. That sounds exceedingly similar, so we just doubled checked our scenarios and it turned out that an HttpResponseMessage with empty RequestMessage property is not handled in Web API and a null reference exception is thrown at the end of the response clean-up code. And the reason I was seeing it only with file-based cache store was that the part of server-side code to return such responses was being triggered only using file-based store (since it was capable of persisting caches and was trying to validate the cache).

So as you can see, a seemingly unrelated problem can really confuse the nature of the bugs in async scenarios.

Conclusion

First of all, always use request.CreateResponse() instead of using new HttpResponseMessage. I googled for cases of new HttpResponseMessage and found +3000 entries. This is really dangerous and I think this is a bug in Web API and needs to be fixed. If you are using new, make sure you set the RequestMessage property.

And in general, be careful with doing server-side async operations. It is really a powerful axe but with it you are not quite sure what a slightly off swing could bring. Careful with that axe Eugene.

Friday, 24 August 2012

Server-side TPL Async: Don't risk learning these lessons the hard way

[Level T2]

There has been more than a few times that I have felt I know all about TPL. Only to realise sometime later I was wrong, very wrong. Now you might read this and say to yourself "Come on, this is basic stuff. I know it well, thank you vety much". Well, it is possible that you could be right but I advise you carry on reading; what follows can surprise you.

None of what I am gonna talk about is new or it is being blogged about for the first time. Brad Wilson has an excellent series on the topic here but this is to serve as a digest of his posts targeted at a broader audience in addition to a few other points.

While this post is not directly related to ASP.NET Web API, most examples (and cases) are related to day-to-day scenarios we encounter in ASP.NET Web API.

Remember this covers pre-async/await keywords in .NET 4.5 and what you need to do if you are using .NET 4.0 and not async/await. Using async/await will cover you for some of the problems described below but not all.

Don't fire and forget

Tasks are ideal for decoupling pieces of functionality. For example, I can perform a database operation and at the same time audit the operation by outputing a log entry, writing to a file, etc. Using tasks I can de-couple these operations so that my database task returns without having to wait for audit to finish. This makes sense since database operation is high priority but audit is low priority:

private void DoDbStuff()
{
   CallDatabase();
   // doing audit entry asynchronously not to bog down database operation
   Task.Factory.StartNew(()=> AuditEntry("Database stuff was done"));
}

In fact, let's say we do not even care if audit is successful or not so we just fire and forget, it most audit will fail which is low priority. OK, it all seems innocent?

No! This innocent operation can bring down your application. Reason for it is that all async exceptions must be observed even if you do not care about them. If you don't, they will haunt you when you least expect them, at the time finalizer for task is run by GC. Such an unhandled exception will kill your app.

The link above talks about various ways of observing an exception. The most practical is to use a continuation and access the .Exception property of the task (just accessing the property is enough, does not need to do anything with the exception itself).

private void DoDbStuff()
{
   CallDatabase();
   // doing audit entry asynchronously not to bog down database operation
   Task.Factory.StartNew(()=> AuditEntry("Database stuff was done"))
      .ContinueWith(t => t.Exception); // fire and forget!
}

Another option which is more of a safe-guard against accidental unobserved exception, is to register to UnobservedTaskException on TaskScheduler:

 TaskScheduler.UnobservedTaskException +=
  (e, sender) => LogException(e);

So we register a handler to handle unobserved exceptions and this way they will be "observed". If you need to read more on this, have a look at Jon Skeet's post here.

This problem has made Ayende Rahien to run for the hills.

Respect SynchronizationContext

Uncle Jeffrey Richter tells us that

By default, the CLR automatically causes the first thread's execution context to flow to any helper threads.

And then we also learn that we can use ExecutionContext.SuppressFlow() to suppress flow of the thread context.

Now, what happens when we use ContinueWith()? It turns out unlike standard thread switches, context does not flow (I do not have a reference, if you do please let me know). This will help with improving performance of asynchronous task as we know context switching is expensive (and big part of it is context flow).

So why is it important? It is important because so many developers are used to HttpContext.Current. This context is stored in the thread storage area and passed along at the time of context switching. So if the context does not flow, HttpContext.Current will be null.

SynchronizationContext is a similar (but not same) concept. It is about a state that can be shared and used by different threads at the time of switching. I cannot explain this better than Stephen here. So using Post on SynchronizationContext ensures that the execution of continuation will happen in the same context and not necessarily by the same thread.

So basically the idea is that if you are in a Task pipeline (best example being MessageHandlers in ASP.NET Web API), you need to take responsibility for passing the context along the pipeline.

This is a snippet from ASP.NET Web API Source code that displays the steps. First of all you check to see if current context is null, if it is not then you have to use Post() to flow the context:

SynchronizationContext syncContext = SynchronizationContext.Current;

    TaskCompletionSource<Task<TOuterResult>> tcs = new TaskCompletionSource<Task<TOuterResult>>();

    task.ContinueWith(innerTask =>
    {
        if (innerTask.IsFaulted)
        {
            tcs.TrySetException(innerTask.Exception.InnerExceptions);
        }
        else if (innerTask.IsCanceled || cancellationToken.IsCancellationRequested)
        {
            tcs.TrySetCanceled();
        }
        else
        {
            if (syncContext != null)
            {
                syncContext.Post(state =>
                {
                    try
                    {
                        tcs.TrySetResult(continuation(task));
                    }
                    catch (Exception ex)
                    {
                        tcs.TrySetException(ex);
                    }
                }, state: null);
            }
            else
            {
                tcs.TrySetResult(continuation(task));
            }
        }
    }, runSynchronously ? TaskContinuationOptions.ExecuteSynchronously : TaskContinuationOptions.None);

    return tcs.Task.FastUnwrap();

There is a horrifying fact here. Most of the DelegatingHandler code out there (including some of mine) in various samples around internet do not respect this. Of course, looking at ASP.NET Web API source code reveals that they do indeed take care of this in their TaskHelper implementations and Brad tried to make us aware of it in his blog series. But I think we have not taken enough attention of the implications of ignoring SynchronizationContext.

Now my suggestion is to use the TaskHelpers and its extensions in the ASP.NET Web API (it is open source) or use the one provided in Brad's post. In any case,

Don't use Task for CPU-bound operations

Overhead of asynchronous operations is not negligible. You should only use async if you are doing an IO-bound operation (calling another web service/API, reading a file, reading a lot of data from database or running a slow query). I personally think even for normal IO operations, sync is more performant and scalable.

As we have talked about it here, the point about asynchronous programming on server-side is releasing the thread to be able to serve another request. Tasks are normally served by the CLR thread pool. If server already needs managed threads for its operations, it will be using CLR thread pool too. This means that by doing async operations you could be stealing threads needed for server's normal operations. A classic example is ASP.NET, so you should be careful to use async only if needed.

ContinueWith is Evil!

I think by now you should know why standard ContinueWith can be evil. First of all, it does not flow the context. Also it makes it easy for unboserved exceptions to creep into your code. My suggestion is to use .Then() from ASP.NET Web API's TaskHelpers.

Performance comparison

I think it is still early days - but I must say I would love to do a benchmark to quantify overhead of server-side asynchronous programming. Well if I do, this place will be where the result will first appear :)

So. Do I think I know all about TPL now? Hardly!

Tuesday, 31 July 2012

Using range header for retrieving range of IEnumerable<T> in ASP.NET Web API


Introduction

[Level T3] In this post, we talk about using HTTP's Range header to achieve requesting data ranges for entities.

Background

HTTP spec defines a series headers that can be used for a client to request for a partial content. These operations are optional (in most cases spec uses word SHOULD) but most servers implement them and browsers have increasingly been using them. If you have ever resumed downloading a big file from internet, then you have used this feature (in fact all browsers use range if supported by server). In this case, client keeps requesting chunks and builds up the file until it is fully downloaded.

So here is how it works in a nutshell:

  1. Server can optionally informs clients while serving a resource that it supports partial content. It does that by sending Accept-Range header with a value of the unit it supports, normally bytes. In our case, our server sends back a custom unit that we call x-entity.
  2. Client, either informed by the server on partial content feature based on Accept-Range header or just simply tries its luck, sends a request with Range header with value of [unit]=[from]-[to] for example bytes=1024-2047. In this example, client asks for the second KB of the file. Range header can specify multiple ranges for example bytes=500-600,601-999
  3. Server will return the range requested and include a Content-Range header with value [units] [from]-[to]/[TotalCount]. For example bytes 1024-2047/12345678. It also returns status code 206 (partial content) to inform the client that the content is partial. If server does not support the range specified, it will send back status code 416.
Spec does consider using custom units so server can implement its own custom units and inform the client of the unit using Accept-Range header. Now the idea is that in ASP.NET Web API, we normally build many actions that return IEnumerable<T>. What if we could use the range to specify range of the enumerable to be returned? Hmmm....

This feature can be useful in case of pagination on the client so that instead of API implementing a range parameter all the time, we just use HTTP's built-in features and encapsulate the implementation in a reusable component, in this case a filter.

So in the code to follow, we define a custom range unit and call it x-entity. "x-" prefix is a common naming convention on the web to specify custom tokens that are not part of the canonical tokens defined in RFC specs.

Implementing range in ASP.NET Web API

So where is the best place to implement this? We have these requirements:

  • Access to request headers to read Range header
  • Access to response header to set Accept-Range
  • Access to content headers to set Content-Range header
  • Access to content so that it can filter IEnumerable<T>
DelegatingHandler might look promising but by the time it accesses the content, it is already turned into stream by MediaTypeFormatters.

MediaTypeFormatter is an interesting option. I actually created a RangeMediaTypeFormatterWrapper that would wrap the MTFs and intercept the content and if it was of type IEnumerable<T>, it would apply the filtering. Initially it seems MTF does not have access to request headers but in here we had an interesting discussion and it turns out it can access request using GetPerRequestFormatterInstance. But it also needs access to response headers.


So Glenn Block suggested filters and after some thoughts, it seems to be the right approach considering current limitations of MTF. The only drawback is that it has to be explicitly defined on the action - which can in some cases be a blessing in fact. In any case, filter approach as you will see is clean and does everything in the same place.

Filters in ASP.NET Web API is not much different from MVC. You get two methods: before (OnActionExecuting) and after (OnActionExecuted) the action where you can change values in request, response, action arguments or simply examine values.

Using the code

You can get the source code from GitHub. As you can see, we have a single controller called CarController. Project is running on port 50714 on my machine so all examples will include this port - yours could be different. So download the project, build and run it. I created a client project to implement all steps below using HttpClient but there is an issue with ASP.NET Web API implementation of the Range header that regardless of the unit set in the range header, always bytes is sent to the server.

As you can see, we have a simple action with EnableRange filter defined on it:

[EnableRange]
public IEnumerable<Car> Get()
{
 return CarRepository.Instance.Get();
}

So now we use fiddler (or similar tool capable of sending HTTP requests, such as Google's Postman) to send this request:

GET http://localhost:50714/api/Car HTTP/1.1
User-Agent: Fiddler
Host: localhost:50714

We will get all the cars in our repository in JSON format. But note the Accept-Range header with value of  x-entity:

HTTP/1.1 200 OK
Cache-Control: no-cache
Pragma: no-cache
Content-Type: application/json; charset=utf-8
Expires: -1
Accept-Ranges: x-entity
Server: Microsoft-IIS/8.0
Date: Tue, 31 Jul 2012 18:59:56 GMT
Content-Length: 1125

[{"Id":1,"Make":"Vauxhall","Model":"Astra","BuildYear":1997,...

So this should tell the client that it can use Range header. Now let's send a range header requesting 3rd item to 6th item (total of 4 items):

GET http://localhost:50714/api/Car HTTP/1.1
User-Agent: Fiddler
Host: localhost:50714
Range: x-entity=2-5

And here is the response:

HTTP/1.1 206 Partial Content
Cache-Control: no-cache
Pragma: no-cache
Content-Type: application/json; charset=utf-8
Content-Range: x-entity 2-5/10
Expires: -1
Server: Microsoft-IIS/8.0
Date: Tue, 31 Jul 2012 19:00:19 GMT
Content-Length: 447

[{"Id":3,"Make":"Toyota","Model":"Yaris","BuildYear":2003,"Price":3750.0,...

Note the Content-Range header above and also the fact that we got the entities we requested in JSON (not shown fully above). So it tells us that it has sent back items from index 2 to index 5 and total of items is 10. Also note the 206 response.

Server can send back * if number of items is not known at the time of serving the request. I have used this option since I do not want to run a Count() on an IEnumerable<T>. It is very likely that the data is being retrieved from database and we do not want to load the whole table into memory. So my approach is to try cast the value into ICollection using as keyword. If it case OK then I get the count, otherwise I set the count to *.

Another option in the spec is that to in the range is optional so the client can send a range header with value 2-*. In this case, we must skip the first 2 items and return the rest:

GET http://localhost:50714/api/Car HTTP/1.1
User-Agent: Fiddler
Host: localhost:50714
Range: x-entity=2-

In this case, our server returns this response:

HTTP/1.1 206 Partial Content
Cache-Control: no-cache
Pragma: no-cache
Content-Type: application/json; charset=utf-8
Content-Range: x-entity 2-9/10
Expires: -1
Server: Microsoft-IIS/8.0
Date: Tue, 31 Jul 2012 21:18:05 GMT
Content-Length: 897

[{"Id":3,"Make":"Toyota","Model":"Yaris","BuildYear":2003,"Price":3750.0, ....

Notes on implementation

The crux if the implementation is to call Skip() and Take() on IEnumerable<T>. Our code has to be able to work with all types hence cannot be generic. On the other hand, filters (and attributes as a whole) cannot use generics. As such we just have to use reflection to do this:

[EnableRange]
var skipMethod = t.GetMethods().Where(m => m.Name == "Skip" && m.GetParameters().Count() == 2)
 .First().MakeGenericMethod(_elementType);
var takeMethod = t.GetMethods().Where(m => m.Name == "Take" && m.GetParameters().Count() == 2)
 .First().MakeGenericMethod(_elementType);
...
value = skipMethod.Invoke(null, new object[] { value,  from});
if(to.HasValue)
 value = takeMethod.Invoke(null, new object[] { value, to - from + 1 });

Also it is useful to note that the return value is not accessible in the filter and we have to resort to using Content and casting it to ObjectContent and use the Value property.

Conclusion

Range header defined in HTTP spec is useful in retrieving partial content. We can use custom units and we defined x-entity unit to enable selecting a range of entities (commonly used in pagination scenarios) and implemented it using a filter.

Monday, 30 July 2012

Serialising request and response in ASP.NET Web API


Introduction

[Level T3] This is a short post on serialising/deserialising HTTP request and response messages in ASP.NET Web API.  Serialising messages manually can be achieved but is hard-work and you can run into various problems. ASP.NET Web API provides a means of achieving this through HttpMessageContent. This post is a follow-up to this discussion.

Background

There are many cases where you could be interested in serialising HttpRequestMessage or HttpResponseMessage. For me, I needed this to implement caching features on the HttpClient in CacheCow framework.

Technically speaking HTTP messages arrive in serialised format and all we need is access to the raw stream coming from server - as such no processing would be required. Unfortunately this is not possible since Web API does not read the message as a raw stream and then process it, instead it starts by reading various chunks, parsing it as it goes.

However, ASP.NET team implemented a feature that could be used for serialisation/deserialisation of request and response messages. If you have read Brad Wilson's batching post, you probably have seen noticed that HttpMessageContent can be used for implementing client-server batching. Now we will use this for serialisation.

HttpMessageContent

RFC 2616 in its appendices defines content types "message/http" and "application/http". application/http is a content-type that can contain more than one request or response.

Did we not have this in multi-part content-type? As we know, we can include different request or response parts in the same message and each part gets its own share of the headers, so what is the difference?

Well the difference is with multi-part, each part can only have headers related to content. And above all, they share the same status code. In application/http, each "part", as it were, is a complete request or response. For example, the requests each will have their own URI and responses their own status code.

HttpMessageContent can encapsulate multiple HttpRequestMessage or HttpResponseMessage but in our case we just need a single request or response.

Serialiser interface

Let's define an interface for our serialiser:

public interface IHttpMessageSerializer
{
 void Serialize(HttpResponseMessage response, Stream stream);
 void Serialize(HttpRequestMessage request, Stream stream);
 HttpResponseMessage DeserializeToResponse(Stream stream);
 HttpRequestMessage DeserializeToRequest(Stream stream);
}

UPDATE: Latest implementation is fully async and can be found as part of CacheCow library here.

Serialisation

In order to serialise, we need to create a new HttpMessageContent passing request or response and then use ReadAsByteArrayAsync to read the whole message as a byte array:

var httpMessageContent = new HttpMessageContent(request);
var buffer = httpMessageContent.ReadAsByteArrayAsync().Result;

As you can see it is very easy to serialise. Now the only caveat is that if you are serialising in a delegating handler, this will consume the message content stream so that it cannot be read further down the stream. If you do, you will see this error message:

The stream was already consumed. It cannot be read again.

The trick (for now) is to call the method ReadAsByteArrayAsync to force the content to be loaded into the buffer. Although we would not need the buffer we read (since the actual reading will happen inside HttpMessageContent), next time the content will be read from the buffer and not from the network. In my implementation I have made it optional whether to pre-read the content into the buffer.

Deserialisation

The trick with deserialisation is to create a normal HttpRequestMessage or HttpResponseMessage and set the content-type header into"application/http;msgtype=request" or "application/http;msgtype=response", accordingly". Then we use the special extension method to read into the an HttpMessageContent:

var request = new HttpRequestMessage();
request.Content = new ByteArrayContent(memoryStream.ToArray());
request.Content.Headers.Add("Content-Type", 
    "application/http;msgtype=request");
return request.Content.ReadAsHttpRequestMessageAsync().Result;

As you can see, all the heavy lifting happens inside the HttpMessageContent and there is really little code that we need to write.

Conclusion

We can use HttpMessageContent to serialise/deserialise request/response in ASP.NET Web API. Full implementation can be found as a GitHub gist here as part of CacheCow library here. This implementation is fully Async and takes advantage of IO completion ports exposed in Begin/End methods.

One word of caution is on cases where message needs to be used after serialisation - which would comprise many cases including serialisation in DelegatingHandler. In these cases we need to invoke ReadAsByteArrayAsync (or similar) to ensure the content is read into the buffer.

Sunday, 22 July 2012

Introducing CacheCow: An HTTP caching framework for server and client


CacheCow

[Level T2] This is a short post to introduce CacheCow, an Open Source framework for HTTP caching on the client and server in ASP.NET Web API.

As some of you probably know, I have been working on caching for a while. If you go back to my post on CachingHandler, you will see that I contributed server-side HTTP caching implementation to the WebApiContrib project and included samples and tests.

However, I realised that even more important part of the caching needs to be implemented on the client. Also implementations of the IEntityTagStore on various databases (in-memory and persisted) and client-side's cache storage all need their own project so this is bigger than just a feature on WebApiContrib. As such, I have decided to start a new project and port the server-side from WebApiContrib. [Please bear in mind, the code in the WebApiContrib will be maintained and supported so if you are using it and have problems or experience bugs, please ping me in twitter or GitHub.]

So CacheCow framework has been born. The name itself is a word game with Cash Cow, meaning it does the heavy lifting for caching with minimal set up hence promises good return on investment :). Project is open source and hosted on GitHub. And yes, I do accept pull requests; Tugberk has done a great job (also with some help from Sayed Ibrahim Hashimi which I am so grateful for) and automated the whole build and NuGet package generation and his PR was merged pretty much immediately. But please contact me before-hand on the work you would like to do - there is ton of interesting work to do.

How to use CacheCow.Server

At the moment, only server-side CacheCow is ready for use. All you need to do is to use NuGet to get the package:

PM> Install-Package CacheCow.Server

This will add the CacheCow.Server and CacheCow.Common DLLs and the rest is all the same as the CachingHandler post and samples. Just add CachingHandler to the config:

GlobalConfiguration.Configuration.MessageHandlers.Add(new CachingHandler());

This will add this handler with all the default settings and will store the cache state in the memory. There are many dials that you can turn and configure the handler according to your resource organisation - just see the sample in WebApiContrib. This sample connects to the CarManager sample and tests various scenarios for a fairly complex resource organisation.

How to use CacheCow.Server.EntityTagStore.SqlServer

As I pointed out above, by default cache state is stored in memory. This is OK for single server or test scenarios but in case of a server farm you would like the cache state to be maintained for whole farm and when a resource cache invalidated, it is done for all servers. In this case you need a central EntityTagStore (cache state store).

Building a cache state store is pretty easy and all you have to do is to implement IEntityTagStore interface which has 5 methods. Since cache might be invalidated not just for a CacheKey (previously called EntityTagKey) but also for a RoutePattern (see CachingHandler post), key-value stores are not rich enough to provide this but conventional databases can be used.

So I have implemented this for SQL Server. In order to get this EntityTagStore, just use NuGet:

PM> Install-Package CacheCow.Server.EntityTagStore.SqlServer

This will download the DLL and also a script file named script.sql located in <project root>\packages\CacheCow.Server.EntityTagStore.SqlServer.0.1.0\scripts

So create a database and run this script against it, and you will get one table and several stored procedures. Then in order to use SqlServerEntityTagStore, create an instance pass it to the CachingHandler constructor. Default constructor relies on a connection string named "EntityTagStore" to be there in your web.config and pointing to your database.

GlobalConfiguration.Configuration.MessageHandlers.Add(
 new CachingHandler(new SqlServerEntityTagStore()));


Alternatively, pass the connection string to the constructor. That is all you have to do use SQL Server EntityTagStore.

Roadmap

I am working on the client CachingHandler to be used with HttpClient. This will initially come with an InMemoryCacheStore but then various persistent cache store implementations can be done (for example file-based, SQL CE, etc)

Also CarManager sample needs to be ported for CacheCow which I am hoping to do very soon - although the old sample does work well.

Any question or comment please ping me on twitter (@aliostad) or GitHub.


Sunday, 1 July 2012

The place of Extension Methods in Software Design


Introduction

[Level T3] Extensions methods - introduced back in .NET 3.0 - are useful tools in a .NET developer's toolset. Apart from their usefulness, extension method is not an inherently object oriented concept yet we use them more and more in our API designs.

Extension methods initially were used for those classes where we did not own the source code for. But nowadays we are using them increasingly for types where we do own the source.

This post aims to have an in-depth look at the place of extension methods in the API design.

Background

Definition of Extension Methods according to MSDN is:
Extension methods enable you to "add" methods to existing types without creating a new derived type, recompiling, or otherwise modifying the original type.
So as we all know, in order to create an extension method, we need to:
  1. Create a static non-generic class
  2. Create a static method
  3. Make the first parameter as the type we are trying to add the method, with the keyword this
For example (and one of my favourites), we can add this extension method to the object to replicate the T-Sql's IN operator:

public static bool IsIn(this object item, params object[] list)
{
 if (list == null || list.Length == 0)
  return false;
 return list.Any(x => x == item);
}

Now I can use this like an instance method:

string ali = "ali";
var isIn = ali.IsIn("john", "jack", "shabbi", "ali"); // isIn -> true

If I may digress a little bit here, this is not such a great implementation since:

var isInForInt = 1.IsIn(2, 3, 1); // isInForInt -> false!

As you have probably guessed, defining the extension method for object type will cause the boxed integers objects to be compared instead of integers themselves and they surely won't be equal. So a generic implementation will solve the problem:

public static bool IsIn<T>(this T item, params T[] list)
{
 if (list == null || list.Length == 0)
  return false;
 return list.Any(x => EqualityComparer<T>.Default.Equals(x, item));
}

Reality is, extension method only gives the illusion of method being on the type and what is being compiled is nothing but a plain old static method call. Having a look at the IL generated confirms this:

IL_0056:  call       bool ConsoleApplication1.ExtensionMethods::IsIn<int32>(!!0, !!0[])

ExtensionMethods above is the name of the static class I created for this method.

So extension methods are basically the same utility or helper static methods we have been writing only glamorised to look like instance methods. Yet they have the additional benefit of:

  1. It leads to much more readable and natural code.
  2. I do not have to know the name of the helper class whose static methods I am using - in fact the behaviour has nothing to do with the static class. That class is not really a class in a true sense since it does not exert state or behaviour. And that is why it has to be declared static: to make clear its design intentions.
  3. Fluent API can be easily designed for older types without touching them.
  4. Since it is not really an instance method call, it can be called on null instances. This is a desirable side effect since we can check for nulls in the extension method and cater for them (none of the "object reference not set to an instance..." nonsense!)
  5. Since it can be called on null instances, some type information for the null instance can be determined in the extension method (although it can be a base type or an interface) while this is not possible for a null object.

Extension methods when we do not own the type

This has been the typical scenario. We always wonder if for example string had a such and such method and there was no way to achieve this. Now using extension methods we can. This scenario can also apply to cases where a historic API has been released (and you own the API) but cannot be changed. In this case, your API can be enhanced using extension methods.

With such usage, there is no decision to be made hence the design has already been done. Extension methods serve mere as a nice utility and syntactical sugar.

One of the most useful use cases I have found is the function composition in functional programming in C# (see some examples in my other posts here and here). This is especially important since you can achieve readability by method chaining. For example:

  usingReflection
   .Repeat(TotalCount)
   .OutputPerformance(stopwatch, performanceOutput)();

In addition to the examples above, let's have a look at a simple example to swallow the exception and optionally log the error (note how the implementation reuses itself to swallow errors that could arise from logging):

public static class WrapSwallowExtension
{
 public static Action<T> WrapSwallow<T>(this Action<T> action, Action<Exception> logger = null)
 {
  return (T t) =>
       {
           try
           {
            action(t);
           }
           catch (Exception e)
           {
      if (logger != null)
       logger.WrapSwallow()(e);             
           }

       };
 }
}

So I can use:

string myString = null;
Action<string> action = (s) => { s.ToLower(); }; // reference null exception! 
action.WrapSwallow()(myString); // swallowed

Now here I created a new exception but when I am working in a functional scenario, I already have my actions and functions.

Extension methods when we own the type

I have heard some saying "Why would you wanna use an extension method when you own the code? Just add the method to the type."

There are cases where you own the type yet you would still use an extension method. Here we have a look at a few scenarios below.

Extension methods for interfaces

This is the most obvious use case. Most of the Linq library is implemented using extension methods (while Microsoft owns the types). An interface cannot have the implementation but you can use extension methods to add implemented enhancement to your interfaces.

Without getting into the debate whether implementing ForEach against IEnumerable<T> is semantically correct or not (don't! I am not going there) you might have noticed that the function only exists for the List<T> so you have to use ToList() to use the feature. Well, this can be easily done for IEnumerable<T> too:

public static class IEnumerableExtensions
{
 public static IEnumerable<T> ForEachOne<T>(this IEnumerable<T> enumerable, Action<T> action)
 {
  foreach (var t in enumerable)
  {
   action(t);
   yield return t;
  }
 }
}

In this particular example, I do not own the source for IEnumerable<T> but even if I had, I would only be able to associate implementation with the interface using extension methods.

Overloading

This is the next common case. If you are familiar with ASP.NET MVC, you probably have noticed that the most of the functionality of HtmlHelper class has been implemented using extension methods.

Html.TextBoxFor(x => x.Name)

In fact all different overloads of HtmlHelper for Textbox, RadioButton, Checkbox, TextArea, etc are implemented using extension methods. So the HtmlHelper class itself implements a core set of functionality which will be called by these extension methods.

Now lets look at this fictional interface:

public interface IDependencyResolver
{
   object Resolve(Type t);
   T Resolve<T>();
}

The interface has two methods for resolving the type, one using the generic type the other with the type instance. Whoever implements this will be most likely implementing the non-generic method and then make generic method call the non-generic one:

public interface IDependencyResolver
{
 object Resolve(Type t);
}

public static class IDependencyResolverExtension
{
 public static T Resolve<T>(this IDependencyResolver resolver)
 {
  return (T) resolver.Resolve(typeof (T));
 }
}


This will help to:
  • Trim down the interface and make it terser so it can express its design intentions more clearly
  • Save all implementers of the interface having to repeat the same bit of code
When I look at the interface IQueryProvider, I wonder if it was designed before extension methods were available:

public interface IQueryProvider
{
    IQueryable<TElement> CreateQuery<TElement>(Expression expression);
    IQueryable CreateQuery(Expression expression);
    TResult Execute<TResult>(Expression expression);
    object Execute(Expression expression);
}

So the 4 methods could have been reduced to 2. Considering the fact that Linq and extension methods both came in .NET 3.0, my suspicion seems very likely!

Dependency layering

Another case where you might decide to use an extension method rather than exposing a direct method on the type is when a type's sole dependency on another type is confined to a single method. This is very common in cases where the dependency is on a layer above the dependent type - while naturally must be the other way around.

For example, let's look at this case:

// THIS WILL NOT WORK!

// sitting at entity layer
public class Foo
{
 // ...

 public Bar ToBar()
 {
  // ...
 }
}

// sitting at business layer
public class Bar
{
 // ...  
}

Now in this example, I have laid out these two classes in different logical layers to better illustrate the case - but it does not have to be, this is all about managing dependencies, in the same layer or other layers. We have baked in Foo's the dependency to Bar for the sake of ToBar(). The solution is to create an extension method for the ToBar().

So we can write (and completely decouple to classes):

public class Foo
{
 
}

public class Bar
{

}

public static class FooExtensions
{
 public static Bar ToBar(this Foo foo)
 {
  return new Bar();
 }
}


Providing implementation for enumerations

This is one that probably many of us have done. Enumerations - unfortunately - cannot contain implementations so extension methods are a good place to put the implementation code for enums. This is usually to do to conversion, parsing and formatting.

Delay decision on API signatures

With regards to an API, anything that goes into the public interface of the type is difficult to change. As such attempting to provide all possible overloads and use cases of the type on its public interface is likely to fail.

Delaying such decisions with providing a base functionality on the type and then providing more and more extension methods with each release is a useful process. ASP.NET team have used this technique for ASP.NET MVC and recently with ASP.NET Web API.

Drawbacks

Extension methods are static methods. As such they cannot be mocked using standard mocking frameworks. An extension method should not have any dependency other than the ones passed to it.

Let's look at this case:

public class Foo
{
 public string FileName { get; set; }

 public void Save()
 {
  // ...
 }
}

public static class FooExtensions
{
 public static void SafeSave(this Foo foo)
 {
  var directoryName = Path.GetDirectoryName(foo.FileName);
  if (!Directory.Exists(directoryName))
   Directory.CreateDirectory(directoryName);
  foo.Save();
 }
}

In this case, unit testing any class that uses SafeSave becomes a nightmare. What we need here is to create an interface IFileSystem and pass along with the extension method to abstract it from using the real file system.

Conclusion

Availability of extension methods has changed the way we design software APIs in the .NET world. We have started to build the basic functionality in the actual types and use extension methods to provide overloading.

There are 5 reasons to use extension methods when you own the type:

  • To associate implementation with interfaces
  • Overloading of an API
  • Removing dependency especially in logical layering
  • Providing implementation for enumerations
  • Delaying decision on API signatures
An extension method should not have any dependencies other than the ones passed to it.