Monday, 7 October 2013

Beware of undisposed or unconsumed HttpResponseMessage

[Level T2] This is just a short post to bring something important to the reader's attention. I guess if you are well into ASP.NET Web API, you are bound to see it - especially if you are consuming APIs using HttpClient.

Let's look at this innocent looking delegating handler (useful on the client side):

public class InnocentLookingDelegatingHandler : DelegatingHandler
{

    protected override Task SendAsync(HttpRequestMessage request, 
        CancellationToken cancellationToken)
    {
        return base.SendAsync(request, cancellationToken)
            .ContinueWith(t =>
                {
                    var response = t.Result;
                    if (response.StatusCode == HttpStatusCode.NotModified)
                    {
                        var cachedResponse = request.CreateResponse(HttpStatusCode.OK);
                        cachedResponse.Content = new StringContent("Just for display purposes");
                        return cachedResponse;
                    }
                    else
                    return response;
                });
    }   
}

There isn't really a lot happening. This delegating handler sniffs the incoming responses and if it intercepts a 304 response, it replaces this with a cached response - of course the implementation is a minimal one just for display purposes.

Now let's use it with a resource which returns 304:

class Program
{
    static void Main(string[] args)
    {
        var client = new HttpClient(new InnocentLookingDelegatingHandler()
                        {
                            InnerHandler = new HttpClientHandler()
                        });
        for (int i = 0; i < 1000 * 1000; i++)
        {
            var request = new HttpRequestMessage(HttpMethod.Get, 
                "http://ajax.googleapis.com/ajax/libs/angularjs/1.0.7/angular.min.js");
            request.Headers.IfModifiedSince = DateTimeOffset.Now;
            var response = client.SendAsync(request).Result;
            response.Dispose();
            Console.Write("\r" + i);
        }
            
    }
}

You can find a ready project to test this on GitHub. Please note that System.Net's maxConnection needs to be set up in the app.config.

So what do we see?

  1. Application starts to leak memory
  2. After a while you get an error telling you have run out of sockets.

We can see the effect using SciTech tool:



Well, reason is the server response in the Task continuation is not getting consumed. What does it mean? Well, adding a line to consume the content is enough:

var cachedResponse = request.CreateResponse(HttpStatusCode.OK);
cachedResponse.Content = new StringContent("Just for display purposes");
response.Content.ReadAsByteArrayAsync().Result; // just read the content which is empty!!
return cachedResponse;

Or simply dispose the response:

var cachedResponse = request.CreateResponse(HttpStatusCode.OK);
cachedResponse.Content = new StringContent("Just for display purposes");
response.Dispose(); // Dispose the response
return cachedResponse;

This should be enough to solve the problem. Now one would think that we have got the response and 304 response actually would never have a content so it is silly to read the content - but well, until you hit this bug.

Thanks to Carl Duguay who reported the issue on CacheCow, the resultant memory leak on 304 responses is fixed now. I think it is very likely that you might run into similar problem so beware when getting a response - always consume or dispose it.

This brings the question that should we always dispose HttpRequestMessage and HttpResponseMessage? Recently, there has many classes that implement IDisposable but they do not require use of dispose pattern. Examples include HttpClient, Task or MemoryStream. On the other hand, you cannot find a single sample that either request or response is used in a Dispose pattern - and the fact that the API is fully async makes use of Dispose pattern very difficult.

In any case, ideas are welcome. But as for the bug we had in CacheCow, it is fixed now.

Wednesday, 25 September 2013

Pro ASP.NET Web API is out!

Some of you who have been reading this humble blog have probably noticed that I have been pretty inactive. While this is true, I have been very busy as one of the authors of the Pro ASP.NET Web API book. I am pleased  - much to my relief and equal joy - to announce that it is finally out!

The process has been a long one and since I joined later in the process, it was hectic at times as I had to manage to keep up with our tight deadline.

I am very grateful for my co-authors Tugberk Ugurlu and Alexander Zeitler, and wonderful people at Apress (Gwenan Spearing, Chris Nelson, Anamika Panchoo ...) who made this possible.

All I can say from my part is that we have cooked the book with the same taste we would have liked to taste any other technical book: practical with real-life examples yet having geeky depth. And above all providing all-round conceptual picture which is hard to grasp by scraping through rims and rims of documentation or various blog posts alone. The book covers why as well as what and we have tried to be read like a book (and not a technical documentation): we have story to tell.

Enough talking. Please provide us with feedbacks positive (or negative) as we would like to hear from your side of story. And if you can, please post a review on Amazon. Happy reading!

Pro ASP.NET Web API

Sunday, 18 August 2013

OWIN and Katana challenges: blues of a library developer

[Level T4] I have to be very careful on how to phrase what I am about to write here. The .NET community celebrates OWIN's integration into .NET HTTP frameworks such ASP.NET Web API, NancyFx and others. This is the result of months and months of hard work and sleepless nights. While I am also jubilant to see this happening, we need to stay objective and keep a self-critical attitude to ensure we make the best of this opportunity. 

TLDR;

Developing a middleware library on top of OWIN that can be equally used by native frameworks is possible but at the current state far from ideal as the overhead of translation between different frameworks can be big. OWIN application and middleware are clearly defined as separate entities in the spec while they are regarded the same in the OWIN implementation itself which renders the distinction redundant.

Introduction

OWIN (Open Web Interface for .NET) is a community effort led by Benjamin Vanderveen and Louis Dejardin to "decouple server and application and, by being an open standard, stimulate the open source ecosystem of .NET web development tools". OWIN consists of a specification which is currently at version 1.0, a set of extension specifications and a small library owin.dll.

Katana is a project lead by Microsoft which provides implementation on top of minimal owin.dll implementation for abstractions such as OwinResquest, OwinResponse and OwinMiddleware. It also provides an HTTP server built on top of HttpListener.

Now each web platform also provides its own plumbing to OWIN.

What OWIN has nailed it

As we saw, the goal of OWIN has been defined as below:

The goal of OWIN is to decouple server and application and, by being an open standard, stimulate the open source ecosystem of .NET web development tools.
There is absolutely no doubt that OWIN has successfully delivered this promise. Regardless of what Server you are using (IIS, Kayak, HttpListener, etc) you can use web framework of choice.




Now this is all well and good. But we do expect more from OWIN. We would like to be able to write a middleware once and use it in any framework. There are many examples of this but main ones are security, formatting, caching, conneg, and alike. This all seemed easy and accessible to me so I set out to implement CacheCow as an OWIN middleware. But the task proved more convoluted than I though.

OWIN application and middleware

Here are how OWIN's application and middleware are defined:
Web Application – A specific application, possibly built on top of a Web Framework, which is run using OWIN compatible Servers.
Middleware – Pass through components that form a pipeline between a server and application to inspect, route, or modify request and response messages for a specific purpose.

Based on the specification (and definition above), an OWIN runtime can be viewed as below: 


Now if we look at the Application Startup section of the sepc, we see applicatrion as the engulfing runtime of the functionality. This is the interface for "application" startup:

public interface IAppBuilder
{

    IDictionary<string, object> Properties { get; }

    IAppBuilder Use(object middleware, params object[] args);

    object Build(Type returnType);

    IAppBuilder New();
}

So our view would probably change to this:


But after more digging, we realise that this cannot represent the use cases because:
  • It is possible to have an application which mixes different web frameworks, i.e. register multiple frameworks each covering a route. Now in this case, is application is composed of all smaller applications or each one is an application?
  • It is possible to have middlewares each based on a different framework. So similar to applications, they can based on a framework.
  • Middlewares does not have to be pass-through, they can be prematurely terminate pipeline chain.
  • Applications can be pass-through. This has been mainly implemented as 404-based chain of responsibility but any implementation is possible.
  • IAppBuilder does not have a notion of application - only middleware. There is no explicit means of registering an application and .Use method has a middleware parameter. Now we all know that this is also meant for applications and all extension methods currently available register applications such as .UseWebApi and .UseNancy.
So in essence, it seems like "application" is a redundant abstraction as far as runtime is concerned. This has no representation in OWIN code and I personally would like it to be removed from the spec or alternatively the definition of middleware and application merged for clarity. So this is how I see OWIN runtime:


ASP.NET Web API and OWIN HTTP Pipeline: Impedance mismatch

ASP.NET Web API provides a very powerful HTTP Pipeline expressed as SendAsync method below:

public Task<HttpResponseMessage> SendAsync(
 HttpRequestMessage request, CancellationToken cancellationToken
)

For those who have worked this pipeline it is clear that building a "middleware" as a DelegatingHandler is very easy.

On the other hand, OWIN provides a similar pipeline with this AppFunc signature:

using AppFunc = Func<
        IDictionary<string, object>, // Environment
        Task>; // Done

In this one, the operation is more or less the same since both request and response exist as part of the dictionary passed in.

As such, one might think that a DelegatingHandler should be very easy to convert to OWIN middleware. I personally expected this to take a few hours and set out to implement CacheCow as an OWIN middleware and show it off in a NancyFx app. I wish that was the case, but as I dig deeper it turned out we are facing hard challenges. At first googling around found this exact question, but to my surprise I found that .UseHttpMessaageHandler() (where a simple handler is returning data) and .UseWebApi() (where a whole ASP.NET Web API application is being set up) have been implemented yet no .UseDelegatingHandler() in site. That did not discourage me as I set out to build one using this little experimental rough and ready bridge:

public class OwinHandlerBridge : HttpMessageHandler
{
    private readonly DelegatingHandler _delegatingHandler;
    private readonly FixedResponseHandler _fixedResponseHandler = new FixedResponseHandler();
    private readonly HttpMessageInvoker _invoker;

    private const string OwinHandlerBridgeResponse = "OwinHandlerBridge_Response";

    private class FixedResponseHandler : HttpMessageHandler
    {
        protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, 
            CancellationToken cancellationToken)
        {
            return Task.FromResult(
                (HttpResponseMessage)
                request.Properties[OwinHandlerBridgeResponse]);
        }

    }

    public OwinHandlerBridge(DelegatingHandler delegatingHandler)
    {
        _delegatingHandler = delegatingHandler;
        _delegatingHandler.InnerHandler = _fixedResponseHandler;
        _invoker = new HttpMessageInvoker(_delegatingHandler);
    }

    protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, 
        CancellationToken cancellationToken)
    {
        var owinContext = request.GetOwinContext();
        request.Properties.Add(OwinHandlerBridgeResponse, owinContext.Response.ToHttpResponseMessage());
        return _invoker.SendAsync(request, cancellationToken);
    }
}

This bridge turns a DelegatingHandler into a HttpMessageHandler so that I can use .UseHttpMessaageHandler(). But there are problems. First of all, I cannot turn use back an OWIN response body since it is implemented as a write-only stream. Also the process of converting Message to and from OWIN to ASP.NET Web API has been reported to be expensive - and prohibitive.

Another issue is also the difference in pipelines so that an OWIN middleware is different from a DelegatingHandler. In ASP.NET Web API, every DelegatingHandler is visited twice in the pipeline (or none in case of a short-circuiting - shown as faded arrows):


In OWIN, a middleware gets the chance to change the request or response only once (or none in case of a short-circuit - shown as faded arrows):


[Correction: since middleware interface is Task-based, practically every component is called twice and you can re-process in the Task continuation, pretty much similar to ASP.NET Web API]

So here is the question, which model should I build my middleware? While many believe it is OWIN, Dominick Baier (@leastprivilege) has already decided to abstract it away from both and created his own model in Thinktecture.IdentityModel to represent an HTTP request and response. Pedro Felix also believes the same.

And the quest for a common HTTP pipeline paradigm continues...


Thursday, 9 May 2013

Performance series - Announcing SuperBenchmarker sb.exe for generating load on your API/Site

[Level T1] Some of you might be doing performance analysis for living, and if so, you have access to expensive performance tools. But for those who are not and get drawn into investigating performance of their own application or their peers, there is really not a great free tool that you can just point to a URL and hammer.

Well, actually there is: Apache benchmark or ab.exe which has served the community since almost 20 years ago. It is free and very performant, i.e. does not require a lot of resources to run so does not affect the performance even if you benchmark your localhost server. But it has limitations (only GET, no HTTPS, no parameterisation) and I could not find a tool so I ended up writing one for my own use. And now I am sharing it.



SuperBenchmarker or sb.exe is a single-exe (ilmerged with dependencies inside) application that can be used to call various endpoints. It can do GET as well as other verbs, can do request templating and you can provide values for placeholders in URL or your template. It also can output a lot of tracing info in case you need it.

It requires .NET 4.5 but you can target any site or API written in Ruby, PHP, Java, ect. You can download it from here if you do not want to use NuGet. [Update: It has been reported that some people have had issue running on x64 windows. If so please comment on this GitHub issue]

So version 0.1 is out and it works although project needs some love with tests and a few features. There are also some shortcuts taken which needs to be sorted out - but it works.You can clone the code and build it (running build.ps1) or simply use NuGet to download it. [Update: preferred method is using Chocolatey - see below]

PM> Install-Package SuperBenchmarker

For usage, see the wiki on the GitHub. I value feedbacks, also if interested to get involved with the project let me know.



Update

Courtesy of Mike Chaliy, this has now a chocolatey package. So if you have chocolatey installed (and if not here are the instructions), simply run this command in command line:

cinst SuperBenchmarker

Info

Usage:
sb.exe -u url [-c concurrency] [-n numberOfRequests] [-m method] [-t template] [-p plugin] [-f file] [-d] [-v] [-k] [-x] [-q] [-h] [-?]
Parameters:
-u Required. Target URL to call. Can include placeholders.
-c Optional. Number of concurrent requests (default=1)
-n Optional. Total number of requests (default=100)
-m Optional. HTTP Method to use (default=GET)
-p Optional. Name of the plugin (DLL) to replace placeholders. Should contain one class which implements IValueProvider. Must reside in the same folder.
-f Optional. Path to CSV file providing replacement values for the test
-d Optional. Runs a single dry run request to make sure all is good (boolean switch)
-v Optional. Provides verbose tracing information (boolean switch)
-k Optional. Outputs cookies (boolean switch)
-x Optional. Whether to use default browser proxy. Useful for seeing request/response in Fiddler. (boolean switch)
-q Optional. In a dry-run (debug) mode shows only the request. (boolean switch)
-h Optional. Displays headers for request and response. (boolean switch)
-? Optional. Displays this help. (boolean switch)


Examples:

-u http://google.com
-u http://google.com -n 1000 -c 10
-u http://google.com -n 1000 -c 10 -d (runs only once)
-u http://localhost/api/myApi/ -t template text (file contains headers to be sent for GET. format is same as HTTP request)
-u http://localhost/api/myApi/ -m POST -t template.txt (file contains headers to be sent for POST. format is same as HTTP request with double CRLF separating headers and payload)
-u http://localhost/api/myApi/{{{ID}}} -f values.txt (values file is CSV and has a column for ID)-u http://localhost/api/myApi/{{{ID}}} -p myplugin.dll (has a public class implementing IValueProvider defined in this exe)
-u http://google.com -h (shows headers)
-u http://google.com -h -q (shows cookies)
-u http://google.com -v (shows some verbose information including URL to target - especially useful if parameterised)

Monday, 1 April 2013

Monitor your ASP.NET Web API application using your own custom counters

[Level T2] OK, so you have created your Web API project and deployed into production and now boss says dude, we have performance problems. Or maybe head of testing wants to benchmark the application and monitor it over the course of next releases so we catch problems early. Or you are just a responsible geek interested in the performance of your code.

NOTE: THIS BLOG POST REFERS TO AN EARLIER VERSION OF PERFIT. PLEASE VISIT https://github.com/aliostad/PerfIt FOR UP-TO-DATE DOCUMENTATION

In any case, no serious web code can be written without having performance in mind. Problem is that existing system, .NET and ASP.NET performance counters can go as far as telling you overall picture of the performance and the metrics usually coalesced into a single value while you need to drill down to individual APIs and see what's happening. Now this can be another burden on your already squeezed time remaining. So how easy is it to publish counters for your individual API? Just these few steps! TLDR; :

  1. Use NuGet to install PerfIt! into your ASP.NET Web API project (make sure you get version +0.1.2 - there was a bug fixed in this version)
  2. Add PerfitDelegatingHandler to the list of your MessageHandlers
  3. Decorate those actions you want to monitor
  4. Use "Add New Item" to add an Installer class into your Web API project. Write just a single line of code for Install and Uninstall.
  5. Use InstallUtil.exe in an administrative command window to install (or uninstall) your counters. Done! 

Seeing performance counters of your own project is kinda cool!

1-Adding PerfIt! to your project

So to add PerfIt! to the project, simply use NuGet console:
PM> Install-Package PerfIt


2-Adding PerfItDelegatingHandler 

Now we need to add the delegating handler:

config.MessageHandlers.Add(new 
      PerfItDelegatingHandler(config, "My test app"));

The string passed in here is the application name. This will be used as the instance name in the performance counter. You can see that in the screenshot above.

3-Decorate actions

For any action that you want the counters to be published, use PerfIt action filter and define the counters you want to see published:

// GET api/values
[PerfItFilter(Description = "Gets all items",
   Counters = new []{CounterTypes.TotalNoOfOperations,
   CounterTypes.AverageTimeTaken})]
public IEnumerable<string> GetAll()
{
    Thread.Sleep(_random.Next(50,300));
    return new string[] { "value1", "value2" };
}

// GET api/values/5
[PerfItFilter(Description = "Gets item by id", 
   Counters = new[] { CounterTypes.TotalNoOfOperations, 
   CounterTypes.AverageTimeTaken })]
public string Get(int id)
{
    Thread.Sleep(_random.Next(50, 300));
    return "value";
}

Here we have decorated GetAll and Get to publish two types of counters (currently these counters are available but more will be added - see below).

The format of the counter will be [controller].[action].[counterType] (see screenshot above). As such, please note that we had to change the first Get to GetAll to so that the counter names do not get mixed up. If you cannot do that (while ASP.NET Web API allows you do change the name as long as the action starts with the verb), alternatively use the Name property of the filter to define your own custom name (which we did not specify here to allow default naming take place).

Description property will appear in the perfmon.exe window and is known as CounterHelp. Counters is an array of string, each string being a defined against PerfIt! runtime (see roadmap for more info) as a counter type. Another option is the ability to define Category for the counter which we also did not specify so by default, it will be the assembly name. You can see the category in the screenshot above as PerfCounterWeb (see the typo!).

4-Adding Installer to your project 

Now use "Add New Item" to add an installer class to your project. You might have not seen this in the new item templates but it is definitely there (screenshot from VS2012 but project is .NET 4.0 so can be done in VS2010):


After adding the class, override Install and Uninstall methods and add the code below (hit F7 to see the code):

public override void Install(IDictionary stateSaver)
{
    base.Install(stateSaver);
    PerfItRuntime.Install();
}

public override void Uninstall(IDictionary savedState)
{
    base.Uninstall(savedState);
    PerfItRuntime.Uninstall();
}

5-Use installutil.exe to install counters

Just make sure you open an administrative command window. Use Visual Studio command prompt since it has the right path for InstallUtil.exe. cd to your bin folder and register your assembly:
c:\projects\myproject\bin>InstallUtil.exe -i MyWebApplication.dll
This should do the trick. Use -u switch to uninstall the counters.

That is all that is needed. Just hit your app and start benchmarking your app in the perfmon.exe.

Turning off the publishing of counters

In normal/production circumstances, you might wanna turn off publishing of the performance counters. In this case, you can put the line below in the appSettings of your web.config:

<appSettings>
    <add key="perfit:publishCounters" value="false"/>

I have kept the default behaviour to publish counters - so to eliminate one configuration step to get up and running. This may change in the future - but unlikely.

PerfIt! Roadmap

I needed to add performance counters to my Web API application. With wife away and a few Easter bank holiday days, I managed to start and finish version 0.1 of PerfIt! library.

Future work includes adding more counter types and looking at improving the pipeline. PerfIt! has been built on the top of its own extensibility framework so very easy to add your own counters, all you need is to implement CounterHandlerBase and then register the handler in PerfItRuntime.HandlerFactories.

Any problems, issues, comments or feedbacks, please use the GitHub issues to get it touch. Enjoy!

Thursday, 21 March 2013

Performance series - Memory leak investigation Part 1

[Level T2] If you have worked a few years within the industry, it is very unlikely that you have not encountered a case of memory leak. Whether your own code or someone else's, it is an annoying problem to diagnose and fix. The problem's scale and the effort to fix it increases exponentially with the scale and complexity of the application and deployment.

Many companies use application benchmarking and performance testing to find these problems early on. But as I experienced myself, problems found earlier as part of performance testing are not necessarily easier to solve. Lengthy cycle of gathering metrics, analysis, coming up with one or more hypothesis/es and then fixing and trying and testing it could be very time consuming and jeopardising delivery and meeting deadlines.

In these cases, focusing on what is important and what is not saves the day. A common problem in troubleshooting performance bugs is information overload and the existence of red herrings that confuse the picture. There is a myriad of performance counters that you could spend hours and days explaining their anomalies that have nothing to do with the problem you are trying to solve. Having worked as a medical doctor before, I have seen this in the field of medicine many many times. Human's body as a very big application - eternally more complex that any given application - demonstrates similar behaviour and I have learnt to be focused on identifying and fixing the problem at hand rather than explaining each and every oddity.

As such, I am starting this series with an overview of the performance counters and tools to be used. There are many posts and even books on this very topic. I am not claiming that this series is a comprehensive and all-you-need guide. I am actually not claiming anything. I am simply sharing my experience and hope it will make your troubleshooting journey less painful and help you find the culprit sooner.

A few notes on the platform

This is exclusively a Windows and mainly a .NET series - although part of this series could help with none .NET applications. Also focus is mainly on ASP.NET web applications as the impact of even a small memory leak can be very big but the same guidelines can be equally applied to desktop applications.

What is a memory leak?

Memory leak is usually considered as the memory which is allocated by the application and becomes inaccessible - as such cannot be deallocated. But I do not like this definition. For example this code does not qualify for memory leak but it will lead to out of memory:

var list = new List<byte[]>();
for(i=0; i< 1000000; i++)
{
   list.Add(new byte[500 * 1024]); // 500 KB
}

As can be seen, data is accessible to the application but application does not unload the allocated memory and keeps piling up memory allocation in heap.

For me, memory leak is a constant/stepwise allocation of memory without deallocation which usually leads to OutOfMemoryException. The reason I say usually is because small leaks can be tolerated as many desktop applications are closed at the end of the day and many website daily recycle their app pools. However, they are still leaks and you have to fix them when you get the time since with change in conditions, leaks that you could tolerate can bring down your site/application.

Process memory/time in 4 different scenarios

In the top right diagram, we see constant allocation but it coincides with deallocation of memory - as such not a leak. This scenario is common in ASP.NET Caching (HttpRuntime.Cache) where 50% of the cache is purged when memory used reaches a certain threshold.

In the bottom right diagram, de-allocation does not happen or its rate is the same as the allocation when the memory size of the application's process reaches a threshold. This pattern can be seen with the SQL Server where it uses as much memory as it can until it reaches a threshold.

How do I establish a memory leak?

All you have to do is to ascertain your application meets the criteria described above. Just observe the memory usage over time under load. It is possible that the leak happens only in a certain condition or when using a particular functionality of the app so you have to be careful about that.

What tool do you need to do that? Even using a Task Manager could be good enough to start with.

Tools to use 

Normally you would start with Task Manager. Just a quick look at the Task Manager or eye-balling the memory size of the process when the app is running and under load is a simple but effective start.

The mainstay of the memory leak analysis is Windows Performance Monitor, aka Perfmon. This is very important in establishing a benchmark, monitoring changes over new version releases and identifying problems. In this post I will look into some essential performance counters.

If you are investigating a serious or live memory leak, you have to have a .NET Memory Profiler. Currently 3 different profilers exist:

  1. RedGate's ANTS Profiler
  2. Jetbrain's dotTrace
  3. Scitech's Memory Profiler

Each tool has its own pros and cons. I have experience with SciTech's and I must say I am really impressed by it.

The most advanced tool is Windbg which is a really useful tool but has a steep learning curve. I will look at using it for memory leak analysis in future posts.

Initial analysis of the memory leak

Let's look at a simple memory leak (you can find the snippets here on GitHub):

var list = new List<byte[]>();
for (int i = 0; i < 1000; i++)
{
    list.Add(new byte[1 * 1000 * 1000]);
    Thread.Sleep(200);
}

Process\Private Bytes performance counter is a popular measure of total memory consumption. Let's look at this counter in this app:


So how do we find out if it is a managed or unmanaged leak? We use .NET CLR Memory\#Bytes in all heaps counter:


As can be seen above, white private bytes increases constantly, heap allocation happens in steps. This stepwise increase of heap is consistent with a managed memory leak. In contrast let's look at this unmanaged leak code:

var list = new List<IntPtr>();
for (int i = 0; i < 400; i++)
{
    list.Add(Marshal.AllocHGlobal(1*1000*1000));
    Thread.Sleep(200);
}
list.ForEach(Marshal.FreeHGlobal);

And here is flat bytes in heap:


So this case is an unmanaged memory leak.

In the next post I will look at a few more performance counter and GC issues.

Tuesday, 19 March 2013

CacheCow Series - Part 0: Getting started and caching basics

[Level T2CacheCow is an implementation of HTTP Caching for ASP.NET Web API. It implements HTTP caching spec (defined in RFC 2616 chapter 13) for both server and client, so you don't have to. You just plug-in the CacheCow component in server and client and the rest is taken care of. You do not have to know details of HTTP caching although it is useful to know some basics.

It is also a flexible and extensible library that instead of pushing a resource organisation approach down your throat, allows you to define your caching strategies and resource organisation using a functional approach. It stays opinionated only about caching where the opinion is HTTP spec. We shall delve into these customisation points in the future posts. But first let's to caching 101.

Caching

Caching is a very overloaded and confused term in ASP.NET. Let's review the cachings available in ASP.NET (I promise not to make it too boring):

HttpRuntime.Cache

This class is used to cache managed objects in SERVER memory using a simple dictionary like key-value approach. You might have used HttpContext.Current.Cache but it is basically nothing but a reference to HttpRuntime.Cache. So how is this different to using a plain dictionary?
  1. You can define expiry so you do not run out of memory and your cache can be refreshed. This is the most important feature.
  2. It is thread-safe
  3. You can never run out of memory with this even with no expiry. As long as it hits a configurable limit, it purges half of the cache.
  4. And also: it stores the data in the unmanaged memory of the worker process and not in heap so does not lengthen GC sweeps.
This cache is useful for keeping objects around without having to acquire them all the time (could be complex calculation or IO cost such as database). No, this is not HTTP caching.

Output cache

In this approach output of a page or a route can be cached on the SERVER so the server code does not get executed. You can define expiry and let the output cached based on parameters. Output cache behind the scene uses HttpRuntime.Cache.

You guessed it right: this is also not HTTP caching.

HTTP Caching

In HTTP Caching, a cacheable resource (or HTTP response in simple terms) gets cached on the CLIENT (and not server). Server is responsible for defining cache policy, expiry and maintaining resource metadata (e.g. ETag, LastModified). Client on the other hand is responsible for actually caching the response, respecting cache expiry directives and validating expired resources.

The beauty of the HTTP caching is that your server will not even be hit when client has cached a resource - unlike other two types of caching discussed. This helps with minimising the scalability, reducing network traffic and allowing for offline client implementation. I believe HTTP caching is the ultimate caching design.

One of the fundamental differences of HTTP caching with other types of caching is that a STALE (or expired) resource is not necessarily unusable and does not get removed automatically. Instead client call server and check if the stale resource is still good to use. If it is, server responds with status 304 (NotModified) otherwise it sends back the new response.

HTTP Caching mechanism can also be used for solving concurrency problems. When a client wants to update a resource (usually using PUT), it can send a conditional call and ask for update to run if and only if the version of the server is the same as the version of the client (based on its ETag or LastModified). This is especially useful with modern distributed systems.

So as you can see, client and server engage in a dance using various headers and things can get really complex. Good new is you don't have to worry about it. CacheCow implements all of this for you out of the box.

CacheCow consists of CacheCow.Server and CacheCow.Component. These two components can be used independently, i.e. each will work regardless the other is used or not - this is the beauty of HTTP Caching as a mixed concern which I explained here.

CacheCow on the server

How easy is it to use CacheCow on the server? You just need to get the CacheCow.Server Nuget package:

PM> Install-Package CacheCow.Server

And then 2 lines of code to be exact (actually 3 but we will see why):

var cachecow = new CachingHandler();
GlobalConfiguration.Configuration.MessageHeandlers.Add(cachecow);

So with this line of code all your resources become cacheable. By default expiry is immediate so client has to validate and double check if it can re-use the cache resource but all of this can be configured (which we will talk about in future posts). Also server now is able to respond to conditional GET or PUT requests.

CacheCow on the server requires a database (technically cache store) for storing cache metadata - but we did not define a store. CacheCow comes with an in-memory database that can be used if you do not specify one. This is good enough for development, testing and small single server scenarios but as you can imagine not scalable. What does it take to define a database? Well, a single line hence the 3rd line and you can choose from SQL Server, Memcached,  RavenDb (Redis to come soon) or build your own by implementing an interface against another database (MySql for example).

So in this way we will have 3 lines (although you may combine first two lines):

var db = new SqlServerEntityTagStore();
var cachecow = new CachingHandler(db);
GlobalConfiguration.Configuration.MessageHeandlers.Add(cachecow);

With these three lines, you have added full feature HTTP caching (ETag generation, responding to conditinal GET and PUT, etc) to your API.

CacheCow on the client

Browsers that we use everyday (and usually even forget to look at as an application) are really amazing HTTP consumption machines. They implement cache storage (file-based) and are capable of handling all client scenarios of HTTP caching.

Client component of CacheCow just does the same but is useful when you want to use HttpClient for consuming HTTP services. Plugging it into your HttpClient is as simple as the server side. First get it:
PM> Install-Package CacheCow.Client
Then

var client = new HttpClient(new CachingHandler()
       { 
           InnerHandler = new HttpClientHandler()
       });


In a similar fashion, client component of CacheCow requires a storage - and this time for the actual responses (i.e. resources). By default, an in-memory storage is used which is OK for testing or single server and small API but for persistent stores you would use from stores such as File, SQL Server, Redis, MongoDB, Memcached aleady implemented or just implement the ICacheStore interface on the top of an alternative storage.

Constructor of the client CachingHandler accepts an implementation of ICacheStore which is your storage of choice.

So now cacheable resource will be cached and CacheCow will do conditional GET and PUT for you.

In the next post, we will look into server component of CacheCow.