Saturday, 12 October 2013

API Layer and its responsibilities - a REST viewpoint

[Level C4] With the emergence and popularity of REST APIs, there is a need to define a layer dedicated to the performing API-related functions. This article tries to define these responsibilities and separate it from the responsibility of other layers.

TLDR; If you are putting business logic of your domain in your API Layer (which is the presentation layer of your service), you are doing it wrong.


Layered (Tiered or N-Tiered) Architecture is a software design pattern that implements a software system in a stack of layers. Each layer has a distinct responsibility and has a dependency upon the deeper layer - with the deepest layer normally accessing a relational database.

This pattern, also known as the Layers pattern, was formalised in the book Pattern-Oriented Software Architecture: Layers pattern "helps to structure applications that can be
decomposed into groups of subtasks in which each group of subtasks is at a particular level of abstraction." This pattern emerged as a replacement to the Client-Server architecture and became very popular in late 90s and early 2000s.

Service Oriented Architecture (SOA) did not change the popularity and importance of tiered architecture since each service contains the tiers inside it (Figure 1)

Figure 1 - Layered Architecture in SOA
Layers are usually domain-agnostic and can be uniquely identified across different implementations and systems. The layers themselves are not necessarily physical, i.e. deployed on different machines; they can be simply logical implementations in different libraries - or even in the same library. What identifies a layer is its responsibility not its physical location.

There are different approaches in identifying and naming layers across a system stack. Historically, three tiers have been identified: Presentation Layer, Business Layer and Data Access Layer (Figure 2).
Figure 2 - Traditional Layers
This traditional layering is still prevalent and popular in the industry. As the names imply:

  • Presentation Layer deals with user interactions
  • Business Layer contains the business logic of the domain
  • Data Access Layer is concerned with the persistence of the domain objects
Sometimes a Service Façade Layer is identified between Business Layer and Presentation Layer which is responsible to organise the business services as a coarsely granular set of services. This layer is usually very thin.

However, Eric Evans in his Domain Driven Design book introduced an alternative layering which was a better fit for DDD: Presentation Layer, Application Layer, Domain Layer and Infrastructure Layer. (Figure 3)
Figure 3 - Layers according to DDD (two alternative interpretations)

In this model, responsibilities of each layer is defined as below:

  • Presentation Layer has the same responsibility as the traditional model: user interaction (user could equally be a machine or human)
  • Application Layer which I prefer to call Workflow Layer is the thin layer that connects and coordinates coarse granular business tasks. This layer can have state regarding the state of a business activity. 
  • Domain Layer is the heart of the system which contains all the business logic
  • Infrastructure Layer is responsible for persistence, passing messages to buses and providing a framework and utility across other layers. As such, all layers could have access to this layer. 

API Layer

In none of the above we saw any mention of an API Layer, so what is an API Layer? API Layer is the outermost layer of a service and is concerned with presentation of a bounded context services through the API. In other words, in an SOA, we can call the Presentation Layer of an SOA Services its API Layer.

This layer was very thin or non-existent in services exposed through SOAP but it is an important layer when it comes to REST. The reason for its lack of importance in SOAP services is the fact that an RPC service can be easily exposed through SOAP with some configuration and little or no coding - using various tools and frameworks that make this possible with a click of a button. WCF is an example of this in the Microsoft stack while IBM Websphere is a popular tool in the Java world capable of achieving this.

API Layer Responsibilities

With REST, however, API requires a distinct layer that is responsible for translating the HTTP semantics to and from the code world. It also responsible for cross-cutting concerns such as monitoring, logging, identity, etc. We will be looking into each and expanding the meaning and examples.

Table 1 - API Layer responsibilities (in REST)
Bear in mind, none of the above involves any domain-related business logic. In other words, API Layer can use a common framework that can be equally used among different services since it is mainly domain-agnostic and does not implement any.business logic. As such, this IBM's online article (Figure 4) turns into How Not To Do API Layer since it puts Domain classes in the API Layer.

Figure 4 - This is IBM's view of an API Layer according to this article.
However, I will call it How Not To Design An API LAYER.
(Source: http://www.ibm.com/developerworks/library/j-ts3/layers.gif)
So let's look into each responsibility in depth. But before we do that it is important to define what we mean by no-business-logic. Since API Layer has an access to underlying layers, it is bound to touch the higher level abstractions of the domain. However it does not implement any of the business logic and simply directs the calls to the layers below.

HTTP Semantics

Talking HTTP is the most important responsibility of a REST API. HTTP provides different axis of sending intent or data which cannot be easily mapped to an RPC method's parameters. This problem is known as the impedance mismatch between RPC and HTTP.

Figure 5 - Impedance mismatch between HTTP/REST and RPC.
Challenge of shaping HTTP requests and responses from RPC methods
is the most important responsibility of the API Layer. (From the book Pro ASP.NET Web API)
Different frameworks provide different solutions to this problem but it is important to bear in mind that exposing a REST endpoint for your service means you need to a lot more to take advantage of all the flexibilities of HTTP messages.

Table 2 lists important HTTP semantics that are responsibility of a REST API Layer.

Table 2 - HTTP Semantics that are responsibility of a REST API Layer
A few points are important to consider. While, none of above are truly dependent on a domain's business logic, it is possible that it could require business decision in some form of configuration. For example, cache expiry interval is usually a concern that the business stakeholders need to understand and happy with. Or Hypermedia exposes the relationship of resources in your API but the API Layer should really just take the natural relationship defined in your domain and expose in HTTP.

Monitoring

If you have an API that you do not monitor its health, performance and usage, then this is not a serious API. Bear in mind that Quality of Service (QoS) is one of the mainstays of an API which can be translated into SLA. Regardless of whether your API is public or private, you have to monitor your API against its baseline.

Identity and Authentication

Very rarely, if ever, an API allows anonymous clients. Even if your API is free, you should build an identity mechanism such as an API Key. API Layer is responsible for turning identity/authentication mechanisms (Basic Authentication, OAuth, OAuth 2.0, Windows Authentication) into rich claim-based identity.

Please note that we did not mention Authorization. Authorization is a concern of deeper layers of the system and fully ingrained with Business Logic - as such is not suited for implementation in the API Layer.

Logging, tracing and documentation

Developing against an API is not an easy task and you should make it very easy for the clients of your API to be able to build, debug and deploy. This is most neglected feature of an API that can endanger success of your public API or private API within a big enterprise.

Monetisation and Capping?

Similar to Authorization, I am on the belief that monetisation and usage capping of an API is heavily dependent on business logic and information that are available in deeper layers. As such, I believe that monetisation and usage capping of an API must be implemented in deeper layers and not in the API Layer.

What about the relationship of the API with Server-side generated views?

A common question that crops up in the stackoverflow is how to lay out the architecture of web applications that historically were based around Server-side MVC Frameworks. For example, should a RoR web application use a Ruby-based API layer? Or if you are in the Microsoft world, should an ASP.NET MVC application use ASP.NET Web API services?

Well it depends. API Layer and Server-side MVC Frameworks both sit at the Presentation Layer. So you should not really add the networking and serialisation overhead of accessing an API Layer from Server-side MVC Frameworks if they are part of the same application. As we said, API Layer is a thin layer that is only responsible with the API centric responsibilities. In most cases, MVC Frameworks should directly use deeper layers of the system and sit side-by-side with the API Layer. However, if API Layer is the presentation layer of an SOA service and the MVC Framework uses multiple APIs to build the views, then it is OK to separate it from the API Layer.

Monday, 7 October 2013

Beware of undisposed or unconsumed HttpResponseMessage

[Level T2] This is just a short post to bring something important to the reader's attention. I guess if you are well into ASP.NET Web API, you are bound to see it - especially if you are consuming APIs using HttpClient.

Let's look at this innocent looking delegating handler (useful on the client side):

public class InnocentLookingDelegatingHandler : DelegatingHandler
{

    protected override Task SendAsync(HttpRequestMessage request, 
        CancellationToken cancellationToken)
    {
        return base.SendAsync(request, cancellationToken)
            .ContinueWith(t =>
                {
                    var response = t.Result;
                    if (response.StatusCode == HttpStatusCode.NotModified)
                    {
                        var cachedResponse = request.CreateResponse(HttpStatusCode.OK);
                        cachedResponse.Content = new StringContent("Just for display purposes");
                        return cachedResponse;
                    }
                    else
                    return response;
                });
    }   
}

There isn't really a lot happening. This delegating handler sniffs the incoming responses and if it intercepts a 304 response, it replaces this with a cached response - of course the implementation is a minimal one just for display purposes.

Now let's use it with a resource which returns 304:

class Program
{
    static void Main(string[] args)
    {
        var client = new HttpClient(new InnocentLookingDelegatingHandler()
                        {
                            InnerHandler = new HttpClientHandler()
                        });
        for (int i = 0; i < 1000 * 1000; i++)
        {
            var request = new HttpRequestMessage(HttpMethod.Get, 
                "http://ajax.googleapis.com/ajax/libs/angularjs/1.0.7/angular.min.js");
            request.Headers.IfModifiedSince = DateTimeOffset.Now;
            var response = client.SendAsync(request).Result;
            response.Dispose();
            Console.Write("\r" + i);
        }
            
    }
}

You can find a ready project to test this on GitHub. Please note that System.Net's maxConnection needs to be set up in the app.config.

So what do we see?

  1. Application starts to leak memory
  2. After a while you get an error telling you have run out of sockets.

We can see the effect using SciTech tool:



Well, reason is the server response in the Task continuation is not getting consumed. What does it mean? Well, adding a line to consume the content is enough:

var cachedResponse = request.CreateResponse(HttpStatusCode.OK);
cachedResponse.Content = new StringContent("Just for display purposes");
response.Content.ReadAsByteArrayAsync().Result; // just read the content which is empty!!
return cachedResponse;

Or simply dispose the response:

var cachedResponse = request.CreateResponse(HttpStatusCode.OK);
cachedResponse.Content = new StringContent("Just for display purposes");
response.Dispose(); // Dispose the response
return cachedResponse;

This should be enough to solve the problem. Now one would think that we have got the response and 304 response actually would never have a content so it is silly to read the content - but well, until you hit this bug.

Thanks to Carl Duguay who reported the issue on CacheCow, the resultant memory leak on 304 responses is fixed now. I think it is very likely that you might run into similar problem so beware when getting a response - always consume or dispose it.

This brings the question that should we always dispose HttpRequestMessage and HttpResponseMessage? Recently, there has many classes that implement IDisposable but they do not require use of dispose pattern. Examples include HttpClient, Task or MemoryStream. On the other hand, you cannot find a single sample that either request or response is used in a Dispose pattern - and the fact that the API is fully async makes use of Dispose pattern very difficult.

In any case, ideas are welcome. But as for the bug we had in CacheCow, it is fixed now.

Wednesday, 25 September 2013

Pro ASP.NET Web API is out!

Some of you who have been reading this humble blog have probably noticed that I have been pretty inactive. While this is true, I have been very busy as one of the authors of the Pro ASP.NET Web API book. I am pleased  - much to my relief and equal joy - to announce that it is finally out!

The process has been a long one and since I joined later in the process, it was hectic at times as I had to manage to keep up with our tight deadline.

I am very grateful for my co-authors Tugberk Ugurlu and Alexander Zeitler, and wonderful people at Apress (Gwenan Spearing, Chris Nelson, Anamika Panchoo ...) who made this possible.

All I can say from my part is that we have cooked the book with the same taste we would have liked to taste any other technical book: practical with real-life examples yet having geeky depth. And above all providing all-round conceptual picture which is hard to grasp by scraping through rims and rims of documentation or various blog posts alone. The book covers why as well as what and we have tried to be read like a book (and not a technical documentation): we have story to tell.

Enough talking. Please provide us with feedbacks positive (or negative) as we would like to hear from your side of story. And if you can, please post a review on Amazon. Happy reading!

Pro ASP.NET Web API

Sunday, 18 August 2013

OWIN and Katana challenges: blues of a library developer

[Level T4] I have to be very careful on how to phrase what I am about to write here. The .NET community celebrates OWIN's integration into .NET HTTP frameworks such ASP.NET Web API, NancyFx and others. This is the result of months and months of hard work and sleepless nights. While I am also jubilant to see this happening, we need to stay objective and keep a self-critical attitude to ensure we make the best of this opportunity. 

TLDR;

Developing a middleware library on top of OWIN that can be equally used by native frameworks is possible but at the current state far from ideal as the overhead of translation between different frameworks can be big. OWIN application and middleware are clearly defined as separate entities in the spec while they are regarded the same in the OWIN implementation itself which renders the distinction redundant.

Introduction

OWIN (Open Web Interface for .NET) is a community effort led by Benjamin Vanderveen and Louis Dejardin to "decouple server and application and, by being an open standard, stimulate the open source ecosystem of .NET web development tools". OWIN consists of a specification which is currently at version 1.0, a set of extension specifications and a small library owin.dll.

Katana is a project lead by Microsoft which provides implementation on top of minimal owin.dll implementation for abstractions such as OwinResquest, OwinResponse and OwinMiddleware. It also provides an HTTP server built on top of HttpListener.

Now each web platform also provides its own plumbing to OWIN.

What OWIN has nailed it

As we saw, the goal of OWIN has been defined as below:

The goal of OWIN is to decouple server and application and, by being an open standard, stimulate the open source ecosystem of .NET web development tools.
There is absolutely no doubt that OWIN has successfully delivered this promise. Regardless of what Server you are using (IIS, Kayak, HttpListener, etc) you can use web framework of choice.




Now this is all well and good. But we do expect more from OWIN. We would like to be able to write a middleware once and use it in any framework. There are many examples of this but main ones are security, formatting, caching, conneg, and alike. This all seemed easy and accessible to me so I set out to implement CacheCow as an OWIN middleware. But the task proved more convoluted than I though.

OWIN application and middleware

Here are how OWIN's application and middleware are defined:
Web Application – A specific application, possibly built on top of a Web Framework, which is run using OWIN compatible Servers.
Middleware – Pass through components that form a pipeline between a server and application to inspect, route, or modify request and response messages for a specific purpose.

Based on the specification (and definition above), an OWIN runtime can be viewed as below: 


Now if we look at the Application Startup section of the sepc, we see applicatrion as the engulfing runtime of the functionality. This is the interface for "application" startup:

public interface IAppBuilder
{

    IDictionary<string, object> Properties { get; }

    IAppBuilder Use(object middleware, params object[] args);

    object Build(Type returnType);

    IAppBuilder New();
}

So our view would probably change to this:


But after more digging, we realise that this cannot represent the use cases because:
  • It is possible to have an application which mixes different web frameworks, i.e. register multiple frameworks each covering a route. Now in this case, is application is composed of all smaller applications or each one is an application?
  • It is possible to have middlewares each based on a different framework. So similar to applications, they can based on a framework.
  • Middlewares does not have to be pass-through, they can be prematurely terminate pipeline chain.
  • Applications can be pass-through. This has been mainly implemented as 404-based chain of responsibility but any implementation is possible.
  • IAppBuilder does not have a notion of application - only middleware. There is no explicit means of registering an application and .Use method has a middleware parameter. Now we all know that this is also meant for applications and all extension methods currently available register applications such as .UseWebApi and .UseNancy.
So in essence, it seems like "application" is a redundant abstraction as far as runtime is concerned. This has no representation in OWIN code and I personally would like it to be removed from the spec or alternatively the definition of middleware and application merged for clarity. So this is how I see OWIN runtime:


ASP.NET Web API and OWIN HTTP Pipeline: Impedance mismatch

ASP.NET Web API provides a very powerful HTTP Pipeline expressed as SendAsync method below:

public Task<HttpResponseMessage> SendAsync(
 HttpRequestMessage request, CancellationToken cancellationToken
)

For those who have worked this pipeline it is clear that building a "middleware" as a DelegatingHandler is very easy.

On the other hand, OWIN provides a similar pipeline with this AppFunc signature:

using AppFunc = Func<
        IDictionary<string, object>, // Environment
        Task>; // Done

In this one, the operation is more or less the same since both request and response exist as part of the dictionary passed in.

As such, one might think that a DelegatingHandler should be very easy to convert to OWIN middleware. I personally expected this to take a few hours and set out to implement CacheCow as an OWIN middleware and show it off in a NancyFx app. I wish that was the case, but as I dig deeper it turned out we are facing hard challenges. At first googling around found this exact question, but to my surprise I found that .UseHttpMessaageHandler() (where a simple handler is returning data) and .UseWebApi() (where a whole ASP.NET Web API application is being set up) have been implemented yet no .UseDelegatingHandler() in site. That did not discourage me as I set out to build one using this little experimental rough and ready bridge:

public class OwinHandlerBridge : HttpMessageHandler
{
    private readonly DelegatingHandler _delegatingHandler;
    private readonly FixedResponseHandler _fixedResponseHandler = new FixedResponseHandler();
    private readonly HttpMessageInvoker _invoker;

    private const string OwinHandlerBridgeResponse = "OwinHandlerBridge_Response";

    private class FixedResponseHandler : HttpMessageHandler
    {
        protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, 
            CancellationToken cancellationToken)
        {
            return Task.FromResult(
                (HttpResponseMessage)
                request.Properties[OwinHandlerBridgeResponse]);
        }

    }

    public OwinHandlerBridge(DelegatingHandler delegatingHandler)
    {
        _delegatingHandler = delegatingHandler;
        _delegatingHandler.InnerHandler = _fixedResponseHandler;
        _invoker = new HttpMessageInvoker(_delegatingHandler);
    }

    protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, 
        CancellationToken cancellationToken)
    {
        var owinContext = request.GetOwinContext();
        request.Properties.Add(OwinHandlerBridgeResponse, owinContext.Response.ToHttpResponseMessage());
        return _invoker.SendAsync(request, cancellationToken);
    }
}

This bridge turns a DelegatingHandler into a HttpMessageHandler so that I can use .UseHttpMessaageHandler(). But there are problems. First of all, I cannot turn use back an OWIN response body since it is implemented as a write-only stream. Also the process of converting Message to and from OWIN to ASP.NET Web API has been reported to be expensive - and prohibitive.

Another issue is also the difference in pipelines so that an OWIN middleware is different from a DelegatingHandler. In ASP.NET Web API, every DelegatingHandler is visited twice in the pipeline (or none in case of a short-circuiting - shown as faded arrows):


In OWIN, a middleware gets the chance to change the request or response only once (or none in case of a short-circuit - shown as faded arrows):


[Correction: since middleware interface is Task-based, practically every component is called twice and you can re-process in the Task continuation, pretty much similar to ASP.NET Web API]

So here is the question, which model should I build my middleware? While many believe it is OWIN, Dominick Baier (@leastprivilege) has already decided to abstract it away from both and created his own model in Thinktecture.IdentityModel to represent an HTTP request and response. Pedro Felix also believes the same.

And the quest for a common HTTP pipeline paradigm continues...


Thursday, 9 May 2013

Performance series - Announcing SuperBenchmarker sb.exe for generating load on your API/Site

[Level T1] Some of you might be doing performance analysis for living, and if so, you have access to expensive performance tools. But for those who are not and get drawn into investigating performance of their own application or their peers, there is really not a great free tool that you can just point to a URL and hammer.

Well, actually there is: Apache benchmark or ab.exe which has served the community since almost 20 years ago. It is free and very performant, i.e. does not require a lot of resources to run so does not affect the performance even if you benchmark your localhost server. But it has limitations (only GET, no HTTPS, no parameterisation) and I could not find a tool so I ended up writing one for my own use. And now I am sharing it.



SuperBenchmarker or sb.exe is a single-exe (ilmerged with dependencies inside) application that can be used to call various endpoints. It can do GET as well as other verbs, can do request templating and you can provide values for placeholders in URL or your template. It also can output a lot of tracing info in case you need it.

It requires .NET 4.5 but you can target any site or API written in Ruby, PHP, Java, ect. You can download it from here if you do not want to use NuGet. [Update: It has been reported that some people have had issue running on x64 windows. If so please comment on this GitHub issue]

So version 0.1 is out and it works although project needs some love with tests and a few features. There are also some shortcuts taken which needs to be sorted out - but it works.You can clone the code and build it (running build.ps1) or simply use NuGet to download it. [Update: preferred method is using Chocolatey - see below]

PM> Install-Package SuperBenchmarker

For usage, see the wiki on the GitHub. I value feedbacks, also if interested to get involved with the project let me know.



Update

Courtesy of Mike Chaliy, this has now a chocolatey package. So if you have chocolatey installed (and if not here are the instructions), simply run this command in command line:

cinst SuperBenchmarker

Info

Usage:
sb.exe -u url [-c concurrency] [-n numberOfRequests] [-m method] [-t template] [-p plugin] [-f file] [-d] [-v] [-k] [-x] [-q] [-h] [-?]
Parameters:
-u Required. Target URL to call. Can include placeholders.
-c Optional. Number of concurrent requests (default=1)
-n Optional. Total number of requests (default=100)
-m Optional. HTTP Method to use (default=GET)
-p Optional. Name of the plugin (DLL) to replace placeholders. Should contain one class which implements IValueProvider. Must reside in the same folder.
-f Optional. Path to CSV file providing replacement values for the test
-d Optional. Runs a single dry run request to make sure all is good (boolean switch)
-v Optional. Provides verbose tracing information (boolean switch)
-k Optional. Outputs cookies (boolean switch)
-x Optional. Whether to use default browser proxy. Useful for seeing request/response in Fiddler. (boolean switch)
-q Optional. In a dry-run (debug) mode shows only the request. (boolean switch)
-h Optional. Displays headers for request and response. (boolean switch)
-? Optional. Displays this help. (boolean switch)


Examples:

-u http://google.com
-u http://google.com -n 1000 -c 10
-u http://google.com -n 1000 -c 10 -d (runs only once)
-u http://localhost/api/myApi/ -t template text (file contains headers to be sent for GET. format is same as HTTP request)
-u http://localhost/api/myApi/ -m POST -t template.txt (file contains headers to be sent for POST. format is same as HTTP request with double CRLF separating headers and payload)
-u http://localhost/api/myApi/{{{ID}}} -f values.txt (values file is CSV and has a column for ID)-u http://localhost/api/myApi/{{{ID}}} -p myplugin.dll (has a public class implementing IValueProvider defined in this exe)
-u http://google.com -h (shows headers)
-u http://google.com -h -q (shows cookies)
-u http://google.com -v (shows some verbose information including URL to target - especially useful if parameterised)

Monday, 1 April 2013

Monitor your ASP.NET Web API application using your own custom counters

[Level T2] OK, so you have created your Web API project and deployed into production and now boss says dude, we have performance problems. Or maybe head of testing wants to benchmark the application and monitor it over the course of next releases so we catch problems early. Or you are just a responsible geek interested in the performance of your code.

NOTE: THIS BLOG POST REFERS TO AN EARLIER VERSION OF PERFIT. PLEASE VISIT https://github.com/aliostad/PerfIt FOR UP-TO-DATE DOCUMENTATION

In any case, no serious web code can be written without having performance in mind. Problem is that existing system, .NET and ASP.NET performance counters can go as far as telling you overall picture of the performance and the metrics usually coalesced into a single value while you need to drill down to individual APIs and see what's happening. Now this can be another burden on your already squeezed time remaining. So how easy is it to publish counters for your individual API? Just these few steps! TLDR; :

  1. Use NuGet to install PerfIt! into your ASP.NET Web API project (make sure you get version +0.1.2 - there was a bug fixed in this version)
  2. Add PerfitDelegatingHandler to the list of your MessageHandlers
  3. Decorate those actions you want to monitor
  4. Use "Add New Item" to add an Installer class into your Web API project. Write just a single line of code for Install and Uninstall.
  5. Use InstallUtil.exe in an administrative command window to install (or uninstall) your counters. Done! 

Seeing performance counters of your own project is kinda cool!

1-Adding PerfIt! to your project

So to add PerfIt! to the project, simply use NuGet console:
PM> Install-Package PerfIt


2-Adding PerfItDelegatingHandler 

Now we need to add the delegating handler:

config.MessageHandlers.Add(new 
      PerfItDelegatingHandler(config, "My test app"));

The string passed in here is the application name. This will be used as the instance name in the performance counter. You can see that in the screenshot above.

3-Decorate actions

For any action that you want the counters to be published, use PerfIt action filter and define the counters you want to see published:

// GET api/values
[PerfItFilter(Description = "Gets all items",
   Counters = new []{CounterTypes.TotalNoOfOperations,
   CounterTypes.AverageTimeTaken})]
public IEnumerable<string> GetAll()
{
    Thread.Sleep(_random.Next(50,300));
    return new string[] { "value1", "value2" };
}

// GET api/values/5
[PerfItFilter(Description = "Gets item by id", 
   Counters = new[] { CounterTypes.TotalNoOfOperations, 
   CounterTypes.AverageTimeTaken })]
public string Get(int id)
{
    Thread.Sleep(_random.Next(50, 300));
    return "value";
}

Here we have decorated GetAll and Get to publish two types of counters (currently these counters are available but more will be added - see below).

The format of the counter will be [controller].[action].[counterType] (see screenshot above). As such, please note that we had to change the first Get to GetAll to so that the counter names do not get mixed up. If you cannot do that (while ASP.NET Web API allows you do change the name as long as the action starts with the verb), alternatively use the Name property of the filter to define your own custom name (which we did not specify here to allow default naming take place).

Description property will appear in the perfmon.exe window and is known as CounterHelp. Counters is an array of string, each string being a defined against PerfIt! runtime (see roadmap for more info) as a counter type. Another option is the ability to define Category for the counter which we also did not specify so by default, it will be the assembly name. You can see the category in the screenshot above as PerfCounterWeb (see the typo!).

4-Adding Installer to your project 

Now use "Add New Item" to add an installer class to your project. You might have not seen this in the new item templates but it is definitely there (screenshot from VS2012 but project is .NET 4.0 so can be done in VS2010):


After adding the class, override Install and Uninstall methods and add the code below (hit F7 to see the code):

public override void Install(IDictionary stateSaver)
{
    base.Install(stateSaver);
    PerfItRuntime.Install();
}

public override void Uninstall(IDictionary savedState)
{
    base.Uninstall(savedState);
    PerfItRuntime.Uninstall();
}

5-Use installutil.exe to install counters

Just make sure you open an administrative command window. Use Visual Studio command prompt since it has the right path for InstallUtil.exe. cd to your bin folder and register your assembly:
c:\projects\myproject\bin>InstallUtil.exe -i MyWebApplication.dll
This should do the trick. Use -u switch to uninstall the counters.

That is all that is needed. Just hit your app and start benchmarking your app in the perfmon.exe.

Turning off the publishing of counters

In normal/production circumstances, you might wanna turn off publishing of the performance counters. In this case, you can put the line below in the appSettings of your web.config:

<appSettings>
    <add key="perfit:publishCounters" value="false"/>

I have kept the default behaviour to publish counters - so to eliminate one configuration step to get up and running. This may change in the future - but unlikely.

PerfIt! Roadmap

I needed to add performance counters to my Web API application. With wife away and a few Easter bank holiday days, I managed to start and finish version 0.1 of PerfIt! library.

Future work includes adding more counter types and looking at improving the pipeline. PerfIt! has been built on the top of its own extensibility framework so very easy to add your own counters, all you need is to implement CounterHandlerBase and then register the handler in PerfItRuntime.HandlerFactories.

Any problems, issues, comments or feedbacks, please use the GitHub issues to get it touch. Enjoy!

Thursday, 21 March 2013

Performance series - Memory leak investigation Part 1

[Level T2] If you have worked a few years within the industry, it is very unlikely that you have not encountered a case of memory leak. Whether your own code or someone else's, it is an annoying problem to diagnose and fix. The problem's scale and the effort to fix it increases exponentially with the scale and complexity of the application and deployment.

Many companies use application benchmarking and performance testing to find these problems early on. But as I experienced myself, problems found earlier as part of performance testing are not necessarily easier to solve. Lengthy cycle of gathering metrics, analysis, coming up with one or more hypothesis/es and then fixing and trying and testing it could be very time consuming and jeopardising delivery and meeting deadlines.

In these cases, focusing on what is important and what is not saves the day. A common problem in troubleshooting performance bugs is information overload and the existence of red herrings that confuse the picture. There is a myriad of performance counters that you could spend hours and days explaining their anomalies that have nothing to do with the problem you are trying to solve. Having worked as a medical doctor before, I have seen this in the field of medicine many many times. Human's body as a very big application - eternally more complex that any given application - demonstrates similar behaviour and I have learnt to be focused on identifying and fixing the problem at hand rather than explaining each and every oddity.

As such, I am starting this series with an overview of the performance counters and tools to be used. There are many posts and even books on this very topic. I am not claiming that this series is a comprehensive and all-you-need guide. I am actually not claiming anything. I am simply sharing my experience and hope it will make your troubleshooting journey less painful and help you find the culprit sooner.

A few notes on the platform

This is exclusively a Windows and mainly a .NET series - although part of this series could help with none .NET applications. Also focus is mainly on ASP.NET web applications as the impact of even a small memory leak can be very big but the same guidelines can be equally applied to desktop applications.

What is a memory leak?

Memory leak is usually considered as the memory which is allocated by the application and becomes inaccessible - as such cannot be deallocated. But I do not like this definition. For example this code does not qualify for memory leak but it will lead to out of memory:

var list = new List<byte[]>();
for(i=0; i< 1000000; i++)
{
   list.Add(new byte[500 * 1024]); // 500 KB
}

As can be seen, data is accessible to the application but application does not unload the allocated memory and keeps piling up memory allocation in heap.

For me, memory leak is a constant/stepwise allocation of memory without deallocation which usually leads to OutOfMemoryException. The reason I say usually is because small leaks can be tolerated as many desktop applications are closed at the end of the day and many website daily recycle their app pools. However, they are still leaks and you have to fix them when you get the time since with change in conditions, leaks that you could tolerate can bring down your site/application.

Process memory/time in 4 different scenarios

In the top right diagram, we see constant allocation but it coincides with deallocation of memory - as such not a leak. This scenario is common in ASP.NET Caching (HttpRuntime.Cache) where 50% of the cache is purged when memory used reaches a certain threshold.

In the bottom right diagram, de-allocation does not happen or its rate is the same as the allocation when the memory size of the application's process reaches a threshold. This pattern can be seen with the SQL Server where it uses as much memory as it can until it reaches a threshold.

How do I establish a memory leak?

All you have to do is to ascertain your application meets the criteria described above. Just observe the memory usage over time under load. It is possible that the leak happens only in a certain condition or when using a particular functionality of the app so you have to be careful about that.

What tool do you need to do that? Even using a Task Manager could be good enough to start with.

Tools to use 

Normally you would start with Task Manager. Just a quick look at the Task Manager or eye-balling the memory size of the process when the app is running and under load is a simple but effective start.

The mainstay of the memory leak analysis is Windows Performance Monitor, aka Perfmon. This is very important in establishing a benchmark, monitoring changes over new version releases and identifying problems. In this post I will look into some essential performance counters.

If you are investigating a serious or live memory leak, you have to have a .NET Memory Profiler. Currently 3 different profilers exist:

  1. RedGate's ANTS Profiler
  2. Jetbrain's dotTrace
  3. Scitech's Memory Profiler

Each tool has its own pros and cons. I have experience with SciTech's and I must say I am really impressed by it.

The most advanced tool is Windbg which is a really useful tool but has a steep learning curve. I will look at using it for memory leak analysis in future posts.

Initial analysis of the memory leak

Let's look at a simple memory leak (you can find the snippets here on GitHub):

var list = new List<byte[]>();
for (int i = 0; i < 1000; i++)
{
    list.Add(new byte[1 * 1000 * 1000]);
    Thread.Sleep(200);
}

Process\Private Bytes performance counter is a popular measure of total memory consumption. Let's look at this counter in this app:


So how do we find out if it is a managed or unmanaged leak? We use .NET CLR Memory\#Bytes in all heaps counter:


As can be seen above, white private bytes increases constantly, heap allocation happens in steps. This stepwise increase of heap is consistent with a managed memory leak. In contrast let's look at this unmanaged leak code:

var list = new List<IntPtr>();
for (int i = 0; i < 400; i++)
{
    list.Add(Marshal.AllocHGlobal(1*1000*1000));
    Thread.Sleep(200);
}
list.ForEach(Marshal.FreeHGlobal);

And here is flat bytes in heap:


So this case is an unmanaged memory leak.

In the next post I will look at a few more performance counter and GC issues.