Saturday, 18 January 2014

Am I going to keep supporting my open source projects?

I have recently been asked by a few whether I will be maintaining my open source projects, mainly CacheCow and PerfIt.

I think since my previous blog post, some have been thinking that I am going to completely drop C#, Microsoft technologies and whatever I have been building so far. There are also some concerns over the future of the Open Source projects and whether they will be pro-actively maintained.

If you read my post again, you will find out that I have twice stressed that I will keep doing what I am doing. In fact I have lately been doing a lot C# and Azure - like never before. I am really liking Windows Azure and I believe it is a great platform for building Cloud Applications.

So in short, the answer is a BIG YES. I just pushed some code in CacheCow. and released CacheCow.Client 0.5.0-alpha which has a few improvements. I have plans for finally releasing 0.5 but it has been delayed due to my engagement with this other project which I am working on - as I said C# and Azure.

So please keep PRs, Issues, Bugs and wish lists coming in.

With regard to my decision and recent blog, this is a strategy change hoping to position myself in 2 years to be able to do equally non-Microsoft work if I have to. I hope it is clear now!


Saturday, 21 December 2013

Thank you Microsoft, and so long ...

[Level C1]

Ideas and views expressed here are purely my own and none of my employer or any related affiliation.

Yeah, I know. It has been more than 13 years. Still remember how it started. It was New Years Eve of the new millennium (yeah year 2000) and I was standing on the shore of the Persian Gulf in Dubai. With my stomach full of delicious Lebanese grilled food and a local newspaper in hand looking for jobs. I was a frustrated Doctor, hurt by the patient abuse and discouraged by the lack of appreciation and creativity in my job. And the very small section of medical jobs compared to pages, after pages, after pages of computer jobs. Well, it was the dot.com boom days, you know, it was the bubble - so we know now. And that was it. A life-changing decision to self-teach myself network administration and get MCSE certificate.

Funny thing is I never did. I got distracted, side-tracked by something much more interesting. Yeah programming, you know, when it gets its hooks in you, it doesn't leave you - never. Hence it started by Visual Basic, and then HTML, and obviously databases which started from Microsoft Access. Then you had to learn SQL Server, And not to forget web programming: ASP 3.0. I remember I printed some docs and it was 14 pages of A4. It had everything I needed to write ASP pages, connect to database, store, search, everything. With that 14 pages I built a website for some foreign Journalists working in Iran: IranReporter - It was up until a few years ago. I know it looks so out-dated and primitive but for a 14-page documentation not so bad.

Then it was .NET, and then move to C# and the rest of the story - I don't bore you with. But what I knew was that the days of 14-page documentation were over. You had to work really hard to keep up. ASP.NET Webforms which I never liked and never learnt.  With .NET 3.0 came LINQ, WCF, WPF and there was Silverlight. And there was ORMs, DI frameworks, Unit Testing and mocking frameworks. And then there was Javascript explosion.

And for me, ASP.NET Web API was the big thing. I have spent best part of the last 2 years, learning, blogging, helping out, writing OSS libraries for it and then co-authoring a book on this beautiful technology. And I love it. True I have not been recognised for it and am not an MVP - and honestly not sure what else could do to deserve to become one - but that is OK, I did what I did because I loved and enjoyed it.

So why Goodbye?

Well, I am not going anywhere! I love C# - I don't like Java. I love Visual Studio - and eclipse is a pain. I adore ASP.NET Web API. I love NuGet. I actually like Windows and dislike Mac. So I will carry on writing C# and use ASP.NET Web API and Visual Studio. But I have to shift focus. I have no other option and in the next few lines I try to explain what I mean.

Reality is our industry is on the cusp of a very big change. Yes, a whirlwind of the change happened already over the last few years but well, to put it bluntly I was blind.

So based on what I have seen so far I believe IN THE NEXT 5 YEARS:

1. All data problems become Big Data problems

If you don't like generalisations you won't like this one too - when I mean all I mean most. But the point is industries and companies will either shrink or grow based on their IT agility. The ones shrinking not worth talking about. The ones growing will have big data problems if they do not have them yet. This is related to the growth of internet users as well as realization of Internet of Things. And the other reason is the adoption of Event-Based Architecture which will massively increase the amount of data movement and data crunching.

Data will be made of what currently is not data. Your mouse move and click is the kind of data you would have to deal with. Or user eye movement. Or input from 1000s of sensors inside your house. And this stuff is what you will be working on day-in day-out.

Dear Microsoft, Big Data solutions did not come from you. And despite my wishes, you did not provide a solution of your own. I love to see an alternative attempt and a rival to Hadoop to choose from. But I am stuck with it, so better embrace it. There are some alternatives emerging (real-time, in-memory) but again they come from the Java communities.

2. Any server-side code that cannot be horizontally scaled is gonna die

Is that even a prediction? We have had web farms for years and had to think of robust session management all these years. But something has changed. First of all, it is not just server-stickiness. You have to recover from site-stickiness and site failures too. On the other hand, this is again more about data.

So the closer you are to the web and further from data, the more likely that your technology will survive. For example, API Layer technologies are here to stay. As such ASP.NET Web API has a future. Cannot say the same thing for middleware technologies, such as BizTalk or NServiceBus. Databases? Out of question.

3. Data locality will still be an issue so technologies closer to data will prevail

Data will still be difficult to move in the years to come. While our bandwidth is growing fast, our data is growing faster so this will probably be even a bigger problem. So where your data is will be a key in defining your technologies. If you use Hadoop, using Java technologies will make more and more sense. If you are Amazon S3, likelihood of using Kafka than Azure Service Bus. 

4. We need 10x or 100x more data scientists and AI specialists

Again not a prediction. We are already seeing this. So perhaps I can sharpen some of my old skills in computer vision.

So what does it mean for me?

As I said, I will carry on writing C#, ASP.NET Web API and read or write from SQL Server and do my best to write the best code I can. But I will start working on my Unix skills (by using it at home) and pick up one or two JVM languages (Clojure, Scala) and work on Hadoop. I will take Big Data more seriously. I mean real seriously... I need to stay close to where innovation is happening which sadly is not Microsoft now. From the Big Data to Google Glass and cars, it is all happening in the Java communities - and when I mean Java communities I do not mean the language but the ecosystem that is around it. And I will be ready to jump ships if I have to.

And still wish Microsoft wakes up to the sound of its shrinking, not only in the PC market but also in its bread and butter, enterprise.

PS: This is not just Microsoft's problem. It is also a problem with our .NET community. The fight between Silverlight/XAML vs. Javascript took so many years. In the community that I am, we waste so much time fighting on OWIN. I had enough, I know how to best use my time.

Sunday, 1 December 2013

CacheCow 0.5.0-alpha released for community review: new features and breaking changes

[Level T3]

CacheCow 0.5 was long time coming but with everything that was happening, I am already 1 month behind. So I have now released it as alpha so I can get some feedback from teh community.

As some of you know, a few API improvements had been postponed due to breaking changes. On the other hand, there were a few areas I really needed to nail down - the most important one being resource organisation. While resource organisation can be anything, most common approach is to stick with the standards Web API routing, i.e. "/api/{controller}/{id}" where api/{controller} is a collection resource and api/{controller}/{id} is an instance resource.

I will go through some of the new features and changes and really appreciate if you spend a bit of time reviewing and sending me feedbacks.

No more dependency on WebHost package

OK, this has been asked by Darrel and Tugberk and only made sense. So in version 0.4 I added dependency to HttpConfiguration but in order not to break the compatibility, it would use GlobalConfiguration.Configuration in WebHost by default. This has been removed now so an instance of HttpConfiguration needs to be passed in. This is a breaking change but fixing it should be a minimal change in your code from:
var handler = new CachingHandler(entityTagStore);
config.MessageHandlers.Add(handler);
You would write:
var handler = new CachingHandler(config, entityTagStore);
config.MessageHandlers.Add(handler);

CacheKeyGenerator is gone

CacheKeyGenerator was providing a flexibility for generating CacheKey from resource and Vary headers. Reality is, there is no need to provide extensibility points. This is part of the framework which needs to be internal. At the end of the day, Vary headers and URL of the resource are important for generating the CacheKey and that is implemented inside CacheKey. Exposing CacheKeyGenerator had lead to some misuses and confusions surrounding its use cases and had to go.

RoutePatternProvider is reborn

A lot of work has been focused on optimising RoutePatternProvider and possible use cases. First of all, signature has been changed to receive the whole request rather than bits and pieces (URL and value of Vary headers as a SelectMany result).

So the work basically looked at the meaning of Cache Key. Basically we have two connected concepts: representation and resource. A resource can have many representations (XML, image, JSON, different encoding or languages). In terms of caching, representations are identified by a strong ETag while resource will have a weak ETag since it cannot be guaranteed that all its representations are byte-by-byte equal - and in fact they are not.

A resource and its representations

It is important to note that each representation needs to be cached separately, however, once the resource is changed, the cache for all representations are invalidated. This was the fundamental area missing in CacheCow implementation and has been added now. Basically, IEntityTagStore has a new method: RemoveResource. This method is pivotal and gets called when a resource is changed.

The change in RoutePatternProvider is not just a delegate, it is now an interface: IRoutePatternProvider. Reason for the change to delegate is that we have a new method in there too: GetLinkedRoutePatterns. So what is a RoutePattern?

RoutePattern is basically similar to a route and its definition conventions in ASP.NET, however, it is meant for resources and their relationships. I will write a separate post on this but basically RoutePattern is in relation to collection resources and instance resources. Let's imagine a car API hosted at http://server/api. In this case, http://server/api/car refers to all cars while http://server/api/car/123 refers to a car with Id of 123. In order to represent this, we use * sign for collection resources and + sign for instance:

http://server/api/car         collection         http://server/api/car/*
http://server/api/car/123     instance           http://server/api/car/+  

Now normally:

  • POST to collection invalidates collection
  • POST to instance invalidates both instance and collection
  • PUT or PATCH to instance invalidates both instance and collection
  • DELETE to instance invalidates both instance and collection

So in brief, change to collection invalidates collection and change to instance invalidates both.

In a hierarchical resource structure this can get even more complex. If /api/parent/123 gets deleted, /api/parent/123/* (collection) and /api/parent/123/+ will most likely get invalidated since they do not exist any more. However, implementing this without making some major assumptions is not possible hence its implementation is left to the CacheCow users to tailor for their own API.

However, since a flat resource organisation (defining all resources using "/api/{controller}/{id}") is quite common, I have implemented IRoutePatternProvider for flat structures in ConventionalRoutePatternProvider which is the default.

Prefixing SQL Server Script names with Server_ and Client_

The fact was that SQL Server EntityTagStore (on the server package) and CacheStore (on the client package) had some stored procedure name clashes preventing use of the same database for both client and server component. This was requested by Ben Foster and has been done now.

Please provide feedback

I hope this makes sense and waiting for your feedbacks. Please ping me on twitter or ask your questions in Github by raising issue.

Happy coding ...

Thursday, 21 November 2013

HTTP Content Negotiation on higher levels of media type

[Level T3]

TLDR; [This is a short post but anyhow] You can now achieve content negotiation on Domain Model or even Version (levels 4 and 5) in the same ASP.NET Web API controller.

As I explained in the previous post, there are real opportunities in adopting 5LMT. On one hand, semantic separation of all these levels of media type helps with organic development of a host of clients with different capabilities and needs. On the other hand, it solves some existing challenges of REST-based APIs.



In the last post we explored how we can have a Roy-correct resource organisation (URLs I mean, of course) and take advantage of content-based action selection which is not natively possible in ASP.NET Web API.

Content Negotiation is an important aspect of HTTP. This is the dance between client and server to negotiate on the optimum media type to get a resource. Most common implementation of content negotiation is the server-side one which server decides the format based on the client's preferences expressed by the content of Accept, Accept-Language, Accept-Encoding, etc headers.

For example, the page you are viewing is probably the result of this content negotiation:

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8

This basically mean give me HTML or XHTML if you can (q=1.0 is implied here), if not XML should be OK, and if not then I can live with image/webp format - which is Google's experimental media type as I was using Chrome.

So, let's have a look at media types expressed above

text/html  -> Format
application/xhtml+xml -> Format/Schema
image/webp  -> Format

So as we can see, content negotiation at this request mainly deals with format and schema (levels 2 and 3). This level of content negotiation comes out of the box in ASP.NET Web API. Responsibility of this server-side content negotiation falls on MediaTypeFormatters and more importantly  IContentNegotiator implementations. The latter is a simple interface taking the type to be serialised, the request and a list of formatters and returns a ContentNegotiationResult:

IContentNegotiator

public interface IContentNegotiator
{
    ContentNegotiationResult Negotiate(Type type, HttpRequestMessage request, 
        IEnumerable<mediatypeformatter> formatters);
}

This is all well and good. But what if we want to ask for SomeDomainModel or AnotherDomainModel(level 4)? Or even more complex: asking for a particular version of the media type (level 5)? Unfortunately ASP.NET Web API does not provide this feature out of the box, however, using 5LMT this is very easy.

Let's imagine you have two different representation of the same resource - perhaps one was added recently and can be used by some clients. This representation can be a class with added properties or even more likely, removal of some existing properties which results in a breaking change. How would you approach this? Using 5LMT, we add a new action to the same controller returning the new representation:

public class MyController
{
    // old action
    public SomeDomainModel Get(int id)
    {
        ...
    }

    // new action added
    public AnotherDomainModel GetTheNewOne(int id)
    {
        ...
    }
}
And the request to return each of will be based on the value of the accept
GET /api/My
Accept: application/SomeDomainModel+json
or to get the new one:
GET /api/My
Accept: application/AnotherDomainModel+json
In order to achieve this, all you have to do is to replace your IHttpActionSelector in the Services of your configuration with MediaTypeBasedActionSelector.

This is the non-canonical representation of media type. You may also use the 5LMT representation which I believe is superior:
GET /api/My
Accept: application/json;domain-model=SomeDomainModel
or to get the new one:
GET /api/My
Accept: application/json;domain-model=AnotherDomainModel

In order to use this feature, get the NuGet package:
PM> Install-Package AspNetWebApi.FiveLevelsOfMediaType
and then add this line to your WebApiConfig.cs:
config.AddFiveLevelsOfMediaType();
MediaTypeBasedActionSelector class provides two extensibility points:

  • domainNameToTypeNameMapper which is a delegate and gets the name passed in the domain-name parameter and returns the class name of the type to be returned (if they are not the same although it is best to keep them the same)
  • Another one which deserves its own blog post, allows for custom versioning.

Happy coding...

Wednesday, 13 November 2013

Content-based action selection in ASP.NET Web API

[Level T3]

TLDR; You can now achieve content-based action selection in ASP.NET Web API

You probably will not remember but last year, I proposed an implementation of media type based on parameters that separates different levels of media type - Five Levels of Media Type or 5LMT. This discussion is mainly theoretical and focuses on media type semantics and problems of the popular (yet non-canonical) approaches to media types that are prevalent in the community. The article identified 5 levels (human illegibility, format, schema, domain model and version) and separating these will help different clients with different capabilities to use levels they are capable of understanding without having to fully comprehend all levels.

But really, is this not just nitpicking on trivial issues and arguing issues which have to practical importance? Well, I think not - and here is why. If you would like to see some code instead of endless dry discussions, you will not be disappointed.

If you visit StackOverflow and look for ASP.NET Web API (or alternatively REST APIs), there are many questions on REST resource design. Coming from an RPC mindset, it is difficult to map all methods to GET, POST, PUT and DELETE. And sometimes we run out of verbs and we resort back to RPC-style APIs.

Let's imagine you are designing a digital music player API and you have identified these resources and operations:

Play  PUT  /api/player
Stop  DELETE  /api/player
Vol up  POST  /api/player/volume
Vol down POST  /api/player/volume
Next  POST  /api/player/song
Previous POST  /api/player/song
Back to first PUT  /api/player/song

This is a RESTful resource design (note idempotent verbs have been mapped to PUT or DELETE) and looks good. But there is a problem: we have same verbs twice or more on some routes and ASP.NET Web API will throw exception and complain about ambiguous actions.

At this time, some finally give in to defining /api/player/song/next and /api/player/song/previous i.e RPC-style URLs. And this is a downward spiral when you resources are verbs rather than noun.

The solution is to go back to REST. One of the tenets of REST is self-descriptive messages as such we should be able to express our intention with the HTTP messages. In other words, sending this message to up the volume:
POST /api/player/volume HTTP/1.1
Content-Type: application/json;domain-model=IncreaseVolumeCommand

{"notch":"11"}
And that is it. Our HTTP message provides enough information for our Web API to distinguish as long as our action has been defined as something like

public void Post(IncreaseVolumeCommand command)
{
   ...
}
or
public void Post(HttpRequestMessage request, IncreaseVolumeCommand command, ...)
{
   ...
}
And how is this achieved? Well, a little utility in 5LMT library called MediaTypeBasedActionSelector does it. To get the library, just add the NuGet package:
PM> Install-Package AspNetWebApi.FiveLevelsOfMediaType
And then your WebApiConfig class, add the line below:
config.AddFiveLevelsOfMediaType();
What this line does is to replace default action selector with MediaTypeBasedActionSelector and wrap your media type formatters in a decorator. As such, make sure you add this line after you have done all changes to your media type formatters.

After this change, your API will send 5LMT parameters. For example:
Content-Type:application/json; charset=utf-8; domain-model=InventoryItemDetail; version=1.0.0.0; format=application%2fjson; schema=application%2fjson; is-text=True

Another useful feature is the support for non-canonical media types. For example you can send this HTTP request instead of the POST discussed above:
POST /api/player/volume HTTP/1.1
Content-Type: application/vnd.MyCompany.IncreaseVolumeCommand-1.0.0.0+json

{"notch":"11"}

While use of this media type is discouraged by RFC 2616 (since above media type is not registered by IANA), this still works with 5LMT library.

One last thing is that this library is dependent on AspNetWebApi.ApiActionSelector which makes action selection implementation of ASP.NET Web API extensible. Unfortunately, most of the implementation is internal and does not expose extensibility points and AspNetWebApi.ApiActionSelector had to achieve this by a lot of cope and paste :)

The code for this library is hosted on GitHub.

Happy coding...

Sunday, 27 October 2013

REST APIs and restaurants [Take II*]

[Level C3]



 - "... and sauteed with the finest aromatic saffron - we import directly from our suppliers in Tibet - and garnished with the finest quality ocra from the heart of Morocco and ..."

- : ".. hmmm, bu..."

- : " and Sir, the King Shrimps day fresh from our suppliers in Belgium mildly cooked with a hint of lime zest and aromatic Thai basil snuggled with a swirl of Cornish Creme Fraiche and blossoms of Burmese jasmine and ..."

- : "... I mean, ..."

- : " and I should remind you that our head chef has used a special blend of exotic spices, marinated for 42 minutes exact, to create a sensual oriental effect of ... "

- : "I REALLY DON'T CARE! This stuff is awful!"



⚑  ⚑  ⚑

In more than one way a REST API is similar to a restaurant. They both hide a very complex process of accomplishing a service (serving food or data) behind a nice, warm and welcoming presentation. And we need to bear in mind that presentation is the key: no matter what goes on behind the scene in the kitchen, the food and its presentation is the key. And your API is the menu, the door to the services provided by the server. Whatever goes on in the kitchen (your server's internal domain) is not the concern of your customers as long as the output and presentation is good.

Commonly, the perspective in which we look at our services is wrong. They are server-centric. [I have talked before about the definition of server (and client) and their interaction] In brief, when we talk about the server, this is what comes in mind (as per CSDS):


This view is useful when building a server. It highlights inherent complexity of the server and how it should hide its complexity behind its API. But it is way too easy to get attached to the bird's eye view of the server where all internals of the server is visible.

And it is too easy to forget that this is how the client sees the server:

Client perspective of a server, where it sees a black box (here it's blue!) and
the gateway to the server is through its API.
When talking of the domain in an API, we are very likely to mean the server's private domain - its internals and complex process. Reality is server's internal domain does not surface beyond the API layer. API Layer is responsible to define resources which in turn define a coherent public domain composed of API and its messages - abstracted away from server's internal domain. If the server is an online retail API, I can create, edit or cancel orders. The fact that this is achieved through contact to many suppliers and their own custom processes is not a concern of the client.

In fact, in the same way that kitchen of a restaurant and its dining area are completely abstracted away, server's public domain is an abstraction on the top of server's internal domain. This is good for the chef and the customer. It is true the food that is served comes out of the kitchen but the whole process of preparation is irrelevant for the customer. 

Why talking of server's public domain is important? Because we commonly look down on the public domain. When we talk of the DTO messages passed to and fro, we regard them as View Models so they are not even proper models, they are just DTOs, no behaviour just some property bag. We use a mapper to strip down the sensitive data from our real models and spit out the view models to the client. And when we talk about DDD, we only mean server - we think view models do not deserve the love of care of DDD.

But this results in a brittle API that has to be changed by every change of the server. It becomes a cut down version of the server's internal API. We all know examples of good and bad API but too many changes in an API is a sign of bad design usually from Server Chauvinism. So always do a separate DDD for your server's public domain. It might sound strange since I am in favour of no-business-logic-in-api-layer. Well, it is true - we do not implement any business logic in the API Layer but the biggest responsibility of API Layer is talking HTTP. This translates to defining resources in REST.

So practice DDD in your API Layer. Define its aggregate roots, entities and value types - it is simplistic to think such concepts can be entirely hidden from the clients. Specify the commands and queries - with commands usually being the payload of the PUT, DELETE and POST and queries usually expressed in the URL itself to make them cache friendly.



What we see in the above picture define server's public domain: it is API (set of resources), input messages (commands and queries) and output messages (entities and value types). The case of queries - as we said - is a bit special since it is normally expressed in URL. This has the benefit of cacheability in contract with sending a payload (using other methods) which not suitable for caching. As an example, here is twitter's API for getting status timelines:

Get /1.1/statuses/user_timeline.json?screen_name=twitterapi&count=2

Defining parameters as query strings is useful as they can also imply optionality. One downside to this approach is the fact that ordering of parameters cannot be expressed in query string and while any combination can be valid, semantically equal combinations will be considered different resources and cached separately. For example, screen_name=twitterapi&count=2 and screen_name=twitterapi&count=2 are semantically equal while they are considered different resources. This reduces effectiveness of caching for GET requests. For this reason, URL segments separated by / are superior although in a big API can result in prohibitive complexity of routing.

Items returned in this API are Entities. it is evident by having an ID (id_str) which is returned as part of data.


Looking at what each timeline contains, we can see list of urls, where each url returned is a Value Type. It has three properties: url, indices and expanded_url.

In the case of commands, we have a similar situation. Looking at an example in twitter, tweeting (sending a tweet) is achieved using this command:

POST /1.1/statuses/update.json

status=Maybe%20he%27ll%20finally%20find%20his%20keys.%20%23peterfalk

Perhaps a better implementation would have been to omit the update.json totally - as this naming has an RPC smell. Using singular or plural is mainly down to taste - although I prefer singular. The command can be expressed by the simple class below:



Not all commands have to have a payload. Deleting a tweet can get send an ID in the URL with the DELETE command to /1.1/statuses/update.json although actual twitter API uses a POST instead.

Resources are basically all that can be accessed through the API although a more useful definition is the grouping around identities. For example Flickr API defines cameras, people, photos, photo comments, etc. It is true that Flickr is not considered a good example of a REST API when it comes to HTTP semantics, however, it does a good job in defining a coherent domain represented by resources while hiding its internals. A simple example is how it sends you previous and next photo instead of letting the client guess what these are based on the internal implementations.

In brief, it is responsibility of the API to create a coherent set of resources that can layout a cohesive picture in front of the client.

Common Anti-Patterns

As we said, the main danger is to have a server-centric perspective and neglect the public domain of your API.

While performing appropriate-sized DDD exercise is important, we need to be careful not expose too much of the server. One common anti-pattern is exposing server's internal to outside world. OData is a RESTful protocol which defines semantics for querying (mainly tabular) data store.

GET /api/products?$filter=name eq 'Gizmo 3'&$orderby=price desc&$top=1

While OData itself is not necessarily prescribing exposing server's internals, most of the implementations are actually built straight on the top of an RDBMS exposing its internal schema and and potentially allowing ad-hoc changes using SQL-like commands rather than domain-rich commands. Netflix who initially warmed up to the idea of exposing its films database to the public, reached the conclusion this is not such a RESTful idea and retired its OData service.

Another issue is the server-centric design: server assuming how its services will be used. I believe (Sorry Roy!) HATEOAS touches the subjects discussed here. When we talk of Hypermedia As The Engine OF Application State, which application do we mean? OK, let's consider Twitter's API. How can you think same hypermedia can equally drive the state of a twitter client, a game that posts scores on twitter, 4Square app that declares you a mayor on twitter or a twitter realtime map showing location of tweets using the firehose data pipe? The relationship from clients to servers are not one-to-one any more and effectiveness of HATEOAS is questionable. Application is defined by the client and not the server, as such server's hypermedia cannot be the engine of the application.

At the end of the day, none of us want to sound like the pompous chef who bores the customers with self-flattery. Server serves, so design it for servant-hood.


* this post was originally posted slightly differently under the name: "Server Chauvinism - API's public domain"

Sunday, 13 October 2013

CacheCow update - Moving to MemoryCache for in-memory stores on both Client and Server

As you might know, CacheCow is a framework that implements HTTP Caching for ASP.NET Web API both for the client and server. If you are familiar with it, feel free to jump to the section "What's the update?".

HTTP Caching defined by HTTP Specification (currently at version 1.1 according to RFC 2616 although the work HTTP 2.0 is very close to finish) is an important feature of HTTP resulting in scalability of web as we know it.

It is important to remember that resources are cached on the client, and not on the server. In ASP.NET we can use Output Caching and System.Web.Caching.Cache to cache items on the Server but HTTP caching is different in the fact that resources get stored on the client. A resource is

  • either not cacheable (identified by no-store value in the Cache-Control header)
  • or can be only cached by the client (identified by private value in the Cache-Control header)
  • or can be cached by the client and all intermediaries (identified by public value in the Cache-Control header)
Currently CacheCow.Client looks after the storage of the items in an implementation of ICacheSore interface. Currently these implementations exist:
  • In-memory
  • Memcached
  • Redis
  • SQL Server
  • File-based
On the other hand, server also needs to store cache metadata. This is normally information such as Last-Modified and ETag of the resources. You configure CacheCow.Server to use one of several implementations of IEntityTagStore on the server. Currently these implementations exist:
  • In-memory
  • Memcached
  • RavenDB
  • MongoDB
  • SQL Server

So what is the change?

In the latest release of CacheCow which stands at 0.4.12, in-memory implementation of ICacheStore and IEntityTagStore have been changed from ConcurrentDictionary<TKey, TValue> based to MemoryCache based. The problem with dictionary-based implementation is that the store just grows and the items will never be freed. MemoryCache, on the other hand, is designed to be able to keep its memory usage to a threshold and expel old or least frequently used items out of the cache. 

To use in-memory stores, all you have to do is to use the CachingHandler with default constructor which results in-memory stores to be used, as they are default. The good thing with these implementations are ability to set a maximum memory limit. This is achieved using app.config to web.config of your application:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <system.runtime.caching>
    <memoryCache>
      <namedCaches>
        <add name="TheName" cacheMemoryLimitMegabytes="40" pollingInterval="00:05:00" />
      </namedCaches>
    </memoryCache>
  </system.runtime.caching>
</configuration>

Above configuration will set the memory usage limit to 40MB and this limit will be checked every 5 minutes. CacheCow uses these names for its MemoryCache objects:

  • CacheCow.Client: "###InMemoryCacheStore_###"
  • CacheCow.Server uses 2 MemoryCache objects:
    • Storing ETag and LastModified: "###_InMemoryEntityTagStore_ETag_###"
    • Storing route patterns: "###_InMemoryEntityTagStore_RoutePattern_###"

You can use the configuration to limit the memory used by these MemoryCache objects. Since MemoryCache is in-memory and is in the same AppDomain as the application, it provides the fastest and most efficient storage.

Use in-memory storage wherever you can and surely it provides better performance compared to the likes of SQL Server.
The only caveat with using these implementations is that even though you may set a memory limit, memory can increase above your threshold. This is not an issue and is related to garbage collection and using GC.Collect() the memory will return back to the actual usage.