Sunday 10 March 2013

Pro tip: don't try abstracting the code world from the HTTP world

[Level T2] I have explained in my Part 5 of the Web API series how the world of HTTP (URI, headers, method, payload) meets the world of code (controller, actions, parameters, etc). This has partly been done in ASP.NET Web API in the media type formatter. And then I talked about the Tower Bridge of MediaTypeFormatter. The problem is these two worlds will eventually come to collide hard - if you ever try to abstract them away.

So what am I on about?

Problem is routing - as I have raised a few times before. And here I will try to explain why - and propose a solution.

The routes basically live in the application setup. You define them outside your controllers and actions - usually inside glolbal.asax or outside while being called on application startup. The idea is that no matter what your routing is, if set up correctly, the values will be populated in your actions - all parameters, and your data. And then your return some data which will be interpreted and serialised in your MediaTypeFormatters.

So what happens if I want to set the location header in the response to a POST method (when an item has been created)? Now I need to provide the parameters to the route to create a virtual link. Now here although I have been abstracted away from the route, I have to know the parameters to pass - this is where things start to get a bit hairy. I have defined a route somewhere else and yet I need to know about it here.

And that by itself is not a big problem. A bigger issue is that ASP.NET does the route matching linearly. So if I happen to define 1000 routes, and my 1000th route gets call often, I can be in trouble as performance can suffer: ASP.NET should always try to match to 999 routes until it hits 1000th.

But why should I have 1000 routes? What is wrong with /api/controller/id? I could end up with handful of routes? OK, here is the catch: routing was designed for a flat routing structure (and originally for ASP.NET MVC) while resource organisation in a Web API has to be hierarchical.

Another issue is hypermedia. Have you tried doing clean and robust hypermedia in Web API using the existing routing? Good luck with that one!

Another problem is the resource in the Web API's really elegant HTTP pipeline (which I love) is URL until it hits the controller dispatcher and then tries to find a matching controller and then action. As such, code part of the resource (controller, action, parameters) cannot programmatically (using attributes, etc) define a strategy to the delegating handlers since we do not yet know which controller or action will respond.

Resource organisation is the responsibility of the application

There were a few issues  raised with CacheCow in terms of cache invalidation of related resources. This is actually one of the fundamental scenarios that I designed CacheCow for. Problem is resource organisation is the responsibility of the application as such CacheCow cannot make any assumptions about the organisation of the resource. So the solution is for the application to describe resource organisation and the relationship of resources. This is definitely not easy - as again, world of routing is a disjoint one.

So here is what we need:

  • A hierarchical routing with a natural setup rather than arbitrary route definition 
  • Resources aware of related resources. As such a resource can define its hypermedia for the most part
  • Ability for resources to define their strategy when it comes to caching, etc


Once asked by Youssef Moussaoui - a developer in ASP.NET team - on what to improve in ASP.NET Web API, my immediate answer was routing. And not just routing, it is resource organisation.

So I know that ASP.NET will look into this but I probably will have a stab at this. We have defined the project Resourx which will look at this and try to achieve the 3 goals above. Watch the space!

5 comments:

  1. "As such, code part of the resource (controller, action, parameters) cannot programmatically (using attributes, etc) define a strategy to the delegating handlers since we do not yet know which controller or action will respond."

    With a bit of "hacking" you could easily get this done using attributes. We use a very primitive concept that you write an attribute over every service method that describes the route to this resource. A global method will parse all controllers for methods with these attributes, creates appropriate routes (grouping wherever possible to reduce the amount) and voilá: We know the exact route to the method where the method is defined.

    ReplyDelete
  2. Again, the routes will live in attributes. The problem is the route live as a string token outside your code. Also the search space will increase hugely with every added service.

    But probably is a better solution anyway.

    ReplyDelete
  3. "The problem is the route live as string token outside your code."

    How else could you even imagine it? On possibility would be to have plain RPC style methods, take the method name as the uri. Another would be to have a strict hierarchical data format, and always go by the controller name.

    I think the attribute approach is the best. You can see on which uri to access the method where it matters: with the method. Also search space can be handled by clever grouping and custom Controller-/ActionSelector. We currently have over 400 methods, increasing, and routing does not take measurable long.

    The default routing is in my opinion very, very flawed. It's not very flexible.. Especially since it's taking the uri-parameters in consideration for routing. I have to use the parameter name "id", when I named my route parameter "id". I don't have a choice to change the name if I consider it more understandbale, except if I add yet another route for this that does not accidently collide with any other.

    I'm always looking for other, perhaps better, ways to handle routing in an WebApi service.

    ReplyDelete
  4. I am thinking of an alternative method. Basically the controller (which is the resource space) defines its hierarchy and routing simply works off that. Watch the space :)

    ReplyDelete
  5. I'm looking forward to an example implementation. ;-)

    ReplyDelete

Note: only a member of this blog may post a comment.