Showing posts with label Decoupling. Show all posts
Showing posts with label Decoupling. Show all posts

Sunday, 1 July 2012

The place of Extension Methods in Software Design


Introduction

[Level T3] Extensions methods - introduced back in .NET 3.0 - are useful tools in a .NET developer's toolset. Apart from their usefulness, extension method is not an inherently object oriented concept yet we use them more and more in our API designs.

Extension methods initially were used for those classes where we did not own the source code for. But nowadays we are using them increasingly for types where we do own the source.

This post aims to have an in-depth look at the place of extension methods in the API design.

Background

Definition of Extension Methods according to MSDN is:
Extension methods enable you to "add" methods to existing types without creating a new derived type, recompiling, or otherwise modifying the original type.
So as we all know, in order to create an extension method, we need to:
  1. Create a static non-generic class
  2. Create a static method
  3. Make the first parameter as the type we are trying to add the method, with the keyword this
For example (and one of my favourites), we can add this extension method to the object to replicate the T-Sql's IN operator:

public static bool IsIn(this object item, params object[] list)
{
 if (list == null || list.Length == 0)
  return false;
 return list.Any(x => x == item);
}

Now I can use this like an instance method:

string ali = "ali";
var isIn = ali.IsIn("john", "jack", "shabbi", "ali"); // isIn -> true

If I may digress a little bit here, this is not such a great implementation since:

var isInForInt = 1.IsIn(2, 3, 1); // isInForInt -> false!

As you have probably guessed, defining the extension method for object type will cause the boxed integers objects to be compared instead of integers themselves and they surely won't be equal. So a generic implementation will solve the problem:

public static bool IsIn<T>(this T item, params T[] list)
{
 if (list == null || list.Length == 0)
  return false;
 return list.Any(x => EqualityComparer<T>.Default.Equals(x, item));
}

Reality is, extension method only gives the illusion of method being on the type and what is being compiled is nothing but a plain old static method call. Having a look at the IL generated confirms this:

IL_0056:  call       bool ConsoleApplication1.ExtensionMethods::IsIn<int32>(!!0, !!0[])

ExtensionMethods above is the name of the static class I created for this method.

So extension methods are basically the same utility or helper static methods we have been writing only glamorised to look like instance methods. Yet they have the additional benefit of:

  1. It leads to much more readable and natural code.
  2. I do not have to know the name of the helper class whose static methods I am using - in fact the behaviour has nothing to do with the static class. That class is not really a class in a true sense since it does not exert state or behaviour. And that is why it has to be declared static: to make clear its design intentions.
  3. Fluent API can be easily designed for older types without touching them.
  4. Since it is not really an instance method call, it can be called on null instances. This is a desirable side effect since we can check for nulls in the extension method and cater for them (none of the "object reference not set to an instance..." nonsense!)
  5. Since it can be called on null instances, some type information for the null instance can be determined in the extension method (although it can be a base type or an interface) while this is not possible for a null object.

Extension methods when we do not own the type

This has been the typical scenario. We always wonder if for example string had a such and such method and there was no way to achieve this. Now using extension methods we can. This scenario can also apply to cases where a historic API has been released (and you own the API) but cannot be changed. In this case, your API can be enhanced using extension methods.

With such usage, there is no decision to be made hence the design has already been done. Extension methods serve mere as a nice utility and syntactical sugar.

One of the most useful use cases I have found is the function composition in functional programming in C# (see some examples in my other posts here and here). This is especially important since you can achieve readability by method chaining. For example:

  usingReflection
   .Repeat(TotalCount)
   .OutputPerformance(stopwatch, performanceOutput)();

In addition to the examples above, let's have a look at a simple example to swallow the exception and optionally log the error (note how the implementation reuses itself to swallow errors that could arise from logging):

public static class WrapSwallowExtension
{
 public static Action<T> WrapSwallow<T>(this Action<T> action, Action<Exception> logger = null)
 {
  return (T t) =>
       {
           try
           {
            action(t);
           }
           catch (Exception e)
           {
      if (logger != null)
       logger.WrapSwallow()(e);             
           }

       };
 }
}

So I can use:

string myString = null;
Action<string> action = (s) => { s.ToLower(); }; // reference null exception! 
action.WrapSwallow()(myString); // swallowed

Now here I created a new exception but when I am working in a functional scenario, I already have my actions and functions.

Extension methods when we own the type

I have heard some saying "Why would you wanna use an extension method when you own the code? Just add the method to the type."

There are cases where you own the type yet you would still use an extension method. Here we have a look at a few scenarios below.

Extension methods for interfaces

This is the most obvious use case. Most of the Linq library is implemented using extension methods (while Microsoft owns the types). An interface cannot have the implementation but you can use extension methods to add implemented enhancement to your interfaces.

Without getting into the debate whether implementing ForEach against IEnumerable<T> is semantically correct or not (don't! I am not going there) you might have noticed that the function only exists for the List<T> so you have to use ToList() to use the feature. Well, this can be easily done for IEnumerable<T> too:

public static class IEnumerableExtensions
{
 public static IEnumerable<T> ForEachOne<T>(this IEnumerable<T> enumerable, Action<T> action)
 {
  foreach (var t in enumerable)
  {
   action(t);
   yield return t;
  }
 }
}

In this particular example, I do not own the source for IEnumerable<T> but even if I had, I would only be able to associate implementation with the interface using extension methods.

Overloading

This is the next common case. If you are familiar with ASP.NET MVC, you probably have noticed that the most of the functionality of HtmlHelper class has been implemented using extension methods.

Html.TextBoxFor(x => x.Name)

In fact all different overloads of HtmlHelper for Textbox, RadioButton, Checkbox, TextArea, etc are implemented using extension methods. So the HtmlHelper class itself implements a core set of functionality which will be called by these extension methods.

Now lets look at this fictional interface:

public interface IDependencyResolver
{
   object Resolve(Type t);
   T Resolve<T>();
}

The interface has two methods for resolving the type, one using the generic type the other with the type instance. Whoever implements this will be most likely implementing the non-generic method and then make generic method call the non-generic one:

public interface IDependencyResolver
{
 object Resolve(Type t);
}

public static class IDependencyResolverExtension
{
 public static T Resolve<T>(this IDependencyResolver resolver)
 {
  return (T) resolver.Resolve(typeof (T));
 }
}


This will help to:
  • Trim down the interface and make it terser so it can express its design intentions more clearly
  • Save all implementers of the interface having to repeat the same bit of code
When I look at the interface IQueryProvider, I wonder if it was designed before extension methods were available:

public interface IQueryProvider
{
    IQueryable<TElement> CreateQuery<TElement>(Expression expression);
    IQueryable CreateQuery(Expression expression);
    TResult Execute<TResult>(Expression expression);
    object Execute(Expression expression);
}

So the 4 methods could have been reduced to 2. Considering the fact that Linq and extension methods both came in .NET 3.0, my suspicion seems very likely!

Dependency layering

Another case where you might decide to use an extension method rather than exposing a direct method on the type is when a type's sole dependency on another type is confined to a single method. This is very common in cases where the dependency is on a layer above the dependent type - while naturally must be the other way around.

For example, let's look at this case:

// THIS WILL NOT WORK!

// sitting at entity layer
public class Foo
{
 // ...

 public Bar ToBar()
 {
  // ...
 }
}

// sitting at business layer
public class Bar
{
 // ...  
}

Now in this example, I have laid out these two classes in different logical layers to better illustrate the case - but it does not have to be, this is all about managing dependencies, in the same layer or other layers. We have baked in Foo's the dependency to Bar for the sake of ToBar(). The solution is to create an extension method for the ToBar().

So we can write (and completely decouple to classes):

public class Foo
{
 
}

public class Bar
{

}

public static class FooExtensions
{
 public static Bar ToBar(this Foo foo)
 {
  return new Bar();
 }
}


Providing implementation for enumerations

This is one that probably many of us have done. Enumerations - unfortunately - cannot contain implementations so extension methods are a good place to put the implementation code for enums. This is usually to do to conversion, parsing and formatting.

Delay decision on API signatures

With regards to an API, anything that goes into the public interface of the type is difficult to change. As such attempting to provide all possible overloads and use cases of the type on its public interface is likely to fail.

Delaying such decisions with providing a base functionality on the type and then providing more and more extension methods with each release is a useful process. ASP.NET team have used this technique for ASP.NET MVC and recently with ASP.NET Web API.

Drawbacks

Extension methods are static methods. As such they cannot be mocked using standard mocking frameworks. An extension method should not have any dependency other than the ones passed to it.

Let's look at this case:

public class Foo
{
 public string FileName { get; set; }

 public void Save()
 {
  // ...
 }
}

public static class FooExtensions
{
 public static void SafeSave(this Foo foo)
 {
  var directoryName = Path.GetDirectoryName(foo.FileName);
  if (!Directory.Exists(directoryName))
   Directory.CreateDirectory(directoryName);
  foo.Save();
 }
}

In this case, unit testing any class that uses SafeSave becomes a nightmare. What we need here is to create an interface IFileSystem and pass along with the extension method to abstract it from using the real file system.

Conclusion

Availability of extension methods has changed the way we design software APIs in the .NET world. We have started to build the basic functionality in the actual types and use extension methods to provide overloading.

There are 5 reasons to use extension methods when you own the type:

  • To associate implementation with interfaces
  • Overloading of an API
  • Removing dependency especially in logical layering
  • Providing implementation for enumerations
  • Delaying decision on API signatures
An extension method should not have any dependencies other than the ones passed to it.

Friday, 8 June 2012

What I think coupling is ...

Introduction

[Level C3] This post is a follow-up to the question/discussion-point Darrel Miller has started here. The question is that in the REST world, what is coupling and how we can achieve de-coupling. Although this post can be read independently, it is best to start with reading his post first. [This discussion carries on in the next related post]

Motivation

REST has a strong focus on decoupling client and server. With REST awareness and adoption increasing, the challenge to define new best practices has now become more apparent. This is particularly important in the light of how much nowadays can be achieved in the browser - a previously limited client. Rise of the Single Page Applications (SPAs) is a testament to the popularity of creating a rich-client in what was previously called thin-client. Diversity of available clients and their capabilities have - in a way - forced us towards REST and achieving de-coupling of client and server. 

But recently, some have felt that too much power and control has shifted towards the client and the server has been reduced to mere a data provider. One of such groups is ROCA that defines a set of best practices that are in clear opposition to SPA paradigms. I actually meant to have a post on ROCA (which I hopefully will soon) but I suppose this post can be a primer as the issue at question is relevant.

Background

Darrel defines coupling as "a measure of how changes in one thing might cause changes in the other". I would very much go with this definition but I would like to expand upon.

Coupling is a software design (logical) and architecture (physical) anti-pattern. Loose-coupling is on the other hand  a virtue that allows different compartments (intentionally avoiding using bloated words module or component) of the system to change independently. 

In the world of code (logical), we use Single-Responsibility principle (S from SOLID) to decouple pieces of code: a class needs to have a single reason to change. Arguably, rest of the SOLID principles deal with various degrees of decoupling. For example D is for not baking the dependency.

On the other hand, in SOA (physical) we loosely-couple the services. Services can be dependent on other services but they would be able to maintain a reasonable level of functionality if other services go down. The key tasks in building a successful SOA is defining service boundaries. Another important concept is achieving cohesion: keeping all related components of the service in the same service and resisting to break a service into two compartments when boundary is weak.

Defining boundary and achieving cohesion

Coupling in other words is baking knowledge of something in a compartment where it is concern of another compartment.

When I buy stuff from Amazon and it is delivered when I am not at home, they leave a card with which I can claim my package at the depot. The card has just an Id in the form of barcode. It does not have the row and shelf number my package is kept at the depot. If it had, it might have made it easier for the clerk to use the numbers on the card to fetch the package. But this way I am baking the knowledge of location into something it does not need to know and what if they had to change the location of the package? That is why the clerk zaps the barcode and location is shown in the system. They can happily change the location as long as the Id does not change.

On the other hand, the depot does not need to know what is in the package - it is not its concern. If it did, it might have been helpful in rare scenarios but that is not worth considering.

So the key to defining boundary and achieving cohesion is to understand whose concern it is for an abstraction. We will represent 3 models here: server-concern, client-concern and mixed-concern. In brief, it depends.

Server-Concern

In this case, only server needs to know about an abstraction. The problem happens when the concept oozes out to the client when client gets to know server implementation details (exposing server bowels).

Server-Concern
Server-Concern

Example

A typical example is getting the list of most recent contacts:
GET /api/contacts?lastUsedMoreThan=2012-05-06&count=20
GET /api/contacts/mostRecent 
In the first case, client gets to know that server keeps a value of lastUsed for contacts. This prevents server to optimise the algorithm by also including number of times we have used the contact. However, in the second case server hides its implementation so the change can be implemented without breaking the client.

Client-Concern

In this scenario, the abstraction is purely a client concept. The problem happens when server starts to make decisions for the client.

Client-Concern
Client-Concern

Example

A typical example is pagination:
GET /api/contacts/page/11
GET /api/contacts?skip=200&count=20
Number of pages available and number of each record in a page is a client concern. This kind of detail will be different on the iPhone from the desktop from the tablet (and what tablet). In the first case, server will make a decision on the number of records per page (and even understands the page as a resource) while in the second case, knowledge of pagination is confined to the client since it is its concern.

Mixed-Concern

In this scenario, both server and client need to work in accord for the feature to work. The problem happens when server or client assumes that the other definitely implements the abstraction.

Mixed-Concern
Mixed-Concern

Example

A typical example is HTTP caching. For HTTP caching to work, client and server need to work in tandem. Server needs to return with each resource, a Cache-Control header, an ETag or a LastModified and the client needs to use these parameters in its future conditional requests with If-Modified-Since or If-None-Match. 

Works
No assumption in implementation on the other side
helps both sides to carry on working without the feature


However, if server does not provide caching or the client does not use and respect the caching parameters from the server, system does not break - albeit it can result in an inferior or suboptimal experience.

Diversity and compromise

While we can have 3 above models, there are cases where same feature can be implemented differently. 

Let's bring an example from engineering. Where do you put AC/DC power transformer? My PC has the transformer in its power supply (item 6) which is equivalent of Client-Concern. My laptop uses a power adaptor and has no mains supply which is equivalent of Server-Concern. On the other hand, my electrical toothbrush has divided the transformer into both charger element and also inside the toothbrush so it works by magnetic induction (analogous to mixed-concern). This is clearly a compromise but the thinking behind it is make the toothbrush waterproof and safe.

Conclusion

Darrel's definition is excellent but we have expanded upon it since the change (maintenance) is one facet of the same concept. Others are concern (requirement) and knowledge of that concern in a compartment (implementation).

We discussed three models: server-concern, client-concern and mixed-concern. Each of these are valid patterns but they each come with its own anti-pattern to be aware of. So in short: it depends.