This has been a revelation - what I am about to reveal here, deeply surprised me - it might surprise you too. This post is mainly about consuming restful APIs using HttpClient and when the payload is JSON.
UPDATE: I got in touch with the ASP.NET team and they confirmed this as a performance bug which has now been fixed but the fix yet not available.
As you probably know performance and benchmarking is very close to my heart and I have been recently focusing on benchmarking a few APIs at work. One of my observations was that the Web APIs/Web Sites which have historically been IO-bound, they show sign of CPU strain and have become CPU-bound.
When you think logically about it, there is no magic here: by using async/await, you end up putting your CPU into some use unlike the old times when the threads are blocked waiting for the IO to return and CPU would be twiddling its thumb. However, I found the CPU overhead of the operations excessive so I set out to benchmark a few different scenarios.
Test Setup
Two APIs were created where one was using the other. These two APIs were part of the same cloud service which was deployed to two separate Medium (A2) web roles. I used 2 different deployments of the same code, one dependent upon version 4.0.30506.0 of the API and the ther one with the latest version which was 5.2.2. Difference between two versions of the Web API is the topic of another post, but the differences were not huge although newer versions showed improved performance.API being called returns a customer with its orders. Every customer has between 1 to 3 orders and each order between 1-3 items. On the long run, these randomisation gets evened out. Each document returned is between 1-2 KB. So the more superficial API, for every customer, makes one call to get the customer and for each customer will separately call the deeper API once for each order. Then it combines the result and sends back the response. Both APIs are deployed in the same Azure Data Centre.
You can find the whole code at GitHub. The code takes 4 different approaches as below:
public class CustomerController : ApiController { public FullCustomer GetSync(int id) { var webClient = new WebClient(); var customerString = webClient.DownloadString(BuildUrl(id)); var customer = JsonConvert.DeserializeObject<Customer>(customerString); var fullCustomer = new FullCustomer(customer); var orders = new List<Order>(); foreach (var orderId in customer.OrderIds) { var orderString = webClient.DownloadString(BuildUrl(id, orderId)); var order = JsonConvert.DeserializeObject<Order>(orderString); orders.Add(order); } fullCustomer.Orders = orders; return fullCustomer; } public async Task<FullCustomer> GetASync(int id) { var webClient = new WebClient(); var customerString = await webClient.DownloadStringTaskAsync(BuildUrl(id)); var customer = JsonConvert.DeserializeObject<Customer>(customerString); var fullCustomer = new FullCustomer(customer); var orders = new List<Order>(); foreach (var orderId in customer.OrderIds) { var orderString = await webClient.DownloadStringTaskAsync(BuildUrl(id, orderId)); var order = JsonConvert.DeserializeObject<Order>(orderString); orders.Add(order); } fullCustomer.Orders = orders; return fullCustomer; } public async Task<FullCustomer> GetASyncWebApi(int id) { var httpClient = new HttpClient(); httpClient.DefaultRequestHeaders.Add("Accept", "application/json"); var responseMessage = await httpClient.GetAsync(BuildUrl(id)); var customer = await responseMessage.Content.ReadAsAsync<Customer>(); var fullCustomer = new FullCustomer(customer); var orders = new List<Order>(); foreach (var orderId in customer.OrderIds) { responseMessage = await httpClient.GetAsync(BuildUrl(id, orderId)); var order = await responseMessage.Content.ReadAsAsync<Order>(); orders.Add(order); } fullCustomer.Orders = orders; return fullCustomer; } public async Task<FullCustomer> GetASyncWebApiString(int id) { var httpClient = new HttpClient(); httpClient.DefaultRequestHeaders.Add("Accept", "application/json"); var responseMessage = await httpClient.GetAsync(BuildUrl(id)); var customerString = await responseMessage.Content.ReadAsStringAsync(); var customer = JsonConvert.DeserializeObject<Customer>(customerString); var fullCustomer = new FullCustomer(customer); var orders = new List<Order>(); foreach (var orderId in customer.OrderIds) { responseMessage = await httpClient.GetAsync(BuildUrl(id, orderId)); var orderString = await responseMessage.Content.ReadAsStringAsync(); var order = JsonConvert.DeserializeObject<Order>(orderString); orders.Add(order); } fullCustomer.Orders = orders; return fullCustomer; } private string BuildUrl(int customerId, int? orderId = null) { string baseUrl = string.Format("http://{0}:8080/api/customer/{1}", Request.RequestUri.Host, customerId); return orderId.HasValue ? string.Format("{0}/order/{1}", baseUrl, orderId.Value) : baseUrl; } }So as you can see, we use 4 different methods:
1) Using WebClient in the sync fashion
2) Using WebClient in the async fashion
3) Using HttpClient in the async fashion with ReadAsAsync on HttpContent
4) Using HttpClient in the async fashion with reading content as string and then using JsonConvert to deserialise
I used SuperBenchmarker to invoke the main API which gathers the data from the other API. I used the tool within the same Azure Data Centre from another machine (none of the APIs) to make the tests more realistic yet eliminate network idiosyncrasies.
I used 5000 requests with concurrency of 10 - although I tried other number as well which did not make any material difference in the results.
I used SuperBenchmarker to invoke the main API which gathers the data from the other API. I used the tool within the same Azure Data Centre from another machine (none of the APIs) to make the tests more realistic yet eliminate network idiosyncrasies.
I used 5000 requests with concurrency of 10 - although I tried other number as well which did not make any material difference in the results.
Results
Here is the result for scenario 1 (sync using WebClient):
TPS: 394 (requests/second) Max: 199ms Min: 8ms Avg: 25ms 50% below 24ms 60% below 25ms 70% below 27ms 80% below 28ms 90% below 30ms 95% below 32ms 98% below 36ms 99% below 55ms 99.9% below 185ms
The result for scenario 2 (Async using WebClient) usually shows better throughput but higher CPU
TPS: 485 (requests/second) Max: 291ms Min: 5ms Avg: 20ms 50% below 19ms 60% below 21ms 70% below 23ms 80% below 25ms 90% below 27ms 95% below 29ms 98% below 32ms 99% below 36ms 99.9% below 284ms
The CPU difference is not huge and can be explained by the increase throughput:
CPU usage during Scenario 1 and 2 |
TPS: 41 (requests/second) Max: 12656ms Min: 26ms Avg: 240ms 50% below 170ms 60% below 178ms 70% below 187ms 80% below 205ms 90% below 256ms 95% below 296ms 98% below 370ms 99% below 3181ms 99.9% below 12573ms
Yeah, shocking. The diagram below compares CPU usage between scenario 1 and 3:
CPU usage in scenario 1 (arrow) and 3 (box) |
Scenario 4 is definitely better and is not too far from scenario 1 and 2:
TPS: 230 (requests/second) Max: 7068ms Min: 7ms Avg: 43ms 50% below 20ms 60% below 22ms 70% below 24ms 80% below 26ms 90% below 29ms 95% below 34ms 98% below 110ms 99% below 144ms 99.9% below 7036ms
The CPU usage is around 80% and definitely worse that scenario 1 and 2 (which requires further analysis).
Analysis
Where is the problem? It appears that JSON Deserialization when reading from a stream is not efficient. It is possible that the JSON Deserialization has to optimise for memory efficiency rather than CPU efficiency since when the whole string is passed, it is surely much faster.
Profiling proves that the problem is indeed JSON Deserialization:
Profiling scenario 3 is showing that the most of the CPU time is spent in JSON Deserialisation |
So in order to prove that, we do not have to invoke an API. The whole operation can be done inside a Console application. So I used the same code that was generating customers and orders. Here I am comparing
private static void Main(string[] args) { const int TotalRun = 10*1000; var customerController = new CustomerController(); var orderController = new OrderController(); var customer = customerController.Get(1); var orders = new List<Order>(); foreach (var orderId in customer.OrderIds) { orders.Add(orderController.Get(1, orderId)); } var fullCustomer = new FullCustomer(customer) { Orders = orders }; var s = JsonConvert.SerializeObject(fullCustomer); var bytes = Encoding.UTF8.GetBytes(s); var stream = new MemoryStream(bytes); var content = new StreamContent(stream); content.Headers.ContentType = new MediaTypeHeaderValue("application/json"); var stopwatch = Stopwatch.StartNew(); for (int i = 1; i < TotalRun+1; i++) { var a = content.ReadAsAsync<FullCustomer>().Result; if(i % 100 == 0) Console.Write("\r" + i); } Console.WriteLine(); Console.WriteLine(stopwatch.Elapsed); stopwatch.Restart(); for (int i = 1; i < TotalRun+1; i++) { var sa = content.ReadAsStringAsync().Result; var a = JsonConvert.DeserializeObject<FullCustomer>(sa); if (i % 100 == 0) Console.Write("\r" + i); } Console.WriteLine(); Console.WriteLine(stopwatch.Elapsed); Console.Read(); }
As expected, the result shows uncomparable difference, in the order of ~120:
10000 00:00:06.2345493 10000 00:00:00.0509763
So this result basically confirms what we have seen. I will get in touch with James Newton King and try to shed more light on the subject.
For the meantime, if you know your content is not going to be huge and always in JSON, you might as well forget about content negotiation and read it as a string and then use JsonConvert to deserialize.
Conclusion
HttpContent.ReadAsAsync on JSON payloads is really slow - in the order of 120x compared to JsonConvert. I guess it might to do with the memory efficiency of reading from streams (keeping memory footprint at zero) but that is a guess and I have been in touch with James Newton King (creator of Json.Net) to get to the bottom of it.