Category Archives: Uncategorized

Linq performance improvements in .NET Core

By now, you’re probably aware that Microsoft released an open-source and cross-platform version of the .NET platform: .NET Core. This means you can now build and run .NET apps on Linux or macOS. This is pretty cool in itself, but it doesn’t end there: .NET Core also brings a lot of improvements to the Base Class Library.

For instance, Linq has been made faster in .NET Core. I made a little benchmark to compare the performance of some common Linq methods, and the results are quite impressive:


The full code for the benchmark can be found here. As with all microbenchmarks, it has to be taken with a grain of salt, but it gives an idea of the improvements.

Some lines in this table are quite surprising. How can Select run 5000 times almost instantly? First, we have to keep in mind that most Linq operators are lazy: they don’t actually do anything until you enumerate the result, so doing something like array.Select(i => i * i) executes in constant time (it just returns a lazy sequence, without consuming the items in array). This is why I included a call to Count() in my benchmark, to make sure the result is enumerated.

Despite this, it runs 5000 times in 413µs… This is possible due to an optimization in the .NET Core implementation of Select and Count. A useful property of Select is that it produces a sequence with the same number of items as the source sequence. In .NET Core, Select takes advantage of this. If the source is an ICollection<T> or an array, it returns a custom enumerable object that keeps track of the number of items. Count can then just retrieve this value and return it, which produces a result in constant time. The full .NET Framework implementation, on the other hand, naively enumerates the sequence produced by Select, which takes much longer.

It’s interesting to note that in this situation, .NET Core will not execute the projection specified in Select, so it’s a breaking change compared to the desktop framework for code that was relying on side effects of this projection. This has been identified as an issue which has already been fixed on the master branch, so the next release of .NET Core will execute the projection on each item.

OrderBy followed by Count() also runs almost instantly… did Microsoft invent a O(1) sorting algorithm? Unfortunately, no… The explanation is the same as for Select: since OrderBy preserves the item count, the information is recorded so that it can be used by Count, and there is no need to actually sort the input sequence.

OK, so these cases were pretty obvious improvements (which will be rolled back anyway, as mentioned above). What about the SelectAndToArray case? In this test, I call ToArray() on the result of Select, to make sure that the projection is actually performed on each item of the source sequence: no cheating this time. Still, the .NET Core version is 68% faster than the full .NET Framework version. The reason has to do with allocations: since the .NET Core implementation knows how many items are in the result of Select, it can directly allocate an array of the correct size. In the .NET Framework, this information is not available, so it starts with a small array, copies items into it until it’s full, then allocates a larger array, copies the previous array into it, copies the next items from the sequence until the array is full, and so on. This causes a lot of allocations and copies, hence the degraded performance. A few years ago, I suggested an optimized version of ToList and ToArray, where you had to specify the size. The .NET Core implementation basically does the same thing, except that you don’t have to pass the size manually, since it’s passed along the Linq method chain.

Where and WhereAndToArray are both about 8% faster on .NET Core 1.1. Looking at the code (.NET 4.6.2, .NET Core), I can’t see any obvious difference that could explain the better performance, so I suspect it’s mostly due to improvements in the runtime. In this case, ToArray doesn’t know the length of the input sequence, since there is no way to predict how many items Where will yield, so it can’t use the same optimization as with Select and has to build the array the slow way.

We already discussed OrderBy + Count(), which wasn’t a fair comparison since the .NET Core implementation didn’t actually sort the sequence. The OrderByAndToArray case is more interesting, because the sort can’t be skipped. And in this case, the .NET Core implementation is slightly slower than the .NET 4.6.2 one. I’m not sure why this is; again, the implementation is very similar, although there has been a bit of refactoring in .NET Core.

So, on the whole, Linq seems generally faster in .NET Core than in .NET 4.6.2, which is very good news. Of course, I only benchmarked a limited numbers of scenarios, but it shows the .NET Core team is working hard to optimize everything they can.

Easy text parsing in C# with Sprache

A few days ago, I discovered a little gem: Sprache. The name means "language" in German. It’s a very elegant and easy to use library to create text parsers, using parser combinators, which are a very common technique in functional programming. The theorical concept may seem a bit scary, but as you’ll see in a minute, Sprache makes it very simple.

Text parsing

Parsing text is a common task, but it can be tedious and error-prone. There are plenty of ways to do it:

  • manual parsing based on Split, IndexOf, Substring etc.
  • regular expressions
  • hand-built parser that scans the string for tokens
  • full blown parser generated with ANTLR or a similar tool
  • and probably many others…

None of these options is very appealing. For simple cases, splitting the string or using a regex can be enough, but it doesn’t scale to more complex grammars. Building a real parser by hand for non-trivial grammars is, well, non-trivial. ANTLR requires Java, a bit of knowledge, and it relies on code generation, which complicates the build process.

Fortunately, Sprache offers a very nice alternative. It provides many predefined parsers and combinators that you can use to define a grammar. Let’s walk through an example: parsing the challenge in the WWW-Authenticate header of an HTTP response (I recently had to write a parser by hand for this recently, and I wish I had known Sprache then).

The grammar

The WWW-Authenticate header is sent by an HTTP server as part of a 401 (Unauthorized) response to indicate how you should authenticate:

# Basic challenge
WWW-Authenticate: Basic realm="FooCorp"

# OAuth 2.0 challenge after sending an expired token
WWW-Authenticate: Bearer realm="FooCorp", error=invalid_token, error_description="The access token has expired"

What we want to parse is the "challenge", i.e. the value of the header. So, we have an authentication scheme (Basic, Bearer), followed by one or more parameters (name-value pairs). This looks simple enough, we could probably just split by ',' then by '=' to get the values… but the double quotes complicate things, since quoted strings could contain the ',' or '=' characters. Also, the double quotes are optional if the parameter value is a single token, so we can’t rely on the fact they will (or won’t) be there. If we want to parse this reliably, we’re going to have to look at the specs.

The WWW-Authenticate header is described in detail in RFC-2617. The grammar looks like this, in what the RFC calls "augmented Backus-Naur Form" (see RFC 2616 §2.1):

# from RFC-2617 (HTTP Basic and Digest authentication)

challenge      = auth-scheme 1*SP 1#auth-param
auth-scheme    = token
auth-param     = token "=" ( token | quoted-string )

# from RFC2616 (HTTP/1.1)

token          = 1*<any CHAR except CTLs or separators>
separators     = "(" | ")" | "<" | ">" | "@"
               | "," | ";" | ":" | "\" | <">
               | "/" | "[" | "]" | "?" | "="
               | "{" | "}" | SP | HT
quoted-string  = ( <"> *(qdtext | quoted-pair ) <"> )
qdtext         = <any TEXT except <">>
quoted-pair    = "\" CHAR

So, we have a few grammar rules, let’s see how we can encode them in C# code with Sprache, and use them to parse a challenge.

Parsing tokens

Let’s start with the most simple parts of the grammar: tokens. A token is declared as one or more of any characters that are not control chars or separators.

We’ll define our rules in a Grammar class. Let’s start by defining some character classes:

static class Grammar
{
    private static readonly Parser<char> SeparatorChar =
        Parse.Chars("()<>@,;:\\\"/[]?={} \t");

    private static readonly Parser<char> ControlChar =
        Parse.Char(Char.IsControl, "Control character");

}
  • Each rule is declared as a Parser<T>; since these rules match single characters, they are of type Parser<char>.
  • The Parse class from Sprache exposes parser primitives and combinators.
  • Parse.Chars matches any character from the specified string, we use it to specify the list of separator characters.
  • The overload of Parse.Char that we use here takes a predicate that will be called to check if the character matches, and a description of the character class. Here we just use System.Char.IsControl as the predicate to match control characters.

Now, let’s define a TokenChar character class to match characters that can be part of a token. As per the RFC, this can be any character not in the previous classes:

    private static readonly Parser<char> TokenChar =
        Parse.AnyChar
            .Except(SeparatorChar)
            .Except(ControlChar);
  • Parse.AnyChar, as the name implies, matches any character.
  • Except specifies exceptions to the rule.

Finally, a token is a sequence of one or more of these characters:

    private static readonly Parser<string> Token =
        TokenChar.AtLeastOnce().Text();
  • A token is a string, so the rule for a token is of type Parser<string>.
  • AtLeastOnce() means one or more repetitions, and since TokenChar is a Parser<char>, it returns a Parser<IEnumerable<char>>.
  • Text() combines the sequence of characters into a string, returning a Parser<string>.

We’re now able to parse a token. But it’s just a small step, and we still have a lot to do…

Parsing quoted strings

The grammar defines a quoted string as a sequence of:

  • an opening double quote
  • any number of either
    • a "qdtext", which is any character except a double quote
    • a "quoted pair", which is any character preceded by a backslash (this is used to escape double quotes inside a string)
  • a closing double quote

Let’s write the rules for "qdtext" and "quoted pair":

    private static readonly Parser<char> DoubleQuote = Parse.Char('"');
    private static readonly Parser<char> Backslash = Parse.Char('\\');

    private static readonly Parser<char> QdText =
        Parse.AnyChar.Except(DoubleQuote);

    private static readonly Parser<char> QuotedPair =
        from _ in Backslash
        from c in Parse.AnyChar
        select c;

The QdText rule doesn’t require much explanation, but QuotedPair is more interesting… As you can see, it looks like a Linq query: this is Sprache’s way of specifying a sequence. This particular query means: match a backslash (named _ because we ignore it) followed by any character named c, and return just c (quoted pairs are not escape sequences in the same sense as in C, Java or C#, so "\n" isn’t interpreted as "new line" but just as "n").

We can now write the rule for a quoted string:

    private static readonly Parser<string> QuotedString =
        from open in DoubleQuote
        from text in QuotedPair.Or(QdText).Many().Text()
        from close in DoubleQuote
        select text;
  • the Or method indicates a choice between two parsers. QuotedPair.Or(QdText) will try to match a quoted pair, and if that fails, it will try to match a QdText instead.
  • Many() indicates any number of repetition
  • Text() combines the characters into a string

We now have all the basic building blocks, so we can move on to higher level rules.

Parsing challenge parameters

A challenge is made of an auth scheme followed by one or more parameters. The auth scheme is trivial (it’s just a token), so let’s start by parsing the parameters.

Although there isn’t a named rule for it in the grammar, let’s define a rule for parameter values. The value can be either a token or a quoted string:

    private static readonly Parser<string> ParameterValue =
        Token.Or(QuotedString);

Since a parameter is a composite element (name and value), let’s define a class to represent it:

class Parameter
{
    public Parameter(string name, string value)
    {
        Name = name;
        Value = value;
    }
    
    public string Name { get; }
    public string Value { get; }
}

The T in Parser<T> isn’t restricted to characters and strings, it can be any type. So the rule for parsing parameters will be of type Parser<Parameter>:

    private static readonly Parser<char> EqualSign = Parse.Char('=');

    private static readonly Parser<Parameter> Parameter =
        from name in Token
        from _ in EqualSign
        from value in ParameterValue
        select new Parameter(name, value);

Here we match a token (the parameter name), followed by the '=' sign, followed by a parameter value, and we combine the name and value into a Parameter instance.

Now let’s parse a sequence of one or more parameters. Parameters are separated by commas (','), with optional leading and trailing whitespace (look for "#rule" in RFC 2616 §2.1). The grammar for lists allows several commas without items in between, e.g. "item1 ,, item2,item3, ,item4", so the rule for the delimiter can be written like this:

    private static readonly Parser<char> Comma = Parse.Char(',');

    private static readonly Parser<char> ListDelimiter =
        from leading in Parse.WhiteSpace.Many()
        from c in Comma
        from trailing in Parse.WhiteSpace.Or(Comma).Many()
        select c;

We just match the first comma, the rest can be any number of commas or whitespace characters. We return the comma because we have to return something, but we won’t actually use it.

We could now match the sequence of parameters like this:

    private static readonly Parser<Parameter[]> Parameters =
        from first in Parameter.Once()
        from others in (
            from _ in ListDelimiter
            from p in Parameter
            select p).Many()
        select first.Concat(others).ToArray();

But it’s not very straightforward… fortunately Sprache provides an easier option with the DelimitedBy method:

    private static readonly Parser<Parameter[]> Parameters =
        from p in Parameter.DelimitedBy(ListDelimiter)
        select p.ToArray();

Parsing the challenge

We’re almost done. We now have everything we need to parse the whole challenge. Let’s define a class to represent it first:

class Challenge
{
    public Challenge(string scheme, Parameter[] parameters)
    {
        Scheme = scheme;
        Parameters = parameters;
    }
    public string Scheme { get; }
    public Parameter[] Parameters { get; }
}

And finally we can write the top-level rule:

    public static readonly Parser<Challenge> Challenge =
        from scheme in Token
        from _ in Parse.WhiteSpace.AtLeastOnce()
        from parameters in Parameters
        select new Challenge(scheme, parameters);

Note that I made this rule public, unlike the others: it’s the only one we need to expose.

Using the parser

Our parser is done, now we just have to use it, which is pretty straightforward:

void ParseAndPrintChallenge(string input)
{
    var challenge = Grammar.Challenge.Parse(input);
    Console.WriteLine($"Scheme: {challenge.Scheme}");
    Console.WriteLine($"Parameters:");
    foreach (var p in challenge.Parameters)
    {
        Console.WriteLine($"- {p.Name} = {p.Value}");
    }
}

With the OAuth 2.0 challenge example from earlier, this produces the following output:

Scheme: Bearer
Parameters:
- realm = FooCorp
- error = invalid_token
- error_description = The access token has expired

If there’s a syntax error in the input text, the Parse will throw a ParseException with a message describing where and why the parsing failed. For instance, if I remove the space between "Bearer" and "realm", I get the following error:

Parsing failure: unexpected ‘=’; expected whitespace (Line 1, Column 12); recently consumed: earerrealm

You can find the full code for this article here.

Conclusion

As you can see, Sprache makes it very simple to parse complex text. The code isn’t particularly short, but it’s completely declarative; there are no loops, no conditionals, no temporary variables, no state… This makes it very easy to understand, and it can easily be compared with the actual grammar definition to check its correctness. It also provides pretty good feedback in case of error, which is hard to accomplish with a hand-built parser.

What’s new in FakeItEasy 3.0.0?

FakeItEasy is a popular mocking framework for .NET, with an very intuitive and easy-to-use API. For about one year, I’ve been a maintainer of FakeItEasy, along with Adam Ralph and Blair Conrad. It’s been a real pleasure working with them and I had a lot of fun!

Today I’m glad to announce that we’re releasing FakeItEasy 3.0.0, which supports .NET Core and introduces a few useful features.

Let’s see what’s new!

.NET Core support

In addition to .NET 4+, FakeItEasy now supports .NET Standard 1.6, so you can use it in .NET Core projects.

Note that due to limitations in .NET Standard 1.x, there are some minor differences with the .NET 4 version:

  • Fakes are not binary serializable;
  • Self-initializing fakes are not supported (i.e. fakeService = A.Fake<IService>(options => options.Wrapping(realService).RecordedBy(recorder)).

Huge thanks to the people who made .NET Core support possible:

  • Jonathon Rossi, who maintains Castle.Core. FakeItEasy relies heavily on Castle.Core, so it couldn’t have supported .NET Core if Castle.Core didn’t.
  • Jeremy Meng from Microsoft, who did most of the heavy lifting to make both FakeItEasy 3.0.0 and Castle.Core 4.0.0 work on .NET Core.

Analyzer

VB.NET support

The FakeItEasy analyzer, which warns you of incorrect usage of the library, now supports VB.NET as well as C#.

New or improved features

Better syntax for configuring successive calls to the same member

When you configure calls on a fake, it creates rules that are "stacked" on each other, which means you can override a previously configured rule. Combined with the ability to specify the number of times a rule must apply, this lets you say things like "return 42 twice, then throw an exception". Until now, to do that you had to configure the calls in reverse order, which wasn’t very intuitive and meant you had to repeat the call specification:

A.CallTo(() => foo.Bar()).Throws(new Exception("oops"));
A.CallTo(() => foo.Bar()).Returns(42).Twice();

FakeItEasy 3.0.0 introduces a new syntax to make this easier:

A.CallTo(() => foo.Bar()).Returns(42).Twice()
    .Then.Throws(new Exception("oops"));

Note that if you don’t specify how many times the rule must apply, it will apply forever until explicitly overridden. Hence, you can only use Then after Once(), Twice() or NumberOfTimes(...).

This is a breaking change at the API level, as the shape of the configuration interfaces has changed, but unless you manipulate those interfaces explicitly, you shouldn’t be affected.

Automatic support for cancellation

When a method accepts a CancellationToken, it should usually throw an exception when it’s called with a token that is already canceled. Previously this behavior had to be configured manually. In FakeItEasy 3.0.0, fake methods will now throw an OperationCanceledException by default when called with a canceled token. Asynchronous methods will return a canceled task.

This is technically a breaking change, but most users are unlikely to be affected.

Throw asynchronously

FakeItEasy lets you configure a method to throw an exception with Throws. But for async methods, there are actually two ways to "throw":

  • throw an exception synchronously, before actually returning a task (this is what Throws does)
  • return a failed task (which had to be done manually until now)

In some cases the difference can be important to the caller, if it doesn’t directly await the async method. FakeItEasy 3.0.0 introduces a ThrowsAsync method to configure a method to return a failed task:

A.CallTo(() => foo.BarAsync()).ThrowsAsync(new Exception("foo"));

Configure property setters on unnatural fakes

Unnatural fakes (i.e. Fake<T>) now have a CallsToSet method, which does the same as A.CallToSet on natural fakes:

var fake = new Fake<IFoo>();
fake.CallsToSet(foo => foo.Bar).To(0).Throws(new Exception("The value of Bar can't be 0"));

Better API for specifying additional attributes

The syntax to specify additional attributes on fakes was a bit unwieldy; you had to create a collection of CustomAttributeBuilders, which themselves had to be created by specifying the constructor and argument values. The WithAdditionalAttributes method has been retired in FakeItEasy 3.0.0 and replaced with a simpler WithAttributes that accepts expressions:

var foo = A.Fake<IFoo>(x => x.WithAttributes(() => new FooAttribute()));

This is a breaking change.

Other notable changes

Deprecation of self-initializing fakes

Self-initializing fakes are a feature that lets you record the results of calls to a real object, and replay them on a fake object. This feature was used by very few people, and didn’t seem to be a good fit in the core FakeItEasy library, so it will be removed in a future version. We’re considering providing the same functionality as a separate package.

Bug fixes

  • Faking a type multiple times and applying different attributes to the fakes now correctly generates different fake types. (#436)
  • All non-void members attempt to return a Dummy by default, even after being reconfigured by Invokes or DoesNothing (#830)

The full list of changes for this release is available in the release notes.

Other contributors to this release include:

A big thanks to them!

C# methods in git diff hunk headers

If you use git on the command line, you may have noticed that diff hunks often show the method signature in the hunk header (the line that starts with @@), like this:

diff --git a/Program.cs b/Program.cs
index 655a213..5ae1016 100644
--- a/Program.cs
+++ b/Program.cs
@@ -13,6 +13,7 @@ static void Main(string[] args)
         Console.WriteLine("Hello World!");
         Console.WriteLine("Hello World!");
         Console.WriteLine("Hello World!");
+        Console.WriteLine("blah");
     }

This is very useful to know where you are when looking at a diff.

Git has a few built-in regex patterns to detect methods in some languages, including C#; they are defined in userdiff.c. But by default, these patterns are not used… you need to tell git which file extensions should be associated with which language. This can be specified in a .gitattributes file at the root of your git repository:

*.cs    diff=csharp

With this done, git diff should show an output similar to the sample above.

Are we done yet? Well, almost. See, the patterns for C# were added to git a long time ago, and C# has changed quite a bit since then. Some new keywords that can now be part of a method signature are not recognized by the built-in pattern, e.g. async or partial. This is quite annoying, because when some code has changed in an async method, the diff hunk header shows the signature of a previous, non-async method, or the line where the class is declared, which is confusing.

My first impulse was to submit a pull request on Github to add the missing keywords; however I soon realized that the git repository on Github is just a mirror and does not accept pull requests… The contribution process consists of sending a patch to the git mailing list, with a long and annoying checklist of requirements. This process seemed so tedious that I gave it up. I honestly don’t know why they use such a difficult and old-fashioned contribution process, it just discourages casual contributors. But that’s a bit off-topic, so let’s move on and try to solve the problem some other way.

Fortunately, the built-in patterns can be overridden in the git configuration. To define the function name pattern for C#, you need to define the diff.csharp.xfuncname setting in your git config file:

[diff "csharp"]
  xfuncname = ^[ \\t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe|async|partial)[ \\t]+)*[][<>@.~_[:alnum:]]+[ \\t]+[<>@._[:alnum:]]+[ \\t]*\\(.*\\))[ \\t]*$

As you can see, it’s the same pattern as in userdiff.c, with the backslashes escaped and the missing keywords added. With this pattern, git diff now shows the correct function signature in async methods:

diff --git a/Program.cs b/Program.cs
index 655a213..5ae1016 100644
--- a/Program.cs
+++ b/Program.cs
@@ -31,5 +32,6 @@ static async Task FooAsync()
         Console.WriteLine("Hello world");
         Console.WriteLine("Hello world");
         Console.WriteLine("Hello world");
+        await Task.Delay(100);
     }
 }

It took me a while to figure it out, so I hope you find it helpful!

Fun with the HttpClient pipeline

A few years ago, Microsoft introduced the HttpClient class as a modern alternative to HttpWebRequest to make web requests from .NET apps. Not only is this new API much easier to use, cleaner, and asynchronous by design, it’s also easily extensible.

You might have noticed that HttpClient has a constructor that accepts a HttpMessageHandler. What is this handler? It’s an object that accepts a request (HttpRequestMessage) and returns a response (HttpResponseMessage); how it does that is entirely dependent on the implementation. By default, HttpClient uses HttpClientHandler, a handler which sends a request to a server over the network and returns the server’s response. The other built-in handler implementation is an abstract class named DelegatingHandler, and is the one I want to talk about.

The pipeline

DelegatingHandler is a handler that is designed to be chained with another handler, effectively forming a pipeline through which requests and responses will pass, as shown on this diagram:

HttpClient pipeline diagram

(Image from the official ASP.NET website)

Each handler has a chance to examine and/or modify the request before passing it to the next handler in the chain, and to examine and/or modify the response it receives from the next handler. Typically, the last handler in the pipeline is the HttpClientHandler, which communicates directly with the network.

The handler chain can be setup like this:

var pipeline = new MyHandler1()
{
    InnerHandler = new MyHandler2()
    {
        InnerHandler = new HttpClientHandler()
    }
};
var client = new HttpClient(pipeline);

But if you prefer fluent interfaces, you can easily create an extension method to do it like this:

var pipeline = new HttpClientHandler()
    .DecorateWith(new MyHandler2())
    .DecorateWith(new MyHandler1());
var client = new HttpClient(pipeline);

All this might seem a little abstract at this point, but this pipeline architecture enables plenty of interesting scenarios. See, HTTP message handlers can be used to add custom behavior to how requests and responses are processed. I’ll give a few examples.

Side note: I’m presenting this feature from a client-side perspective (since I primarily make client apps), but the same HTTP message handlers are also used on the server-side in ASP.NET Web API.

Unit testing

The first use case that comes to mind, and the first I ever used, is unit testing. If you’re testing a class that makes online payments over HTTP, you don’t want it to actually send requests to the real server… you just want to ensure that the requests it sends are correct, and that it reacts correctly to specific responses. An easy solution to this problem is to create a "stub" handler, and inject it into your class to use instead of HttpClientHandler. Here’s a simple implementation:

class StubHandler : HttpMessageHandler
{
    // Responses to return
    private readonly Queue<HttpResponseMessage> _responses =
        new Queue<System.Net.Http.HttpResponseMessage>();

    // Requests that were sent via the handler
    private readonly List<HttpRequestMessage> _requests =
        new List<System.Net.Http.HttpRequestMessage>();

    protected override Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        if (_responses.Count == 0)
            throw new InvalidOperationException("No response configured");

        _requests.Add(request);
        var response = _responses.Dequeue();
        return Task.FromResult(response);
    }

    public void QueueResponse(HttpResponseMessage response) =>
        _responses.Enqueue(response);

    public IEnumerable<HttpRequestMessage> GetRequests() =>
        _requests;
}

This class lets you record the requests that are sent via the handler and specify the responses that should be returned. For instance, you could write a test like this:

// Arrange
var handler = new StubHandler();
handler.EnqueueResponse(new HttpResponseMessage(HttpStatusCode.Unauthorized));
var processor = new PaymentProcessor(handler);

// Act
var paymentResult = await processor.ProcessPayment(new Payment());

// Assert
Assert.AreEqual(PaymentStatus.Failed, paymentResult.Status);

Of course, rather than creating a stub manually, you could use a mocking framework to generate a fake handler for you. The fact that the SendAsync method is protected makes it a little harder than it should be, but you can easily work around the issue by making a subclass that exposes a public virtual method, and mock that instead:

public abstract class MockableMessageHandler : HttpMessageHandler
{
    protected override sealed Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        return DoSendAsync(request);
    }

    public abstract Task<HttpResponseMessage> DoSendAsync(HttpRequestMessage request);
}

Usage example with FakeItEasy:

// Arrange
var handler = A.Fake<MockableMessageHandler>();
A.CallTo(() => handler.DoSendAsync(A<HttpRequestMessage>._))
    .Returns(new HttpResponseMessage(HttpStatusCode.Unauthorized));
var processor = new PaymentProcessor(handler);
...

Logging

Logging sent requests and received responses can help diagnose issues. This can easily be done with a custom delegating handler:

public class LoggingHandler : DelegatingHandler
{
    private readonly ILogger _logger;

    public LoggingHandler(ILogger logger)
    {
        _logger = logger;
    }

    protected override async Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        _logger.Trace($"Request: {request}");
        try
        {
            // base.SendAsync calls the inner handler
            var response = await base.SendAsync(request, cancellationToken);
            _logger.Trace($"Response: {response}");
            return response;
        }
        catch (Exception ex)
        {
            _logger.Error($"Failed to get response: {ex}");
            throw;
        }
    }
}

Retrying failed requests

Another interesting use case for HTTP message handlers is to automatically retry failed requests. For instance, the server you’re talking to might be temporarily unavailable (503), or it could be throttling your requests (429), or maybe you lost Internet access. Handling the retry for these cases at the application level is a pain, because it can happen virtually in any part of your code. Having this logic at the lowest possible level and implemented in a way that is completely transparent to the callers can make things much easier.

Here’s a possible implementation of a retry handler:

public class RetryHandler : DelegatingHandler
{
    protected override async Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        while (true)
        {
            try
            {
                // base.SendAsync calls the inner handler
                var response = await base.SendAsync(request, cancellationToken);

                if (response.StatusCode == HttpStatusCode.ServiceUnavailable)
                {
                    // 503 Service Unavailable
                    // Wait a bit and try again later
                    await Task.Delay(5000, cancellationToken);
                    continue;
                }

                if (response.StatusCode == (HttpStatusCode)429)
                {
                    // 429 Too many requests
                    // Wait a bit and try again later
                    await Task.Delay(1000, cancellationToken);
                    continue;
                }

                // Not something we can retry, return the response as is
                return response;
            }
            catch (Exception ex) when(IsNetworkError(ex))
            {
                // Network error
                // Wait a bit and try again later
                await Task.Delay(2000, cancellationToken);
                continue;
            }
        }
    }

    private static bool IsNetworkError(Exception ex)
    {
        // Check if it's a network error
        if (ex is SocketException)
            return true;
        if (ex.InnerException != null)
            return IsNetworkError(ex.InnerException);
        return false;
    }
}

Note that it’s a pretty naive and simplistic implementation; for use in production code, you will probably want to add exponential backoff, take the Retry-After header into account to decide how long you have to wait, or be more subtle in how you check if an exception indicates a connection issue. Also, note that in its current state, this handler will retry forever until it succeeds; make sure to pass a cancellation token so that you can stop retrying if necessary.

Other use cases

I can’t give examples for every possible scenario, but here are a few other possible use cases for HTTP message handlers:

  • Custom cookie handling (I actually did that to work around a bug in CookieContainer)
  • Custom authentication (also something I did to implement OAuth2 Bearer authentication)
  • Using the X-HTTP-Method-Override header to pass proxies that forbid certain HTTP methods (see Scott Hanselman’s article for details)
  • Custom encryption or encoding
  • Caching

As you can see, there’s a whole world of possibilities! If you have other ideas, let me know in the comments!

Tuple deconstruction in C# 7

Last time on this blog I talked about the new tuple feature of C# 7. In Visual Studio 15 Preview 3, the feature wasn’t quite finished; it lacked 2 important aspects:

  • emitting metadata for the names of tuple elements, so that the names are preserved across assemblies
  • deconstruction of tuples into separate variables

Well, it looks like the C# language team has been busy during the last month, because both items are now implemented in VS 15 Preview 4, which was released today! They’ve also written nice startup guides about tuples and deconstruction.

It is now possible to write something like this:

var values = ...
var (count, sum) = Tally(values);
Console.WriteLine($"There are {count} values and their sum is {sum}");

(the Tally method is the one from the previous post)

Note that the intermediate variable t from the previous post has disappeared; we now assign the count and sum variables directly from the method result, which looks much nicer IMHO. There doesn’t seem to be a way to ignore part of the tuple (i.e. not assign it to a variable), hopefully it will come later.

An interesting aspect of deconstruction is that it’s not limited to tuples; any type can be deconstructed, as long as it has a Deconstruct method with the appropriate out parameters:

class Point
{
    public int X { get; }
    public int Y { get; }

    public Point(int x, int y)
    {
        X = x;
        Y = y;
    }

    public void Deconstruct(out int x, out int y)
    {
        x = X;
        y = Y;
    }
}

...

var (x, y) = point;
Console.WriteLine($"Coordinates: ({x}, {y})");

The Deconstruct method can also be an extension method, which can be useful if you want to deconstruct a type that you don’t own. The old System.Tuple classes, for example, can be deconstructed using extension methods like this one:

public static void Deconstruct<T1, T2>(this Tuple<T1, T2> tuple, out T1 item1, out T2 item2)
{
    item1 = tuple.Item1;
    item2 = tuple.Item2;
}

...

var tuple = Tuple.Create("foo", 42);
var (name, value) = tuple;
Console.WriteLine($"Name: {name}, Value = {value}");

Finally, methods that return tuples are now decorated with a [TupleElementNames] attribute that indicates the names of the tuple members:

// Decompiled code
[return: TupleElementNames(new[] { "count", "sum" })]
public static ValueTuple<int, double> Tally(IEnumerable<double> values)
{
   ...
}

(the attribute is emitted by the compiler, you don’t actually need to write it yourself)

This makes it possible to share the tuple member names across assemblies, and lets tools like Intellisense provide helpful information about the method.

So, the tuple feature of C# 7 seems to be mostly complete; however, keep in mind that it’s still a preview, and some things could change between now and the final release.

Tuples in C# 7

A tuple is an finite ordered list of values, of possibly different types, which is used to bundle related values together without having to create a specific type to hold them.

In .NET 4.0, a set of Tuple classes has been introduced in the framework, which can be used as follows:

private static Tuple<int, double> Tally(IEnumerable<double> values)
{
	int count = 0;
	double sum = 0.0;
	foreach (var value in values)
	{
	    count++;
	    sum += value;
	}
	return Tuple.Create(count, sum);
}

...

var values = ...
var t = Tally(values);
Console.WriteLine($"There are {t.Item1} values and their sum is {t.Item2}");

There are two annoying issues with the Tuple classes:

  • They’re classes, i.e. reference types. This means they must be allocated on the heap, and garbage collected when they’re no longer used. For applications where performance is critical, it can be an issue. Also, the fact that they can be null is often not desirable.
  • The elements in the tuple don’t have names, or rather, they always have the same names (Item1, Item2, etc), which are not meaningful at all. The Tuple<T1, T2> type conveys no information about what the tuple actually represents, which makes it a poor choice in public APIs.

In C# 7, a new feature will be introduced to improve support for tuples: you will be able to declare tuples types “inline”, a little like anonymous types, except that they’re not limited to the current method. Using this new feature, the code above becomes much cleaner:

static (int count, double sum) Tally(IEnumerable<double> values)
{
	int count = 0;
	double sum = 0.0;
	foreach (var value in values)
	{
	    count++;
	    sum += value;
	}
	return (count, sum);
}

...

var values = ...
var t = Tally(values);
Console.WriteLine($"There are {t.count} values and their sum is {t.sum}");

Note how the return type of the Tally method is declared, and how the result is used. This is much better! The tuple elements now have significant names, and the syntax is nicer too. The feature relies on a new ValueTuple<T1, T2> structure, which means it doesn’t involve a heap allocation.

You can try this feature right now in Visual Studio 15 Preview 3. However, the ValueTuple<T1, T2> type is not (yet) part of the .NET Framework; to get this example to work, you’ll need to reference the System.ValueTuple NuGet package.

Finally, one last remark about the names of tuple members: like many other language features, they’re just syntactic sugar. In the compiled code, the tuple members are only referred to as Item1 and Item2, not count and sum. The Tally method above actually returns a ValueTuple<int, double>, not a specially generated type. Note that the compiler that ships with VS 15 Preview 3 emits no metadata about the names of the tuple members. This part of the feature is not yet implemented, but should be included in the final version. This means that in the meantime, you can’t use tuples across assemblies (well, you can, but you will lose the member names and will have to use Item1 and Item2 to refer to the tuple members).

Pitfall: using var and async together

A few days ago at work, I stumbled upon a sneaky bug in our main app. The code looked innocent enough, and at first glance I couldn’t understand what was wrong… The code was similar to the following:

public async Task<bool> BookExistsAsync(int id)
{
    var store = await GetBookStoreAsync();
    var book = store.GetBookByIdAsync(id);
    return book != null;
}

// For completeness, here are the types and methods used in BookExistsAsync:

private Task<IBookStore> GetBookStoreAsync()
{
    // actual implementation irrelevant
    // ...
}


public interface IBookStore
{
    Task<Book> GetBookByIdAsync(int id);
    // other members omitted for brevity
}

public class Book
{
    public int Id { get; set; }
    // other members omitted for brevity
}

The BookExistsAsync method always returns true. Can you see why ?

Look at this line:

var book = store.GetBookByIdAsync(id);

Quick, what’s the type of book? If you answered Book, think again: it’s Task<Book>. The await is missing! And an async method always returns a non-null task, so book is never null.

When you have an async method with no await, the compiler warns you, but in this case there is an await on the line above. The only thing we do with book is to check that it’s not null; since Task<T> is a reference type, there’s nothing suspicious in comparing it to null. So, the compiler sees nothing wrong; the static code analyzer (ReSharper in this case) sees nothing wrong; and of course the feeble human brain reviewing the code sees nothing wrong either… Obviously, it could easily have been detected with adequate unit test coverage, but unfortunately this method wasn’t covered.

So, how to avoid this kind of mistake? Stop using var and always specify types explicitly? But I like var, I use it almost everywhere! Besides, I think it’s the first time I ever found a bug caused by the use of var. I’m really not willing to give it up…

Ideally, I would have liked ReSharper to spot the issue; perhaps it should consider all Task-returning methods to be implicitly [NotNull], unless specified otherwise. Until then, I don’t have a silver bullet against this issue; just pay attention when you call an async method, and write unit tests!

Test driving C# 7 features in Visual Studio “15” Preview

About two weeks ago, Microsoft released the first preview of the next version of Visual Studio. You can read about what’s new in the release notes. Some of the new features are really nice (for instance I love the new “lightweight installer”), but the most interesting for me is that it comes with a version of the compiler that includes a few of the features planned for C# 7. Let’s have a closer look at them!

Enabling the new features

The new features are not enabled by default. You can enable them individually with /feature: command line switches, but the easiest way is to enable them all by adding __DEMO__ and __DEMO_EXPERIMENTAL__ to the conditional compilation symbols (in Project properties, Build tab).

Local functions

Most functional languages allow you to declare functions in the body of other functions. It’s now possible to do the same in C# 7! The syntax for declaring a method inside another is pretty much what you would expect:

long Factorial(int n)
{
    long Fact(int i, long acc)
    {
        return i == 0 ? acc : Fact(i - 1, acc * i);
    }
    return Fact(n, 1);
}

Here, the Fact method is local to the Factorial method (in case you’re wondering, it’s a tail-recursive implementation of the factorial — which doesn’t make much sense, since C# doesn’t support tail recursion, but it’s just an example).

Of course, it was already possible to simulate local functions with lambda expressions, but there were a few drawbacks:

  • it’s less readable, because you have to declare the delegate type explicitly
  • it’s slower, due to the overhead of creating a delegate instance, and calling the delegate vs. calling the method directly
  • writing recursive lambdas is a bit awkward

Local functions have the following benefits:

  • when a method is only used as a helper for another method, making it local makes the relation more obvious
  • like lambdas, local functions can capture local variables and parameters of their containing method
  • local functions support recursion like any normal method

You can read more about this feature in the Roslyn Github repository.

Ref returns and ref locals

Since the first version of C#, it has always been possible to pass parameters by reference, which is conceptually similar to passing a pointer to a variable in languages like C. Until now, this feature was limited to parameters, but in C# 7 it becomes possible to return values by reference, or to have local variables that refer to the location of another variable. Here’s an example:

static void TestRefReturn()
{
    var foo = new Foo();
    Console.WriteLine(foo); // 0, 0
    
    foo.GetByRef("x") = 42;

    ref int y = ref foo.GetByRef("y");
    y = 99;

    Console.WriteLine(foo); // 42, 99
}

class Foo
{
    private int x;
    private int y;

    public ref int GetByRef(string name)
    {
        if (name == "x")
            return ref x;
        if (name == "y")
            return ref y;
        throw new ArgumentException(nameof(name));
    }

    public override string ToString() => $"{x},{y}";
}

Let’s have a closer look at this code.

  • On line 6, it looks like I’m assigning a value to the return of a method; what does this even mean? Well, the GetByRef method returns a field of the Foo class by reference (note the ref int return type of GetByRef). So, if I pass "x" as an argument, it returns the x field by reference. If I assign a value to that, it actually assigns a value to the x field.
  • On line 8, instead of just assigning a value directly to the field returned by GetByRef, I use a ref local variable y. The local variable now shares the same memory location as the foo.y field. So if I assign a value to it, it changes the value of foo.y.

Note that you can also return an array location by reference:

private MyBigStruct[] array = new MyBigStruct[10];
private int current;

public ref MyBigStruct GetCurrentItem()
{
    return ref array[current];
}

It’s likely that most C# developers will never actually need this feature; it’s pretty low level, and not the kind of thing you typically need when writing line-of-business applications. However it’s very useful for code whose performance is critical: copying a large structure is expensive, so if we can return it by reference instead, it can be a non-negligible performance benefit.

You can learn more about this feature on Github.

Pattern matching

Pattern matching is a feature very common in functional languages. C# 7 introduces some aspects of pattern matching, in the form of extensions to the is operator. For instance, when testing the type of a variable, it lets you introduce a new variable after the type, so that this variable is assigned with the left-hand side operand of the is, but with the type specified as the right-hand side operand (it will be clearer with an example).

Typically, if you need to test that a value is of type DateTime, then do something with that DateTime, you need to test the type, then cast to that type:

object o = GetValue();
if (o is DateTime)
{
    var d = (DateTime)o;
    // Do something with d
}

In C# 7, you can do this instead:

object o = GetValue();
if (o is DateTime d)
{
    // Do something with d
}

d is now declared directly as part of the o is DateTime expression.

This feature can also be used in a switch statement:

object v = GetValue();
switch (v)
{
    case string s:
        Console.WriteLine($"{v} is a string of length {s.Length}");
        break;
    case int i:
        Console.WriteLine($"{v} is an {(i % 2 == 0 ? "even" : "odd")} int");
        break;
    default:
        Console.WriteLine($"{v} is something else");
        break;
}

In this code, each case introduces a variable of the appropriate type, which you can use in the body of the case.

So far I only covered pattern matching against a simple type, but there are also more advanced forms. For instance:

switch (DateTime.Today)
{
    case DateTime(*, 10, 31):
        Console.WriteLine("Happy Halloween!");
        break;
    case DateTime(var year, 7, 4) when year > 1776:
        Console.WriteLine("Happy Independence Day!");
        break;
    case DateTime { DayOfWeek is DayOfWeek.Saturday }:
    case DateTime { DayOfWeek is DayOfWeek.Sunday }:
        Console.WriteLine("Have a nice week-end!");
        break;
    default:
        break;
}

How cool is that!

There’s also another (still experimental) form of pattern matching, using a new match keyword:

object o = GetValue();
string description = o match
    (
        case string s : $"{o} is a string of length {s.Length}"
        case int i : $"{o} is an {(i % 2 == 0 ? "even" : "odd")} int"
        case * : $"{o} is something else"
    );

It’s very similar to a switch, except that it’s an expression, not a statement.

There’s a lot more to pattern matching than what I mentioned here. You can look at the spec on Github for more details.

Binary literals and digit separators

These features were not explicitly mentioned in the VS Preview release notes, but I noticed they were included as well. They were initially planned for C# 6, but didn’t make it in the final release. They’re back in C# 7.

You can now write numeric literal in binary, in addition to decimal an hexadecimal:

int x = 0b11001010;

Very convenient to define bit masks!

To make large numbers more readable, you can also group digits by introducing separators. This can be used for decimal, hexadecimal or binary literals:

int oneBillion = 1_000_000_000;
int foo = 0x7FFF_1234;
int bar = 0b1001_0110_1010_0101;

Conclusion

So, with Visual Studio “15” Preview, you can start experimenting with the new C# 7 features; don’t hesitate to share your feedback on Github! And keep in mind that it’s still pre-release software, lots of things can change before the final release.

Using multiple cancellation sources with CreateLinkedTokenSource

Async programming in C# used to be hard; thanks to .NET 4’s Task Parallel Library and C# 5’s async/await feature, it has become fairly easy, and as a result, is becoming much more common. At the same time, a standardized approach to cancellation has been introduced : cancellation tokens. The basic idea is that you create a CancellationTokenSource that controls the cancellation, and pass the token it provides to the method that you want to be able to cancel. That method will then pass it to the other methods it calls, if they can be canceled, and/or regularly check if cancellation was requested. Upon cancellation, the method will typically throw an OperationCanceledException. Quick and dirty example:

private readonly IBusinessService _businessService;
private CancellationTokenSource _cancellationSource;
private Task _asyncOperation;

private void StartAsyncOperation()
{
    if (_asyncOperation != null)
        return;
    var _cancellationSource = new CancellationTokenSource();
    _asyncOperation = _businessService.DoSomethingAsync(_cancellationSource.Token);
}

// async void is bad; like I said, this is a quick and dirty example
private async void StopAsyncOperation()
{
    try
    {
        _cancellationSource.Cancel();
        // wait for the operation to finish
        await _asyncOperation;
    }
    catch (OperationCanceledException)
    {
        // Operation was successfully canceled
    }
    catch (Exception)
    {
        // Oops, something went wrong
    }
    finally
    {
        _asyncOperation = null;
        _cancellationSource.Dispose();
        _cancellationSource = null;
    }

...

class BusinessService : IBusinessService
{
    public async Task DoSomethingAsync(CancellationToken cancellationToken)
    {
        var data = await GetDataFromServerAsync(cancellationToken);
        foreach (string line in data)
        {
            cancellationToken.ThrowIfCancellationRequested();
            await ProcessLineAsync(line, cancellationToken);
        }
    }

    ...
}

In this case, StopAsyncOperation would be called, for instance, if the user chooses to abort the operation.

This all works pretty well and is rather easy to setup. But what if there is another reason to cancel the operation, known only by the BusinessService and outside the control of the calling method? That’s where the CancellationSource.CreateLinkedTokenSource method comes into play; basically, this method creates a cancellation source that will be canceled when any of the specified tokens is canceled.

Let’s start with a simple case: you have another cancellation token that you also want to take into account. The code would look like this:

    public async Task DoSomethingAsync(CancellationToken cancellationToken)
    {
        var otherToken = GetOtherCancellationToken();
        using (var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, otherToken))
        {
            var data = await GetDataFromServerAsync(linkedCts.Token);
            foreach (string line in data)
            {
                linkedCts.Token.ThrowIfCancellationRequested();
                await ProcessLineAsync(line, linkedCts.Token);
            }
        }
    }

We created a linked cancellation source based on the two cancellation tokens, then used the token from this new source instead of cancellationToken. If either cancellationToken or otherToken is canceled, linkedCts.Token will be canceled as well. If necessary, the calling code can detect how the operation was canceled by checking the CancellationToken property of the OperationCanceledException.

Now let’s see a slightly more difficult case: the second cancellation source is actually an event. You want to cancel the operation when the event occurs, in addition to user cancellation represented by the cancellationToken parameter. So you need to subscribe to the event and trigger the cancellation when it occurs. Here’s a way to do it:

    public async Task DoSomethingAsync(CancellationToken cancellationToken)
    {
        using (var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken))
        {
            EventHandler handler = (sender, e) => linkedCts.Cancel();
            try
            {
                SomeEvent += handler;
                var data = await GetDataFromServerAsync(linkedCts.Token);
                foreach (string line in data)
                {
                    linkedCts.Token.ThrowIfCancellationRequested();
                    await ProcessLineAsync(line, linkedCts.Token);
                }
            }
            finally
            {
                SomeEvent -= handler;
            }
        }
    }

Here we only pass cancellationToken to CreateLinkedTokenSource, and we directly cancel linkedCts when the event is raised. The code is getting a bit convoluted, but it achieves the desired result.

I can’t really give you a specific real-world use case of this technique, because the cases where I used it are too specific to be of public interest, but I can outline the general scenario. I have a long running operation that is made up of multiple long running operations. The whole operation can be canceled globally, and each of the sub-operations can also be canceled individually, without affecting the others. Here’s the rough outline of what it looks like:

async Task GlobalOperationAsync(CancellationToken cancellationToken)
{
    foreach (var subOperation is SubOperations)
    {
        cancellationToken.ThrowIfCancellationRequested();
        var subToken = subOperation.GetSpecificCancellationToken();
        using (var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, subToken))
        {
            try
            {
                await subOperation.RunAsync(linkedCts.Token);
            }
            catch (OperationCanceledException ex)
            {
                // Rethrow only if global cancellation was requested
                if (cancellationToken.IsCancellationRequested)
                    throw;
                    
                // otherwise continue running the other sub-operations
            }
        }
    }
}

Note that even though CancellationToken was introduced with the TPL and all the examples I gave were asynchronous, nothing prevents you from using this technique with synchronous code.

I hope you find this helpful. Have a great New Year’s Eve celebration and a happy new year!