Tag Archives: C#

Writing a GitHub Webhook as an Azure Function

I recently experimented with Azure Functions and GitHub apps, and I wanted to share what I learned.

A bit of background

As you may already know, I’m one of the maintainers of the FakeItEasy mocking library. As is common in open-source projects, we use a workflow based on feature branches and pull requests. When a change is requested in a PR during code review, we usually make the change as a fixup commit, because it makes it easier to review, and because we like to keep a clean history. When the changes are approved, the author squashes the fixup commits before the PR is merged. Unfortunately, I’m a little absent minded, and when I review a PR, I often forget to wait for the author to squash their commits before I merge… This causes the fixup commits to appear in the main dev branch, which is ugly.

Which leads me to the point of this post: I wanted to make a bot that could prevent a PR from being merged if it had commits that needed to be squashed (i.e. commits whose messages start with fixup! or squash!). And while I was at it, I thought I might as well make it usable by everyone, so I made it a GitHub app: DontMergeMeYet.

GitHub apps

Now, you might be wondering, what on Earth is a GitHub app? It’s simply a third-party application that is granted access to a GitHub repository using its own identity; what it can do with the repo depends on which permissions were granted. A GitHub app can also receive webhook notifications when events occur in the repo (e.g. a comment is posted, a pull request is opened, etc.).

A GitHub app could, for instance, react when a pull request is opened or updated, examine the PR details, and add a commit status to indicate whether the PR is ready to merge or not (this WIP app does this, but doesn’t take fixup commits into account).

As you can see, it’s a pretty good fit for what I’m trying to do!

In order to create a GitHub app, you need to go to the GitHub apps page, and click New GitHub app. You then fill in at least the name, homepage, and webhook URL, give the app the necessary permissions, and subscribe to the webhook events you need. In my case, I only needed read-only access to pull requests, read-write access to commit statuses, and to receive pull request events.

At this point, we don’t yet have an URL for the webhook, so enter any valid URL; we’ll change it later after we actually implemented the app.

Azure Functions

I hadn’t paid much attention to Azure Functions before, because I didn’t really see the point. So I started to implement my webhook as a full-blown ASP.NET Core app, but then I realized several things:

  • My app only had a single HTTP endpoint
  • It was fully stateless and didn’t need a database
  • If I wanted the webhook to always respond quickly, the Azure App Service had to be "always on"; that option isn’t available in free plans, and I didn’t want to pay a fortune for a better service plan.

I looked around and realized that Azure Functions had a "consumption plan", with a generous amount (1 million per month) of free requests before I had to pay anything, and functions using this plan are "always on". Since I had a single endpoint and no persistent state, an Azure Function seemed to be the best fit for my requirements.

Interestingly, Azure Functions can be triggered, among other things, by GitHub webhooks. This is very convenient as it takes care of validating the payload signature.

So, Azure Functions turn out to be a perfect match for implementing my webhook. Let’s look at how to create one.

Creating an Azure Function triggered by a GitHub webhook

It’s possible to write Azure functions in JavaScript, C# (csx) or F# directly in the portal, but I wanted the comfort of the IDE, so I used Visual Studio. To write an Azure Function in VS, follow the instructions on this page. When you create the project, a dialog appears to let you choose some options:

New function dialog

  • version of the Azure Functions runtime: v1 targets the full .NET Framework, v2 targets .NET Core. I picked v1, because I had trouble with the dependencies in .NET Core.
  • Trigger: GitHub webhooks don’t appear here, so just pick "HTTP Trigger", we’ll make the necessary changes in the code.
  • Storage account: pick the storage emulator; when you publish the function, a real Azure storage account will be set instead
  • Access rights: it doesn’t matter what you pick, we’ll override it in the code.

The project template creates a class named Function1 with a Run method that looks like this:

public static class Function1
{
    [FunctionName("Function1")]
    public static async Task<HttpResponseMessage> Run(
        [HttpTrigger(AuthorizationLevel.Anonymous, "get", "post", Route = null)]HttpRequestMessage req, TraceWriter log)
    {
        ...
    }
}

Rename the class to something that makes more sense, e.g. GitHubWebHook, and don’t forget to change the name in the FunctionName attribute as well.

Now we need to tell the Azure Functions runtime that this function is triggered by a GitHub webhook. To do this, change the method signature to look like this:

    [FunctionName("GitHubWebHook")]
    public static async Task<HttpResponseMessage> Run(
        [HttpTrigger("POST", WebHookType = "github")] HttpRequestMessage req,
        TraceWriter log)

GitHub webhooks always use the HTTP POST method; the WebHookType property is set to "github" to indicate that it’s a GitHub webhook.

Note that it doesn’t really matter what we respond to the webhook request; GitHub doesn’t do anything with the response. I chose to return a 204 (No content) response, but you can return a 200 or anything else, it doesn’t matter.

Publishing the Azure Function

To publish your function, just right click on the Function App project, and click Publish. This will show a wizard that will let you create a new Function App resource on your Azure subscription, or select an existing one. Not much to explain here, it’s pretty straightforward; just follow the wizard!

When the function is published, you need to tell GitHub how to invoke it. Open the Azure portal in your browser, navigate to your new Function App, and select the GitHubWebHook function. This will show the content of the (generated) function.json file. Above the code view, you will see two links: Get function URL, and Get GitHub secret:

Azure Function URL and secret

You need to copy the URL to the Webhook URL field in the GitHub app settings, and copy the secret to the Webhook secret field. This secret is used to calculate a signature for webhook payloads, so that the Azure Function can ensure the payloads really come from GitHub. As I mentioned earlier, this verification is done automatically when you use a GitHub HTTP trigger.

And that’s it, your webhook is online! Now you can go install the GitHub app into one of your repositories, and your webhook will start receiving events for this repo.

Points of interest

I won’t describe the whole implementation of my webhook in this post, because it would be too long and most of it isn’t that interesting, but I will just highlight a few points of interest. You can find the complete code on GitHub.

Parsing the payload

Rather than reinventing the wheel, we can leverage the Octokit .NET library. Octokit is a library made by GitHub to consume the GitHub REST API. It contains classes representing the entities used in the API, including webhook payloads, so we can just deserialize the request content as a PullRequestEventPayload. However, if we just try to do this with JSON.NET, this isn’t going to work: Octokit doesn’t use JSON.NET, so the classes aren’t decorated with JSON.NET attributes to map the C# property names to the JSON property names. Instead, we need to use the JSON serializer that is included in Octokit, called SimpleJsonSerializer:

private static async Task<PullRequestEventPayload> DeserializePayloadAsync(HttpContent content)
{
    string json = await content.ReadAsStringAsync();
    var serializer = new SimpleJsonSerializer();
    return serializer.Deserialize<PullRequestEventPayload>(json);
}

There’s also another issue: the PullRequestEventPayload from Octokit is missing the Installation property, which we’re going to need later to authenticate with the GitHub API. An easy workaround is to make a new class that inherits PullRequestEventPayload and add the new property:

public class PullRequestPayload : PullRequestEventPayload
{
    public Installation Installation { get; set; }
}

public class Installation
{
    public int Id { get; set; }
}

And we’ll just use PullRequestPayload instead of PullRequestEventPayload.

Authenticating with the GitHub API

We’re going to need to call the GitHub REST API for two things:

  • to get the list of commits in the pull request
  • to update the commit status

In order to access the API, we’re going to need credentials… but which credentials? We could just generate a personal access token and use that, but then we would access the API as a "real" GitHub user, and we would only be able to access our own repositories (for writing, at least).

As I mentioned earlier, GitHub apps have their own identity. What I didn’t say is that when authenticated as themselves, there isn’t much they’re allowed to do: they can only get management information about themselves, and get a token to authenticate as an installation. An installation is, roughly, an instance of the application that is installed on one or more repo. When someone installs your app on their repo, it creates an installation. Once you get a token for an installation, you can access all the APIs allowed by the app’s permissions on the repos it’s installed on.

However, there are a few hoops to jump through to get this token… This page describes the process in detail.

The first step is to generate a JSON Web Token (JWT) for the app. This token has to contain the following claims:

  • iat: the timestamp at which the token was issued
  • exp: the timestamp at which the token expires
  • iss: the issuer, which is actually the app ID (found in the GitHub app settings page)

This JWT needs to be signed with the RS256 algorithm (RSA signature with SHA256); in order to sign it, you need a private key, which must be generated from the GitHub app settings page. You can download the private key in PEM format, and store it somewhere your app can access it. Unfortunately, the .NET APIs to generate and sign a JWT don’t handle the PEM format, they need an RSAParameters object… But Stackoverflow is our friend, and this answer contains the code we need to convert a PEM private key to an RSAParameters object. I just kept the part I needed, and manually reformatted the PEM private key to remove the header, footer, and newlines, so that it could easily be stored in the settings as a single line of text.

Once you have the private key as an RSAParameters object, you can generate a JWT like this:

public string GetTokenForApplication()
{
    var key = new RsaSecurityKey(_settings.RsaParameters);
    var creds = new SigningCredentials(key, SecurityAlgorithms.RsaSha256);
    var now = DateTime.UtcNow;
    var token = new JwtSecurityToken(claims: new[]
        {
            new Claim("iat", now.ToUnixTimeStamp().ToString(), ClaimValueTypes.Integer),
            new Claim("exp", now.AddMinutes(10).ToUnixTimeStamp().ToString(), ClaimValueTypes.Integer),
            new Claim("iss", _settings.AppId)
        },
        signingCredentials: creds);

    var jwt = new JwtSecurityTokenHandler().WriteToken(token);
    return jwt;
}

A few notes about this code:

  • It requires the following NuGet packages:
    • Microsoft.IdentityModel.Tokens 5.2.1
    • System.IdentityModel.Tokens.Jwt 5.2.1
  • ToUnixTimeStamp is an extension method that converts a DateTime to a UNIX timestamp; you can find it here
  • As per the GitHub documentation, the token lifetime cannot exceed 10 minutes

Once you have the JWT, you can get an installation access token by calling the "new installation token" API endpoint. You can authenticate to this endpoint by using the generated JWT as a Bearer token

public async Task<string> GetTokenForInstallationAsync(int installationId)
{
    var appToken = GetTokenForApplication();
    using (var client = new HttpClient())
    {
        string url = $"https://api.github.com/installations/{installationId}/access_tokens";
        var request = new HttpRequestMessage(HttpMethod.Post, url)
        {
            Headers =
            {
                Authorization = new AuthenticationHeaderValue("Bearer", appToken),
                UserAgent =
                {
                    ProductInfoHeaderValue.Parse("DontMergeMeYet"),
                },
                Accept =
                {
                    MediaTypeWithQualityHeaderValue.Parse("application/vnd.github.machine-man-preview+json")
                }
            }
        };
        using (var response = await client.SendAsync(request))
        {
            response.EnsureSuccessStatusCode();
            var json = await response.Content.ReadAsStringAsync();
            var obj = JObject.Parse(json);
            return obj["token"]?.Value<string>();
        }
    }
}

OK, almost there. Now we just need to use the installation token to call the GitHub API. This can be done easily with Octokit:

private IGitHubClient CreateGitHubClient(string installationToken)
{
    var userAgent = new ProductHeaderValue("DontMergeMeYet");
    return new GitHubClient(userAgent)
    {
        Credentials = new Credentials(installationToken)
    };
}

And that’s it, you can now call the GitHub API as an installation of your app.

Note: the code above isn’t exactly what you’ll find in the repo; I simplified it a little for the sake of clarity.

Testing locally using ngrok

When creating your Azure Function, it’s useful to be able to debug on your local machine. However, how will GitHub be able to call your function if it doesn’t have a publicly accessible URL? The answer is a tool called ngrok. Ngrok can create a temporary host name that forwards all traffic to a port on your local machine. To use it, create an account (it’s free) and download the command line tool. Once logged in to the ngrok website, a page will give you the command to save an authentication token on your machine. Just execute this command:

ngrok authtoken 1beErG2VTJJ0azL3r2SBn_2iz8johqNv612vaXa3Rkm

Start your Azure Function in debug from Visual Studio; the console will show you the local URL of the function, something like http://localhost:7071/api/GitHubWebHook. Note the port, and in a new console, start ngrok like this:

ngrok http 7071 --host-header rewrite

This will create a new hostname and start forwarding traffic to the 7071 port on your machine. The --host-header rewrite argument causes ngrok to change the Host HTTP header to localhost, rather than the temporary hostname; Azure Functions don’t work correctly without this.

You can see the temporary hostname in the command output:

ngrok by @inconshreveable                                                                                                                                                                                         (Ctrl+C to quit)

Session Status                online
Account                       Thomas Levesque (Plan: Free)
Version                       2.2.8
Region                        United States (us)
Web Interface                 http://127.0.0.1:4040
Forwarding                    http://89e14c16.ngrok.io -> localhost:7071
Forwarding                    https://89e14c16.ngrok.io -> localhost:7071

Connections                   ttl     opn     rt1     rt5     p50     p90
                              0       0       0.00    0.00    0.00    0.00

Finally, go to the GitHub app settings, and change the webook URL to https://89e14c16.ngrok.io/api/GitHubWebHook (i.e. the temporary domain with the same path as the local URL).

Now you’re all set. GitHub will send the webhook payloads to ngrok, which will forward them to your app running locally.

Note that unless you have a paid plan for ngrok, the temporary subdomain changes every time you start the tool, which is annoying. So it’s better to keep it running for the whole development session, otherwise you will need to change the GitHub app settings again.

Conclusion

Hopefully you learned a few things from this article. With Azure Functions, it’s almost trivial to implement a GitHub webhook (the only tricky part is the authentication to call the GitHub API, but not all webhooks need it). It’s much lighter than a full-blown web app, and much simpler to write: you don’t have to care about MVC, routing, services, etc. And if it wasn’t enough, the pricing model for Azure Functions make it a very cheap option for hosting a webhook!

Understanding the ASP.NET Core middleware pipeline

Middlewhat?

The ASP.NET Core architecture features a system of middleware, which are pieces of code that handle requests and responses. Middleware are chained to each other to form a pipeline. Incoming requests are passed through the pipeline, where each middleware has a chance to do something with them before passing them to the next middleware. Outgoing responses are also passed through the pipeline, in reverse order. If this sounds very abstract, the following schema from the official ASP.NET Core documentation should help you understand:

Middleware pipeline

Middleware can do all sort of things, such as handling authentication, errors, static files, etc… MVC in ASP.NET Core is also implemented as a middleware.

Configuring the pipeline

You typically configure the ASP.NET pipeline in the Configure method of your Startup class, by calling Use* methods on the IApplicationBuilder. Here’s an example straight from the docs:

public void Configure(IApplicationBuilder app)
{
    app.UseExceptionHandler("/Home/Error");
    app.UseStaticFiles();
    app.UseAuthentication();
    app.UseMvcWithDefaultRoute();
}

Each Use* method adds a middleware to the pipeline. The order in which they’re added determines the order in which requests will traverse them. So an incoming request will first traverse the exception handler middleware, then the static files middleware, then the authentication middleware, and will eventually be handled by the MVC middleware.

The Use* methods in this example are actually just "shortcuts" to make it easier to build the pipeline. Behind the scenes, they all end up using (directly or indirectly) these low-level primitives: Use and Run. Both add a middleware to the pipeline, the difference is that Run adds a terminal middleware, i.e. a middleware that is the last in the pipeline.

A basic pipeline with no branches

Let’s look at a simple example, using only the Use and Run primitives:

public void Configure(IApplicationBuilder app)
{
    // Middleware A
    app.Use(async (context, next) =>
    {
        Console.WriteLine("A (before)");
        await next();
        Console.WriteLine("A (after)");
    });

    // Middleware B
    app.Use(async (context, next) =>
    {
        Console.WriteLine("B (before)");
        await next();
        Console.WriteLine("B (after)");
    });

    // Middleware C (terminal)
    app.Run(async context =>
    {
        Console.WriteLine("C");
        await context.Response.WriteAsync("Hello world");
    });
}

Here, each middleware is defined inline as an anonymous method; they could also be defined as full-blown classes, but for this example I picked the more concise option. Non-terminal middleware take two arguments: the HttpContext and a delegate to call the next middleware. Terminal middleware only take the HttpContext. Here we have two middleware A and B that just log to the console, and a terminal middleware C which writes the response. Here’s the console output when we send a request to our app:

A (before)
B (before)
C
B (after)
A (after)

We can see that each middleware was traversed in the order in which it was added, then traversed again in reverse order. The pipeline can be represented like this:

Basic pipeline

Short-circuiting middleware

A middleware doesn’t necessarily have to call the next middleware. For instance, if the static files middleware can handle a request, it doesn’t need to pass it down to the rest of the pipeline, it can respond immediately. This behavior is called short-circuiting the pipeline.

In the previous example, if we comment out the call to next() in middleware B, we get the following output:

A (before)
B (before)
B (after)
A (after)

As you can see, middleware C is never invoked. The pipeline now looks like this:

Short-circuited pipeline

Branching the pipeline

In the previous examples, there was only one "branch" in the pipeline: the middleware coming after A was always B, and the middleware coming after B was always C. But it doesn’t have to be that way. You might want a given request to be processed by a completely different pipeline, based on the path or anything else.

There are two types of branches: branches that rejoin the main pipeline, and branches that don’t.

Making a non-rejoining branch

This can be done using the Map or MapWhen method. Map lets you specify a branch based on the request path. MapWhen gives you more control: you can specify a predicate on the HttpContext to decide whether to branch or not. Let’s look at a simple example using Map:

public void Configure(IApplicationBuilder app)
{
    app.Use(async (context, next) =>
    {
        Console.WriteLine("A (before)");
        await next();
        Console.WriteLine("A (after)");
    });

    app.Map(
        new PathString("/foo"),
        a => a.Use(async (context, next) =>
        {
            Console.WriteLine("B (before)");
            await next();
            Console.WriteLine("B (after)");
        }));

    app.Run(async context =>
    {
        Console.WriteLine("C");
        await context.Response.WriteAsync("Hello world");
    });
}

The first argument for Map is a PathString representing the path prefix of the request. The second argument is a delegate that configures the branch’s pipeline (the a parameter represents the IApplicationBuilder for the branch). The branch defined by the delegate will process the request if its path starts with the specified path prefix.

For a request that doesn’t start with /foo, this code produces the following output:

A (before)
C
A (after)

Middleware B is not invoked, since it’s in the branch and the request doesn’t match the prefix for the branch. But for a request that does start with /foo, we get the following output:

A (before)
B (before)
B (after)
A (after)

Note that this request returns a 404 (Not found) response: this is because the B middleware calls next(), but there’s no next middleware, so it falls back to returning a 404 response. To solve this, we could use Run instead of Use, or just not call next().

The pipeline defined by this code can be represented as follows:

Non-rejoining branch

(I omited the response arrows for clarity)

As you can see, the branch with middleware B doesn’t rejoin the main pipeline, so middleware C isn’t called.

Making a rejoining branch

You can make a branch that rejoins the main pipeline by using the UseWhen method. This method accepts a predicate on the HttpContext to decide whether to branch or not. The branch will rejoin the main pipeline where it left it. Here’s an example similar to the previous one, but with a rejoining branch:

public void Configure(IApplicationBuilder app)
{
    app.Use(async (context, next) =>
    {
        Console.WriteLine("A (before)");
        await next();
        Console.WriteLine("A (after)");
    });

    app.UseWhen(
        context => context.Request.Path.StartsWithSegments(new PathString("/foo")),
        a => a.Use(async (context, next) =>
        {
            Console.WriteLine("B (before)");
            await next();
            Console.WriteLine("B (after)");
        }));

    app.Run(async context =>
    {
        Console.WriteLine("C");
        await context.Response.WriteAsync("Hello world");
    });
}

For a request that doesn’t start with /foo, this code produces the same output as the previous example:

A (before)
C
A (after)

Again, middleware B is not invoked, since it’s in the branch and the request doesn’t match the predicate for the branch. But for a request that does start with /foo, we get the following output:

A (before)
B (before)
C
B (after)
A (after)

We can see that the request passes trough the branch (middleware B), then goes back to the main pipeline, ending with middleware C. This pipeline can be represented like this:

Rejoining branch

Note that there is no Use method that accepts a PathString to specify the path prefix. I’m not sure why it’s not included, but it would be easy to write, using UseWhen:

public static IApplicationBuilder Use(this IApplicationBuilder builder, PathString pathMatch, Action<IApplicationBuilder> configuration)
{
    return builder.UseWhen(
        context => context.Request.Path.StartsWithSegments(new PathString("/foo")),
        configuration);
}

Conclusion

As you can see, the idea behind the middleware pipeline is quite simple, but it’s very powerful. Most of the features baked in ASP.NET Core (authentication, static files, caching, MVC, etc) are implemented as middleware. And of course, it’s easy to write your own!

Better timeout handling with HttpClient

The problem

If you often use HttpClient to call REST APIs or to transfer files, you may have been annoyed by the way this class handles request timeout. There are two major issues with timeout handling in HttpClient:

  • The timeout is defined at the HttpClient level and applies to all requests made with this HttpClient; it would be more convenient to be able to specify a timeout individually for each request.
  • The exception thrown when the timeout is elapsed doesn’t let you determine the cause of the error. When a timeout occurs, you’d expect to get a TimeoutException, right? Well, surprise, it throws a TaskCanceledException! So, there’s no way to tell from the exception if the request was actually canceled, or if a timeout occurred.

Fortunately, thanks to HttpClient‘s flexibility, it’s quite easy to make up for this design flaw.

So we’re going to implement a workaround for these two issues. Let’s recap what we want:

  • the ability to specify timeout on a per-request basis
  • to receive a TimeoutException rather than a TaskCanceledException when a timeout occurs.

Specifying the timeout on a per-request basis

Let’s see how we can associate a timeout value to a request. The HttpRequestMessage class has a Properties property, which is a dictionary in which we can put whatever we need. We’re going to use this to store the timeout for a request, and to make things easier, we’ll create extension methods to access the value in a strongly-typed fashion:

public static class HttpRequestExtensions
{
    private static string TimeoutPropertyKey = "RequestTimeout";

    public static void SetTimeout(
        this HttpRequestMessage request,
        TimeSpan? timeout)
    {
        if (request == null)
            throw new ArgumentNullException(nameof(request));

        request.Properties[TimeoutPropertyKey] = timeout;
    }

    public static TimeSpan? GetTimeout(this HttpRequestMessage request)
    {
        if (request == null)
            throw new ArgumentNullException(nameof(request));

        if (request.Properties.TryGetValue(
                TimeoutPropertyKey,
                out var value)
            && value is TimeSpan timeout)
            return timeout;
        return null;
    }
}

Nothing fancy here, the timeout is an optional value of type TimeSpan. We can now associate a timeout value with a request, but of course, at this point there’s no code that makes use of the value…

HTTP handler

The HttpClient uses a pipeline architecture: each request is sent through a chain of handlers (of type HttpMessageHandler), and the response is passed back through these handlers in reverse order. This article explains this in greater detail if you want to know more. We’re going to insert our own handler into the pipeline, which will be in charge of handling timeouts.

Our handler is going to inherit DelegatingHandler, a type of handler designed to be chained to another handler. To implement a handler, we need to override the SendAsync method. A minimal implementation would look like this:

class TimeoutHandler : DelegatingHandler
{
    protected async override Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        return await base.SendAsync(request, finalCancellationToken);
    }
}

The call to base.SendAsync just passes the request to the next handler. Which means that at this point, our handler does absolutely nothing useful, but we’re going to augment it gradually.

Taking into account the timeout for a request

First, let’s add a DefaultTimeout property to our handler; it will be used for requests that don’t have their timeout explicitly set:

public TimeSpan DefaultTimeout { get; set; } = TimeSpan.FromSeconds(100);

The default value of 100 seconds is the same as that of HttpClient.Timeout.

To actually implement the timeout, we’re going to get the timeout value for the request (or DefaultTimeout if none is defined), create a CancellationToken that will be canceled after the timeout duration, and pass this CancellationToken to the next handler: this way, the request will be canceled after the timout is elapsed (this is actually what HttpClient does internally, except that it uses the same timeout for all requests).

To create a CancellationToken whose cancellation we can control, we need a CancellationTokenSource, which we’re going to create based on the request’s timeout:

private CancellationTokenSource GetCancellationTokenSource(
    HttpRequestMessage request,
    CancellationToken cancellationToken)
{
    var timeout = request.GetTimeout() ?? DefaultTimeout;
    if (timeout == Timeout.InfiniteTimeSpan)
    {
        // No need to create a CTS if there's no timeout
        return null;
    }
    else
    {
        var cts = CancellationTokenSource
            .CreateLinkedTokenSource(cancellationToken);
        cts.CancelAfter(timeout);
        return cts;
    }
}

Two points of interest here:

  • If the request’s timeout is infinite, we don’t create a CancellationTokenSource; it would never be canceled, so we save a useless allocation.
  • If not, we create a CancellationTokenSource that will be canceled after the timeout is elapsed (CancelAfter). Note that this CTS is linked to the CancellationToken we receive as a parameter in SendAsync: this way, it will be canceled either when the timeout expires, or when the CancellationToken parameter will itself be canceled. You can get more details on linked cancellation tokens in this article.

Finally, let’s change the SendAsync method to use the CancellationTokenSource we created:

protected async override Task<HttpResponseMessage> SendAsync(
    HttpRequestMessage request,
    CancellationToken cancellationToken)
{
    using (var cts = GetCancellationTokenSource(request, cancellationToken))
    {
        return await base.SendAsync(
            request,
            cts?.Token ?? cancellationToken);
    }
}

We get the CTS and pass its token to base.SendAsync. Note that we use cts?.Token, because GetCancellationTokenSource can return null; if that happens, we use the cancellationToken parameter directly.

At this point, we have a handler that lets us specify a different timeout for each request. But we still get a TaskCanceledException when a timeout occurs… Well, this is going to be easy to fix!

Throwing the correct exception

All we need to do is catch the TaskCanceledException (or rather its base class, OperationCanceledException), and check if the cancellationToken parameter is canceled: if it is, the cancellation was caused by the caller, so we let it bubble up normally; if not, this means the cancellation was caused by the timeout, so we throw a TimeoutException. Here’s the final SendAsync method:

protected async override Task<HttpResponseMessage> SendAsync(
    HttpRequestMessage request,
    CancellationToken cancellationToken)
{
    using (var cts = GetCancellationTokenSource(request, cancellationToken))
    {
        try
        {
            return await base.SendAsync(
                request,
                cts?.Token ?? cancellationToken);
        }
        catch(OperationCanceledException)
            when (!cancellationToken.IsCancellationRequested)
        {
            throw new TimeoutException();
        }
    }
}

Note that we use an exception filter : this way we don’t actually catch the OperationException when we want to let it propagate, and we avoid unnecessarily unwinding the stack.

Our handler is done, now let’s see how to use it.

Using the handler

When creating an HttpClient, it’s possible to specify the first handler of the pipeline. If none is specified, an HttpClientHandler is used; this handler sends requests directly to the network. To use our new TimeoutHandler, we’re going to create it, attach an HttpClientHandler as its next handler, and pass it to the HttpClient:

var handler = new TimeoutHandler
{
    InnerHandler = new HttpClientHandler()
};

using (var client = new HttpClient(handler))
{
    client.Timeout = Timeout.InfiniteTimeSpan;
    ...
}

Note that we need to disable the HttpClient‘s timeout by setting it to an infinite value, otherwise the default behavior will interfere with our handler.

Now let’s try to send a request with a timeout of 5 seconds to a server that takes to long to respond:

var request = new HttpRequestMessage(HttpMethod.Get, "http://foo/");
request.SetTimeout(TimeSpan.FromSeconds(5));
var response = await client.SendAsync(request);

If the server doesn’t respond within 5 seconds, we get a TimeoutException instead of a TaskCanceledException, so things seem to be working as expected.

Let’s now check that cancellation still works correctly. To do this, we pass a CancellationToken that will be cancelled after 2 seconds (i.e. before the timeout expires):

var request = new HttpRequestMessage(HttpMethod.Get, "http://foo/");
request.SetTimeout(TimeSpan.FromSeconds(5));
var cts = new CancellationTokenSource(TimeSpan.FromSeconds(2));
var response = await client.SendAsync(request, cts.Token);

This time, we receive a TaskCanceledException, as expected.

By implementing our own HTTP handler, we were able to solve the initial problem and have a smarter timeout handling.

The full code for this article is available here.

Testing and debugging library code from LINQPad

I’ve been meaning to blog about LINQPad in a very long time. In case you don’t know about it, LINQPad is a tool that lets you write and test code very quickly without having to create a full-blown project in Visual Studio. It supports C#, VB.NET, F# and SQL. It was initially intended as an educational tool to experiment with LINQ (its author, Joe Albahari, developed it as companion to his C# in a Nutshell book), but it’s also extremely useful as a general-purpose .NET scratchpad.

I frequently use LINQPad to quickly test a library that I’m working on. It’s very easy, just reference the assembly you want to test and start using it. But when the library doesn’t behave as expected, it’s often useful to be able to debug it step by step… It turns out that it’s pretty simple to do it from LINQPad!

The premium version of LINQPad has an integrated debugger, which isn’t as powerful as the one in Visual Studio, but is useful to debug LINQPad scripts. However, it doesn’t let you step into library code… Fortunately, there’s a trick to use the Visual Studio debugger to debug code running from LINQPad.

First, open your library in Visual Studio if you haven’t already. Build the solution, and add a reference to the assembly to your LINQPad script:

Add reference to the library

Write some code that uses your library:

Use your library from LINQPad

And add this line at the beginning of your LINQPad script:

Debugger.Launch();

When you run the script, it will open a dialog window prompting you to choose a debugger:

Choose a debugger

Select the Visual Studio instance in which your solution is loaded and click OK. This will attach the Visual Studio debugger to the process that is running the LINQPad script, and pause the execution on the call to Debugger.Launch():

Debugging in Visual Studio

You can now debug the LINQPad script and your library code. You can set breakpoints, step into methods, add watches, etc, just as when debugging a normal application!

Linq performance improvements in .NET Core

By now, you’re probably aware that Microsoft released an open-source and cross-platform version of the .NET platform: .NET Core. This means you can now build and run .NET apps on Linux or macOS. This is pretty cool in itself, but it doesn’t end there: .NET Core also brings a lot of improvements to the Base Class Library.

For instance, Linq has been made faster in .NET Core. I made a little benchmark to compare the performance of some common Linq methods, and the results are quite impressive:


The full code for the benchmark can be found here. As with all microbenchmarks, it has to be taken with a grain of salt, but it gives an idea of the improvements.

Some lines in this table are quite surprising. How can Select run 5000 times almost instantly? First, we have to keep in mind that most Linq operators are lazy: they don’t actually do anything until you enumerate the result, so doing something like array.Select(i => i * i) executes in constant time (it just returns a lazy sequence, without consuming the items in array). This is why I included a call to Count() in my benchmark, to make sure the result is enumerated.

Despite this, it runs 5000 times in 413µs… This is possible due to an optimization in the .NET Core implementation of Select and Count. A useful property of Select is that it produces a sequence with the same number of items as the source sequence. In .NET Core, Select takes advantage of this. If the source is an ICollection<T> or an array, it returns a custom enumerable object that keeps track of the number of items. Count can then just retrieve this value and return it, which produces a result in constant time. The full .NET Framework implementation, on the other hand, naively enumerates the sequence produced by Select, which takes much longer.

It’s interesting to note that in this situation, .NET Core will not execute the projection specified in Select, so it’s a breaking change compared to the desktop framework for code that was relying on side effects of this projection. This has been identified as an issue which has already been fixed on the master branch, so the next release of .NET Core will execute the projection on each item.

OrderBy followed by Count() also runs almost instantly… did Microsoft invent a O(1) sorting algorithm? Unfortunately, no… The explanation is the same as for Select: since OrderBy preserves the item count, the information is recorded so that it can be used by Count, and there is no need to actually sort the input sequence.

OK, so these cases were pretty obvious improvements (which will be rolled back anyway, as mentioned above). What about the SelectAndToArray case? In this test, I call ToArray() on the result of Select, to make sure that the projection is actually performed on each item of the source sequence: no cheating this time. Still, the .NET Core version is 68% faster than the full .NET Framework version. The reason has to do with allocations: since the .NET Core implementation knows how many items are in the result of Select, it can directly allocate an array of the correct size. In the .NET Framework, this information is not available, so it starts with a small array, copies items into it until it’s full, then allocates a larger array, copies the previous array into it, copies the next items from the sequence until the array is full, and so on. This causes a lot of allocations and copies, hence the degraded performance. A few years ago, I suggested an optimized version of ToList and ToArray, where you had to specify the size. The .NET Core implementation basically does the same thing, except that you don’t have to pass the size manually, since it’s passed along the Linq method chain.

Where and WhereAndToArray are both about 8% faster on .NET Core 1.1. Looking at the code (.NET 4.6.2, .NET Core), I can’t see any obvious difference that could explain the better performance, so I suspect it’s mostly due to improvements in the runtime. In this case, ToArray doesn’t know the length of the input sequence, since there is no way to predict how many items Where will yield, so it can’t use the same optimization as with Select and has to build the array the slow way.

We already discussed OrderBy + Count(), which wasn’t a fair comparison since the .NET Core implementation didn’t actually sort the sequence. The OrderByAndToArray case is more interesting, because the sort can’t be skipped. And in this case, the .NET Core implementation is slightly slower than the .NET 4.6.2 one. I’m not sure why this is; again, the implementation is very similar, although there has been a bit of refactoring in .NET Core.

So, on the whole, Linq seems generally faster in .NET Core than in .NET 4.6.2, which is very good news. Of course, I only benchmarked a limited numbers of scenarios, but it shows the .NET Core team is working hard to optimize everything they can.

C# methods in git diff hunk headers

If you use git on the command line, you may have noticed that diff hunks often show the method signature in the hunk header (the line that starts with @@), like this:

diff --git a/Program.cs b/Program.cs
index 655a213..5ae1016 100644
--- a/Program.cs
+++ b/Program.cs
@@ -13,6 +13,7 @@ static void Main(string[] args)
         Console.WriteLine("Hello World!");
         Console.WriteLine("Hello World!");
         Console.WriteLine("Hello World!");
+        Console.WriteLine("blah");
     }

This is very useful to know where you are when looking at a diff.

Git has a few built-in regex patterns to detect methods in some languages, including C#; they are defined in userdiff.c. But by default, these patterns are not used… you need to tell git which file extensions should be associated with which language. This can be specified in a .gitattributes file at the root of your git repository:

*.cs    diff=csharp

With this done, git diff should show an output similar to the sample above.

Are we done yet? Well, almost. See, the patterns for C# were added to git a long time ago, and C# has changed quite a bit since then. Some new keywords that can now be part of a method signature are not recognized by the built-in pattern, e.g. async or partial. This is quite annoying, because when some code has changed in an async method, the diff hunk header shows the signature of a previous, non-async method, or the line where the class is declared, which is confusing.

My first impulse was to submit a pull request on Github to add the missing keywords; however I soon realized that the git repository on Github is just a mirror and does not accept pull requests… The contribution process consists of sending a patch to the git mailing list, with a long and annoying checklist of requirements. This process seemed so tedious that I gave it up. I honestly don’t know why they use such a difficult and old-fashioned contribution process, it just discourages casual contributors. But that’s a bit off-topic, so let’s move on and try to solve the problem some other way.

Fortunately, the built-in patterns can be overridden in the git configuration. To define the function name pattern for C#, you need to define the diff.csharp.xfuncname setting in your git config file:

[diff "csharp"]
  xfuncname = ^[ \\t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe|async|partial)[ \\t]+)*[][<>@.~_[:alnum:]]+[ \\t]+[<>@._[:alnum:]]+[ \\t]*\\(.*\\))[ \\t]*$

As you can see, it’s the same pattern as in userdiff.c, with the backslashes escaped and the missing keywords added. With this pattern, git diff now shows the correct function signature in async methods:

diff --git a/Program.cs b/Program.cs
index 655a213..5ae1016 100644
--- a/Program.cs
+++ b/Program.cs
@@ -31,5 +32,6 @@ static async Task FooAsync()
         Console.WriteLine("Hello world");
         Console.WriteLine("Hello world");
         Console.WriteLine("Hello world");
+        await Task.Delay(100);
     }
 }

It took me a while to figure it out, so I hope you find it helpful!

Fun with the HttpClient pipeline

A few years ago, Microsoft introduced the HttpClient class as a modern alternative to HttpWebRequest to make web requests from .NET apps. Not only is this new API much easier to use, cleaner, and asynchronous by design, it’s also easily extensible.

You might have noticed that HttpClient has a constructor that accepts a HttpMessageHandler. What is this handler? It’s an object that accepts a request (HttpRequestMessage) and returns a response (HttpResponseMessage); how it does that is entirely dependent on the implementation. By default, HttpClient uses HttpClientHandler, a handler which sends a request to a server over the network and returns the server’s response. The other built-in handler implementation is an abstract class named DelegatingHandler, and is the one I want to talk about.

The pipeline

DelegatingHandler is a handler that is designed to be chained with another handler, effectively forming a pipeline through which requests and responses will pass, as shown on this diagram:

HttpClient pipeline diagram

(Image from the official ASP.NET website)

Each handler has a chance to examine and/or modify the request before passing it to the next handler in the chain, and to examine and/or modify the response it receives from the next handler. Typically, the last handler in the pipeline is the HttpClientHandler, which communicates directly with the network.

The handler chain can be setup like this:

var pipeline = new MyHandler1()
{
    InnerHandler = new MyHandler2()
    {
        InnerHandler = new HttpClientHandler()
    }
};
var client = new HttpClient(pipeline);

But if you prefer fluent interfaces, you can easily create an extension method to do it like this:

var pipeline = new HttpClientHandler()
    .DecorateWith(new MyHandler2())
    .DecorateWith(new MyHandler1());
var client = new HttpClient(pipeline);

All this might seem a little abstract at this point, but this pipeline architecture enables plenty of interesting scenarios. See, HTTP message handlers can be used to add custom behavior to how requests and responses are processed. I’ll give a few examples.

Side note: I’m presenting this feature from a client-side perspective (since I primarily make client apps), but the same HTTP message handlers are also used on the server-side in ASP.NET Web API.

Unit testing

The first use case that comes to mind, and the first I ever used, is unit testing. If you’re testing a class that makes online payments over HTTP, you don’t want it to actually send requests to the real server… you just want to ensure that the requests it sends are correct, and that it reacts correctly to specific responses. An easy solution to this problem is to create a "stub" handler, and inject it into your class to use instead of HttpClientHandler. Here’s a simple implementation:

class StubHandler : HttpMessageHandler
{
    // Responses to return
    private readonly Queue<HttpResponseMessage> _responses =
        new Queue<System.Net.Http.HttpResponseMessage>();

    // Requests that were sent via the handler
    private readonly List<HttpRequestMessage> _requests =
        new List<System.Net.Http.HttpRequestMessage>();

    protected override Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        if (_responses.Count == 0)
            throw new InvalidOperationException("No response configured");

        _requests.Add(request);
        var response = _responses.Dequeue();
        return Task.FromResult(response);
    }

    public void QueueResponse(HttpResponseMessage response) =>
        _responses.Enqueue(response);

    public IEnumerable<HttpRequestMessage> GetRequests() =>
        _requests;
}

This class lets you record the requests that are sent via the handler and specify the responses that should be returned. For instance, you could write a test like this:

// Arrange
var handler = new StubHandler();
handler.EnqueueResponse(new HttpResponseMessage(HttpStatusCode.Unauthorized));
var processor = new PaymentProcessor(handler);

// Act
var paymentResult = await processor.ProcessPayment(new Payment());

// Assert
Assert.AreEqual(PaymentStatus.Failed, paymentResult.Status);

Of course, rather than creating a stub manually, you could use a mocking framework to generate a fake handler for you. The fact that the SendAsync method is protected makes it a little harder than it should be, but you can easily work around the issue by making a subclass that exposes a public virtual method, and mock that instead:

public abstract class MockableMessageHandler : HttpMessageHandler
{
    protected override sealed Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        return DoSendAsync(request);
    }

    public abstract Task<HttpResponseMessage> DoSendAsync(HttpRequestMessage request);
}

Usage example with FakeItEasy:

// Arrange
var handler = A.Fake<MockableMessageHandler>();
A.CallTo(() => handler.DoSendAsync(A<HttpRequestMessage>._))
    .Returns(new HttpResponseMessage(HttpStatusCode.Unauthorized));
var processor = new PaymentProcessor(handler);
...

Logging

Logging sent requests and received responses can help diagnose issues. This can easily be done with a custom delegating handler:

public class LoggingHandler : DelegatingHandler
{
    private readonly ILogger _logger;

    public LoggingHandler(ILogger logger)
    {
        _logger = logger;
    }

    protected override async Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        _logger.Trace($"Request: {request}");
        try
        {
            // base.SendAsync calls the inner handler
            var response = await base.SendAsync(request, cancellationToken);
            _logger.Trace($"Response: {response}");
            return response;
        }
        catch (Exception ex)
        {
            _logger.Error($"Failed to get response: {ex}");
            throw;
        }
    }
}

Retrying failed requests

Another interesting use case for HTTP message handlers is to automatically retry failed requests. For instance, the server you’re talking to might be temporarily unavailable (503), or it could be throttling your requests (429), or maybe you lost Internet access. Handling the retry for these cases at the application level is a pain, because it can happen virtually in any part of your code. Having this logic at the lowest possible level and implemented in a way that is completely transparent to the callers can make things much easier.

Here’s a possible implementation of a retry handler:

public class RetryHandler : DelegatingHandler
{
    protected override async Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        while (true)
        {
            try
            {
                // base.SendAsync calls the inner handler
                var response = await base.SendAsync(request, cancellationToken);

                if (response.StatusCode == HttpStatusCode.ServiceUnavailable)
                {
                    // 503 Service Unavailable
                    // Wait a bit and try again later
                    await Task.Delay(5000, cancellationToken);
                    continue;
                }

                if (response.StatusCode == (HttpStatusCode)429)
                {
                    // 429 Too many requests
                    // Wait a bit and try again later
                    await Task.Delay(1000, cancellationToken);
                    continue;
                }

                // Not something we can retry, return the response as is
                return response;
            }
            catch (Exception ex) when(IsNetworkError(ex))
            {
                // Network error
                // Wait a bit and try again later
                await Task.Delay(2000, cancellationToken);
                continue;
            }
        }
    }

    private static bool IsNetworkError(Exception ex)
    {
        // Check if it's a network error
        if (ex is SocketException)
            return true;
        if (ex.InnerException != null)
            return IsNetworkError(ex.InnerException);
        return false;
    }
}

Note that it’s a pretty naive and simplistic implementation; for use in production code, you will probably want to add exponential backoff, take the Retry-After header into account to decide how long you have to wait, or be more subtle in how you check if an exception indicates a connection issue. Also, note that in its current state, this handler will retry forever until it succeeds; make sure to pass a cancellation token so that you can stop retrying if necessary.

Other use cases

I can’t give examples for every possible scenario, but here are a few other possible use cases for HTTP message handlers:

  • Custom cookie handling (I actually did that to work around a bug in CookieContainer)
  • Custom authentication (also something I did to implement OAuth2 Bearer authentication)
  • Using the X-HTTP-Method-Override header to pass proxies that forbid certain HTTP methods (see Scott Hanselman’s article for details)
  • Custom encryption or encoding
  • Caching

As you can see, there’s a whole world of possibilities! If you have other ideas, let me know in the comments!

Tuple deconstruction in C# 7

Last time on this blog I talked about the new tuple feature of C# 7. In Visual Studio 15 Preview 3, the feature wasn’t quite finished; it lacked 2 important aspects:

  • emitting metadata for the names of tuple elements, so that the names are preserved across assemblies
  • deconstruction of tuples into separate variables

Well, it looks like the C# language team has been busy during the last month, because both items are now implemented in VS 15 Preview 4, which was released today! They’ve also written nice startup guides about tuples and deconstruction.

It is now possible to write something like this:

var values = ...
var (count, sum) = Tally(values);
Console.WriteLine($"There are {count} values and their sum is {sum}");

(the Tally method is the one from the previous post)

Note that the intermediate variable t from the previous post has disappeared; we now assign the count and sum variables directly from the method result, which looks much nicer IMHO. There doesn’t seem to be a way to ignore part of the tuple (i.e. not assign it to a variable), hopefully it will come later.

An interesting aspect of deconstruction is that it’s not limited to tuples; any type can be deconstructed, as long as it has a Deconstruct method with the appropriate out parameters:

class Point
{
    public int X { get; }
    public int Y { get; }

    public Point(int x, int y)
    {
        X = x;
        Y = y;
    }

    public void Deconstruct(out int x, out int y)
    {
        x = X;
        y = Y;
    }
}

...

var (x, y) = point;
Console.WriteLine($"Coordinates: ({x}, {y})");

The Deconstruct method can also be an extension method, which can be useful if you want to deconstruct a type that you don’t own. The old System.Tuple classes, for example, can be deconstructed using extension methods like this one:

public static void Deconstruct<T1, T2>(this Tuple<T1, T2> tuple, out T1 item1, out T2 item2)
{
    item1 = tuple.Item1;
    item2 = tuple.Item2;
}

...

var tuple = Tuple.Create("foo", 42);
var (name, value) = tuple;
Console.WriteLine($"Name: {name}, Value = {value}");

Finally, methods that return tuples are now decorated with a [TupleElementNames] attribute that indicates the names of the tuple members:

// Decompiled code
[return: TupleElementNames(new[] { "count", "sum" })]
public static ValueTuple<int, double> Tally(IEnumerable<double> values)
{
   ...
}

(the attribute is emitted by the compiler, you don’t actually need to write it yourself)

This makes it possible to share the tuple member names across assemblies, and lets tools like Intellisense provide helpful information about the method.

So, the tuple feature of C# 7 seems to be mostly complete; however, keep in mind that it’s still a preview, and some things could change between now and the final release.

Tuples in C# 7

A tuple is an finite ordered list of values, of possibly different types, which is used to bundle related values together without having to create a specific type to hold them.

In .NET 4.0, a set of Tuple classes has been introduced in the framework, which can be used as follows:

private static Tuple<int, double> Tally(IEnumerable<double> values)
{
	int count = 0;
	double sum = 0.0;
	foreach (var value in values)
	{
	    count++;
	    sum += value;
	}
	return Tuple.Create(count, sum);
}

...

var values = ...
var t = Tally(values);
Console.WriteLine($"There are {t.Item1} values and their sum is {t.Item2}");

There are two annoying issues with the Tuple classes:

  • They’re classes, i.e. reference types. This means they must be allocated on the heap, and garbage collected when they’re no longer used. For applications where performance is critical, it can be an issue. Also, the fact that they can be null is often not desirable.
  • The elements in the tuple don’t have names, or rather, they always have the same names (Item1, Item2, etc), which are not meaningful at all. The Tuple<T1, T2> type conveys no information about what the tuple actually represents, which makes it a poor choice in public APIs.

In C# 7, a new feature will be introduced to improve support for tuples: you will be able to declare tuples types “inline”, a little like anonymous types, except that they’re not limited to the current method. Using this new feature, the code above becomes much cleaner:

static (int count, double sum) Tally(IEnumerable<double> values)
{
	int count = 0;
	double sum = 0.0;
	foreach (var value in values)
	{
	    count++;
	    sum += value;
	}
	return (count, sum);
}

...

var values = ...
var t = Tally(values);
Console.WriteLine($"There are {t.count} values and their sum is {t.sum}");

Note how the return type of the Tally method is declared, and how the result is used. This is much better! The tuple elements now have significant names, and the syntax is nicer too. The feature relies on a new ValueTuple<T1, T2> structure, which means it doesn’t involve a heap allocation.

You can try this feature right now in Visual Studio 15 Preview 3. However, the ValueTuple<T1, T2> type is not (yet) part of the .NET Framework; to get this example to work, you’ll need to reference the System.ValueTuple NuGet package.

Finally, one last remark about the names of tuple members: like many other language features, they’re just syntactic sugar. In the compiled code, the tuple members are only referred to as Item1 and Item2, not count and sum. The Tally method above actually returns a ValueTuple<int, double>, not a specially generated type. Note that the compiler that ships with VS 15 Preview 3 emits no metadata about the names of the tuple members. This part of the feature is not yet implemented, but should be included in the final version. This means that in the meantime, you can’t use tuples across assemblies (well, you can, but you will lose the member names and will have to use Item1 and Item2 to refer to the tuple members).

Pitfall: using var and async together

A few days ago at work, I stumbled upon a sneaky bug in our main app. The code looked innocent enough, and at first glance I couldn’t understand what was wrong… The code was similar to the following:

public async Task<bool> BookExistsAsync(int id)
{
    var store = await GetBookStoreAsync();
    var book = store.GetBookByIdAsync(id);
    return book != null;
}

// For completeness, here are the types and methods used in BookExistsAsync:

private Task<IBookStore> GetBookStoreAsync()
{
    // actual implementation irrelevant
    // ...
}


public interface IBookStore
{
    Task<Book> GetBookByIdAsync(int id);
    // other members omitted for brevity
}

public class Book
{
    public int Id { get; set; }
    // other members omitted for brevity
}

The BookExistsAsync method always returns true. Can you see why ?

Look at this line:

var book = store.GetBookByIdAsync(id);

Quick, what’s the type of book? If you answered Book, think again: it’s Task<Book>. The await is missing! And an async method always returns a non-null task, so book is never null.

When you have an async method with no await, the compiler warns you, but in this case there is an await on the line above. The only thing we do with book is to check that it’s not null; since Task<T> is a reference type, there’s nothing suspicious in comparing it to null. So, the compiler sees nothing wrong; the static code analyzer (ReSharper in this case) sees nothing wrong; and of course the feeble human brain reviewing the code sees nothing wrong either… Obviously, it could easily have been detected with adequate unit test coverage, but unfortunately this method wasn’t covered.

So, how to avoid this kind of mistake? Stop using var and always specify types explicitly? But I like var, I use it almost everywhere! Besides, I think it’s the first time I ever found a bug caused by the use of var. I’m really not willing to give it up…

Ideally, I would have liked ReSharper to spot the issue; perhaps it should consider all Task-returning methods to be implicitly [NotNull], unless specified otherwise. Until then, I don’t have a silver bullet against this issue; just pay attention when you call an async method, and write unit tests!