C# methods in git diff hunk headers

Very poorPoorAverageGoodExcellent (1 votes) 
Loading...

If you use git on the command line, you may have noticed that diff hunks often show the method signature in the hunk header (the line that starts with @@), like this:

diff --git a/Program.cs b/Program.cs
index 655a213..5ae1016 100644
--- a/Program.cs
+++ b/Program.cs
@@ -13,6 +13,7 @@ static void Main(string[] args)
         Console.WriteLine("Hello World!");
         Console.WriteLine("Hello World!");
         Console.WriteLine("Hello World!");
+        Console.WriteLine("blah");
     }

This is very useful to know where you are when looking at a diff.

Git has a few built-in regex patterns to detect methods in some languages, including C#; they are defined in userdiff.c. But by default, these patterns are not used… you need to tell git which file extensions should be associated with which language. This can be specified in a .gitattributes file at the root of your git repository:

*.cs    diff=csharp

With this done, git diff should show an output similar to the sample above.

Are we done yet? Well, almost. See, the patterns for C# were added to git a long time ago, and C# has changed quite a bit since then. Some new keywords that can now be part of a method signature are not recognized by the built-in pattern, e.g. async or partial. This is quite annoying, because when some code has changed in an async method, the diff hunk header shows the signature of a previous, non-async method, or the line where the class is declared, which is confusing.

My first impulse was to submit a pull request on Github to add the missing keywords; however I soon realized that the git repository on Github is just a mirror and does not accept pull requests… The contribution process consists of sending a patch to the git mailing list, with a long and annoying checklist of requirements. This process seemed so tedious that I gave it up. I honestly don’t know why they use such a difficult and old-fashioned contribution process, it just discourages casual contributors. But that’s a bit off-topic, so let’s move on and try to solve the problem some other way.

Fortunately, the built-in patterns can be overridden in the git configuration. To define the function name pattern for C#, you need to define the diff.csharp.xfuncname setting in your git config file:

[ diff "csharp"]
  xfuncname = ^[ \\t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe|async|partial)[ \\t]+)*[][<>@.~_[:alnum:]]+[ \\t]+[<>@._[:alnum:]]+[ \\t]*\\(.*\\))[ \\t]*$

As you can see, it’s the same pattern as in userdiff.c, with the backslashes escaped and the missing keywords added. With this pattern, git diff now shows the correct function signature in async methods:

diff --git a/Program.cs b/Program.cs
index 655a213..5ae1016 100644
--- a/Program.cs
+++ b/Program.cs
@@ -31,5 +32,6 @@ static async Task FooAsync()
         Console.WriteLine("Hello world");
         Console.WriteLine("Hello world");
         Console.WriteLine("Hello world");
+        await Task.Delay(100);
     }
 }

It took me a while to figure it out, so I hope you find it helpful!

Fun with the HttpClient pipeline

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...

A few years ago, Microsoft introduced the HttpClient class as a modern alternative to HttpWebRequest to make web requests from .NET apps. Not only is this new API much easier to use, cleaner, and asynchronous by design, it’s also easily extensible.

You might have noticed that HttpClient has a constructor that accepts a HttpMessageHandler. What is this handler? It’s an object that accepts a request (HttpRequestMessage) and returns a response (HttpResponseMessage); how it does that is entirely dependent on the implementation. By default, HttpClient uses HttpClientHandler, a handler which sends a request to a server over the network and returns the server’s response. The other built-in handler implementation is an abstract class named DelegatingHandler, and is the one I want to talk about.

The pipeline

DelegatingHandler is a handler that is designed to be chained with another handler, effectively forming a pipeline through which requests and responses will pass, as shown on this diagram:

HttpClient pipeline diagram

(Image from the official ASP.NET website)

Each handler has a chance to examine and/or modify the request before passing it to the next handler in the chain, and to examine and/or modify the response it receives from the next handler. Typically, the last handler in the pipeline is the HttpClientHandler, which communicates directly with the network.

The handler chain can be setup like this:

var pipeline = new MyHandler1()
{
    InnerHandler = new MyHandler2()
    {
        InnerHandler = new HttpClientHandler()
    }
};
var client = new HttpClient(pipeline);

But if you prefer fluent interfaces, you can easily create an extension method to do it like this:

var pipeline = new HttpClientHandler()
    .DecorateWith(new MyHandler2())
    .DecorateWith(new MyHandler1());
var client = new HttpClient(pipeline);

All this might seem a little abstract at this point, but this pipeline architecture enables plenty of interesting scenarios. See, HTTP message handlers can be used to add custom behavior to how requests and responses are processed. I’ll give a few examples.

Side note: I’m presenting this feature from a client-side perspective (since I primarily make client apps), but the same HTTP message handlers are also used on the server-side in ASP.NET Web API.

Unit testing

The first use case that comes to mind, and the first I ever used, is unit testing. If you’re testing a class that makes online payments over HTTP, you don’t want it to actually send requests to the real server… you just want to ensure that the requests it sends are correct, and that it reacts correctly to specific responses. An easy solution to this problem is to create a "stub" handler, and inject it into your class to use instead of HttpClientHandler. Here’s a simple implementation:

class StubHandler : HttpMessageHandler
{
    // Responses to return
    private readonly Queue<HttpResponseMessage> _responses =
        new Queue<System.Net.Http.HttpResponseMessage>();

    // Requests that were sent via the handler
    private readonly List<HttpRequestMessage> _requests =
        new List<System.Net.Http.HttpRequestMessage>();

    protected override Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        if (_responses.Count == 0)
            throw new InvalidOperationException("No response configured");

        _requests.Add(request);
        var response = _responses.Dequeue();
        return Task.FromResult(response);
    }

    public void QueueResponse(HttpResponseMessage response) =>
        _responses.Enqueue(response);

    public IEnumerable<HttpRequestMessage> GetRequests() =>
        _requests;
}

This class lets you record the requests that are sent via the handler and specify the responses that should be returned. For instance, you could write a test like this:

// Arrange
var handler = new StubHandler();
handler.EnqueueResponse(new HttpResponseMessage(HttpStatusCode.Unauthorized));
var processor = new PaymentProcessor(handler);

// Act
var paymentResult = await processor.ProcessPayment(new Payment());

// Assert
Assert.AreEqual(PaymentStatus.Failed, paymentResult.Status);

Of course, rather than creating a stub manually, you could use a mocking framework to generate a fake handler for you. The fact that the SendAsync method is protected makes it a little harder than it should be, but you can easily work around the issue by making a subclass that exposes a public virtual method, and mock that instead:

public abstract class MockableMessageHandler : HttpMessageHandler
{
    protected override sealed Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        return DoSendAsync(request);
    }

    public abstract Task<HttpResponseMessage> DoSendAsync(HttpRequestMessage request);
}

Usage example with FakeItEasy:

// Arrange
var handler = A.Fake<MockableMessageHandler>();
A.CallTo(() => handler.DoSendAsync(A<HttpRequestMessage>._))
    .Returns(new HttpResponseMessage(HttpStatusCode.Unauthorized));
var processor = new PaymentProcessor(handler);
...

Logging

Logging sent requests and received responses can help diagnose issues. This can easily be done with a custom delegating handler:

public class LoggingHandler : DelegatingHandler
{
    private readonly ILogger _logger;

    public LoggingHandler(ILogger logger)
    {
        _logger = logger;
    }

    protected override async Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        _logger.Trace($"Request: {request}");
        try
        {
            // base.SendAsync calls the inner handler
            var response = await base.SendAsync(request, cancellationToken);
            _logger.Trace($"Response: {response}");
            return response;
        }
        catch (Exception ex)
        {
            _logger.Error($"Failed to get response: {ex}");
            throw;
        }
    }
}

Retrying failed requests

Another interesting use case for HTTP message handlers is to automatically retry failed requests. For instance, the server you’re talking to might be temporarily unavailable (503), or it could be throttling your requests (429), or maybe you lost Internet access. Handling the retry for these cases at the application level is a pain, because it can happen virtually in any part of your code. Having this logic at the lowest possible level and implemented in a way that is completely transparent to the callers can make things much easier.

Here’s a possible implementation of a retry handler:

public class RetryHandler : DelegatingHandler
{
    protected override async Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        while (true)
        {
            try
            {
                // base.SendAsync calls the inner handler
                var response = await base.SendAsync(request, cancellationToken);

                if (response.StatusCode == HttpStatusCode.ServiceUnavailable)
                {
                    // 503 Service Unavailable
                    // Wait a bit and try again later
                    await Task.Delay(5000, cancellationToken);
                    continue;
                }

                if (response.StatusCode == (HttpStatusCode)429)
                {
                    // 429 Too many requests
                    // Wait a bit and try again later
                    await Task.Delay(1000, cancellationToken);
                    continue;
                }

                // Not something we can retry, return the response as is
                return response;
            }
            catch (Exception ex) when(IsNetworkError(ex))
            {
                // Network error
                // Wait a bit and try again later
                await Task.Delay(2000, cancellationToken);
                continue;
            }
        }
    }

    private static bool IsNetworkError(Exception ex)
    {
        // Check if it's a network error
        if (ex is SocketException)
            return true;
        if (ex.InnerException != null)
            return IsNetworkError(ex.InnerException);
        return false;
    }
}

Note that it’s a pretty naive and simplistic implementation; for use in production code, you will probably want to add exponential backoff, take the Retry-After header into account to decide how long you have to wait, or be more subtle in how you check if an exception indicates a connection issue. Also, note that in its current state, this handler will retry forever until it succeeds; make sure to pass a cancellation token so that you can stop retrying if necessary.

Other use cases

I can’t give examples for every possible scenario, but here are a few other possible use cases for HTTP message handlers:

  • Custom cookie handling (I actually did that to work around a bug in CookieContainer)
  • Custom authentication (also something I did to implement OAuth2 Bearer authentication)
  • Using the X-HTTP-Method-Override header to pass proxies that forbid certain HTTP methods (see Scott Hanselman’s article for details)
  • Custom encryption or encoding
  • Caching

As you can see, there’s a whole world of possibilities! If you have other ideas, let me know in the comments!

Tuple deconstruction in C# 7

Very poorPoorAverageGoodExcellent (2 votes) 
Loading...

Last time on this blog I talked about the new tuple feature of C# 7. In Visual Studio 15 Preview 3, the feature wasn’t quite finished; it lacked 2 important aspects:

  • emitting metadata for the names of tuple elements, so that the names are preserved across assemblies
  • deconstruction of tuples into separate variables

Well, it looks like the C# language team has been busy during the last month, because both items are now implemented in VS 15 Preview 4, which was released today! They’ve also written nice startup guides about tuples and deconstruction.

It is now possible to write something like this:

var values = ...
var (count, sum) = Tally(values);
Console.WriteLine($"There are {count} values and their sum is {sum}");

(the Tally method is the one from the previous post)

Note that the intermediate variable t from the previous post has disappeared; we now assign the count and sum variables directly from the method result, which looks much nicer IMHO. There doesn’t seem to be a way to ignore part of the tuple (i.e. not assign it to a variable), hopefully it will come later.

An interesting aspect of deconstruction is that it’s not limited to tuples; any type can be deconstructed, as long as it has a Deconstruct method with the appropriate out parameters:

class Point
{
    public int X { get; }
    public int Y { get; }

    public Point(int x, int y)
    {
        X = x;
        Y = y;
    }

    public void Deconstruct(out int x, out int y)
    {
        x = X;
        y = Y;
    }
}

...

var (x, y) = point;
Console.WriteLine($"Coordinates: ({x}, {y})");

The Deconstruct method can also be an extension method, which can be useful if you want to deconstruct a type that you don’t own. The old System.Tuple classes, for example, can be deconstructed using extension methods like this one:

public static void Deconstruct<T1, T2>(this Tuple<T1, T2> tuple, out T1 item1, out T2 item2)
{
    item1 = tuple.Item1;
    item2 = tuple.Item2;
}

...

var tuple = Tuple.Create("foo", 42);
var (name, value) = tuple;
Console.WriteLine($"Name: {name}, Value = {value}");

Finally, methods that return tuples are now decorated with a [TupleElementNames] attribute that indicates the names of the tuple members:

// Decompiled code
[return: TupleElementNames(new[] { "count", "sum" })]
public static ValueTuple<int, double> Tally(IEnumerable<double> values)
{
   ...
}

(the attribute is emitted by the compiler, you don’t actually need to write it yourself)

This makes it possible to share the tuple member names across assemblies, and lets tools like Intellisense provide helpful information about the method.

So, the tuple feature of C# 7 seems to be mostly complete; however, keep in mind that it’s still a preview, and some things could change between now and the final release.

Tuples in C# 7

Very poorPoorAverageGoodExcellent (2 votes) 
Loading...

A tuple is an finite ordered list of values, of possibly different types, which is used to bundle related values together without having to create a specific type to hold them.

In .NET 4.0, a set of Tuple classes has been introduced in the framework, which can be used as follows:

private static Tuple<int, double> Tally(IEnumerable<double> values)
{
	int count = 0;
	double sum = 0.0;
	foreach (var value in values)
	{
	    count++;
	    sum += value;
	}
	return Tuple.Create(count, sum);
}

...

var values = ...
var t = Tally(values);
Console.WriteLine($"There are {t.Item1} values and their sum is {t.Item2}");

There are two annoying issues with the Tuple classes:

  • They’re classes, i.e. reference types. This means they must be allocated on the heap, and garbage collected when they’re no longer used. For applications where performance is critical, it can be an issue. Also, the fact that they can be null is often not desirable.
  • The elements in the tuple don’t have names, or rather, they always have the same names (Item1, Item2, etc), which are not meaningful at all. The Tuple<T1, T2> type conveys no information about what the tuple actually represents, which makes it a poor choice in public APIs.

In C# 7, a new feature will be introduced to improve support for tuples: you will be able to declare tuples types “inline”, a little like anonymous types, except that they’re not limited to the current method. Using this new feature, the code above becomes much cleaner:

static (int count, double sum) Tally(IEnumerable<double> values)
{
	int count = 0;
	double sum = 0.0;
	foreach (var value in values)
	{
	    count++;
	    sum += value;
	}
	return (count, sum);
}

...

var values = ...
var t = Tally(values);
Console.WriteLine($"There are {t.count} values and their sum is {t.sum}");

Note how the return type of the Tally method is declared, and how the result is used. This is much better! The tuple elements now have significant names, and the syntax is nicer too. The feature relies on a new ValueTuple<T1, T2> structure, which means it doesn’t involve a heap allocation.

You can try this feature right now in Visual Studio 15 Preview 3. However, the ValueTuple<T1, T2> type is not (yet) part of the .NET Framework; to get this example to work, you’ll need to reference the System.ValueTuple NuGet package.

Finally, one last remark about the names of tuple members: like many other language features, they’re just syntactic sugar. In the compiled code, the tuple members are only referred to as Item1 and Item2, not count and sum. The Tally method above actually returns a ValueTuple<int, double>, not a specially generated type. Note that the compiler that ships with VS 15 Preview 3 emits no metadata about the names of the tuple members. This part of the feature is not yet implemented, but should be included in the final version. This means that in the meantime, you can’t use tuples across assemblies (well, you can, but you will lose the member names and will have to use Item1 and Item2 to refer to the tuple members).

Pitfall: using var and async together

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...

A few days ago at work, I stumbled upon a sneaky bug in our main app. The code looked innocent enough, and at first glance I couldn’t understand what was wrong… The code was similar to the following:

public async Task<bool> BookExistsAsync(int id)
{
    var store = await GetBookStoreAsync();
    var book = store.GetBookByIdAsync(id);
    return book != null;
}

// For completeness, here are the types and methods used in BookExistsAsync:

private Task<IBookStore> GetBookStoreAsync()
{
    // actual implementation irrelevant
    // ...
}


public interface IBookStore
{
    Task<Book> GetBookByIdAsync(int id);
    // other members omitted for brevity
}

public class Book
{
    public int Id { get; set; }
    // other members omitted for brevity
}

The BookExistsAsync method always returns true. Can you see why ?

Look at this line:

var book = store.GetBookByIdAsync(id);

Quick, what’s the type of book? If you answered Book, think again: it’s Task<Book>. The await is missing! And an async method always returns a non-null task, so book is never null.

When you have an async method with no await, the compiler warns you, but in this case there is an await on the line above. The only thing we do with book is to check that it’s not null; since Task<T> is a reference type, there’s nothing suspicious in comparing it to null. So, the compiler sees nothing wrong; the static code analyzer (ReSharper in this case) sees nothing wrong; and of course the feeble human brain reviewing the code sees nothing wrong either… Obviously, it could easily have been detected with adequate unit test coverage, but unfortunately this method wasn’t covered.

So, how to avoid this kind of mistake? Stop using var and always specify types explicitly? But I like var, I use it almost everywhere! Besides, I think it’s the first time I ever found a bug caused by the use of var. I’m really not willing to give it up…

Ideally, I would have liked ReSharper to spot the issue; perhaps it should consider all Task-returning methods to be implicitly [NotNull], unless specified otherwise. Until then, I don’t have a silver bullet against this issue; just pay attention when you call an async method, and write unit tests!

Test driving C# 7 features in Visual Studio “15” Preview

Very poorPoorAverageGoodExcellent (1 votes) 
Loading...

About two weeks ago, Microsoft released the first preview of the next version of Visual Studio. You can read about what’s new in the release notes. Some of the new features are really nice (for instance I love the new “lightweight installer”), but the most interesting for me is that it comes with a version of the compiler that includes a few of the features planned for C# 7. Let’s have a closer look at them!

Enabling the new features

The new features are not enabled by default. You can enable them individually with /feature: command line switches, but the easiest way is to enable them all by adding __DEMO__ and __DEMO_EXPERIMENTAL__ to the conditional compilation symbols (in Project properties, Build tab).

Local functions

Most functional languages allow you to declare functions in the body of other functions. It’s now possible to do the same in C# 7! The syntax for declaring a method inside another is pretty much what you would expect:

long Factorial(int n)
{
    long Fact(int i, long acc)
    {
        return i == 0 ? acc : Fact(i - 1, acc * i);
    }
    return Fact(n, 1);
}

Here, the Fact method is local to the Factorial method (in case you’re wondering, it’s a tail-recursive implementation of the factorial — which doesn’t make much sense, since C# doesn’t support tail recursion, but it’s just an example).

Of course, it was already possible to simulate local functions with lambda expressions, but there were a few drawbacks:

  • it’s less readable, because you have to declare the delegate type explicitly
  • it’s slower, due to the overhead of creating a delegate instance, and calling the delegate vs. calling the method directly
  • writing recursive lambdas is a bit awkward

Local functions have the following benefits:

  • when a method is only used as a helper for another method, making it local makes the relation more obvious
  • like lambdas, local functions can capture local variables and parameters of their containing method
  • local functions support recursion like any normal method

You can read more about this feature in the Roslyn Github repository.

Ref returns and ref locals

Since the first version of C#, it has always been possible to pass parameters by reference, which is conceptually similar to passing a pointer to a variable in languages like C. Until now, this feature was limited to parameters, but in C# 7 it becomes possible to return values by reference, or to have local variables that refer to the location of another variable. Here’s an example:

static void TestRefReturn()
{
    var foo = new Foo();
    Console.WriteLine(foo); // 0, 0
    
    foo.GetByRef("x") = 42;

    ref int y = ref foo.GetByRef("y");
    y = 99;

    Console.WriteLine(foo); // 42, 99
}

class Foo
{
    private int x;
    private int y;

    public ref int GetByRef(string name)
    {
        if (name == "x")
            return ref x;
        if (name == "y")
            return ref y;
        throw new ArgumentException(nameof(name));
    }

    public override string ToString() => $"{x},{y}";
}

Let’s have a closer look at this code.

  • On line 6, it looks like I’m assigning a value to the return of a method; what does this even mean? Well, the GetByRef method returns a field of the Foo class by reference (note the ref int return type of GetByRef). So, if I pass "x" as an argument, it returns the x field by reference. If I assign a value to that, it actually assigns a value to the x field.
  • On line 8, instead of just assigning a value directly to the field returned by GetByRef, I use a ref local variable y. The local variable now shares the same memory location as the foo.y field. So if I assign a value to it, it changes the value of foo.y.

Note that you can also return an array location by reference:

private MyBigStruct[] array = new MyBigStruct[10];
private int current;

public ref MyBigStruct GetCurrentItem()
{
    return ref array[current];
}

It’s likely that most C# developers will never actually need this feature; it’s pretty low level, and not the kind of thing you typically need when writing line-of-business applications. However it’s very useful for code whose performance is critical: copying a large structure is expensive, so if we can return it by reference instead, it can be a non-negligible performance benefit.

You can learn more about this feature on Github.

Pattern matching

Pattern matching is a feature very common in functional languages. C# 7 introduces some aspects of pattern matching, in the form of extensions to the is operator. For instance, when testing the type of a variable, it lets you introduce a new variable after the type, so that this variable is assigned with the left-hand side operand of the is, but with the type specified as the right-hand side operand (it will be clearer with an example).

Typically, if you need to test that a value is of type DateTime, then do something with that DateTime, you need to test the type, then cast to that type:

object o = GetValue();
if (o is DateTime)
{
    var d = (DateTime)o;
    // Do something with d
}

In C# 7, you can do this instead:

object o = GetValue();
if (o is DateTime d)
{
    // Do something with d
}

d is now declared directly as part of the o is DateTime expression.

This feature can also be used in a switch statement:

object v = GetValue();
switch (v)
{
    case string s:
        Console.WriteLine($"{v} is a string of length {s.Length}");
        break;
    case int i:
        Console.WriteLine($"{v} is an {(i % 2 == 0 ? "even" : "odd")} int");
        break;
    default:
        Console.WriteLine($"{v} is something else");
        break;
}

In this code, each case introduces a variable of the appropriate type, which you can use in the body of the case.

So far I only covered pattern matching against a simple type, but there are also more advanced forms. For instance:

switch (DateTime.Today)
{
    case DateTime(*, 10, 31):
        Console.WriteLine("Happy Halloween!");
        break;
    case DateTime(var year, 7, 4) when year > 1776:
        Console.WriteLine("Happy Independence Day!");
        break;
    case DateTime { DayOfWeek is DayOfWeek.Saturday }:
    case DateTime { DayOfWeek is DayOfWeek.Sunday }:
        Console.WriteLine("Have a nice week-end!");
        break;
    default:
        break;
}

How cool is that!

There’s also another (still experimental) form of pattern matching, using a new match keyword:

object o = GetValue();
string description = o match
    (
        case string s : $"{o} is a string of length {s.Length}"
        case int i : $"{o} is an {(i % 2 == 0 ? "even" : "odd")} int"
        case * : $"{o} is something else"
    );

It’s very similar to a switch, except that it’s an expression, not a statement.

There’s a lot more to pattern matching than what I mentioned here. You can look at the spec on Github for more details.

Binary literals and digit separators

These features were not explicitly mentioned in the VS Preview release notes, but I noticed they were included as well. They were initially planned for C# 6, but didn’t make it in the final release. They’re back in C# 7.

You can now write numeric literal in binary, in addition to decimal an hexadecimal:

int x = 0b11001010;

Very convenient to define bit masks!

To make large numbers more readable, you can also group digits by introducing separators. This can be used for decimal, hexadecimal or binary literals:

int oneBillion = 1_000_000_000;
int foo = 0x7FFF_1234;
int bar = 0b1001_0110_1010_0101;

Conclusion

So, with Visual Studio “15” Preview, you can start experimenting with the new C# 7 features; don’t hesitate to share your feedback on Github! And keep in mind that it’s still pre-release software, lots of things can change before the final release.

Posted in Uncategorized. Tags: , , . 2 Comments »

Using multiple cancellation sources with CreateLinkedTokenSource

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...

Async programming in C# used to be hard; thanks to .NET 4’s Task Parallel Library and C# 5’s async/await feature, it has become fairly easy, and as a result, is becoming much more common. At the same time, a standardized approach to cancellation has been introduced : cancellation tokens. The basic idea is that you create a CancellationTokenSource that controls the cancellation, and pass the token it provides to the method that you want to be able to cancel. That method will then pass it to the other methods it calls, if they can be canceled, and/or regularly check if cancellation was requested. Upon cancellation, the method will typically throw an OperationCanceledException. Quick and dirty example:

private readonly IBusinessService _businessService;
private CancellationTokenSource _cancellationSource;
private Task _asyncOperation;

private void StartAsyncOperation()
{
    if (_asyncOperation != null)
        return;
    var _cancellationSource = new CancellationTokenSource();
    _asyncOperation = _businessService.DoSomethingAsync(_cancellationSource.Token);
}

// async void is bad; like I said, this is a quick and dirty example
private async void StopAsyncOperation()
{
    try
    {
        _cancellationSource.Cancel();
        // wait for the operation to finish
        await _asyncOperation;
    }
    catch (OperationCanceledException)
    {
        // Operation was successfully canceled
    }
    catch (Exception)
    {
        // Oops, something went wrong
    }
    finally
    {
        _asyncOperation = null;
        _cancellationSource.Dispose();
        _cancellationSource = null;
    }

...

class BusinessService : IBusinessService
{
    public async Task DoSomethingAsync(CancellationToken cancellationToken)
    {
        var data = await GetDataFromServerAsync(cancellationToken);
        foreach (string line in data)
        {
            cancellationToken.ThrowIfCancellationRequested();
            await ProcessLineAsync(line, cancellationToken);
        }
    }

    ...
}

In this case, StopAsyncOperation would be called, for instance, if the user chooses to abort the operation.

This all works pretty well and is rather easy to setup. But what if there is another reason to cancel the operation, known only by the BusinessService and outside the control of the calling method? That’s where the CancellationSource.CreateLinkedTokenSource method comes into play; basically, this method creates a cancellation source that will be canceled when any of the specified tokens is canceled.

Let’s start with a simple case: you have another cancellation token that you also want to take into account. The code would look like this:

    public async Task DoSomethingAsync(CancellationToken cancellationToken)
    {
        var otherToken = GetOtherCancellationToken();
        using (var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, otherToken))
        {
            var data = await GetDataFromServerAsync(linkedCts.Token);
            foreach (string line in data)
            {
                linkedCts.Token.ThrowIfCancellationRequested();
                await ProcessLineAsync(line, linkedCts.Token);
            }
        }
    }

We created a linked cancellation source based on the two cancellation tokens, then used the token from this new source instead of cancellationToken. If either cancellationToken or otherToken is canceled, linkedCts.Token will be canceled as well. If necessary, the calling code can detect how the operation was canceled by checking the CancellationToken property of the OperationCanceledException.

Now let’s see a slightly more difficult case: the second cancellation source is actually an event. You want to cancel the operation when the event occurs, in addition to user cancellation represented by the cancellationToken parameter. So you need to subscribe to the event and trigger the cancellation when it occurs. Here’s a way to do it:

    public async Task DoSomethingAsync(CancellationToken cancellationToken)
    {
        using (var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken))
        {
            EventHandler handler = (sender, e) => linkedCts.Cancel();
            try
            {
                SomeEvent += handler;
                var data = await GetDataFromServerAsync(linkedCts.Token);
                foreach (string line in data)
                {
                    linkedCts.Token.ThrowIfCancellationRequested();
                    await ProcessLineAsync(line, linkedCts.Token);
                }
            }
            finally
            {
                SomeEvent -= handler;
            }
        }
    }

Here we only pass cancellationToken to CreateLinkedTokenSource, and we directly cancel linkedCts when the event is raised. The code is getting a bit convoluted, but it achieves the desired result.

I can’t really give you a specific real-world use case of this technique, because the cases where I used it are too specific to be of public interest, but I can outline the general scenario. I have a long running operation that is made up of multiple long running operations. The whole operation can be canceled globally, and each of the sub-operations can also be canceled individually, without affecting the others. Here’s the rough outline of what it looks like:

async Task GlobalOperationAsync(CancellationToken cancellationToken)
{
    foreach (var subOperation is SubOperations)
    {
        cancellationToken.ThrowIfCancellationRequested();
        var subToken = subOperation.GetSpecificCancellationToken();
        using (var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, subToken))
        {
            try
            {
                await subOperation.RunAsync(linkedCts.Token);
            }
            catch (OperationCanceledException ex)
            {
                // Rethrow only if global cancellation was requested
                if (cancellationToken.IsCancellationRequested)
                    throw;
                    
                // otherwise continue running the other sub-operations
            }
        }
    }
}

Note that even though CancellationToken was introduced with the TPL and all the examples I gave were asynchronous, nothing prevents you from using this technique with synchronous code.

I hope you find this helpful. Have a great New Year’s Eve celebration and a happy new year!

Exception filters in C# 6: their biggest advantage is not what you think

Very poorPoorAverageGoodExcellent (7 votes) 
Loading...

Exception filters are one of the major new features of C# 6. They take advantage of a CLR feature that was there from the start, but wasn’t used in C# until now. They allow you to specify a condition on a catch block:

static void Main()
{
    try
    {
        Foo.DoSomethingThatMightFail(null);
    }
    catch (MyException ex) when (ex.Code == 42)
    {
        Console.WriteLine("Error 42 occurred");
    }
}

As you might expect, the catch block will be entered if and only if ex.Code == 42. If the condition is not verified, the exception will bubble up the stack until it’s caught somewhere else or terminates the process.

At first glance, this feature doesn’t seem to bring anything really new. After all, it has always been possible to do this:

static void Main()
{
    try
    {
        Foo.DoSomethingThatMightFail(null);
    }
    catch (MyException ex)
    {
        if (ex.Code == 42)
            Console.WriteLine("Error 42 occurred");
        else
            throw;
    }
}

Since this piece of code is equivalent to the previous one, exception filters are just syntactic sugar, aren’t they? I mean, they are equivalent, right?

WRONG!

Stack unwinding

There is actually a subtle but important difference: exception filters don’t unwind the stack. OK, but what does that mean?

When you enter a catch block, the stack is unwound: this means that the stack frames for the method calls “deeper” than the current method are dropped. This implies that all information about current execution state in those stack frames is lost, making it harder to identify the root cause of the exception.

Let’s assume that DoSomethingThatMightFail throws a MyException with the code 123, and the debugger is configured to break only on uncaught exceptions.

  • In the code that doesn’t use exception filters, the catch block is always entered (based on the type of the exception), and the stack is immediately unwound. Since the exception doesn’t satisfy the condition, it is rethrown. So the debugger will break on the throw; in the catch block; no information on the execution state of the DoSomethingThatMightFail method will be available. In other words, we won’t know what was going on in the method that threw the exception.
  • In the code with exception filters, on the other hand, the filter won’t match, so the catch block won’t be entered at all, and the stack won’t be unwound. The debugger will break in the DoSomethingThatMightFail method, making it easy to see what was going on when the exception was thrown.

Of course, when you’re debugging directly in Visual Studio, you can configure the debugger to break as soon as an exception is thrown, whether it’s caught or not. But you don’t always have that luxury; for instance, if you’re debugging an application in production, you often have just a crash dump to work with, so the fact that the stack wasn’t unwound becomes very useful, since it lets you see what was going on in the method that threw the exception.

Stack vs. stack trace

You may have noticed that I talked about the stack, not the stack trace. Even though it’s common to refer to “the stack” when we mean “the stack trace”, they’re not the same thing. The call stack is a piece of memory allocated to the thread, that contains information for each method call: return address, arguments, and local variables. The stack trace is just a string that contains the names of the methods currently on the call stack (and the location in those methods, if debug symbols are available). The Exception.StackTrace property contains the stack trace as it was when the exception was thrown, and is not affected when the stack is unwound; if you rethrow the same exception with throw;, it is left untouched. It is only overwritten if you rethrow the exception with throw ex;. The stack itself, on the other hand, is unwound when a catch block is entered, as discussed above.

Side effects

It’s interesting to note that an exception filter can contain any expression that returns a bool (well, almost… you can’t use await, for instance). It can be an inline condition, a property, a method call, etc. Technically, there’s nothing to prevent you from causing side effects in the exception filter. In most cases, I would strongly advise against doing that, as it can cause very confusing behavior; it can become really hard to understand the order in which things are executed. However, there is a common scenario that could benefit from side effects in exception filters: logging. You could easily create a method that logs the exception and returns false so that the catch block is not entered. This would allow logging exceptions on the fly without actually catching them, hence without unwinding the stack:

try
{
    DoSomethingThatMightFail(s);
}
catch (Exception ex) when (Log(ex, "An error occurred"))
{
    // this catch block will never be reached
}

...

static bool Log(Exception ex, string message, params object[] args)
{
    Debug.Print(message, args);
    return false;
}

Conclusion

As you can see, exception filters are not just syntactic sugar. Contrary to most C# 6 features, they’re not really a “coding” feature (in that they don’t make the code significantly clearer), but rather a “debugging” feature. Correctly understood and used, they can make it much easier to diagnose problems in your code.

Customizing string interpolation in C# 6

Very poorPoorAverageGoodExcellent (4 votes) 
Loading...

One of the major new features in C# 6 is string interpolation, which allows you to write things like this:

string text = $"{p.Name} was born on {p.DateOfBirth:D}";

A lesser known aspect of this feature is that an interpolated string can be treated either as a String, or as an IFormattable, depending on the context. When it is converted to an IFormattable, it constructs a FormattableString object that implements the interface and exposes:

  • the format string with the placeholders (“holes”) replaced by numbers (compatible with String.Format)
  • the values for the placeholders

The ToString() method of this object just calls String.Format(format, values). But there is also an overload that accepts an IFormatProvider, and this is where things get interesting, because it makes it possible to customize how the values are formatted. It might not be immediately obvious why this is useful, so let me give you a few examples…

Specifying the culture

During the design of the string interpolation feature, there was a lot of debate on whether to use the current culture or the invariant culture to format the values; there were good arguments on both sides, but eventually it was decided to use the current culture, for consistency with String.Format and similar APIs that use composite formatting. Using the current culture makes sense when you’re using string interpolation to build strings to be displayed in the user interface; but there are also scenarios where you want to build strings that will be consumed by an API or protocol (URLs, SQL queries…), and in those cases you usually want to use the invariant culture.

C# 6 provides an easy way to do that, by taking advantage of the conversion to IFormattable. You just need to create a method like this:

static string Invariant(FormattableString formattable)
{
    return formattable.ToString(CultureInfo.InvariantCulture);
}

And you can then use it as follows:

string text = Invariant($"{p.Name} was born on {p.DateOfBirth:D}");

The values in the interpolated strings will now be formatted with the invariant culture, rather than the default culture.

Building URLs

Here’s a more advanced example. String interpolation is a convenient way to build URLs, but if you include arbitrary strings in a URL, you need to be careful to URL-encode them. A custom string interpolator can do that for you; you just need to create a custom IFormatProvider that will take care of encoding the values. The implementation was not obvious at first, but after some trial and error I came up with this:

class UrlFormatProvider : IFormatProvider
{
    private readonly UrlFormatter _formatter = new UrlFormatter();

    public object GetFormat(Type formatType)
    {
        if (formatType == typeof(ICustomFormatter))
            return _formatter;
        return null;
    }

    class UrlFormatter : ICustomFormatter
    {
        public string Format(string format, object arg, IFormatProvider formatProvider)
        {
            if (arg == null)
                return string.Empty;
            if (format == "r")
                return arg.ToString();
            return Uri.EscapeDataString(arg.ToString());
        }
    }
}

You can use the formatter like this:

static string Url(FormattableString formattable)
{
    return formattable.ToString(new UrlFormatProvider());
}

...

string url = Url($"http://foobar/item/{id}/{name}");

It will correctly encode the values of id and name so that the resulting URL only contains valid characters.

Aside: Did you notice the if (format == "r")? It’s a custom format specifier to indicate that the value should not be encoded (“r” stands for “raw”). To use it you just include it in the format string like this: {id:r}. This will prevent the encoding of id.

Building SQL queries

You can do something similar for SQL queries. Of course, it’s a known bad practice to embed values directly in the query, for security and performance reasons (you should use parameterized queries instead); but for “quick and dirty” developments it can still be useful. And anyway, it’s a good illustration for the feature. When embedding values in a SQL queries, you should:

  • enclose strings in single quotes, and escape single quotes inside the strings by doubling them
  • format dates according to what the DBMS expects (typically MM/dd/yyyy)
  • format numbers using the invariant culture
  • replace null values with the NULL literal

(there are probably other things to take care of, but these are the most obvious).

We can use the same approach as for URLs and create a SqlFormatProvider:

class SqlFormatProvider : IFormatProvider
{
    private readonly SqlFormatter _formatter = new SqlFormatter();

    public object GetFormat(Type formatType)
    {
        if (formatType == typeof(ICustomFormatter))
            return _formatter;
        return null;
    }

    class SqlFormatter : ICustomFormatter
    {
        public string Format(string format, object arg, IFormatProvider formatProvider)
        {
            if (arg == null)
                return "NULL";
            if (arg is string)
                return "'" + ((string)arg).Replace("'", "''") + "'";
            if (arg is DateTime)
                return "'" + ((DateTime)arg).ToString("MM/dd/yyyy") + "'";
            if (arg is IFormattable)
                return ((IFormattable)arg).ToString(format, CultureInfo.InvariantCulture);
            return arg.ToString();
        }
    }
}

You can then use the formatter like this:

static string Sql(FormattableString formattable)
{
    return formattable.ToString(new SqlFormatProvider());
}

...

string sql = Sql($"insert into items(id, name, creationDate) values({id}, {name}, {DateTime.Now})");

This will take care of properly formatting the values to produce a valid SQL query.

Using string interpolation when targeting older versions of .NET

As is often the case for language features that leverage .NET framework types, you can use this feature with older versions of the framework that don’t have the FormattableString class; you just have to create the class yourself in the appropriate namespace. Actually, there are two classes to implement: FormattableString and FormattableStringFactory. Jon Skeet was apparently in a hurry to try this, and he has already provided an example with the code for these classes:

using System;

namespace System.Runtime.CompilerServices
{
    public class FormattableStringFactory
    {
        public static FormattableString Create(string messageFormat, params object[] args)
        {
            return new FormattableString(messageFormat, args);
        }

        public static FormattableString Create(string messageFormat, DateTime bad, params object[] args)
        {
            var realArgs = new object[args.Length + 1];
            realArgs[0] = "Please don't use DateTime";
            Array.Copy(args, 0, realArgs, 1, args.Length);
            return new FormattableString(messageFormat, realArgs);
        }
    }
}

namespace System
{
    public class FormattableString
    {
        private readonly string messageFormat;
        private readonly object[] args;

        public FormattableString(string messageFormat, object[] args)
        {
            this.messageFormat = messageFormat;
            this.args = args;
        }
        public override string ToString()
        {
            return string.Format(messageFormat, args);
        }
    }
}

This is the same approach that made it possible to use Linq when targeting .NET 2 (LinqBridge) or caller info attributes when targeting .NET 4 or earlier. Of course, it still requires the C# 6 compiler to work…

Conclusion

The conversion of interpolated strings to IFormattable had been mentioned previously, but it wasn’t implemented until recently; the just released CTP 6 of Visual Studio 2015 ships with a new version of the compiler that includes this feature, so you can now go ahead and use it. This feature makes string interpolation very flexible, and I’m sure people will come up with many other use cases that I didn’t think of.

You can find the code for the examples above on GitHub.

Optimize ToArray and ToList by providing the number of elements

Very poorPoorAverageGoodExcellent (2 votes) 
Loading...

The ToArray and ToList extension methods are convenient ways to eagerly materialize an enumerable sequence (e.g. a Linq query) into an array or a list. However, there’s something that bothers me: both of these methods are very inefficient if they don’t know the number of elements in the sequence (which is almost always the case when you use them on a Linq query). Let’s focus on ToArray for now (ToList has a few differences, but the principle is mostly the same).

Basically, ToArray takes a sequence, and returns an array that contains all the elements from the sequence. If the sequence implements ICollection<T>, it uses the Count property to allocate an array of the right size, and copy the elements into it; here’s an example:

List<User> users = GetUsers();
User[] array = users.ToArray();

In this scenario, ToArray is fairly efficient. Now, let’s change that code to extract just the names from the users:

List<User> users = GetUsers();
string[] array = users.Select(u => u.Name).ToArray();

Now, the argument of ToArray is an IEnumerable<User> returned by Select. It doesn’t implement ICollection<User>, so ToArray doesn’t know the number of elements, so it cannot allocate an array of the appropriate size. So here’s what it does:

  1. start by allocating a small array (4 elements in the current implementation)
  2. copy elements from the source into the array until the array is full
  3. if there are no more elements in the source, go to 7
  4. otherwise, allocate a new array, twice as large as the previous one
  5. copy the items from the old array to the new array
  6. repeat from step 2
  7. if the array is longer than the number of elements, trim it: allocate a new array with exactly the right size, and copy the elements from the previous array
  8. return the array

If there are few elements, this is quite painless; but for a very long sequence, it’s very inefficient, because of the many allocations and copies.

What is annoying is that, in many cases, we know the number of elements in the source! In the example above, we only use Select, which doesn’t change the number of elements, so we know that it’s the same as in the original list; but ToArray doesn’t know, because the information was lost along the way. If only we had a way to help it by providing this information ourselves….

Well, it’s actually very easy to do: all we have to do is create a new extension method that accepts the count as a parameter. Here’s what it might look like:

public static TSource[] ToArray<TSource>(this IEnumerable<TSource> source, int count)
{
    if (source == null) throw new ArgumentNullException("source");
    if (count < 0) throw new ArgumentOutOfRangeException("count");
    var array = new TSource[count];
    int i = 0;
    foreach (var item in source)
    {
        array[i++] = item;
    }
    return array;
}

Now we can optimize our previous example like this:

List<User> users = GetUsers();
string[] array = users.Select(u => u.Name).ToArray(users.Count);

Note that if you specify a count that is less than the actual number of elements in the sequence, you will get an IndexOutOfRangeException; it’s your responsibility to provide the correct count to the method.

So, what do we actually gain by doing that? From my benchmarks, this improved ToArray is about twice as fast as the built-in one, for a long sequence (tested with 1,000,000 elements). This is pretty good!

Note that we can improve ToList in the same way, by using the List<T> constructor that lets us specify the initial capacity:

public static List<TSource> ToList<TSource>(this IEnumerable<TSource> source, int count)
{
    if (source == null) throw new ArgumentNullException("source");
    if (count < 0) throw new ArgumentOutOfRangeException("count");
    var list = new List<TSource>(count);
    foreach (var item in source)
    {
        list.Add(item);
    }
    return list;
}

In this case, the performance gain is not as as big as for ToArray (about 25% instead of 50%), probably because the list doesn’t need to be trimmed, but it’s not negligible.

Obviously, a similar optimization could be made to ToDictionary as well, since the Dictionary<TKey, TValue> class also has a constructor that lets us specify the initial capacity.

The improved ToArray and ToList methods are available in my Linq.Extras library, which also provides many useful extension methods for working on sequences and collections.

css.php