Tuple deconstruction in C# 7

Very poorPoorAverageGoodExcellent (1 votes) 
Loading...

Last time on this blog I talked about the new tuple feature of C# 7. In Visual Studio 15 Preview 3, the feature wasn’t quite finished; it lacked 2 important aspects:

  • emitting metadata for the names of tuple elements, so that the names are preserved across assemblies
  • deconstruction of tuples into separate variables

Well, it looks like the C# language team has been busy during the last month, because both items are now implemented in VS 15 Preview 4, which was released today! They’ve also written nice startup guides about tuples and deconstruction.

It is now possible to write something like this:

var values = ...
var (count, sum) = Tally(values);
Console.WriteLine($"There are {count} values and their sum is {sum}");

(the Tally method is the one from the previous post)

Note that the intermediate variable t from the previous post has disappeared; we now assign the count and sum variables directly from the method result, which looks much nicer IMHO. There doesn’t seem to be a way to ignore part of the tuple (i.e. not assign it to a variable), hopefully it will come later.

An interesting aspect of deconstruction is that it’s not limited to tuples; any type can be deconstructed, as long as it has a Deconstruct method with the appropriate out parameters:

class Point
{
    public int X { get; }
    public int Y { get; }

    public Point(int x, int y)
    {
        X = x;
        Y = y;
    }

    public void Deconstruct(out int x, out int y)
    {
        x = X;
        y = Y;
    }
}

...

var (x, y) = point;
Console.WriteLine($"Coordinates: ({x}, {y})");

The Deconstruct method can also be an extension method, which can be useful if you want to deconstruct a type that you don’t own. The old System.Tuple classes, for example, can be deconstructed using extension methods like this one:

public static void Deconstruct<T1, T2>(this Tuple<T1, T2> tuple, out T1 item1, out T2 item2)
{
    item1 = tuple.Item1;
    item2 = tuple.Item2;
}

...

var tuple = Tuple.Create("foo", 42);
var (name, value) = tuple;
Console.WriteLine($"Name: {name}, Value = {value}");

Finally, methods that return tuples are now decorated with a [TupleElementNames] attribute that indicates the names of the tuple members:

// Decompiled code
[return: TupleElementNames(new[] { "count", "sum" })]
public static ValueTuple<int, double> Tally(IEnumerable<double> values)
{
   ...
}

(the attribute is emitted by the compiler, you don’t actually need to write it yourself)

This makes it possible to share the tuple member names across assemblies, and lets tools like Intellisense provide helpful information about the method.

So, the tuple feature of C# 7 seems to be mostly complete; however, keep in mind that it’s still a preview, and some things could change between now and the final release.

Tuples in C# 7

Very poorPoorAverageGoodExcellent (2 votes) 
Loading...

A tuple is an finite ordered list of values, of possibly different types, which is used to bundle related values together without having to create a specific type to hold them.

In .NET 4.0, a set of Tuple classes has been introduced in the framework, which can be used as follows:

private static Tuple<int, double> Tally(IEnumerable<double> values)
{
	int count = 0;
	double sum = 0.0;
	foreach (var value in values)
	{
	    count++;
	    sum += value;
	}
	return Tuple.Create(count, sum);
}

...

var values = ...
var t = Tally(values);
Console.WriteLine($"There are {t.Item1} values and their sum is {t.Item2}");

There are two annoying issues with the Tuple classes:

  • They’re classes, i.e. reference types. This means they must be allocated on the heap, and garbage collected when they’re no longer used. For applications where performance is critical, it can be an issue. Also, the fact that they can be null is often not desirable.
  • The elements in the tuple don’t have names, or rather, they always have the same names (Item1, Item2, etc), which are not meaningful at all. The Tuple<T1, T2> type conveys no information about what the tuple actually represents, which makes it a poor choice in public APIs.

In C# 7, a new feature will be introduced to improve support for tuples: you will be able to declare tuples types “inline”, a little like anonymous types, except that they’re not limited to the current method. Using this new feature, the code above becomes much cleaner:

static (int count, double sum) Tally(IEnumerable<double> values)
{
	int count = 0;
	double sum = 0.0;
	foreach (var value in values)
	{
	    count++;
	    sum += value;
	}
	return (count, sum);
}

...

var values = ...
var t = Tally(values);
Console.WriteLine($"There are {t.count} values and their sum is {t.sum}");

Note how the return type of the Tally method is declared, and how the result is used. This is much better! The tuple elements now have significant names, and the syntax is nicer too. The feature relies on a new ValueTuple<T1, T2> structure, which means it doesn’t involve a heap allocation.

You can try this feature right now in Visual Studio 15 Preview 3. However, the ValueTuple<T1, T2> type is not (yet) part of the .NET Framework; to get this example to work, you’ll need to reference the System.ValueTuple NuGet package.

Finally, one last remark about the names of tuple members: like many other language features, they’re just syntactic sugar. In the compiled code, the tuple members are only referred to as Item1 and Item2, not count and sum. The Tally method above actually returns a ValueTuple<int, double>, not a specially generated type. Note that the compiler that ships with VS 15 Preview 3 emits no metadata about the names of the tuple members. This part of the feature is not yet implemented, but should be included in the final version. This means that in the meantime, you can’t use tuples across assemblies (well, you can, but you will lose the member names and will have to use Item1 and Item2 to refer to the tuple members).

Pitfall: using var and async together

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...

A few days ago at work, I stumbled upon a sneaky bug in our main app. The code looked innocent enough, and at first glance I couldn’t understand what was wrong… The code was similar to the following:

public async Task<bool> BookExistsAsync(int id)
{
    var store = await GetBookStoreAsync();
    var book = store.GetBookByIdAsync(id);
    return book != null;
}

// For completeness, here are the types and methods used in BookExistsAsync:

private Task<IBookStore> GetBookStoreAsync()
{
    // actual implementation irrelevant
    // ...
}


public interface IBookStore
{
    Task<Book> GetBookByIdAsync(int id);
    // other members omitted for brevity
}

public class Book
{
    public int Id { get; set; }
    // other members omitted for brevity
}

The BookExistsAsync method always returns true. Can you see why ?

Look at this line:

var book = store.GetBookByIdAsync(id);

Quick, what’s the type of book? If you answered Book, think again: it’s Task<Book>. The await is missing! And an async method always returns a non-null task, so book is never null.

When you have an async method with no await, the compiler warns you, but in this case there is an await on the line above. The only thing we do with book is to check that it’s not null; since Task<T> is a reference type, there’s nothing suspicious in comparing it to null. So, the compiler sees nothing wrong; the static code analyzer (ReSharper in this case) sees nothing wrong; and of course the feeble human brain reviewing the code sees nothing wrong either… Obviously, it could easily have been detected with adequate unit test coverage, but unfortunately this method wasn’t covered.

So, how to avoid this kind of mistake? Stop using var and always specify types explicitly? But I like var, I use it almost everywhere! Besides, I think it’s the first time I ever found a bug caused by the use of var. I’m really not willing to give it up…

Ideally, I would have liked ReSharper to spot the issue; perhaps it should consider all Task-returning methods to be implicitly [NotNull], unless specified otherwise. Until then, I don’t have a silver bullet against this issue; just pay attention when you call an async method, and write unit tests!

Test driving C# 7 features in Visual Studio “15” Preview

Very poorPoorAverageGoodExcellent (1 votes) 
Loading...

About two weeks ago, Microsoft released the first preview of the next version of Visual Studio. You can read about what’s new in the release notes. Some of the new features are really nice (for instance I love the new “lightweight installer”), but the most interesting for me is that it comes with a version of the compiler that includes a few of the features planned for C# 7. Let’s have a closer look at them!

Enabling the new features

The new features are not enabled by default. You can enable them individually with /feature: command line switches, but the easiest way is to enable them all by adding __DEMO__ and __DEMO_EXPERIMENTAL__ to the conditional compilation symbols (in Project properties, Build tab).

Local functions

Most functional languages allow you to declare functions in the body of other functions. It’s now possible to do the same in C# 7! The syntax for declaring a method inside another is pretty much what you would expect:

long Factorial(int n)
{
    long Fact(int i, long acc)
    {
        return i == 0 ? acc : Fact(i - 1, acc * i);
    }
    return Fact(n, 1);
}

Here, the Fact method is local to the Factorial method (in case you’re wondering, it’s a tail-recursive implementation of the factorial — which doesn’t make much sense, since C# doesn’t support tail recursion, but it’s just an example).

Of course, it was already possible to simulate local functions with lambda expressions, but there were a few drawbacks:

  • it’s less readable, because you have to declare the delegate type explicitly
  • it’s slower, due to the overhead of creating a delegate instance, and calling the delegate vs. calling the method directly
  • writing recursive lambdas is a bit awkward

Local functions have the following benefits:

  • when a method is only used as a helper for another method, making it local makes the relation more obvious
  • like lambdas, local functions can capture local variables and parameters of their containing method
  • local functions support recursion like any normal method

You can read more about this feature in the Roslyn Github repository.

Ref returns and ref locals

Since the first version of C#, it has always been possible to pass parameters by reference, which is conceptually similar to passing a pointer to a variable in languages like C. Until now, this feature was limited to parameters, but in C# 7 it becomes possible to return values by reference, or to have local variables that refer to the location of another variable. Here’s an example:

static void TestRefReturn()
{
    var foo = new Foo();
    Console.WriteLine(foo); // 0, 0
    
    foo.GetByRef("x") = 42;

    ref int y = ref foo.GetByRef("y");
    y = 99;

    Console.WriteLine(foo); // 42, 99
}

class Foo
{
    private int x;
    private int y;

    public ref int GetByRef(string name)
    {
        if (name == "x")
            return ref x;
        if (name == "y")
            return ref y;
        throw new ArgumentException(nameof(name));
    }

    public override string ToString() => $"{x},{y}";
}

Let’s have a closer look at this code.

  • On line 6, it looks like I’m assigning a value to the return of a method; what does this even mean? Well, the GetByRef method returns a field of the Foo class by reference (note the ref int return type of GetByRef). So, if I pass "x" as an argument, it returns the x field by reference. If I assign a value to that, it actually assigns a value to the x field.
  • On line 8, instead of just assigning a value directly to the field returned by GetByRef, I use a ref local variable y. The local variable now shares the same memory location as the foo.y field. So if I assign a value to it, it changes the value of foo.y.

Note that you can also return an array location by reference:

private MyBigStruct[] array = new MyBigStruct[10];
private int current;

public ref MyBigStruct GetCurrentItem()
{
    return ref array[current];
}

It’s likely that most C# developers will never actually need this feature; it’s pretty low level, and not the kind of thing you typically need when writing line-of-business applications. However it’s very useful for code whose performance is critical: copying a large structure is expensive, so if we can return it by reference instead, it can be a non-negligible performance benefit.

You can learn more about this feature on Github.

Pattern matching

Pattern matching is a feature very common in functional languages. C# 7 introduces some aspects of pattern matching, in the form of extensions to the is operator. For instance, when testing the type of a variable, it lets you introduce a new variable after the type, so that this variable is assigned with the left-hand side operand of the is, but with the type specified as the right-hand side operand (it will be clearer with an example).

Typically, if you need to test that a value is of type DateTime, then do something with that DateTime, you need to test the type, then cast to that type:

object o = GetValue();
if (o is DateTime)
{
    var d = (DateTime)o;
    // Do something with d
}

In C# 7, you can do this instead:

object o = GetValue();
if (o is DateTime d)
{
    // Do something with d
}

d is now declared directly as part of the o is DateTime expression.

This feature can also be used in a switch statement:

object v = GetValue();
switch (v)
{
    case string s:
        Console.WriteLine($"{v} is a string of length {s.Length}");
        break;
    case int i:
        Console.WriteLine($"{v} is an {(i % 2 == 0 ? "even" : "odd")} int");
        break;
    default:
        Console.WriteLine($"{v} is something else");
        break;
}

In this code, each case introduces a variable of the appropriate type, which you can use in the body of the case.

So far I only covered pattern matching against a simple type, but there are also more advanced forms. For instance:

switch (DateTime.Today)
{
    case DateTime(*, 10, 31):
        Console.WriteLine("Happy Halloween!");
        break;
    case DateTime(var year, 7, 4) when year > 1776:
        Console.WriteLine("Happy Independence Day!");
        break;
    case DateTime { DayOfWeek is DayOfWeek.Saturday }:
    case DateTime { DayOfWeek is DayOfWeek.Sunday }:
        Console.WriteLine("Have a nice week-end!");
        break;
    default:
        break;
}

How cool is that!

There’s also another (still experimental) form of pattern matching, using a new match keyword:

object o = GetValue();
string description = o match
    (
        case string s : $"{o} is a string of length {s.Length}"
        case int i : $"{o} is an {(i % 2 == 0 ? "even" : "odd")} int"
        case * : $"{o} is something else"
    );

It’s very similar to a switch, except that it’s an expression, not a statement.

There’s a lot more to pattern matching than what I mentioned here. You can look at the spec on Github for more details.

Binary literals and digit separators

These features were not explicitly mentioned in the VS Preview release notes, but I noticed they were included as well. They were initially planned for C# 6, but didn’t make it in the final release. They’re back in C# 7.

You can now write numeric literal in binary, in addition to decimal an hexadecimal:

int x = 0b11001010;

Very convenient to define bit masks!

To make large numbers more readable, you can also group digits by introducing separators. This can be used for decimal, hexadecimal or binary literals:

int oneBillion = 1_000_000_000;
int foo = 0x7FFF_1234;
int bar = 0b1001_0110_1010_0101;

Conclusion

So, with Visual Studio “15” Preview, you can start experimenting with the new C# 7 features; don’t hesitate to share your feedback on Github! And keep in mind that it’s still pre-release software, lots of things can change before the final release.

Posted in Uncategorized. Tags: , , . 2 Comments »

Using multiple cancellation sources with CreateLinkedTokenSource

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...

Async programming in C# used to be hard; thanks to .NET 4’s Task Parallel Library and C# 5’s async/await feature, it has become fairly easy, and as a result, is becoming much more common. At the same time, a standardized approach to cancellation has been introduced : cancellation tokens. The basic idea is that you create a CancellationTokenSource that controls the cancellation, and pass the token it provides to the method that you want to be able to cancel. That method will then pass it to the other methods it calls, if they can be canceled, and/or regularly check if cancellation was requested. Upon cancellation, the method will typically throw an OperationCanceledException. Quick and dirty example:

private readonly IBusinessService _businessService;
private CancellationTokenSource _cancellationSource;
private Task _asyncOperation;

private void StartAsyncOperation()
{
    if (_asyncOperation != null)
        return;
    var _cancellationSource = new CancellationTokenSource();
    _asyncOperation = _businessService.DoSomethingAsync(_cancellationSource.Token);
}

// async void is bad; like I said, this is a quick and dirty example
private async void StopAsyncOperation()
{
    try
    {
        _cancellationSource.Cancel();
        // wait for the operation to finish
        await _asyncOperation;
    }
    catch (OperationCanceledException)
    {
        // Operation was successfully canceled
    }
    catch (Exception)
    {
        // Oops, something went wrong
    }
    finally
    {
        _asyncOperation = null;
        _cancellationSource.Dispose();
        _cancellationSource = null;
    }

...

class BusinessService : IBusinessService
{
    public async Task DoSomethingAsync(CancellationToken cancellationToken)
    {
        var data = await GetDataFromServerAsync(cancellationToken);
        foreach (string line in data)
        {
            cancellationToken.ThrowIfCancellationRequested();
            await ProcessLineAsync(line, cancellationToken);
        }
    }

    ...
}

In this case, StopAsyncOperation would be called, for instance, if the user chooses to abort the operation.

This all works pretty well and is rather easy to setup. But what if there is another reason to cancel the operation, known only by the BusinessService and outside the control of the calling method? That’s where the CancellationSource.CreateLinkedTokenSource method comes into play; basically, this method creates a cancellation source that will be canceled when any of the specified tokens is canceled.

Let’s start with a simple case: you have another cancellation token that you also want to take into account. The code would look like this:

    public async Task DoSomethingAsync(CancellationToken cancellationToken)
    {
        var otherToken = GetOtherCancellationToken();
        using (var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, otherToken))
        {
            var data = await GetDataFromServerAsync(linkedCts.Token);
            foreach (string line in data)
            {
                linkedCts.Token.ThrowIfCancellationRequested();
                await ProcessLineAsync(line, linkedCts.Token);
            }
        }
    }

We created a linked cancellation source based on the two cancellation tokens, then used the token from this new source instead of cancellationToken. If either cancellationToken or otherToken is canceled, linkedCts.Token will be canceled as well. If necessary, the calling code can detect how the operation was canceled by checking the CancellationToken property of the OperationCanceledException.

Now let’s see a slightly more difficult case: the second cancellation source is actually an event. You want to cancel the operation when the event occurs, in addition to user cancellation represented by the cancellationToken parameter. So you need to subscribe to the event and trigger the cancellation when it occurs. Here’s a way to do it:

    public async Task DoSomethingAsync(CancellationToken cancellationToken)
    {
        using (var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken))
        {
            EventHandler handler = (sender, e) => linkedCts.Cancel();
            try
            {
                SomeEvent += handler;
                var data = await GetDataFromServerAsync(linkedCts.Token);
                foreach (string line in data)
                {
                    linkedCts.Token.ThrowIfCancellationRequested();
                    await ProcessLineAsync(line, linkedCts.Token);
                }
            }
            finally
            {
                SomeEvent -= handler;
            }
        }
    }

Here we only pass cancellationToken to CreateLinkedTokenSource, and we directly cancel linkedCts when the event is raised. The code is getting a bit convoluted, but it achieves the desired result.

I can’t really give you a specific real-world use case of this technique, because the cases where I used it are too specific to be of public interest, but I can outline the general scenario. I have a long running operation that is made up of multiple long running operations. The whole operation can be canceled globally, and each of the sub-operations can also be canceled individually, without affecting the others. Here’s the rough outline of what it looks like:

async Task GlobalOperationAsync(CancellationToken cancellationToken)
{
    foreach (var subOperation is SubOperations)
    {
        cancellationToken.ThrowIfCancellationRequested();
        var subToken = subOperation.GetSpecificCancellationToken();
        using (var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, subToken))
        {
            try
            {
                await subOperation.RunAsync(linkedCts.Token);
            }
            catch (OperationCanceledException ex)
            {
                // Rethrow only if global cancellation was requested
                if (cancellationToken.IsCancellationRequested)
                    throw;
                    
                // otherwise continue running the other sub-operations
            }
        }
    }
}

Note that even though CancellationToken was introduced with the TPL and all the examples I gave were asynchronous, nothing prevents you from using this technique with synchronous code.

I hope you find this helpful. Have a great New Year’s Eve celebration and a happy new year!

Exception filters in C# 6: their biggest advantage is not what you think

Very poorPoorAverageGoodExcellent (6 votes) 
Loading...

Exception filters are one of the major new features of C# 6. They take advantage of a CLR feature that was there from the start, but wasn’t used in C# until now. They allow you to specify a condition on a catch block:

static void Main()
{
    try
    {
        Foo.DoSomethingThatMightFail(null);
    }
    catch (MyException ex) when (ex.Code == 42)
    {
        Console.WriteLine("Error 42 occurred");
    }
}

As you might expect, the catch block will be entered if and only if ex.Code == 42. If the condition is not verified, the exception will bubble up the stack until it’s caught somewhere else or terminates the process.

At first glance, this feature doesn’t seem to bring anything really new. After all, it has always been possible to do this:

static void Main()
{
    try
    {
        Foo.DoSomethingThatMightFail(null);
    }
    catch (MyException ex)
    {
        if (ex.Code == 42)
            Console.WriteLine("Error 42 occurred");
        else
            throw;
    }
}

Since this piece of code is equivalent to the previous one, exception filters are just syntactic sugar, aren’t they? I mean, they are equivalent, right?

WRONG!

Stack unwinding

There is actually a subtle but important difference: exception filters don’t unwind the stack. OK, but what does that mean?

When you enter a catch block, the stack is unwound: this means that the stack frames for the method calls “deeper” than the current method are dropped. This implies that all information about current execution state in those stack frames is lost, making it harder to identify the root cause of the exception.

Let’s assume that DoSomethingThatMightFail throws a MyException with the code 123, and the debugger is configured to break only on uncaught exceptions.

  • In the code that doesn’t use exception filters, the catch block is always entered (based on the type of the exception), and the stack is immediately unwound. Since the exception doesn’t satisfy the condition, it is rethrown. So the debugger will break on the throw; in the catch block; no information on the execution state of the DoSomethingThatMightFail method will be available. In other words, we won’t know what was going on in the method that threw the exception.
  • In the code with exception filters, on the other hand, the filter won’t match, so the catch block won’t be entered at all, and the stack won’t be unwound. The debugger will break in the DoSomethingThatMightFail method, making it easy to see what was going on when the exception was thrown.

Of course, when you’re debugging directly in Visual Studio, you can configure the debugger to break as soon as an exception is thrown, whether it’s caught or not. But you don’t always have that luxury; for instance, if you’re debugging an application in production, you often have just a crash dump to work with, so the fact that the stack wasn’t unwound becomes very useful, since it lets you see what was going on in the method that threw the exception.

Stack vs. stack trace

You may have noticed that I talked about the stack, not the stack trace. Even though it’s common to refer to “the stack” when we mean “the stack trace”, they’re not the same thing. The call stack is a piece of memory allocated to the thread, that contains information for each method call: return address, arguments, and local variables. The stack trace is just a string that contains the names of the methods currently on the call stack (and the location in those methods, if debug symbols are available). The Exception.StackTrace property contains the stack trace as it was when the exception was thrown, and is not affected when the stack is unwound; if you rethrow the same exception with throw;, it is left untouched. It is only overwritten if you rethrow the exception with throw ex;. The stack itself, on the other hand, is unwound when a catch block is entered, as discussed above.

Side effects

It’s interesting to note that an exception filter can contain any expression that returns a bool (well, almost… you can’t use await, for instance). It can be an inline condition, a property, a method call, etc. Technically, there’s nothing to prevent you from causing side effects in the exception filter. In most cases, I would strongly advise against doing that, as it can cause very confusing behavior; it can become really hard to understand the order in which things are executed. However, there is a common scenario that could benefit from side effects in exception filters: logging. You could easily create a method that logs the exception and returns false so that the catch block is not entered. This would allow logging exceptions on the fly without actually catching them, hence without unwinding the stack:

try
{
    DoSomethingThatMightFail(s);
}
catch (Exception ex) when (Log(ex, "An error occurred"))
{
    // this catch block will never be reached
}

...

static bool Log(Exception ex, string message, params object[] args)
{
    Debug.Print(message, args);
    return false;
}

Conclusion

As you can see, exception filters are not just syntactic sugar. Contrary to most C# 6 features, they’re not really a “coding” feature (in that they don’t make the code significantly clearer), but rather a “debugging” feature. Correctly understood and used, they can make it much easier to diagnose problems in your code.

Customizing string interpolation in C# 6

Very poorPoorAverageGoodExcellent (4 votes) 
Loading...

One of the major new features in C# 6 is string interpolation, which allows you to write things like this:

string text = $"{p.Name} was born on {p.DateOfBirth:D}";

A lesser known aspect of this feature is that an interpolated string can be treated either as a String, or as an IFormattable, depending on the context. When it is converted to an IFormattable, it constructs a FormattableString object that implements the interface and exposes:

  • the format string with the placeholders (“holes”) replaced by numbers (compatible with String.Format)
  • the values for the placeholders

The ToString() method of this object just calls String.Format(format, values). But there is also an overload that accepts an IFormatProvider, and this is where things get interesting, because it makes it possible to customize how the values are formatted. It might not be immediately obvious why this is useful, so let me give you a few examples…

Specifying the culture

During the design of the string interpolation feature, there was a lot of debate on whether to use the current culture or the invariant culture to format the values; there were good arguments on both sides, but eventually it was decided to use the current culture, for consistency with String.Format and similar APIs that use composite formatting. Using the current culture makes sense when you’re using string interpolation to build strings to be displayed in the user interface; but there are also scenarios where you want to build strings that will be consumed by an API or protocol (URLs, SQL queries…), and in those cases you usually want to use the invariant culture.

C# 6 provides an easy way to do that, by taking advantage of the conversion to IFormattable. You just need to create a method like this:

static string Invariant(FormattableString formattable)
{
    return formattable.ToString(CultureInfo.InvariantCulture);
}

And you can then use it as follows:

string text = Invariant($"{p.Name} was born on {p.DateOfBirth:D}");

The values in the interpolated strings will now be formatted with the invariant culture, rather than the default culture.

Building URLs

Here’s a more advanced example. String interpolation is a convenient way to build URLs, but if you include arbitrary strings in a URL, you need to be careful to URL-encode them. A custom string interpolator can do that for you; you just need to create a custom IFormatProvider that will take care of encoding the values. The implementation was not obvious at first, but after some trial and error I came up with this:

class UrlFormatProvider : IFormatProvider
{
    private readonly UrlFormatter _formatter = new UrlFormatter();

    public object GetFormat(Type formatType)
    {
        if (formatType == typeof(ICustomFormatter))
            return _formatter;
        return null;
    }

    class UrlFormatter : ICustomFormatter
    {
        public string Format(string format, object arg, IFormatProvider formatProvider)
        {
            if (arg == null)
                return string.Empty;
            if (format == "r")
                return arg.ToString();
            return Uri.EscapeDataString(arg.ToString());
        }
    }
}

You can use the formatter like this:

static string Url(FormattableString formattable)
{
    return formattable.ToString(new UrlFormatProvider());
}

...

string url = Url($"http://foobar/item/{id}/{name}");

It will correctly encode the values of id and name so that the resulting URL only contains valid characters.

Aside: Did you notice the if (format == "r")? It’s a custom format specifier to indicate that the value should not be encoded (“r” stands for “raw”). To use it you just include it in the format string like this: {id:r}. This will prevent the encoding of id.

Building SQL queries

You can do something similar for SQL queries. Of course, it’s a known bad practice to embed values directly in the query, for security and performance reasons (you should use parameterized queries instead); but for “quick and dirty” developments it can still be useful. And anyway, it’s a good illustration for the feature. When embedding values in a SQL queries, you should:

  • enclose strings in single quotes, and escape single quotes inside the strings by doubling them
  • format dates according to what the DBMS expects (typically MM/dd/yyyy)
  • format numbers using the invariant culture
  • replace null values with the NULL literal

(there are probably other things to take care of, but these are the most obvious).

We can use the same approach as for URLs and create a SqlFormatProvider:

class SqlFormatProvider : IFormatProvider
{
    private readonly SqlFormatter _formatter = new SqlFormatter();

    public object GetFormat(Type formatType)
    {
        if (formatType == typeof(ICustomFormatter))
            return _formatter;
        return null;
    }

    class SqlFormatter : ICustomFormatter
    {
        public string Format(string format, object arg, IFormatProvider formatProvider)
        {
            if (arg == null)
                return "NULL";
            if (arg is string)
                return "'" + ((string)arg).Replace("'", "''") + "'";
            if (arg is DateTime)
                return "'" + ((DateTime)arg).ToString("MM/dd/yyyy") + "'";
            if (arg is IFormattable)
                return ((IFormattable)arg).ToString(format, CultureInfo.InvariantCulture);
            return arg.ToString();
        }
    }
}

You can then use the formatter like this:

static string Sql(FormattableString formattable)
{
    return formattable.ToString(new SqlFormatProvider());
}

...

string sql = Sql($"insert into items(id, name, creationDate) values({id}, {name}, {DateTime.Now})");

This will take care of properly formatting the values to produce a valid SQL query.

Using string interpolation when targeting older versions of .NET

As is often the case for language features that leverage .NET framework types, you can use this feature with older versions of the framework that don’t have the FormattableString class; you just have to create the class yourself in the appropriate namespace. Actually, there are two classes to implement: FormattableString and FormattableStringFactory. Jon Skeet was apparently in a hurry to try this, and he has already provided an example with the code for these classes:

using System;

namespace System.Runtime.CompilerServices
{
    public class FormattableStringFactory
    {
        public static FormattableString Create(string messageFormat, params object[] args)
        {
            return new FormattableString(messageFormat, args);
        }

        public static FormattableString Create(string messageFormat, DateTime bad, params object[] args)
        {
            var realArgs = new object[args.Length + 1];
            realArgs[0] = "Please don't use DateTime";
            Array.Copy(args, 0, realArgs, 1, args.Length);
            return new FormattableString(messageFormat, realArgs);
        }
    }
}

namespace System
{
    public class FormattableString
    {
        private readonly string messageFormat;
        private readonly object[] args;

        public FormattableString(string messageFormat, object[] args)
        {
            this.messageFormat = messageFormat;
            this.args = args;
        }
        public override string ToString()
        {
            return string.Format(messageFormat, args);
        }
    }
}

This is the same approach that made it possible to use Linq when targeting .NET 2 (LinqBridge) or caller info attributes when targeting .NET 4 or earlier. Of course, it still requires the C# 6 compiler to work…

Conclusion

The conversion of interpolated strings to IFormattable had been mentioned previously, but it wasn’t implemented until recently; the just released CTP 6 of Visual Studio 2015 ships with a new version of the compiler that includes this feature, so you can now go ahead and use it. This feature makes string interpolation very flexible, and I’m sure people will come up with many other use cases that I didn’t think of.

You can find the code for the examples above on GitHub.

Optimize ToArray and ToList by providing the number of elements

Very poorPoorAverageGoodExcellent (2 votes) 
Loading...

The ToArray and ToList extension methods are convenient ways to eagerly materialize an enumerable sequence (e.g. a Linq query) into an array or a list. However, there’s something that bothers me: both of these methods are very inefficient if they don’t know the number of elements in the sequence (which is almost always the case when you use them on a Linq query). Let’s focus on ToArray for now (ToList has a few differences, but the principle is mostly the same).

Basically, ToArray takes a sequence, and returns an array that contains all the elements from the sequence. If the sequence implements ICollection<T>, it uses the Count property to allocate an array of the right size, and copy the elements into it; here’s an example:

List<User> users = GetUsers();
User[] array = users.ToArray();

In this scenario, ToArray is fairly efficient. Now, let’s change that code to extract just the names from the users:

List<User> users = GetUsers();
string[] array = users.Select(u => u.Name).ToArray();

Now, the argument of ToArray is an IEnumerable<User> returned by Select. It doesn’t implement ICollection<User>, so ToArray doesn’t know the number of elements, so it cannot allocate an array of the appropriate size. So here’s what it does:

  1. start by allocating a small array (4 elements in the current implementation)
  2. copy elements from the source into the array until the array is full
  3. if there are no more elements in the source, go to 7
  4. otherwise, allocate a new array, twice as large as the previous one
  5. copy the items from the old array to the new array
  6. repeat from step 2
  7. if the array is longer than the number of elements, trim it: allocate a new array with exactly the right size, and copy the elements from the previous array
  8. return the array

If there are few elements, this is quite painless; but for a very long sequence, it’s very inefficient, because of the many allocations and copies.

What is annoying is that, in many cases, we know the number of elements in the source! In the example above, we only use Select, which doesn’t change the number of elements, so we know that it’s the same as in the original list; but ToArray doesn’t know, because the information was lost along the way. If only we had a way to help it by providing this information ourselves….

Well, it’s actually very easy to do: all we have to do is create a new extension method that accepts the count as a parameter. Here’s what it might look like:

public static TSource[] ToArray<TSource>(this IEnumerable<TSource> source, int count)
{
    if (source == null) throw new ArgumentNullException("source");
    if (count < 0) throw new ArgumentOutOfRangeException("count");
    var array = new TSource[count];
    int i = 0;
    foreach (var item in source)
    {
        array[i++] = item;
    }
    return array;
}

Now we can optimize our previous example like this:

List<User> users = GetUsers();
string[] array = users.Select(u => u.Name).ToArray(users.Count);

Note that if you specify a count that is less than the actual number of elements in the sequence, you will get an IndexOutOfRangeException; it’s your responsibility to provide the correct count to the method.

So, what do we actually gain by doing that? From my benchmarks, this improved ToArray is about twice as fast as the built-in one, for a long sequence (tested with 1,000,000 elements). This is pretty good!

Note that we can improve ToList in the same way, by using the List<T> constructor that lets us specify the initial capacity:

public static List<TSource> ToList<TSource>(this IEnumerable<TSource> source, int count)
{
    if (source == null) throw new ArgumentNullException("source");
    if (count < 0) throw new ArgumentOutOfRangeException("count");
    var list = new List<TSource>(count);
    foreach (var item in source)
    {
        list.Add(item);
    }
    return list;
}

In this case, the performance gain is not as as big as for ToArray (about 25% instead of 50%), probably because the list doesn’t need to be trimmed, but it’s not negligible.

Obviously, a similar optimization could be made to ToDictionary as well, since the Dictionary<TKey, TValue> class also has a constructor that lets us specify the initial capacity.

The improved ToArray and ToList methods are available in my Linq.Extras library, which also provides many useful extension methods for working on sequences and collections.

Easy unit testing of null argument validation

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...

When unit testing a method, one of the things to test is argument validation : for instance, ensure that the method throws a ArgumentNullException when a null argument is passed for a parameter that isn’t allowed to be null. Writing this kind of test is very easy, but it’s also a tedious and repetitive task, especially if the method has many parameters… So I wrote a method that automates part of this task: it tries to pass null for each of the specified arguments, and asserts that the method throws an ArgumentNullException. Here’s an example that tests a FullOuterJoin extension method:

[Test]
public void FullOuterJoin_Throws_If_Argument_Null()
{
    var left = Enumerable.Empty<int>();
    var right = Enumerable.Empty<int>();
    TestHelper.AssertThrowsWhenArgumentNull(
        () => left.FullOuterJoin(right, x => x, y => y, (k, x, y) => 0, 0, 0, null),
        "left", "right", "leftKeySelector", "rightKeySelector", "resultSelector");
}

The first parameter is a lambda expression that represents how to call the method. In this lambda, you should only pass valid arguments. The following parameters are the names of the parameters that are not allowed to be null. For each of the specified names, AssertThrowsWhenArgumentNull will replace the corresponding argument with null in the provided lambda, compile and invoke the lambda, and assert that the method throws a ArgumentNullException.

Using this method, instead of writing a test for each of the arguments that are not allowed to be null, you only need one test.

Here’s the code for the TestHelper.AssertThrowsWhenArgumentNull method (you can also find it on Gist):

using System;
using System.Linq;
using System.Linq.Expressions;
using NUnit.Framework;

namespace MyLibrary.Tests
{
    static class TestHelper
    {
        public static void AssertThrowsWhenArgumentNull(Expression<TestDelegate> expr, params string[] paramNames)
        {
            var realCall = expr.Body as MethodCallExpression;
            if (realCall == null)
                throw new ArgumentException("Expression body is not a method call", "expr");

            var realArgs = realCall.Arguments;
            var paramIndexes = realCall.Method.GetParameters()
                .Select((p, i) => new { p, i })
                .ToDictionary(x => x.p.Name, x => x.i);
            var paramTypes = realCall.Method.GetParameters()
                .ToDictionary(p => p.Name, p => p.ParameterType);
            
            

            foreach (var paramName in paramNames)
            {
                var args = realArgs.ToArray();
                args[paramIndexes[paramName]] = Expression.Constant(null, paramTypes[paramName]);
                var call = Expression.Call(realCall.Method, args);
                var lambda = Expression.Lambda<TestDelegate>(call);
                var action = lambda.Compile();
                var ex = Assert.Throws<ArgumentNullException>(action, "Expected ArgumentNullException for parameter '{0}', but none was thrown.", paramName);
                Assert.AreEqual(paramName, ex.ParamName);
            }
        }

    }
}

Note that it is written for NUnit, but can easily be adapted to other unit test frameworks.

I used this method in my Linq.Extras library, which provides many additional extension methods for working with sequences and collections (including the FullOuterJoin method mentioned above).

A review of NDepend

Very poorPoorAverageGoodExcellent (1 votes) 
Loading...

I’ve been hearing quite a lot about NDepend over the last few years, but I had never tried it until recently, when its creator Patrick Smacchia was kind enough to offer me a license.

NDepend is a static analysis tool for .NET that checks your code base against a large set of rules that fall in various categories, such as code quality, object-oriented design, architecture, naming conventions, etc. All of these rules are completely customizable. It can be used as a standalone tool, or as a Visual Studio extension; there is also a command-line tool to integrate in the build process.

I should note that it’s the first time I write a software review, so this exercise is completely new to me. Although I was offered a free license, I’m not affiliated with NDepend in any way, and I’ll do my best to be as fair and unbiased as possible.

Setup experience

NDepend doesn’t have an installer: it’s just a zip file that you extract into a folder. From there you can run the standalone tool (VisualNDepend.exe), and install the VS plugin (NDepend.Install.VisualStudioAddin.exe).

There is no UI to enter the license key either; you just drop the NDependProLicense.xml file into the NDepend folder.

Admittedly, this tool is intended for professional developers who shouldn’t have any problem with those steps, so it’s not that big a deal, but a more streamlined setup experience would have been nicer.

UI

Perhaps it’s just me, but I found the UI a little confusing; there are just too many windows and tooltips that pop open all the time (I used the tool mostly as a VS extension). NDepend needs a lot of screen space to work comfortably, and at home I only have one screen with a lower-than-average resolution, which made it a bit awkward to use for me.

To be fair, the Dashboard gives a pretty good overview of the project. In the VS extension, there is also an icon in the status bar that lets you see at a glance the code queries and rule violations (click the images to enlarge).

imageimage

You can also view a full report that is rendered as webpage and contains a lot of relevant information about your project.

image

This report can be customized to your specific needs in the NDepend project properties.

Code queries and rules

This is, in my opinion, the best thing about NDepend : the code inspection engine is extremely powerful and customizable. NDepend comes with a lot of default rules :

image

(in this screenshot I have already fixed all warnings, so all rules show a count of 0)

These rules are defined using a domain specific language called CQLinq, which allows you to write complex queries about your code using the familiar Linq syntax. For instance, here’s a simple one that checks for namespaces with few types:

image

The default rules often come with comments that give more information about the rule and explain why it’s relevant. As you can see, the code is mostly standard Linq, and the editor has syntax highlighting and Intellisense. NDepend’s code model includes about everything you could expect (classes, methods, etc), but also a lot of extra information like cyclomatic complexity, number of IL instructions, dependencies between classes or namespaces, etc. The result presentation is quite smart; depending on the output of the query, it shows namespaces, types or members organized by assembly. Result columns that contain lists can be clicked to view the elements of the list, and a click on a code item jumps to the location in code.

Each rule can be enabled or disabled, or set as critical or not. You can modify the default rules, or create your own. Note that rules don’t have to be warnings: you can create a code query that just reports information about your code:

image

So as you can see, CQLinq is a powerful way to check just about any design rule you care to enforce about your code.

Of course, the feature is not perfect… Here are a few downsides:

  • There are a lot of default rules. Arguably, that could also be counted as a quality, but the first time you run NDepend on your project, the sheer number of reported rule violations is quite overwhelming, and usually you don’t really care about most of them. So you have to spend quite a long time reviewing the results to decide which rules you really care about, which ones need to be adjusted to your need, etc. When I did it on a rather small project, it took me 2 hours to fix all warnings; not because I had a lot of things to fix in my code, but because I had a lot of things to fix in the rules! I’m not saying my code was perfect, obviously, and NDepend did help me find and fix a few issues, but many of the rules weren’t really relevant in my specific project. So, if you use NDepend, expect to spend a lot of time adjusting the rules to your needs; once you do that, the tool will really shine, and the analysis results will be a lot more useful to you.
  • There is no easy way to “suppress” a specific occurrence of a rule violation. For instance, in ReSharper you can suppress a warning with a special comment (and the quick fix menu lets you add that comment automatically); in FxCop, you can apply the [SuppressMessage] attribute to a type or member. There is nothing like that in NDepend; if you want to exclude a code item from a rule, you have to modify the code of the CQLinq query itself. Given the flexibility of the query language, it’s understandable that there is no generic way to suppress warnings, but still, it’s annoying; it also means that you can’t just reuse the exact same queries in other projects. There is however a nice feature that partly counterbalances the lack of a generic suppression mechanism: the JustMyCode context. It defines a “view” of the code that only includes your own code, not the code generated by designers or by the compiler. So you can query against the JustMyCode context to ignore rule violations in code that you didn’t write, and you can customize what is considered “not your code” using the same CQLinq syntax.
  • Queries that take IL statistics (number of IL instructions, IL cyclomatic complexity, etc) into account are often biased by complex code constructs such as iterator blocks, anonymous methods or async methods, which results in false positives. Some methods are complex at the IL level, and reported as such, even though the original C# code is rather straightforward.

Dependency management

I guess that’s the feature that gave the tool its name, even though now it does much more than that… NDepend can give you very detailed information about dependencies between assemblies and namespaces (your own, as well as framework or third party assemblies). The dependencies can be viewed as a directed graph:

image

Or as a matrix:

image

Both views are interactive; the matrix view can even be “drilled down” to view dependencies at a lower level.

I didn’t really take advantage of the dependency-related features, because I only tested NDepend on simple projects, but they can certainly be very useful in large solutions to eliminate unwanted coupling between different parts of the code.

Code evolution analysis

NDepend also lets you to compare analysis results between builds. Basically, you set a baseline for the comparison, and it gives you trends to measure the progress of various code metrics over time. I didn’t use this feature myself so I can’t really talk in detail about it, but its usefulness is quite obvious for large projects as it lets you see which aspects are improving or worsening, allowing you to refocus the team’s efforts as necessary.

Conclusion

I have to say that I’m very impressed by NDepend’s analysis engine; it’s incredibly powerful, and the fact that the rules are completely customizable opens a world of possibilities. I love the fact that I can just write a simple Linq query to find all classes or methods that match certain criteria. Regarding the other features, like dependency management, I’m sure they can be very useful, but most of the projects I work on are rather small, so dependencies are usually not a major issue for me.

The way I see it, NDepend is a great tool to keep close tabs on the architecture of large projects, but is probably overkill for small projects. It’s also very useful if you need to enforce strict design guidelines across a large code base; obviously, it won’t completely replace code review, but it can certainly be a big help in the review process.

In any case, NDepend has a lot of obvious qualities, but it’s probably not the right tool for everyone. The only way to decide if you need it or not is to try it for yourself, and see how it works out for you!

css.php