Optimize ToArray and ToList by providing the number of elements

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...Loading...

The ToArray and ToList extension methods are convenient ways to eagerly materialize an enumerable sequence (e.g. a Linq query) into an array or a list. However, there’s something that bothers me: both of these methods are very inefficient if they don’t know the number of elements in the sequence (which is almost always the case when you use them on a Linq query). Let’s focus on ToArray for now (ToList has a few differences, but the principle is mostly the same).

Basically, ToArray takes a sequence, and returns an array that contains all the elements from the sequence. If the sequence implements ICollection<T>, it uses the Count property to allocate an array of the right size, and copy the elements into it; here’s an example:

List<User> users = GetUsers();
User[] array = users.ToArray();

In this scenario, ToArray is fairly efficient. Now, let’s change that code to extract just the names from the users:

List<User> users = GetUsers();
string[] array = users.Select(u => u.Name).ToArray();

Now, the argument of ToArray is an IEnumerable<User> returned by Select. It doesn’t implement ICollection<User>, so ToArray doesn’t know the number of elements, so it cannot allocate an array of the appropriate size. So here’s what it does:

  1. start by allocating a small array (4 elements in the current implementation)
  2. copy elements from the source into the array until the array is full
  3. if there are no more elements in the source, go to 7
  4. otherwise, allocate a new array, twice as large as the previous one
  5. copy the items from the old array to the new array
  6. repeat from step 2
  7. if the array is longer than the number of elements, trim it: allocate a new array with exactly the right size, and copy the elements from the previous array
  8. return the array

If there are few elements, this is quite painless; but for a very long sequence, it’s very inefficient, because of the many allocations and copies.

What is annoying is that, in many cases, we know the number of elements in the source! In the example above, we only use Select, which doesn’t change the number of elements, so we know that it’s the same as in the original list; but ToArray doesn’t know, because the information was lost along the way. If only we had a way to help it by providing this information ourselves….

Well, it’s actually very easy to do: all we have to do is create a new extension method that accepts the count as a parameter. Here’s what it might look like:

public static TSource[] ToArray<TSource>(this IEnumerable<TSource> source, int count)
{
    if (source == null) throw new ArgumentNullException("source");
    if (count < 0) throw new ArgumentOutOfRangeException("count");
    var array = new TSource[count];
    int i = 0;
    foreach (var item in source)
    {
        array[i++] = item;
    }
    return array;
}

Now we can optimize our previous example like this:

List<User> users = GetUsers();
string[] array = users.Select(u => u.Name).ToArray(users.Count);

Note that if you specify a count that is less than the actual number of elements in the sequence, you will get an IndexOutOfRangeException; it’s your responsibility to provide the correct count to the method.

So, what do we actually gain by doing that? From my benchmarks, this improved ToArray is about twice as fast as the built-in one, for a long sequence (tested with 1,000,000 elements). This is pretty good!

Note that we can improve ToList in the same way, by using the List<T> constructor that lets us specify the initial capacity:

public static List<TSource> ToList<TSource>(this IEnumerable<TSource> source, int count)
{
    if (source == null) throw new ArgumentNullException("source");
    if (count < 0) throw new ArgumentOutOfRangeException("count");
    var list = new List<TSource>(count);
    foreach (var item in source)
    {
        list.Add(item);
    }
    return list;
}

In this case, the performance gain is not as as big as for ToArray (about 25% instead of 50%), probably because the list doesn’t need to be trimmed, but it’s not negligible.

Obviously, a similar optimization could be made to ToDictionary as well, since the Dictionary<TKey, TValue> class also has a constructor that lets us specify the initial capacity.

The improved ToArray and ToList methods are available in my Linq.Extras library, which also provides many useful extension methods for working on sequences and collections.

Easily convert file sizes to human-readable form

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...Loading...

If you write an application that has anything to do with file management, you will probably need to display the size of the files. But if a file has a size of 123456789 bytes, it doesn’t mean that you should just display this value to the user, because it’s hard to read, and the user usually doesn’t need 1-byte precision. Instead, you will write something like 118 MB.

This should be a no-brainer, but there are actually a number of different ways to display byte sizes… For instance, there are several co-existing conventions for units and prefixes:

  • The SI (International System of Units) convention uses decimal multiples, based on powers of 10: 1 kilobyte is 1000 bytes, 1 megabyte is 1000 kilobytes, etc. The prefixes are the one from the metric system (k, M, G, etc.).
  • The IEC convention uses binary multiples, based on powers of 2: 1 kibibyte is 1024 bytes, 1 mebibyte is 1024 kibibytes, etc. The prefixes are Ki, Mi, Gi etc., to avoid confusion with the metric system.
  • But neither of these conventions is commonly used: the customary convention is to use binary mutiples (1024), but decimal prefixes (K, M, G, etc.).

Depending on the context, you might want to use either of these conventions. I’ve never seen the SI convention used anywhere; some apps (I’ve seen it in VirtualBox for instance) use the IEC convention; most apps and operating systems use the customary convention. You can read this Wikipedia article if you want more details: Binary prefix.

OK, so let’s chose the customary convention for now. Now you have to decide which scale to use: do you want to write 0.11 GB, 118 MB, 120564 KB, or 123456789 B? Typically, the scale is chosen so that the displayed value is between 1 and 1024.

A few more things you might have to consider:

  • Do you want to display integer values, or include a few decimal places?
  • Is there a minimum unit to use (for instance, Windows never uses bytes: a 1 byte file is displayed as 1 KB)?
  • How should the value be rounded?
  • How do you want to format the value?
  • for values less than 1KB, do you want to use the word “bytes”, or just the symbol “B”?

OK, enough of this! What’s your point?

So as you can see, displaying a byte size in human-readable form isn’t as straightforward as you might have expected… I’ve had to write code to do it in a number of apps, and I eventually got tired of doing it again over and over, so I wrote a library that attempts to cover all use cases. I called it HumanBytes, for reasons that should be obvious… It is also available as a NuGet package.

Its usage is quite simple. It’s based on a class named ByteSizeFormatter, which has a few properties to control how the value is rendered:

var formatter = new ByteSizeFormatter
{
    Convention = ByteSizeConvention.Binary,
    DecimalPlaces = 1,
    NumberFormat = "#,##0.###",
    MinUnit = ByteSizeUnit.Kilobyte,
    MaxUnit = ByteSizeUnit.Gigabyte,
    RoundingRule = ByteSizeRounding.Closest,
    UseFullWordForBytes = true,
};

var f = new FileInfo("TheFile.jpg");
Console.WriteLine("The size of '{0}' is {1}", f, formatter.Format(f.Length));

In most cases, though, you will just want to use the default settings. You can do that easily with the Bytes extension method:

var f = new FileInfo("TheFile.jpg");
Console.WriteLine("The size of '{0}' is {1}", f, f.Length.Bytes());

This method returns an instance of the ByteSize structure, whose ToString method formats the value using the default formatter. You can change the default formatter settings globally through the ByteSizeFormatter.Default static property.

A note on localization

Not all languages use the same symbol for “byte”, and obviously the word “byte” itself is different across languages. Currently the library only supports English and French; if you want your language to be supported as well, please fork, add your translation, and make a pull request. There are only 3 terms to translate, so it shouldn’t take long Winking smile.

Posted in Libraries. Tags: , , . No Comments »

StringTemplate: another approach to string interpolation

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...Loading...

With the upcoming version 6 of C#, there’s a lot of talk on CodePlex and elsewhere about string interpolation. Not very surprising, since it’s one of the major features of that release… In case you were living under a rock during the last few months and you haven’t heard about it, string interpolation is a way to insert C# expressions inside a string, so that they’re evaluated at runtime and replaced with their values. Basically, you write something like this:

string text = $"{p.Name} was born on {p.DateOfBirth:D}";

And the compiler transforms it to this:

string text = String.Format("{0} was born on {1:D}", p.Name, p.DateOfBirth);

Note: the syntax shown above is the one from the latest design notes about this feature. It might still change before the final release, and the current preview build of VS2015 uses a different syntax: “\{p.Name} was born on \{p.DateOfBirth:D}”.

I really love this feature. It’s going to be extremely convenient for things like logging, generating URLs or queries, etc. I will probably use it a lot, especially since Microsoft has listened to community feedback and included a way to customize how the embedded expressions are evaluated (see the part about IFormattable in the design notes).

But there’s one thing that bothers me, though: since interpolated strings are interpreted by the compiler, they have to be hard-coded ; you can’t extract them to resources for localization. This means that this feature cannot be used for localization, and we’re stuck with old-fashioned numeric placeholders in localized strings.

Or are we really?

For a few years now, I’ve been using a custom string interpolation engine that can be used like String.Format, but with named placeholders instead of numeric ones. It takes a format string, and an object with properties that match the placeholder names:

string text = StringTemplate.Format("{Name} was born on {DateOfBirth:D}", new { p.Name, p.DateOfBirth });

Obviously, if you already have an object with the properties you want to include in the string, you can just pass that object directly:

string text = StringTemplate.Format("{Name} was born on {DateOfBirth:D}", p);

The result is exactly what you would expect: the placeholders are replaced with the values of the corresponding properties.

In which ways is it better than String.Format?

  • It’s much more readable: a named placeholder tells you immediately which value will go there
  • It’s less error-prone: you don’t need to pay attention to the order of the values to be formatted
  • When you extract the format strings to resources for localization, the translator sees a name in the placeholder, not a number. This gives more context to the string, and makes it easier to understand what the final string will look like.

Note that you can use the same format specifiers as in String.Format. The StringTemplate class parses your format string into one compatible with String.Format, extracts the property values into an array, and calls String.Format.

Of course, parsing the string and extracting the property values with reflection every time would be very inefficient, so there are a some optimizations:

  • each distinct format string is only parsed once, and the result of the parsing is added to a cache, to be reused every time.
  • for each property used in a format string, a getter delegate is generated and cached, to avoid using reflection every time.

This means that the first time you use a given format string, there will be the overhead of parsing and generating the delegates, but subsequent usages of the same format string will be much faster.

The StringTemplate class is part of a library called NString, which also contains a few extension methods to make string manipulations easier. The library is a PCL that can be used with all .NET flavors except Silverlight 5. A NuGet package is available here.

Passing parameters by reference to an asynchronous method

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...Loading...

Asynchrony in C# 5 is awesome, and I’ve been using it a lot since it was introduced. But there are few annoying limitations; for instance, you cannot pass parameters by reference (ref or out) to an asynchronous method. There are good reasons for that; the most obvious is that if you pass a local variable by reference, it is stored on the stack, but the current stack won’t remain available during the whole execution of the async method (only until the first await), so the location of the variable won’t exist anymore.

However, it’s pretty easy to work around that limitation : you only need to create a Ref<T> class to hold the value, and pass an instance of this class by value to the async method:

async void btnFilesStats_Click(object sender, EventArgs e)
{
    var count = new Ref<int>();
    var size = new Ref<ulong>();
    await GetFileStats(tbPath.Text, count, size);
    txtFileStats.Text = string.Format("{0} files ({1} bytes)", count, size);
}

async Task GetFileStats(string path, Ref<int> totalCount, Ref<ulong> totalSize)
{
    var folder = await StorageFolder.GetFolderFromPathAsync(path);
    foreach (var f in await folder.GetFilesAsync())
    {
        totalCount.Value += 1;
        var props = await f.GetBasicPropertiesAsync();
        totalSize.Value += props.Size;
    }
    foreach (var f in await folder.GetFoldersAsync())
    {
        await GetFilesCountAndSize(f, totalCount, totalSize);
    }
}

The Ref<T> class looks like this:

public class Ref<T>
{
    public Ref() { }
    public Ref(T value) { Value = value; }
    public T Value { get; set; }
    public override string ToString()
    {
        T value = Value;
        return value == null ? "" : value.ToString();
    }
    public static implicit operator T(Ref<T> r) { return r.Value; }
    public static implicit operator Ref<T>(T value) { return new Ref<T>(value); }
}

As you can see, it’s pretty straightforward. This approach can also be used in iterator blocks (i.e. yield return), that also don’t allow ref and out parameters. It also has an advantage over standard ref and out parameters: you can make the parameter optional, if for instance you’re not interested in the result (obviously, the callee must handle that case appropriately).

Easy unit testing of null argument validation

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...Loading...

When unit testing a method, one of the things to test is argument validation : for instance, ensure that the method throws a ArgumentNullException when a null argument is passed for a parameter that isn’t allowed to be null. Writing this kind of test is very easy, but it’s also a tedious and repetitive task, especially if the method has many parameters… So I wrote a method that automates part of this task: it tries to pass null for each of the specified arguments, and asserts that the method throws an ArgumentNullException. Here’s an example that tests a FullOuterJoin extension method:

[Test]
public void FullOuterJoin_Throws_If_Argument_Null()
{
    var left = Enumerable.Empty<int>();
    var right = Enumerable.Empty<int>();
    TestHelper.AssertThrowsWhenArgumentNull(
        () => left.FullOuterJoin(right, x => x, y => y, (k, x, y) => 0, 0, 0, null),
        "left", "right", "leftKeySelector", "rightKeySelector", "resultSelector");
}

The first parameter is a lambda expression that represents how to call the method. In this lambda, you should only pass valid arguments. The following parameters are the names of the parameters that are not allowed to be null. For each of the specified names, AssertThrowsWhenArgumentNull will replace the corresponding argument with null in the provided lambda, compile and invoke the lambda, and assert that the method throws a ArgumentNullException.

Using this method, instead of writing a test for each of the arguments that are not allowed to be null, you only need one test.

Here’s the code for the TestHelper.AssertThrowsWhenArgumentNull method (you can also find it on Gist):

using System;
using System.Linq;
using System.Linq.Expressions;
using NUnit.Framework;

namespace MyLibrary.Tests
{
    static class TestHelper
    {
        public static void AssertThrowsWhenArgumentNull(Expression<TestDelegate> expr, params string[] paramNames)
        {
            var realCall = expr.Body as MethodCallExpression;
            if (realCall == null)
                throw new ArgumentException("Expression body is not a method call", "expr");

            var realArgs = realCall.Arguments;
            var paramIndexes = realCall.Method.GetParameters()
                .Select((p, i) => new { p, i })
                .ToDictionary(x => x.p.Name, x => x.i);
            var paramTypes = realCall.Method.GetParameters()
                .ToDictionary(p => p.Name, p => p.ParameterType);
            
            

            foreach (var paramName in paramNames)
            {
                var args = realArgs.ToArray();
                args[paramIndexes[paramName]] = Expression.Constant(null, paramTypes[paramName]);
                var call = Expression.Call(realCall.Method, args);
                var lambda = Expression.Lambda<TestDelegate>(call);
                var action = lambda.Compile();
                var ex = Assert.Throws<ArgumentNullException>(action, "Expected ArgumentNullException for parameter '{0}', but none was thrown.", paramName);
                Assert.AreEqual(paramName, ex.ParamName);
            }
        }

    }
}

Note that it is written for NUnit, but can easily be adapted to other unit test frameworks.

I used this method in my Linq.Extras library, which provides many additional extension methods for working with sequences and collections (including the FullOuterJoin method mentioned above).

Visual Studio Online + Git integration with Team Explorer

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...Loading...

I recently started using Visual Studio Online for personal projects, and I must say it’s a pretty good platform, although it would be nice to be able to host public projects as well as private ones. The thing I like the most is the integration with Visual Studio Team Explorer to manage work items and builds.

However, I noticed a little gotcha when using Git for source control : the remote for VS Online must be named origin, otherwise Team Explorer won’t detect that it’s a VS Online project, and it won’t show the “Builds” and “Work items” pages.

 When VSO remote is named "origin" When VSO remote is named "vso"

This is obviously a bug (although a minor one), since the name origin is just a convention and a git remote can have any name; I reported it on Connect. If you encounter it you can easily work around it by renaming your remote to origin:

git remote rename vso origin

A review of NDepend

Very poorPoorAverageGoodExcellent (No Ratings Yet) 
Loading...Loading...

I’ve been hearing quite a lot about NDepend over the last few years, but I had never tried it until recently, when its creator Patrick Smacchia was kind enough to offer me a license.

NDepend is a static analysis tool for .NET that checks your code base against a large set of rules that fall in various categories, such as code quality, object-oriented design, architecture, naming conventions, etc. All of these rules are completely customizable. It can be used as a standalone tool, or as a Visual Studio extension; there is also a command-line tool to integrate in the build process.

I should note that it’s the first time I write a software review, so this exercise is completely new to me. Although I was offered a free license, I’m not affiliated with NDepend in any way, and I’ll do my best to be as fair and unbiased as possible.

Setup experience

NDepend doesn’t have an installer: it’s just a zip file that you extract into a folder. From there you can run the standalone tool (VisualNDepend.exe), and install the VS plugin (NDepend.Install.VisualStudioAddin.exe).

There is no UI to enter the license key either; you just drop the NDependProLicense.xml file into the NDepend folder.

Admittedly, this tool is intended for professional developers who shouldn’t have any problem with those steps, so it’s not that big a deal, but a more streamlined setup experience would have been nicer.

UI

Perhaps it’s just me, but I found the UI a little confusing; there are just too many windows and tooltips that pop open all the time (I used the tool mostly as a VS extension). NDepend needs a lot of screen space to work comfortably, and at home I only have one screen with a lower-than-average resolution, which made it a bit awkward to use for me.

To be fair, the Dashboard gives a pretty good overview of the project. In the VS extension, there is also an icon in the status bar that lets you see at a glance the code queries and rule violations (click the images to enlarge).

imageimage

You can also view a full report that is rendered as webpage and contains a lot of relevant information about your project.

image

This report can be customized to your specific needs in the NDepend project properties.

Code queries and rules

This is, in my opinion, the best thing about NDepend : the code inspection engine is extremely powerful and customizable. NDepend comes with a lot of default rules :

image

(in this screenshot I have already fixed all warnings, so all rules show a count of 0)

These rules are defined using a domain specific language called CQLinq, which allows you to write complex queries about your code using the familiar Linq syntax. For instance, here’s a simple one that checks for namespaces with few types:

image

The default rules often come with comments that give more information about the rule and explain why it’s relevant. As you can see, the code is mostly standard Linq, and the editor has syntax highlighting and Intellisense. NDepend’s code model includes about everything you could expect (classes, methods, etc), but also a lot of extra information like cyclomatic complexity, number of IL instructions, dependencies between classes or namespaces, etc. The result presentation is quite smart; depending on the output of the query, it shows namespaces, types or members organized by assembly. Result columns that contain lists can be clicked to view the elements of the list, and a click on a code item jumps to the location in code.

Each rule can be enabled or disabled, or set as critical or not. You can modify the default rules, or create your own. Note that rules don’t have to be warnings: you can create a code query that just reports information about your code:

image

So as you can see, CQLinq is a powerful way to check just about any design rule you care to enforce about your code.

Of course, the feature is not perfect… Here are a few downsides:

  • There are a lot of default rules. Arguably, that could also be counted as a quality, but the first time you run NDepend on your project, the sheer number of reported rule violations is quite overwhelming, and usually you don’t really care about most of them. So you have to spend quite a long time reviewing the results to decide which rules you really care about, which ones need to be adjusted to your need, etc. When I did it on a rather small project, it took me 2 hours to fix all warnings; not because I had a lot of things to fix in my code, but because I had a lot of things to fix in the rules! I’m not saying my code was perfect, obviously, and NDepend did help me find and fix a few issues, but many of the rules weren’t really relevant in my specific project. So, if you use NDepend, expect to spend a lot of time adjusting the rules to your needs; once you do that, the tool will really shine, and the analysis results will be a lot more useful to you.
  • There is no easy way to “suppress” a specific occurrence of a rule violation. For instance, in ReSharper you can suppress a warning with a special comment (and the quick fix menu lets you add that comment automatically); in FxCop, you can apply the [SuppressMessage] attribute to a type or member. There is nothing like that in NDepend; if you want to exclude a code item from a rule, you have to modify the code of the CQLinq query itself. Given the flexibility of the query language, it’s understandable that there is no generic way to suppress warnings, but still, it’s annoying; it also means that you can’t just reuse the exact same queries in other projects. There is however a nice feature that partly counterbalances the lack of a generic suppression mechanism: the JustMyCode context. It defines a “view” of the code that only includes your own code, not the code generated by designers or by the compiler. So you can query against the JustMyCode context to ignore rule violations in code that you didn’t write, and you can customize what is considered “not your code” using the same CQLinq syntax.
  • Queries that take IL statistics (number of IL instructions, IL cyclomatic complexity, etc) into account are often biased by complex code constructs such as iterator blocks, anonymous methods or async methods, which results in false positives. Some methods are complex at the IL level, and reported as such, even though the original C# code is rather straightforward.

Dependency management

I guess that’s the feature that gave the tool its name, even though now it does much more than that… NDepend can give you very detailed information about dependencies between assemblies and namespaces (your own, as well as framework or third party assemblies). The dependencies can be viewed as a directed graph:

image

Or as a matrix:

image

Both views are interactive; the matrix view can even be “drilled down” to view dependencies at a lower level.

I didn’t really take advantage of the dependency-related features, because I only tested NDepend on simple projects, but they can certainly be very useful in large solutions to eliminate unwanted coupling between different parts of the code.

Code evolution analysis

NDepend also lets you to compare analysis results between builds. Basically, you set a baseline for the comparison, and it gives you trends to measure the progress of various code metrics over time. I didn’t use this feature myself so I can’t really talk in detail about it, but its usefulness is quite obvious for large projects as it lets you see which aspects are improving or worsening, allowing you to refocus the team’s efforts as necessary.

Conclusion

I have to say that I’m very impressed by NDepend’s analysis engine; it’s incredibly powerful, and the fact that the rules are completely customizable opens a world of possibilities. I love the fact that I can just write a simple Linq query to find all classes or methods that match certain criteria. Regarding the other features, like dependency management, I’m sure they can be very useful, but most of the projects I work on are rather small, so dependencies are usually not a major issue for me.

The way I see it, NDepend is a great tool to keep close tabs on the architecture of large projects, but is probably overkill for small projects. It’s also very useful if you need to enforce strict design guidelines across a large code base; obviously, it won’t completely replace code review, but it can certainly be a big help in the review process.

In any case, NDepend has a lot of obvious qualities, but it’s probably not the right tool for everyone. The only way to decide if you need it or not is to try it for yourself, and see how it works out for you!

[WPF] Declare global hotkeys in XAML with NHotkey

Very poorPoorAverageGoodExcellent (1 votes) 
Loading...Loading...

A common requirement for desktop applications is to handle system-wide hotkeys, in order to intercept keyboard shortcuts even when they don’t have focus. Unfortunately, there is no built-in feature in the .NET framework to do it.

Of course, this is not a new issue, and there are quite a few open-source libraries that address it (e.g. VirtualInput). Most of them rely on a global system hook, which allow them to intercept all keystrokes, even the ones you’re not interested in. I used some of those libraries before, but I’m not really happy with them:

  • they’re often tied to a specific UI framework (usually Windows Forms), which makes them a bit awkward to use in another UI framework (like WPF)
  • I don’t really like the approach of intercepting all keystrokes. It usually means that you end up with a big method with lots of if/else if to decide what to do based on which keys were pressed.

A better option, in my opinion, is to listen only to the keys you’re interested in, and declare what to do for each of those. The approach used in WPF for key bindings is quite elegant:

<Window.InputBindings>
    <KeyBinding Gesture="Ctrl+Alt+Add" Command="{Binding IncrementCommand}" />
    <KeyBinding Gesture="Ctrl+Alt+Subtract" Command="{Binding DecrementCommand}" />
</Window.InputBindings>

But of course, key bindings are not global, they require that your app has focus… What if we could change that?

NHotkey is a very simple hotkey library that enables global key bindings. All you have to do is set an attached property to true on your key bindings:

<Window.InputBindings>
    <KeyBinding Gesture="Ctrl+Alt+Add" Command="{Binding IncrementCommand}"
                HotkeyManager.RegisterGlobalHotkey="True" />
    <KeyBinding Gesture="Ctrl+Alt+Subtract" Command="{Binding DecrementCommand}"
                HotkeyManager.RegisterGlobalHotkey="True" />
</Window.InputBindings>

And that’s it; the commands defined in the key bindings will now be invoked even if your app doesn’t have focus!

You can also use NHotkey from code:

HotkeyManager.Current.AddOrReplace("Increment", Key.Add, ModifierKeys.Control | ModifierKeys.Alt, OnIncrement);
HotkeyManager.Current.AddOrReplace("Decrement", Key.Subtract, ModifierKeys.Control | ModifierKeys.Alt, OnDecrement);

The library takes advantage of the RegisterHotkey function. Because it also supports Windows Forms, it is split into 3 parts, so that you don’t need to reference the WinForms assembly from a WPF app or vice versa:

  • The core library, which handles the hotkey registration itself, independently of any specific UI framework. This library is not directly usable, but is used by the other two.
  • The WinForms-specific API, which uses the Keys enumeration from System.Windows.Forms
  • The WPF-specific API, which uses the Key and ModifierKeys enumerations from System.Windows.Input, and supports global key bindings in XAML.

If you install the library from Nuget, add either the NHotkey.Wpf or the NHotkey.WindowsForms package; the core package will be added automatically.

Tackling timeout issues when uploading large files with HttpWebRequest

Very poorPoorAverageGoodExcellent (1 votes) 
Loading...Loading...

If you ever had to upload large volumes of data over HTTP, you probably ran into timeout issues. The default Timeout value for HttpWebRequest is 100 seconds, which means that if it takes more than that from the time you send the request headers to the time you receive the response headers, your request will fail. Obviously, if you’re uploading a large file, you need to increase that timeout… but to which value?

If you know the available bandwidth, you could calculate a rough estimate of how long it should take to upload the file, but it’s not very reliable, because if there is some network congestion, it will take longer, and your request will fail even though it could have succeeded given enough time. So, should you set the timeout to a very large value, like several hours, or even Timeout.Infinite? Probably not. The most compelling reason is that even though the transfer itself could take hours, some phases of the exchange shouldn’t take that long. Let’s decompose the phases of an HTTP upload:

timeout1

Obtaining the request stream or getting the response (orange parts) isn’t supposed to take very long, so obviously we need a rather short timeout there (the default value of 100 seconds seems reasonable). But sending the request body (blue part) could take much longer, and there is no reliable way  to decide how long that should be; as long as we keep sending data and the server is receiving it, there is no reason not to continue, even if it’s taking hours. So we actually don’t want a timeout at all there! Unfortunately, the behavior of the Timeout property is to consider everything from the call to GetRequestStream to the return of GetResponse

In my opinion, it’s a design flaw of the HttpWebRequest class, and one that has bothered me for a very long time. So I eventually came up with a solution. It relies on the fact that the asynchronous versions of GetRequestStream and GetResponse don’t have a timeout mechanism. Here’s what the documentation says:

The Timeout property has no effect on asynchronous requests made with the BeginGetResponse or BeginGetRequestStream method.

In the case of asynchronous requests, the client application implements its own time-out mechanism. Refer to the example in the BeginGetResponse method.

So, a solution could be to to use these methods directly (or the new Task-based versions: GetRequestStreamAsync and GetResponseAsync); but more often than not, you already have an existing code base that uses the synchronous methods, and changing the code to make it fully asynchronous is usually not trivial. So, the easy approach is to create synchronous wrappers around BeginGetRequestStream and BeginGetResponse, with a way to specify a timeout for these operations:

    public static class WebRequestExtensions
    {
        public static Stream GetRequestStreamWithTimeout(
            this WebRequest request,
            int? millisecondsTimeout = null)
        {
            return AsyncToSyncWithTimeout(
                request.BeginGetRequestStream,
                request.EndGetRequestStream,
                millisecondsTimeout ?? request.Timeout);
        }

        public static WebResponse GetResponseWithTimeout(
            this HttpWebRequest request,
            int? millisecondsTimeout = null)
        {
            return AsyncToSyncWithTimeout(
                request.BeginGetResponse,
                request.EndGetResponse,
                millisecondsTimeout ?? request.Timeout);
        }

        private static T AsyncToSyncWithTimeout<T>(
            Func<AsyncCallback, object, IAsyncResult> begin,
            Func<IAsyncResult, T> end,
            int millisecondsTimeout)
        {
            var iar = begin(null, null);
            if (!iar.AsyncWaitHandle.WaitOne(millisecondsTimeout))
            {
                var ex = new TimeoutException();
                throw new WebException(ex.Message, ex, WebExceptionStatus.Timeout, null);
            }
            return end(iar);
        }
    }

(note that I used the Begin/End methods rather than the Async methods, in order to keep compatibility with older versions of .NET)

These extension methods can be used instead of GetRequestStream and GetResponse; each of them will timeout if they take too long, but once you have the request stream, you can take as long as you want to upload the data. Note that the stream itself has its own read and write timeout (5 minutes by default), so if 5 minutes go by without any data being uploaded, the Write method will cause an exception. Here is the new upload scenario using these methods:

timeout2

As you can see, the only difference is that the timeout doesn’t apply anymore to the transfer of the request body, but only to obtaining the request stream and getting the response. Here’s a full example that corresponds to the scenario above:

long UploadFile(string path, string url, string contentType)
{
    // Build request
    var request = (HttpWebRequest)WebRequest.Create(url);
    request.Method = WebRequestMethods.Http.Post;
    request.AllowWriteStreamBuffering = false;
    request.ContentType = contentType;
    string fileName = Path.GetFileName(path);
    request.Headers["Content-Disposition"] = string.Format("attachment; filename=\"{0}\"", fileName);
    
    try
    {
        // Open source file
        using (var fileStream = File.OpenRead(path))
        {
            // Set content length based on source file length
            request.ContentLength = fileStream.Length;
            
            // Get the request stream with the default timeout
            using (var requestStream = request.GetRequestStreamWithTimeout())
            {
                // Upload the file with no timeout
                fileStream.CopyTo(requestStream);
            }
        }
        
        // Get response with the default timeout, and parse the response body
        using (var response = request.GetResponseWithTimeout())
        using (var responseStream = response.GetResponseStream())
        using (var reader = new StreamReader(responseStream))
        {
            string json = reader.ReadToEnd();
            var j = JObject.Parse(json);
            return j.Value<long>("Id");
        }
    }
    catch (WebException ex)
    {
        if (ex.Status == WebExceptionStatus.Timeout)
        {
            LogError(ex, "Timeout while uploading '{0}'", fileName);
        }
        else
        {
            LogError(ex, "Error while uploading '{0}'", fileName);
        }
        throw;
    }
}

I hope you will find this helpful!

Uploading data with HttpClient using a "push" model

Very poorPoorAverageGoodExcellent (2 votes) 
Loading...Loading...

If you have used the HttpWebRequest class to upload data, you know that it uses a “push” model. What I mean is that you call the GetRequestStream method, which opens the connection if necessary, sends the headers, and returns a stream on which you can write directly.

.NET 4.5 introduced the HttpClient class as a new way to communicate over HTTP. It actually relies on HttpWebRequest under the hood, but offers a more convenient and fully asynchronous API. HttpClient uses a different approach when it comes to uploading data: instead of writing manually to the request stream, you set the Content property of the HttpRequestMessage to an instance of a class derived from HttpContent. You can also pass the content directly to the PostAsync or PutAsync methods.

The .NET Framework provides a few built-in implementations of HttpContent, here are some of the most commonly used:

  • ByteArrayContent: represents in-memory raw binary content
  • StringContent: represents text in a specific encoding (this is a specialization of ByteArrayContent)
  • StreamContent: represents raw binary content in the form of a Stream

For instance, here’s how you would upload the content of a file:

async Task UploadFileAsync(Uri uri, string filename)
{
    using (var stream = File.OpenRead(filename))
    {
        var client = new HttpClient();
        var response = await client.PostAsync(uri, new StreamContent(stream));
        response.EnsureSuccessStatusCode();
    }
}

As you may have noticed, nowhere in this code do we write to the request stream explicitly: the content is pulled from the source stream.

This “pull” model is fine most of the time, but it has a drawback: it requires that the data to upload already exists in a form that can be sent directly to the server. This is not always practical, because sometimes you want to generate the request content “on the fly”. For instance, if you want to send an object serialized as JSON, with the “pull” approach you first need to serialize it in memory as a string or MemoryStream, then assign that to the request’s content:

async Task UploadJsonObjectAsync<T>(Uri uri, T data)
{
    var client = new HttpClient();
    string json = JsonConvert.SerializeObject(data);
    var response = await client.PostAsync(uri, new StringContent(json));
    response.EnsureSuccessStatusCode();
}

This is fine for small objects, but obviously not optimal for large object graphs…

So, how could we reverse this pull model to a push model? Well, it’s actually pretty simple: all you have to do is to create a class that inherits HttpContent, and override the SerializeToStreamAsync method to write to the request stream directly. Actually, I intended to blog about my own implementation, but then I did some research, and it turns out that Microsoft has already done the work: the Web API 2 Client library provides a PushStreamContent class that does exactly that. Basically, you just pass a delegate that defines what to do with the request stream. Here’s how it works:

async Task UploadJsonObjectAsync<T>(Uri uri, T data)
{
    var client = new HttpClient();
    var content = new PushStreamContent((stream, httpContent, transportContext) =>
    {
        var serializer = new JsonSerializer();
        using (var writer = new StreamWriter(stream))
        {
            serializer.Serialize(writer, data);
        }
    });
    var response = await client.PostAsync(uri, content);
    response.EnsureSuccessStatusCode();
}

Note that the PushStreamContent class also provides a constructor overload that accepts an asynchronous delegate, if you want to write to the stream asynchronously.

Actually, for this specific use case, the Web API 2 Client library provides a less convoluted approach: the ObjectContent class. You just pass it the object to send and a MediaTypeFormatter, and it takes care of serializing the object to the request stream:

async Task UploadJsonObjectAsync<T>(Uri uri, T data)
{
    var client = new HttpClient();
    var content = new ObjectContent<T>(data, new JsonMediaTypeFormatter());
    var response = await client.PostAsync(uri, content);
    response.EnsureSuccessStatusCode();
}

By default, the JsonMediaTypeFormatter class uses Json.NET as its JSON serializer, but there is an option to use DataContractJsonSerializer instead.

Note that if you need to read an object from the response content, this is even easier: just use the ReadAsAsync<T> extension method (also in the Web API 2 Client library). So as you can see, HttpClient makes it very easy to consume REST APIs.

css.php