[WPF] Using Linq to shape data in a CollectionView

Very poorPoorAverageGoodExcellent (7 votes) 
Loading ... Loading ...

WPF provides a simple mechanism for shaping collections of data, via the ICollectionView interface and its Filter, SortDescriptions and GroupDescriptions properties:

// Collection to which the view is bound
public ObservableCollection People { get; private set; }
...

// Default view of the People collection
ICollectionView view = CollectionViewSource.GetDefaultView(People);

// Show only adults
view.Filter = o => ((Person)o).Age >= 18;

// Sort by last name and first name
view.SortDescriptions.Add(new SortDescription("LastName", ListSortDirection.Ascending));
view.SortDescriptions.Add(new SortDescription("FirstName", ListSortDirection.Ascending));

// Group by country
view.GroupDescriptions.Add(new PropertyGroupDescription("Country"));

Even though this technique is not difficult to use, it has a few drawbacks:

  • The syntax is a bit clumsy and unnatural: the fact that the filter parameter is an object whereas we know it’s actually of type Person makes the code less readable because of the cast, and the specification of the sort and group descriptions is a little repetitive
  • Specifying the property names as strings introduces a risk of error, since they’re not verified by the compiler

In the last few years, we got used to use Linq to do this kind of things… it would be nice to be able to do the same for the shaping of an ICollectionView.

Let’s see what syntax we could use to do it with Linq… something like this perhaps?

People.Where(p => p.Age >= 18)
      .OrderBy(p => p.LastName)
      .ThenBy(p => p.FirstName)
      .GroupBy(p => p.Country);

Or, with the Linq query comprehension syntax:

from p in People
where p.Age >= 18
orderby p.LastName, p.FirstName
group p by p.Country;

Obviously, this is not enough: this code only creates a query on the collection, it doesn’t modify the CollectionView… but with just a little extra work, we can get the desired result:

var query =
    from p in People.ShapeView()
    where p.Age >= 18
    orderby p.LastName, p.FirstName
    group p by p.Country;

query.Apply();

The ShapeView method returns a wrapper which encapsulates the default view of the collection, and exposes Where, OrderBy and GroupBy methods with appropriate signatures to specify the shaping of the CollectionView. Creating the query has no direct effect, the changes are only applied to the view when Apply is called: that’s because it’s better to apply all changes at the same time, using ICollectionView.DeferRefresh, to avoid causing a refresh of the view for each clause of the query. When Apply is called, we can see that the view is correctly updated to reflect the query.

This solution allows to define the filter, sort and grouping in a strongly-typed way, which implies that the code is verified by the compiler. It’s also more concise and readable than the original code… Just be careful with one thing: some queries that are correct from the compiler’s point of view won’t be applicable to a CollectionView. For instance, if you try to group the data by the first letter of the last name (p.LastName.Substring(0, 1)), the GroupBy method will fail because only properties are supported by PropertyGroupDescription.

Note that the wrapper won’t overwrite the shaping properties of the CollectionView if you don’t specify the corresponding Linq clause, so you can just modify the current view without specifying everything again. If you need to clear the properties, you can use the ClearFilter, ClearSort and ClearGrouping methods:

// Remove the grouping and add a sort criteria
People.ShapeView()
      .ClearGrouping()
      .OrderBy(p => p.LastName);
      .Apply();

Note that as for a normal Linq query, it’s possible to use either the query comprehension syntax or to call the methods directly, since the former is just syntactic sugar for the latter.

Finally, here’s the complete code of the wrapper and the associated extension methods:

    public static class CollectionViewShaper
    {
        public static CollectionViewShaper<TSource> ShapeView<TSource>(this IEnumerable<TSource> source)
        {
            var view = CollectionViewSource.GetDefaultView(source);
            return new CollectionViewShaper<TSource>(view);
        }

        public static CollectionViewShaper<TSource> Shape<TSource>(this ICollectionView view)
        {
            return new CollectionViewShaper<TSource>(view);
        }
    }

    public class CollectionViewShaper<TSource>
    {
        private readonly ICollectionView _view;
        private Predicate<object> _filter;
        private readonly List<SortDescription> _sortDescriptions = new List<SortDescription>();
        private readonly List<GroupDescription> _groupDescriptions = new List<GroupDescription>();

        public CollectionViewShaper(ICollectionView view)
        {
            if (view == null)
                throw new ArgumentNullException("view");
            _view = view;
            _filter = view.Filter;
            _sortDescriptions = view.SortDescriptions.ToList();
            _groupDescriptions = view.GroupDescriptions.ToList();
        }

        public void Apply()
        {
            using (_view.DeferRefresh())
            {
                _view.Filter = _filter;
                _view.SortDescriptions.Clear();
                foreach (var s in _sortDescriptions)
                {
                    _view.SortDescriptions.Add(s);
                }
                _view.GroupDescriptions.Clear();
                foreach (var g in _groupDescriptions)
                {
                    _view.GroupDescriptions.Add(g);
                }
            }
        }
            
        public CollectionViewShaper<TSource> ClearGrouping()
        {
            _groupDescriptions.Clear();
            return this;
        }

        public CollectionViewShaper<TSource> ClearSort()
        {
            _sortDescriptions.Clear();
            return this;
        }

        public CollectionViewShaper<TSource> ClearFilter()
        {
            _filter = null;
            return this;
        }

        public CollectionViewShaper<TSource> ClearAll()
        {
            _filter = null;
            _sortDescriptions.Clear();
            _groupDescriptions.Clear();
            return this;
        }

        public CollectionViewShaper<TSource> Where(Func<TSource, bool> predicate)
        {
            _filter = o => predicate((TSource)o);
            return this;
        }

        public CollectionViewShaper<TSource> OrderBy<TKey>(Expression<Func<TSource, TKey>> keySelector)
        {
            return OrderBy(keySelector, true, ListSortDirection.Ascending);
        }

        public CollectionViewShaper<TSource> OrderByDescending<TKey>(Expression<Func<TSource, TKey>> keySelector)
        {
            return OrderBy(keySelector, true, ListSortDirection.Descending);
        }

        public CollectionViewShaper<TSource> ThenBy<TKey>(Expression<Func<TSource, TKey>> keySelector)
        {
            return OrderBy(keySelector, false, ListSortDirection.Ascending);
        }

        public CollectionViewShaper<TSource> ThenByDescending<TKey>(Expression<Func<TSource, TKey>> keySelector)
        {
            return OrderBy(keySelector, false, ListSortDirection.Descending);
        }

        private CollectionViewShaper<TSource> OrderBy<TKey>(Expression<Func<TSource, TKey>> keySelector, bool clear, ListSortDirection direction)
        {
            string path = GetPropertyPath(keySelector.Body);
            if (clear)
                _sortDescriptions.Clear();
            _sortDescriptions.Add(new SortDescription(path, direction));
            return this;
        }

        public CollectionViewShaper<TSource> GroupBy<TKey>(Expression<Func<TSource, TKey>> keySelector)
        {
            string path = GetPropertyPath(keySelector.Body);
            _groupDescriptions.Add(new PropertyGroupDescription(path));
            return this;
        }

        private static string GetPropertyPath(Expression expression)
        {
            var names = new Stack<string>();
            var expr = expression;
            while (expr != null && !(expr is ParameterExpression) && !(expr is ConstantExpression))
            {
                var memberExpr = expr as MemberExpression;
                if (memberExpr == null)
                    throw new ArgumentException("The selector body must contain only property or field access expressions");
                names.Push(memberExpr.Member.Name);
                expr = memberExpr.Expression;
            }
            return String.Join(".", names.ToArray());
        }
    }

kick it on DotNetKicks.com

Posted in WPF. Tags: , , . 3 Comments »

[Entity Framework] Using Include with lambda expressions

Very poorPoorAverageGoodExcellent (10 votes) 
Loading ... Loading ...

I’m currently working on a project that uses Entity Framework 4. Even though lazy loading is enabled, I often use the ObjectQuery.Include method to eagerly load associated entities, in order to avoid database roundtrips when I access them:

var query =
    from ord in db.Orders.Include("OrderDetails")
    where ord.Date >= DateTime.Today
    select ord;

Or if I also want to eagerly load the product:

var query =
    from ord in db.Orders.Include("OrderDetails.Product")
    where ord.Date >= DateTime.Today
    select ord;

However, there’s something that really bothers me with this Include method: the property path is passed as a string. This approach has two major drawbacks:

  • It’s easy to make a mistake when typing the property path, and since it’s a string, the compiler doesn’t complain. So we get a runtime error, rather than a compilation error.
  • We can’t take advantage of IDE features like Intellisense and refactoring. If we rename a property in the model, automatic refactoring won’t check the content of the string. We have to manually update all calls to Include that refer to this property, with the risk of missing some of them in the process…

It would be much more convenient to use a lambda expression to specify the property path. The principle is well known, and frequently used to avoid using a string to refer to a property.

The trivial case, where the property to include is directly accessible from the source, is pretty easy to handle, and many implementation can be found on the Internet. We just need to use a method that extracts the property name from an expression :

    public static class ObjectQueryExtensions
    {
        public static ObjectQuery<T> Include<T>(this ObjectQuery<T> query, Expression<Func<T, object>> selector)
        {
            string propertyName = GetPropertyName(selector);
            return query.Include(propertyName);
        }

        private static string GetPropertyName<T>(Expression<Func<T, object>> expression)
        {
            MemberExpression memberExpr = expression.Body as MemberExpression;
            if (memberExpr == null)
                throw new ArgumentException("Expression body must be a member expression");
            return memberExpr.Member.Name;
        }
    }

Using that extension method, the code from the first sample can be rewritten as follows:

var query =
    from ord in db.Orders.Include(o => o.OrderDetails)
    where ord.Date >= DateTime.Today
    select ord;

This code works fine, but only for the simplest cases… In the second example, we also want to eagerly load the OrderDetail.Product property, but the code above can’t handle that case. Indeed, the expression we would use to include the Product property would be something like o.OrderDetails.Select(od => od.Product), but the GetPropertyName method can only handle property accesses, not method calls, and it works only for an expression with a single level.

To get the full path of the property to include, we have to walk through the whole expression tree to extract the name of each property. It sounds like a complicated task, but there’s a class that can help us with it: ExpressionVisitor. This class was introduced in .NET 4.0 and implements the Visitor pattern to walk through all nodes in the expression tree. It’s just a base class for implementing custom visitors, and it does nothing else than just visiting each node. All we need to do is inherit it, and override some methods to extract the properties from the expression. Here are the methods we need to override:

  • VisitMember : used to visit a property or field access
  • VisitMethodCall : used to visit a method call. Even though method calls aren’t directly related to what we want to do, we need to change its behavior in the case of Linq operators: the default implementation visits each parameter in their normal order, but for extension method like Select or SelectMany, we need to visit the first parameter (the this parameter) last, so that we retrieve the properties in the correct order.

    Here’s a new version of the Include method, along with the ExpressionVisitor implementation:

        public static class ObjectQueryExtensions
        {
            public static ObjectQuery<T> Include<T>(this ObjectQuery<T> query, Expression<Func<T, object>> selector)
            {
                string path = new PropertyPathVisitor().GetPropertyPath(selector);
                return query.Include(path);
            }
    
            class PropertyPathVisitor : ExpressionVisitor
            {
                private Stack<string> _stack;
    
                public string GetPropertyPath(Expression expression)
                {
                    _stack = new Stack<string>();
                    Visit(expression);
                    return _stack
                        .Aggregate(
                            new StringBuilder(),
                            (sb, name) =>
                                (sb.Length > 0 ? sb.Append(".") : sb).Append(name))
                        .ToString();
                }
    
                protected override Expression VisitMember(MemberExpression expression)
                {
                    if (_stack != null)
                        _stack.Push(expression.Member.Name);
                    return base.VisitMember(expression);
                }
    
                protected override Expression VisitMethodCall(MethodCallExpression expression)
                {
                    if (IsLinqOperator(expression.Method))
                    {
                        for (int i = 1; i < expression.Arguments.Count; i++)
                        {
                            Visit(expression.Arguments[i]);
                        }
                        Visit(expression.Arguments[0]);
                        return expression;
                    }
                    return base.VisitMethodCall(expression);
                }
    
                private static bool IsLinqOperator(MethodInfo method)
                {
                    if (method.DeclaringType != typeof(Queryable) && method.DeclaringType != typeof(Enumerable))
                        return false;
                    return Attribute.GetCustomAttribute(method, typeof(ExtensionAttribute)) != null;
                }
            }
        }
    

    I already talked about the VisitMethodCall method, so I won’t explain it further. The implementation of VisitMember is very simple: we just push the member name on a stack. Why a stack ? That’s because the expression is not visited in the order one would intuitively expect. For instance, in an expression like o.OrderDetails.Select(od => od.Product), the first visited node is not o but the call to Select, because what precedes it (o.OrderDetails) is actually the first parameter of the static Select method… To retrieve the properties in the correct order, we put them on a stack so that we can read them back in reverse order when we need to build the property path.

    The GetPropertyPath method probably doesn’t need a long explanation: it initializes the stack, visits the expression, and builds the property path from the stack.

    We can now rewrite the code from the second example as follows:

    var query =
        from ord in db.Orders.Include(o => OrderDetails.Select(od => od.Product))
        where ord.Date >= DateTime.Today
        select ord;
    

    This method also works for more complex cases. Let’s add a few new entities to our model: one or more discounts can be applied to each purchased product, and each discount is linked to a sales campaign. If we need to retrieve the associated discounts and campaigns in the query results, we can write something like that:

    var query =
        from ord in db.Orders.Include(o => OrderDetails.Select(od => od.Discounts.Select(d => d.Campaign)))
        where ord.Date >= DateTime.Today
        select ord;
    

    The result is the same as if we had passed “OrderDetails.Discounts.Campaign” to the standard Include method. Since the nested Select calls impair the readability, we can also use a different expression, with the same result:

    var query =
        from ord in db.Orders.Include(o => o.OrderDetails
                                            .SelectMany(od => od.Discounts)
                                            .Select(d => d.Campaign))
        where ord.Date >= DateTime.Today
        select ord;
    

    To conclude, I just have two remarks regarding this solution:

    • A similar extension method is included in the Entity Framework Feature CTP4 (see this article for details). So it is possible that it will eventually be included in the framework (perhaps in a service pack for .NET 4.0 ?).
    • Even though this solution targets Entity Framework 4.0, it should be possible to adapt it for EF 3.5. The ExpressionVisitor class is not available in 3.5, but there is another implementation of it in Joseph Albahari’s LINQKit. I didn’t try it, but it should work the same way…

    kick it on DotNetKicks.com

Automating null checks with Linq expressions

Very poorPoorAverageGoodExcellent (7 votes) 
Loading ... Loading ...

The problem

Have you ever written code like the following ?

X xx = GetX();
string name = "Default";
if (xx != null && xx.Foo != null && xx.Foo.Bar != null && xx.Foo.Bar.Baz != null)
{
    name = xx.Foo.Bar.Baz.Name;
}

I bet you have ! You just need to get the value of xx.Foo.Bar.Baz.Name, but you have to test every intermediate object to ensure that it’s not null. It can quickly become annoying if the property you need is nested in a deep object graph….

A solution

Linq offers a very interesting feature which can help solve that problem : expressions. C# 3.0 makes it possible to retrieve the abstract syntax tree (AST) of a lambda expression, and perform all kinds of manipulations on it. It is also possible to dynamically generate an AST, compile it to obtain a delegate, and execute it.

How is this related to the problem described above ? Well, Linq makes it possible to analyse the AST for the expression that accesses the xx.Foo.Bar.Baz.Name property, and rewrite that AST to insert null checks where needed. So we’re going to create a NullSafeEval extension method, which takes as a parameter the lambda expression defining how to access a property, and the default value to return if a null object is encountered along the way.

That method will transform the expression xx.Foo.Bar.Baz.Name into that :

    (xx == null)
    ? defaultValue
    : (xx.Foo == null)
      ? defaultValue
      : (xx.Foo.Bar == null)
        ? defaultValue
        : (xx.Foo.Bar.Baz == null)
          ? defaultValue
          : xx.Foo.Bar.Baz.Name;

Here’s the implementation of the NullSafeEval method :

        public static TResult NullSafeEval<TSource, TResult>(this TSource source, Expression<Func<TSource, TResult>> expression, TResult defaultValue)
        {
            var safeExp = Expression.Lambda<Func<TSource, TResult>>(
                NullSafeEvalWrapper(expression.Body, Expression.Constant(defaultValue)),
                expression.Parameters[0]);

            var safeDelegate = safeExp.Compile();
            return safeDelegate(source);
        }

        private static Expression NullSafeEvalWrapper(Expression expr, Expression defaultValue)
        {
            Expression obj;
            Expression safe = expr;

            while (!IsNullSafe(expr, out obj))
            {
                var isNull = Expression.Equal(obj, Expression.Constant(null));

                safe =
                    Expression.Condition
                    (
                        isNull,
                        defaultValue,
                        safe
                    );

                expr = obj;
            }
            return safe;
        }

        private static bool IsNullSafe(Expression expr, out Expression nullableObject)
        {
            nullableObject = null;

            if (expr is MemberExpression || expr is MethodCallExpression)
            {
                Expression obj;
                MemberExpression memberExpr = expr as MemberExpression;
                MethodCallExpression callExpr = expr as MethodCallExpression;

                if (memberExpr != null)
                {
                    // Static fields don't require an instance
                    FieldInfo field = memberExpr.Member as FieldInfo;
                    if (field != null && field.IsStatic)
                        return true;

                    // Static properties don't require an instance
                    PropertyInfo property = memberExpr.Member as PropertyInfo;
                    if (property != null)
                    {
                        MethodInfo getter = property.GetGetMethod();
                        if (getter != null && getter.IsStatic)
                            return true;
                    }
                    obj = memberExpr.Expression;
                }
                else
                {
                    // Static methods don't require an instance
                    if (callExpr.Method.IsStatic)
                        return true;

                    obj = callExpr.Object;
                }

                // Value types can't be null
                if (obj.Type.IsValueType)
                    return true;

                // Instance member access or instance method call is not safe
                nullableObject = obj;
                return false;
            }
            return true;
        }

In short, this code walks up the lambda expression tree, and surrounds each property access or instance method call with a conditional expression (condition ? value if true : value if false).

And here’s how we can use this method :

string name = xx.NullSafeEval(x => x.Foo.Bar.Baz.Name, "Default");

Much clearer and concise than our initial code, isn’t it ? :)

Note that the proposed implementation handles not only properties, but also method calls, so we could write something like that :

string name = xx.NullSafeEval(x => x.Foo.GetBar(42).Baz.Name, "Default");

Indexers are not handled yet, but they could be added quite easily ; I will leave it to you to do it if you have the use for it ;)

Limitations

Even though that solution can seem very interesting at first sight, please read what follows before you integrate this code into a real world program…

  • First, the proposed code is just a proof of concept, and as such, hasn’t been thoroughly tested, so it’s probably not very reliable.
  • Secondly, keep in mind that dynamic code generation from an expression tree is tough work for the CLR, and will have a big impact on performance. A quick test shows that using the NullSafeEval method is about 10000 times slower than accessing the property directly…

    A possible approach to limit that issue would be to cache the delegates generated for each expression, to avoid regenerating them every time. Unfortunately, as far as I know there is no simple and reliable way to compare two Linq expressions, which makes it much harder to implement such a cache.

  • Last, you might have noticed that intermediate properties and methods are evaluated several times ; not only this is bad for performance, but more importantly, it could have side effects that are hard to predict, depending on how the properties and methods are implemented.

    A possible workaround would be to rewrite the conditional expression as follows :

    Foo foo = null;
    Bar bar = null;
    Baz baz = null;
    var name =
        (x == null)
        ? defaultValue
        : ((foo = x.Foo) == null)
          ? defaultValue
          : ((bar = foo.Bar) == null)
            ? defaultValue
            : ((baz = bar.Baz) == null)
              ? defaultValue
              : baz.Name;
    

    Unfortunately, this is not possible in .NET 3.5 : that version only supports simple expressions, so it’s not possible to declare variables, assign values to them, or write several distinct instructions. However, in .NET 4.0, support for Linq expressions has been largely improved, and makes it possible to generate that kind of code. I’m currently trying to improve the NullSafeEval method to take advantage of the new .NET 4.0 features, but it turns out to be much more difficult than I had anticipated… If I manage to work it out, I’ll let you know and post the code !

To conclude, I wouldn’t recommend using that technique in real programs, at least not in its current state. However, it gives an interesting insight on the possibilities offered by Linq expressions. If you’re new to this, you should know that Linq expressions are used (among other things) :

  • To generate SQL queries in ORMs like Linq to SQL or Entity Framework
  • To build complex predicates dynamically, like in the PredicateBuilder class by Joseph Albahari
  • To implement “static reflection”, which has generated a lot of buzz on technical blogs lately
css.php