Explain IEnumerable and IQueryable and how EF query works

Hightlights

Understand how LINQ queries work

Query expression and Query provider

LINQ (Language Integrated Query) is a set of API in .NET which provides programming languages with native capabilities to query data from different sources and formats. The data sources can be in-memory collections, XML documents, remote databases or web services etc…

The most visible part of LINQ to developers must be query expressions such as Where, Select, First etc… These are standard query operators that we use to filter, project, aggregate, sort etc… on data sources. LINQ standards query operators are in fact extension methods for IEnumerable types from System.Linq namespace. You also have the possibility to extend these types with custom operators to provide more querying capabilities if needed.

The lesser visible part of LINQ are different engines, aka LINQ providers, that process, translate, transform and execute our query expression to give back results. The main providers include:

LINQ to Objects: operates on in-memory types and collections (list, objects, string etc…)
LINQ to XML: operates on XML documents
LINQ to Entities: operates on Entity Framework conceptual models (entities)
LINQ to SQL: operates on SQL Server database schema

LINQ can be enabled on any data sources

That’s the magic of the technology. You can use LINQ standard syntax to query any data types as long as they are LINQ enabled. There are several options to enable LINQ querying capabilities depending on nature of your data sources.

In-memory data

The data is already there to be used, but you only a way to easily filter, sort, aggregate or transform it. There are two to add LINQ querying capabilities to in-memory types:

Implement IEnumerable<T> interface. The interface contains a single method GetEnumerator() which make it possible to iterate through the collection via the returned iterator. As stated above, when implementing IEnumerable<T> interface, your type is extended with LINQ default (standard) querying methods in System.Linq. An enumerable type allows you to describe what you want to retrieve via LINQ syntax, instead of writing lengthy foreach loops.
Extend the type (using extension methods) with LINQ-like operators to provide custom querying. You by-pass provided querying methods from System.Linq and implement your own querying operators.

The IEnumerable<T> is therefore intended to give types enumerable capability. Enumerable types are inherently queryable by using LINQ standard operators. Queries against Enumerable type are compiled to delegates and executed locally.

Remote data

Working with remote data is a little more complicated. You need a mechanism to connect to the data source, execute query, retrieve results and then maybe transform it into custom types.

In .NET terms, you will need a data provider as a thin layer to access a data store. It does all the dirty work such as initiating connection, sending commands, retrieving results in raw binary form and preparing these results in common types to be consumed in .NET.

Then you need the query provider, which receives LINQ queries, translates them into command language specific to the data source, execute them against a data provider and do all the necessary mappings to fixed types or user-defined types. Of course, the query provider does not actually have to execute these commands. For example, LINQ to SQL translates the query expression into SQL and sends it to the server to execute it.

Here’s a quick grip of the basic mechanism.

IQueryable<T>, together with IQueryProvider interfaces gives you standards to implement your own query provider. IQueryable<T> presents your LINQ queries compiled into expression trees which then will be visited by IQueryProvider for translation and execution. IQueryable objects hold a reference to the underlying query provider (IQueryProvider type). When you materialize your query by iterating or calling First(), ToList(), Sum(), Count()… upon IQueryable object, the query provider then know to execute and return results in forms of enumerable objects (aha!).

So, IEnumerable is intended to represent collections in-memory which you can iterate through, and IQueryable is to represent queries against remote data store.

But in Entity Framework, it is a little bit tricky…

Queries with EF Core 3.1

Entity Framework comprises of its own query provider and data provider (which is built on top of existing storage-specific .NET data providers). The query provider of EF provide LINQ to Entities which allows you to query against entity models (CLR types in EF terminology). EF has its own implementation of IQueryable.

Update query object: IEnumerable vs IQueryable

We see that with Entity Framework, we can query against a DbSet<T> using both IEnumerable and IQueryable object?! Both statements below are valid and process the same.

//(1)
IEnumerable<Product> ienumerableQuery = dbContext.Product.Where(x => x.ProductId < 100);
//(2)
IQueryable<Product> queryable = dbContext.Product.Where(x => x.ProductId < 100);

In both query the provider actually returns a legit IQueryable object (remember IQueryable inherits from IEnumerable). They are both compiled to expression trees and translated into the same SQL statements to be executed at the server at runtime. But as for (1), the compiler implicitly converts the query object returned by Where clause into IEnumerable before assigning it to ienumerableQuery.

The tricky part is, as ienumerableQuery and queryable have different compile types, they behave differently when you extend your query at some later time !

//(3)
ienumerableQuery = ienumerableQuery.Where(x => x.ProductId > 50);
//(4)
queryable = queryable.Where(x => x.ProductId > 50);

When execute, using SQL Server profiler we’ll see that the SQL statements for ienumerableQuery do not contain the added filter (ProductId > 50), but for queryable, the second filter is included nicely. The weirdest part however is that results are correctly returned and the same for both !

What actually happens is when you reassign ienumerableQuery in (3), you are querying against an IEnumerable object, so the compiler decides to use the IEnumerable.Where and compiles operation to a delegate with whatever is given from (1) as source. During runtime, (1) gives out IQueryable object, the compiled delegate from (3) uses this object as source to apply query. When execute, only the initial query produced in (1) is sent over the server, the retrieved results will then be refined in-memory to give you a final list.

As for queryable, the type never changes so in (4) IQueryable.Where is used and hence a new expression tree will be built to reflect the changes. We can manipulate again and again the same queryable object and all changes will be reflected in the final generated SQL to be sent to the server.

Therefore, if you intend to adjust your query incrementally, it is better to use IQueryable query object because you are certain to get only what you need at execution point and hence avoid many redundancies.

The hidden cost of IEnumerable

We’ve seen that IEnumerable query object against Entity Framework once declared and assigned will have its SQL statements generated once and for life. Any subsequent operations added after declaration will be applied in memory.

This can bring about serious performance problem as you request database for redundant objects, which causes useless data to be sent over the network, and increases the size of data that application has to process in memory. On top of that is the cost to have query executed partly at the database and partly in app. Besides performance hit, you might get incoherent results due to different timezone or other environment variables. This should be avoided if your query can be translated to plain SQL, simply by using IQueryable.

Another overhead we’ve seen in the example above is strongly declaring query as IEnumerable involves a lot of type casting.

Deferred execution

Deferred execution is an important feature of query in Entity Framework. It’s said, your query is executed only when it’s absolutely necessary.

When you write this:

IEnumerable<Product> ienumerableQuery = dbContext.Product.Where(x => x.ProductId < 100);

//OR THIS
IQueryable<Product> queryable = dbContext.Product.Where(x => x.ProductId < 100);

You only stored your query commands in a variable (doesn’t matter IEnumerable or IQueryable). Entity Framework will execute the query when you materialize the results, either by iterating using foreach loops, or by calling immediate execution using ToList(), First(), Sum(), Count() etc… All results are fetched and stored in a temporary collection (aka eager evaluation).

And every time you materialize the results, the query will be executed again and again, unless you store your results in a “real” IEnumerable variable such as List, Array to use later. Deferred execution also happens in these cases:

//(A)
IEnumerable<Product> new_ienumerableQuery = ienumerableQuery.Where(x => x.ProductId > 50);

//OR THIS (B)
ienumerableQuery = ienumerableQuery.Where(x => x.ProductId > 50);

//OR THIS (C)
IQueryable<Product> new_queryable= queryable.Where(x => x.ProductId > 50);

//OR THIS (D)
queryable = queryable.Where(x => x.ProductId > 50);

It seems confused at first, but it’s worth to remember that deferred execution is a feature of LINQ query, no matter on which data source. LINQ queries are always executed when the query variable is iterated over, not when the query variable is created. This simple example can prove this:

List<string> source = new List<string> {"dodo", "dudu", "didi", "dada"};
IEnumerable<string> query = source.Where(x=> x.Contains("u"));
var t1 = query.ToList();
source.Add("didu");
var t2 = query.ToList();

We’ll see that t1 contains one element "dudu" while t2 has two, "dudu" and "didu" added to the source after the query has been created.

AsQueryable() vs AsEnumerable()

These two operators are used to convert source object to respective types.

After AsEnumerable(), all operations will be executed locally.

Query after AsQueryale() has no effect if applied on Linq.IEnumerable source object. For example:

IEnumerable<Product> ienumerableQuery = dbContext.Product.Where(x => x.ProductId < 100);
IQueryable<Product> queryale= dbContext.Product.Where(x => x.ProductId < 100);

//these codes following AsQueryable() are evaluated on client side
var t1 = ienumerableQuery.Where(x=> x.ProductId > 50).AsQueryable().Where(x=> x.ProductId == 45);

var t2 = queryable.AsEnumerable().Where(x=> x.ProductId > 50).AsQueryable().Where(x=> x.ProductId == 45);

As seen previously section, the ienumerableQuery variable is of type IQueryable at runtime but of type Linq.IEnumerable at compile time. For this reason, the first Where operator in the second statement is an operator of Linq.IEnumerable interface and so are all subsequent operators. AsQueryable() has no effect because you are trying to cast a base type (IEnumerable) to a derived type (IQueryable), the framework just returns the original source without doing anything. Similarly, in the third statement the queryable has already been casted to IEnumerable after AsEnumerable() so it cannot be restored to IQueryable afterwards because information of the derived type is already lost in the up-cast.

Custom function in queries

Custom functions in application, unless mapped to database function can’t be translated to SQL in LINQ to Entities. This code below will result in exception at runtime.

IQueryable<Product> queryale= dbContext.Product.Where(x => x.ProductId = MyCustomFunction());

If you want to use custom function in your query, you must cast it first to IEnumerable so that it can be evaluated on client side. For example:

IQueryable<Product> queryale= dbContext.Product.AsEnumerable().Where(x => x.ProductId = MyCustomFunction());

Tips to work with IEnumerable and IQueryable

Carefully look into the type of the variable at runtime and compile time to make sure your query does what you really intend to do. For server evaluation, make sure your query is of type Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryable. For client evaluation, make sure it is of type System.Linq.Enumerable.
Execute your query as often as necessary but do not do so repeatedly, because that adds loads to network and database. Make sure to understand deferred execution.
Materialize results to avoid round trips to database.
Use IQueryable if you want to extend your query later on.
Do understand the stakes behind IEnumerable.
Use AsEnumerable() if you want to have a part of the query evaluated on client side.

Additional sources

https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/

https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/ef/language-reference/query-execution

https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/ef/overview

https://docs.microsoft.com/en-au/archive/blogs/mattwar/linq-building-an-iqueryable-provider-part-i

https://docs.microsoft.com/en-au/archive/blogs/mattwar/iqueryables-deep-dark-secret

https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/enabling-a-data-source-for-linq-querying1

https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/data-providers

https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/ado-net-architecture

https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/types/casting-and-type-conversions