Exploring the Performance of the ADO.NET Entity Framework - Part 1
Performance Matters
No matter what type of application you are developing at some point in the lifecycle of a software application or service, performance matters; this is especially true when accessing data. In this article, I’m going to spend some time exploring the performance of the ADO.NET Entity Framework by breaking down the stack and showing how to speed up the simple query. I also explain the performance characteristics of the Entity Framework. I am going to run this in a two part series so keep looking back.
So we’re all on the same page, here’s the configuration that I’m using to run my tests:
· Visual Studio 2008.
· SQL Express (installed with Visual Studio).
· ADO.NET Entity Framework Beta 3.
· Entity Framework Tools December 2007 CTP.
· I work on the Entity Framework team, so I have the symbols loaded.
· A C# console application built under the release mode configuration.
· I’m using Northwind as my database.
· I’m running on my laptop, which is a dual core 2GHz processor with 3GB of RAM.
· I’m using the profile tool that ships in Visual Studio.
The General Disclaimer
Just as a disclaimer, or a thought provoking exercise, whichever is better suited for you, I want to explain why you would use the EF and Entity Data Model (EDM) and how this relates to performance. While this is an important point, I’ll point it as briefly as I can. Whenever you put a layer and reshaping model that creates a layer of abstraction or something else like an EDM to transform the relational schema of a database, there is going to be a performance decrease. Now with that said, the ADO.NET team has taken the challenge to minimize the performance decrease in this and in future releases.
Query Execution
I am using the following Northwind model as the basis for most of my exploration into performance.
Figure 1 Northwind Data model using the EDM designer
With this model I can write the following code that queries for the EntityType Order in the EntitySet Orders then iterates the returns result set.
using (NorthwindEntities ne = new NorthwindEntities()) { foreach (Order o in ne.Orders) { int i = o.OrderID; } } |
I ran this query through 10 iterations in my project, and iterated over a total of 848 rows of data for each query. Out of those ten iterations, here the results in milliseconds.
First run (ms) |
Next 9 (ms) | ||||||||
4241 |
13 |
13 |
14 |
13 |
13 |
13 |
13 |
24 |
14 |
The first thing to notice is that the initial iteration takes 4241 milliseconds. The first time you create an ObjectContext and execute any operation that accesses the database, a few expensive operations occur. The pie chart below shows a breakdown, by percentage, of where the time is spent in the initial run.
Figure 2 – Code area where the first run spends execution time
Here’s what each of those sections mean.
· Loading Metadata 11% – The metadata consists of conceptual, mapping, and logical views of the data model, as defined in the CSDL, MSL, and SSDL files, respectively. During the loading of the metadata, the EDM files are loaded in the MetadataWorkspace and cached in a global cache so other workspaces can take advantage of the existing metadata.
· Initializing Metadata 14%– During the initialization of the metadata, the connection is opened and the actual metadata information is retrieved from the ADO.NET Data Provider which gets information from the database.
· Opening Connection 8%– This is simply just opening the database connection; the first time is usually the slowest.
· View Generation 56%– A big part of creating an abstracted view of the database is providing the actual view for queries and updates in the store’s native language. During this step, the store views are created. The good news is there is a way of making view generation part of the build process so that this step can be avoided at run time.
· Load Assembly 2%– The CLR types must be validated against the metadata.
· Tracking 1%– As part of the Object Services layer, there is a state manager that tracks changes made to entity objects. This is where objects are added to the state manager. Each object’s identity is created and this entity key issued to search for instances of the same entity type with the same key in the state manager. If a match is found, the merge option is used to determine the next steps.
· Materialization 7%– This is the process of actually creating the object and filling in all the properties taken from the returned DbDataReader.
· Misc 1%– This includes all of the other small operations, which includes execution of the SqlCommand, SQL Server query execution, and just application code.
When I run the EDM generator (EdmGen.exe) command line tool with the view generation command parameter (/mode:ViewGeneration), the output is a code file (either C# or Visual Basic) that I can include in my project. Having the view pre-generated reduces the startup time down to 2933 milliseconds or about a 28% decrease. For scenarios where view generation is the primary cost, such as when an object context is created for only a few queries, pre-generating these views and deploying them with your application is a good solution. However, the downside of this solution is the need to keep the generated views synchronized with changes to the model.
If we remove the cost of startup and take a look at just the cost to execute the query and return objects, here’s the breakdown of percent per operation.
Figure 3 - Warm query results
· ObjectContext construction 1.38%– This is the cost of creating the ObjectContext and looking up the existing metadata information based on the information already loaded from the first run. This is important for Web service and ASP.NET scenarios where the short lived context is the programming pattern. Because the service lives for a long time you’ll see the metadata caching and query caching significantly reduced.
· Query creation 11.02%– Each time a query is created in an ObjectContext, the Entity SQL query command is cached. This is the one-time overhead of query creation. The good news is that since query creation is cached, similar queries are executed faster. Later, I show how to use compiled LINQ queries to speed this process up even more.
· EntityKey creation 0.28%– No matter what merge option is used; there is always a cost for EntityKey creation.
· Relationship span 4.13% – For queries where the returned objects are tracked in the ObjectStateManager, we include the related ends. This is invisible to the user, except now EntityReferences have an EntityKey for relationships.
· Object lookup 1.38% – The cost of using an object’s EntityKey to determine if the object already exists in the ObjectStateManager.
· Materialization 73%– the cost of reading from the DbDataReader and creating an object.
· Misc 9.92% – Cost of the remaining stuff, including execution in SQL Server.
To better understand how the query process affects performance, here’s the basic logic of a query, in order of execution.
· Parts of the query are broken up to allow for query caching to occur resulting in a query plan.
· Query is passed through to the .NET Framework data provider and executed against the database.
· The results are returned and the user iterates over the results.
· On each entity, the key properties are used to create an EntityKey.
· If the query is a tracking query then the EntityKey is used to do identity resolution.
· If the EntityKey is not found, the object is created and the properties copied into the object.
· The object is added to the ObjectStateManager for tracking.
· If merge options are used, then the properties follow the merge option rules.
· If the object is related to other objects in the ObjectContext, then the relationships between the entities are connected.
In my next post, I’ll show some of the performance improvements that can be made on the query itself and how Entity SQL and LINQ to Entities perform.
Brian Dawson
Program Manager, ADO.NET Team
Comments
Anonymous
February 04, 2008
The comment has been removedAnonymous
February 07, 2008
El equipo de ADO.NET hace unos dias ha publicado un interesante post sobre rendiemiento de Entity FrameworkAnonymous
February 07, 2008
Two great posts appeared recently on the ADO.NET team blog. The first details some of the performanceAnonymous
February 10, 2008
So if it takes so much time to create the ObjectContext, should I create it before hand? In other words, in a windows app, can I just create the ObjectContext when the app loads and the user logs-in, and then use the ObjectContext until the user logs-out or close the app? In the demos and examples I see this objects being created in a "using", implying that I should create it before every chunk of database operation. Can you please clarify that? Thanks!Anonymous
February 11, 2008
Great question about the connection. The first opening does including getting additional information. We do open and close the connection when needed and the times are faster over time. As for the ObjectContext and best practice, this really just depends on your application. The first run really is the slowest with the ObjectContext. There is more than just performance you need to take into account when considing using a longer running ObjectContext. For example, do you need to track entities longer? If so than "using" may not be the right pattern. The using makes a nice pattern for short lived context, such as in the ASP.NET scenarios.Anonymous
February 11, 2008
Thank you very much Brian! The scenario I had in mind when suggested the use of a long running ObjectContext was a Windows app, in which tracking entities for a long time is required - an order taking app for instance. Thanks again.Anonymous
February 12, 2008
Exploring the Performance of the ADO.NET Entity FrameworkAnonymous
February 13, 2008
J'ai quelque peu délaissé mon blog ces derniers temps mais maintenant que les techdays sont passés (duAnonymous
February 13, 2008
En este post comento una serie buenas prácticas para mejorar la eficiencia de ADO.NET Entity FrameworkAnonymous
February 20, 2008
概述春节后的第一期推荐系列文章,共有10篇文章:1.ASP.NETMVCExampleApplicationoverNorthwindwiththeEntityFramework...Anonymous
February 20, 2008
Como hace unos días apuntaba Unai , en el blog del equipo de desarrollo de ADO.NET están publicando unaAnonymous
February 20, 2008
概述 春节后的第一期推荐系列文章,共有10篇文章: 1.ASP.NETMVCExampleApplicationoverNorthwindwiththeEntityFr...Anonymous
February 21, 2008
The first post, Exploring the Performance of the ADO.NET Entity Framework - Part 1 , began: PerformanceAnonymous
February 28, 2008
It's cool to have an idea of the performance of the EF. Nonetheless, I think a comparison must be shown between the dataset world and Linq to Entities. It would be interesting to have the overhead of each compared.Anonymous
March 27, 2008
There have been a few questions from the last performance blog post about how the Entity Framework comparesAnonymous
March 27, 2008
L'ADO .Net Team vient de poster deux nouveaux posts : le premier concerne l'utilisation des procéduresAnonymous
May 01, 2008
It doesn't take a performance test to tell you that EF will be MUCH, MUCH faster than any DataSet, which has to be some of the worst code in the entire BCL. I've written an ORM that uses tons of reflection, runtime schema analysis, runtime SQL generation, etc - and it outperformed typed DataSets (where queries and all such are generated at design-time) by several factors.Anonymous
May 24, 2008
For everybody who is interested on internals of Entity Framework (ER) in relation to referential integrityAnonymous
May 31, 2008
Performance Matters No matter what type of application you are developing at some point in the lifecycle of a software application or service, performance matters; this is especially true when accessing data. In this article, I’m going to spend some timAnonymous
June 05, 2008
Performance Matters No matter what type of application you are developing at some point in the lifecycle of a software application or service, performance matters; this is especially true when accessing data. In this article, I’m going to spend some timAnonymous
June 20, 2008
Background To set the stage for this article, do take a look at Exploring the Performance of the ADO.NETAnonymous
June 20, 2008
Background To set the stage for this article, do take a look at Exploring the Performance of the ADO.NETAnonymous
July 01, 2008
Concurrently with the finalization of the initial LINQ release bits, community previews of complementaryAnonymous
August 12, 2008
Why does actual metadata information need to be retrieved from the database in runtime, taking 14% of the exec time (Initializing Metadata)? I thought the local metadata in all those XML files would be sufficient. I'm sure it serves a purpose, but I don't like the idea of burdening a production db server with metadata queries.Anonymous
August 20, 2008
And what about the perfomance in last EF release? Does something changed?Anonymous
October 01, 2008
Yes, it would be interesting to know this for the latest EF release.Anonymous
January 11, 2009
If we’ll skip exact details, we can say, that internal behavior of whole modeling and mapping is basedAnonymous
April 08, 2009
Direi che una buona pagina da cui partire è questo documento su MSDN :Performance Considerations forAnonymous
January 31, 2011
this is nice post, but can you give brief introduction of comparing Linq to entities performance with stored procedures.Anonymous
March 28, 2012
i totally agree with Morten Mertner i don't use entity framework anymore in my applications, because it is extremely SLOW!!! i tried some steps proposed in this post (not all). even if i try, how much time it will take to setup???!!!! so my conclusion is: i prefer to use old ado.net instead of this problematic evolution for nothing. thanks you, byeAnonymous
April 02, 2012
@kamal101 - Be sure to take a look at this post about the performance improvements in EF blogs.msdn.com/.../sneak-preview-entity-framework-5-0-performance-improvements.aspxAnonymous
December 17, 2013
Is it possible to repeat these tests for version 5.0 of the EF? I have been struggling to get the symbols loaded and also understand what elements need to be inspected to get the %s as above.Anonymous
December 18, 2013
@Mark - We are planning to put together an article with some numbers for the latest releases, but our team is heads down working on releases/features at the moment so it won't be immediately. Here is some helpful documentation about performance in EF - msdn.microsoft.com/.../hh949853. It was written for EF5 but still applies to EF6.