Working With Large Models In Entity Framework – Part 2
In my last post I talked about some of the issues you typically face when using a large Entity model in your application. I have also described a few things that you can use to mitigate some of these problems. In this post I will walk through a couple of examples to demonstrate how you can split one large entity model into smaller ones while reusing types thus avoiding duplication.
Sub-dividing the model into smaller models with type reuse
The concept of “Using” in CSDL schema will allow you to do this. This is a pretty powerful feature that will enable you to create multiple models and let you reuse types in different models.
There are a few problems with using this approach which include:
a) No designer support: The designer does not support “Using”. So you would have to create your model first using designer and then edit the xml to use Entities from a related model.
b) Bi-Directional Navigation Not Supported: Since you cannot create cycles when creating model dependencies, you can declare navigation properties only in one model.
But even with these short comings, dividing the model into smaller models will make sense in a lot of cases. There are two ways you can sub-divide your model.
a. Multiple CSDL files( Models) while sharing MSL and SSDL:
The advantage with this approach is that the object model for your types is much cleaner. You don’t have the problem of having 1000 types in a single namespace. But it won’t solve the problem of performance or Intellisense.
How to create multiple CSDL files that will share MSL and SSDL:
The example uses Northwind sample database.
1. Create a new Ado.Net Entity Data Model using the Entity Data Model Wizard by pointing to Northwind database and choosing Products, Categories and Suppliers tables.
2. Change the “Metadata Artifact Processing” property of Edmx file to “Copy to Output Directory” directory and build the solution. This will drop the CSDL, SSDL and MSL files in the build output path.
3. Copy the schema files (CSDL, SSDL and MSL) to another location. This location will be used in the Metadata parameter of the EntityConnection string. Let’s call these files – model1.csdl, model.ssdl, model.msl
4. Open the CSDL file and copy the Cateogories Entity Type to a separate CSDL file. Let’s call this file model2.csdl. Use a different namespace for this schema. Let’s say this is NorthwindModelBase.
5. Remove the Categories entity type from the Model1.csdl.
6. Add a Using statement to import the newly created namespace.
Ex:
<Schema Namespace="NorthwindModel" Alias="Self" xmlns="https://schemas.microsoft.com/ado/2006/04/edm">
<Using Namespace="NorthwindModelBase" Alias="BaseModel" />
7. Change the Categories EntitySet and FK_Products_Categories to refer to the Categories Entity type.
Ex:
<Association Name="FK_Products_Categories">
<End Role="Categories" Type="BaseModel.Categories" Multiplicity="0..1" />
<End Role="Products" Type="NorthwindModel.Products" Multiplicity="*" />
</Association>
8. Use EdmGen to create two different C# files for these models.
Command:
a. edmgen /mode:EntityClassGeneration /incsdl:model1.csdl
/refcsdl:model2.csdl /outobjectlayer:model1.cs
b.edmgen /mode:EntityClassGeneration /incsdl:model2.csdl
/outobjectlayer: model2.cs
9. You can compile these c# files into the same assembly or a different assembly.
10. The Metadata property of the Connection string should point to both the CSDL files in addition to the SSDL and MSL files.
11. Assuming that the base model was built before the model that was using it, the Categories Entity type would not have a navigation property
The schemas for this example can be found in the attached .zip file under the folder MultipleModelsWithSingleSSDLAndMSLFiles
b. Divide application schemas into different sets of CSDL, MSL and SSDL files :
In the previous section, I described how to break your CSDL file into multiple CSDL files while still sharing the mapping. But as I mentioned earlier this would not solve the performance problems that could come up because of the size of the models. To solve the performance problems, you would have to actually map the database partially into different CSDL files.
Here are the things you need to remember when using this approach:
1. There could be cases where you might have to map the same table to two different models. So you would have some duplicate metadata lying around.
2. There could also be cases where you would expose foreign keys as scalar properties because you do not want to pull in all the related tables into your Entity model.
3.SSDL and MSL does not have any concept of reuse currently, so either you can choose to reuse the types in CSDL( as described in previous section) or you could choose to duplicate information in CSDL too. Reusing the types definitely has some advantages but given the pain in creating CSDL schemas that import other schemas, you might want to consider duplicating information in CSDL too. This would allow you to work with the designer. But if you are dividing the model for performance and maintainability reasons and you actually want to use these smaller models in a single application, duplicating the information would not be a viable option. There are definitely other disadvantages with duplicating information across multiple model files( typically the same problems that you would see with duplicate code). The way to avoid duplication would be by using the “Using” element in CSDL. In the below steps, I have described how to do model splitting with the support of Using and no duplication in the model( CSDL ) files.
How to split single set of CSDL, SSDL and MSL files into multiple sets:
The example uses Northwind sample database.
1. We want to create an application that uses the following tables: Products, Categories, Orders, Order Details, Customers and CustomerDemographics.
2. Let’s say for reasons of performance and maintainability we want to split these into two different models with two different containers.
3. To do this, create 2 new Ado.Net Entity Data Models using the Entity Data Model Wizard by pointing to Northwind database. In one model, choose Products, Categories, Orders, Customers and Order Details tables. In the second set, choose Customers and CustomerDemographics. So you have included Customers table both in the first set and second set. Let’s refer to the first set of schemas as ProductDetails and second set of schemas as CustomerDetails.
4. You can either choose to reuse the Customers type by using the “Using” element in CSDL or repeat the same type in both the sets.
5. In my sample( shared below) I have chosen to move the Customers type to a separate model called CustomerBase.csdl and reuse the Customers type in both CustomerDetails model in ProductDetails model.
6. Change the Customers end in “FK_Customers_Orders” association in ProductDetails model to refer to Customers type defined in CustomerBase model.
Ex :
<Association Name="FK_Customers_Orders">
<End Role="Customers" Type="CustomerBase.Customers" Multiplicity="0..1" />
<End Role="Orders" Type="NorthwindModel.Orders" Multiplicity="*" />
</Association>
7. You need to make a similar change to the CustomerCustomerDemo Association that relates Customers to CustomerDemogrpahics.
8. You can also see that there is a Navigation property on Orders type that you can use to Navigate to related Customer. Also observer that the Customer defined in the CustomerBase model does not have a navigation property to navigate back to related Orders. Since you are sharing Customers type between different models, you cannot add that navigation property. For example, if you add a navigation property for Orders on Customers type, it won’t make sense when you use Customers type in CustomerDetails model since Orders type is not present in that model.
9. At runtime, you could create either one Context that works with both the schema sets or two different contexts. To create a single context with both the schema sets, you would use the ObjectContext constructor that takes in an EntityConnectionString. In the Metadata parameter of the connection string, specify the paths to both sets of files.
10. Problems with Navigation properties being absent:
As mentioned above, in the ProductsModel there is a Navigation property on Orders type that you can use to Navigate to related Customer but you cannot navigate back to Orders from Customers because the NavigationProperty was not defined on the Customers type.
Here is some sample code that navigates from Orders to Customer and prints the name of Customer who ordered it. This is pretty simple since there is a Navigation property on Orders type that you can use to Navigate to related Customer.
NorthwindEntities productDetails = new NorthwindEntities(connString);
var orderInfos = from o in productDetails.Orders.Include("Customers")
select new { o.OrderID, o.Customers.ContactName);
foreach (var orderInfo in orderInfos)
{
Console.WriteLine("Order ID:" + orderInfo.OrderID + " Customer who
ordered:" + orderInfo.ContactName)
}
Here is another sample to navigate from Customer to Orders and prints the OrderID of all the orders that a customer has ordered. Since there is no navigation property, you would have to go to the Orders collection and use the Customer navigation property on Orders.
Here are a couple of samples in Linq to do the navigation back:
Sample 1
NorthwindEntities productDetails = new NorthwindEntities(connString);
var customerInfos = from c in productDetails.Customers
select new { Customer = c, Orders = (from o in productDetails.Orders where o.Customers == c select o) };
foreach (var customerInfo in customerInfos)
{
Console.WriteLine("Customer " + customerInfo.Customer.ContactName);
foreach (var orderInfo in customerInfo.Orders)
{
Console.WriteLine("Order ID:" + orderInfo.OrderID);
}
}
Sample 2
NorthwindEntities productDetails = new NorthwindEntities(connString);
var customerInfos = from c in productDetails.Customers
join o in productDetails.Orders on c.CustomerID equals
o.Customers.CustomerID into orders
select new {Customer = c, Orders = orders};
foreach (var customerInfo in customerInfos)
{
Console.WriteLine("Customer " + customerInfo.Customer.ContactName);
foreach (var orderInfo in customerInfo.Orders)
{
Console.WriteLine("Order ID:" + orderInfo.OrderID);
}
}
The schemas for this example can be found in the attached .zip file under the folder MultipleSchemaSets.
I hope that these posts have been helpful. I would love to hear your feedback and also things that you would like to add to the list of things that I suggested.
Srikanth Mandadi
Development Lead, Entity Framework
Comments
Anonymous
November 26, 2008
Srikanth Mandadi a publié la deuxième partie de sa série relative au problème de performances des trèsAnonymous
November 26, 2008
We appreciate your efforts addressing this issue. I think that alot of folks would like to see a much more simplified approach in the long term. My hope is that the Oslo repository has a key role to play here. Especially in terms of dividing the models up into sub domains, being able to reuse them among teams and the like. In fact would it be possible for your team to let us know how they see the relationship between EF , M, and the Oslo repository? Just know that we're looking for something more than modifying the EDMX files by hand. Thanks, TravisAnonymous
November 27, 2008
It's a start. But i'm interessing in possibility to have multiple programmers work whith a multiple file (like multiple .edmx) that combine in unique edmx in compile time. Each programmers can use a personal file to describe CSDL SSDL & MSL of this part of the application (like for a single class). In your scenario A) SSDL and MSL are unique and not alow concurrency. I want the possibility of paraller developper but I don't want to manage manually association like in scenario B. it's possible to merge multiple edmx in one? Have we thinking at tool to do this?Anonymous
November 27, 2008
It is a step in the right direction, but there are quite a few steps involved, and this would increase the tendency to make a mistake. Can you make the process easier in v2?Anonymous
December 01, 2008
Malcolm, We have talked about quite a few things that would improve this process. But other than a few small things here and there, some of the big items that we want to do in this area like designer support for Using etc probably won't make it into V2. Thanks SrikanthAnonymous
December 01, 2008
Andrea, I did not completely follow the scenario you are describing but my guess is that you are talking about allowing users to split and/or reuse their SSDL and MSL files as we allow them to do in the CSDL. If so, this is something we talked about and something we want to enable in the future. Thanks SrikanthAnonymous
December 03, 2008
Weekly digest of interesting stuffAnonymous
December 08, 2008
The comment has been removedAnonymous
December 08, 2008
Hi Srikanth, you perfectly understood what i mean to say. I'm happy that this feature will be (soon? ;) ) available! It's great that lead developer read our requests (or feedback) and answer to our needs! Thank's again. Good work. Andrea.Anonymous
December 10, 2008
EntityFramework的开发领导SrikanthMandadi称这个包含两部分内容的文章为Anonymous
January 20, 2009
Привет всем! В заметке Working With Large Models In Entity Framework – Part 2рассказано о декомпозиции...Anonymous
January 21, 2009
Hi! *.msl may be split easy, but how split *.ssdl?Anonymous
February 17, 2009
After a year of working with LINQ to SQL, I strongly belivev that LINQ to SQL and Entity Framework (EF)Anonymous
April 08, 2009
Direi che una buona pagina da cui partire è questo documento su MSDN :Performance Considerations forAnonymous
May 26, 2009
Un de mes clients veut développer un ERP avec EF. Sa base contient plus de 600 tables quasiment toutesAnonymous
May 27, 2009
One of my customers wants to code an ERP. To make it, he wants to use EF. His DB has more than 600 tablesAnonymous
June 04, 2009
Похоже, народ начинает пробовать заюзать Entity Framework в реальных проектах. И начинает задавать неудобные...Anonymous
August 10, 2009
Hi Srikanth, We split our model into two and were working on this till recently. I tried refactoring the code and it is broken now. When i try to load the data from other mode i am getting an error like "System.InvalidOperationException: The relationship 'DataAccess.EF.ParameterListValuesParameters' does not match any relationship defined in the conceptual model". Do you have idea about this error? I searched in net for this error but no help.Anonymous
August 10, 2009
By the way, I used your option B to create 2 sets of csdl, msl and ssdl files for the models and used "using" in the second model to refer the entities in the first model. Thanks, DhileepAnonymous
October 20, 2009
Having one .edmx file might be good as far as it could be easily edited both manually and via designer.
- For the manual editing improvements you could add one simple feature which will provide some sort of "navigation links". Thus when you edit some peace of xml code related to some table in SSDL part, VS will show you links (or previews) of other related peaces of xml in CSDL and MSL part.
- In entity designer we need an ability to group entities into aggrigates with different namespaces. (ex.: Membership.User, Membership.Role, Store.Product, Store.CreditCard etc.). That would be great if we could collaps/expand those "aggrigates". Zoom in/out int0 aggrigates view and into entities view. -- navin@php.net
Anonymous
October 20, 2009
Hi Srikanth, Is there any fix for the "large model" issue in version 4?Anonymous
October 27, 2009
Will any of the "large model" issues mentioned in Part I and Part II be addressed in the upcoming EF4 release? There is a lot of good information about some of the new features, but can you let us know if there is any movement towards making EF more friendly to large models? Or at least if there will be a less manual approach to splitting models in EF4? My team is in the process of migrating an application with LOTS of tables. So far we only have ~80 tables in EF (out of 1,000+), but we will add more tables to the model as we add functionality. Unfortunately we couldn't add all of the tables to EF due to performance problems and others issues mentioned in these blog posts. Thanks in advance for any information you can provide!Anonymous
January 27, 2011
have you tried this in EF4.0, at least once, based on your defined steps in the blog.Anonymous
January 30, 2011
I am using T4 template to generate POCO Classes which are then moved to another project. Can you please provide some tips as to how can we divide the EF model into separate models while using POCO Classes ? thanks in advanceAnonymous
April 26, 2011
Hi I don't understand in multiple schema set method, How "customerdetails.model.cs" created? could you help me? whats it's command? and how can i use these files in my project? I copy that files in a directory and change my connectionStrings in app.config to refer to that files,but this cause my project generate 49 errors!Anonymous
April 29, 2011
any answer?Anonymous
January 11, 2012
With the "using" approach, how does that affect the startup performance of the app? Currently, we have one big edmx, and start-up performance is bad, because EF 4.2 processes the whole model at startup. (even with pre-generated views it is still too slow). With the using approach, will EF only process stuff from one diagram on startup (i.e. process the stuff from one diagram the first time something on that diagram is queried) or will it still process all the metadata for the whole application at startup?Anonymous
December 19, 2013
The very first thing we need is an EF who deals properly with sub-models because we handle complexity with abstraction and clean code and less plumbing. LineOfBusines applications are most of the time as serious as the number of entities, and 100..900 entities with structure is no exception. I wait for EF7 being capable to handle these situations and for the time being I use creative work around's :-)