Performance Quiz #13 -- Linq to SQL compiled queries cost
I've written a few articles about Linq now and you know I was a big fan of compiled queries in Linq but what do they cost? Or more specifically, how many times to you have to use a compiled query in order for the cost of compilation to pay for itself? With regular expressions for instance it's usually a mistake to compile a regular expression if you only intend to match it against a fairly small amount of text.
Lets do a specific experiment to get an idea. Using the ubiquitous Northwinds database and getting the same data over and over to control for the the cost of the database accesses (and magnify any Linq overheads) we run this query:
var q = (from o in nw.Orders
select new {
OrderID = o.OrderID,
CustomerID = o.CustomerID,
EmployeeID = o.EmployeeID,
ShippedDate = o.ShippedDate
}).Take(5);
and compare it against:
var fq = CompiledQuery.Compile
(
(Northwinds nw) =>
(from o in nw.Orders
select new
{
OrderID = o.OrderID,
CustomerID = o.CustomerID,
EmployeeID = o.EmployeeID,
ShippedDate = o.ShippedDate
}).Take(5)
);
So now the quiz: How many times to I have to use the compiled version of the query in order for it to be cheaper to compile than it would have been to just use the original query directly?
Comments
Anonymous
January 11, 2008
PingBack from http://geeklectures.info/2008/01/11/performance-quiz-13-linq-to-sql-compiled-queries-cost/Anonymous
January 11, 2008
PingBack from http://geeklectures.info/2008/01/11/performance-quiz-13-linq-to-sql-compiled-queries-cost/Anonymous
January 11, 2008
Only once since the query must be "compiled" anyway.Anonymous
January 11, 2008
[Only once since the query must be "compiled" anyway.] I doubt that. There is almost certainly additional overhead in finding out if the query has already been compiled (using a hashcode or similar mechanism?) and allocating storage for the compiled query in some sort of collection of compiled queries.Anonymous
January 11, 2008
I doubt it as well. Unfortunately, I do not know enough about Linq to give an informed answer. Therefore, I will be watching Rico's blog with interest as he posts the answer. JamesNTAnonymous
January 11, 2008
There is no implicit caching of compiled queries, so I vote for 'once' as well. That's the problem: LINQ doesn't remember the expressions when they are executed twice or more. It always re-interprets an expression tree from scratch. Otherwise, we wouldn't need CompiledQuery functionality in the first place.Anonymous
January 12, 2008
It depends on the query. Taking into account the cost of the compile, I've had some compiled queries take as many as 15 executions before they paid for themself.Anonymous
January 12, 2008
The comment has been removedAnonymous
January 13, 2008
I think queries are cached for the lifetime of thedatacontext. And it worth compiling query if you use it with different datacontexts more than once.Anonymous
January 13, 2008
I doubt that queries are cached in DataContext. I think if you use same query more than once, compiled queries are better.Anonymous
January 14, 2008
Well, from my testing (using the query Rico gave), the pay-off is immediate. I'm going to assume the the Comments editor is going to mangle the chart below, so to summarize, Compiled always wins, even for just one repitition. At higher repetitions, the ratio settles at 5:1, but oddly, at low reps, compiled win by as much as 25:1. (Some -- but not all -- of that can be attributed to the timer resolution) Reps uncompiled Compiled ratio 1 :00.25 :00.014 18:1 5 :00.37 :00.015 25:1 10 :00.27 :00.02 14:1 50 :00.81 :00.06 13:1 500 :01.98 :00.38 >5:1 5000 :14.86 :03.10 <5:1 50000 2:26 :27 >5:1Anonymous
January 14, 2008
James, your times don't seem to detail the compilation time? My tests show that 1 repetition uncompiled is slower than compiling and 1 repetition of the compiled object. Although, I haven't tested specifically with Rico's example, in my tests the compilation time begins to be offset at 15 repetitions. As I stated earlier, this offset is likely dependant on database and the query involved.Anonymous
January 14, 2008
Check it out and see if you know the answer. Rico's Performance TidbitsAnonymous
January 14, 2008
Here is my code for the compiled case. The time is for one call to this method. ("reps" is a hard-code static const which I manually change between runs.) ("total" is used to prevent the compiler from eliding unused code, and to compare with the results from Test1() which was the uncompiled version.) static int Test2() { NorthwindDataContext nwdc = new NorthwindDataContext("(local)"); var fq = CompiledQuery.Compile((NorthwindDataContext nw) => (from o in nw.Orders select new { OrderID = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(5) ); int total = 0; for (int i = 0; i < reps; ++i) { var t= fq.Invoke(nwdc); foreach (var a in t) { total += a.OrderID; } } return total; }Anonymous
January 14, 2008
James: Things you want to control for at low reps: -jitting time (run once and throw away that result to get all the code jitted) -state of the connection pool (you don't want to charge the cost of making the connection to one algorithm, and not the other so make several connections first and discard those results too) What I did was I ran several thousand batches first of both cases, discarded those results, and then measured. Also I tried them in different orders to make sure that the later runs were not penalized by say more things on the GC heap (they were not as it turns out but I controlled for it anyway) I don't understand how the compiled query can possibily win in the single use case :)Anonymous
January 14, 2008
If I include the cost of the compilation in the times, I get ratios like this with that sample code: uncompiled compiled 1 00:00:00.1047737 00:00:00.0903456 1.2:1 5 00:00:00.0921325 00:00:00.1044020 0.8:1 10 00:00:00.0982715 00:00:00.1024622 1,0:1 50 00:00:00.1724429 00:00:00.1112768 1.5:1 500 00:00:00.6105094 00:00:00.2092043 2.9:1 5000 00:00:05.0588695 00:00:01.2370046 4.1:1 50000 00:00:49.9114562 00:00:11.3373295 4.4:1 Now, with a different query on the same northwind database the results are different again: uncompiled compiled 1 00:00:00.4928271 00:00:00.3322023 1.5:1 5 00:00:01.1390319 00:00:01.1036043 1.0:1 10 00:00:02.1391678 00:00:02.0752555 1.0:1 50 00:00:10.0530105 00:00:09.6794834 1.0:1 500 00:01:41.2241573 00:01:35.6616519 1.1:1 And the copde/compiled-query is: static int Test() { NorthwindDataContext nwdc = new NorthwindDataContext(@"(local)SQLExpress"); int total = 0; Stopwatch stopwatch = new Stopwatch(); stopwatch.Start(); var cq = CompiledQuery.Compile((NorthwindDataContext nw) => (from customer in nw.Customers select new { Customer = customer, Orders = from o in nw.Orders join co in nw.Orders on o.CustomerID equals co.CustomerID where co.CustomerID == customer.CustomerID select o })); for (int i = 0; i < reps; ++i) { var q = cq.Invoke(nwdc); foreach (var a in q) { total += a.Orders.Count(); } } stopwatch.Stop(); Trace.WriteLine(String.Format("{0} {1} ({2} ticks)", reps, stopwatch.Elapsed, stopwatch.ElapsedTicks)); return total; } A similar query on a different database produced results that showed compiled was slower than uncompiled up to about 15 reps. So, it really depends on the query and the database. As with any optimizations, it should be checked to make sure it really is an optimization and that it's worth doing...Anonymous
January 14, 2008
Peter correctly points out that if you're bringing back lots of data (as in his second example) all this query compilation stuff just so much noise. The real cost is in processing the data and creating the objects. With that much data coming back, compilation isn't hurting but it sure isn't helping much. Pretty much you've got the raw time to read the data and create the objects and that's it. Which is as it should be :) I'm guessing now but I think the primary difference between my results and Peter's (where we were using the same query) is that I put in a dummy run to eliminate the cost of jitting and the cost of getting the database connection pool set up from the experiment. Also I keep the total work constant so that I never have to measure any small times.Anonymous
January 14, 2008
I was right, but in some sense I cheated. :PAnonymous
January 14, 2008
Yes, I didn't perform and dummy runs to eliminate jitting in the metrics. Unless I'm NGENing a product I like to see "worst" case. If the user runs the application for the first time and it is prohibitively slow, they're not going to want to use it again :-). But, metrics not including JIT times are equally as informative.Anonymous
January 14, 2008
using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Data.Linq; using System.Diagnostics; using System.Linq.Expressions; namespace LINQPerf { class Program { static void Main(string[] args) { Stopwatch s, sAvg; const int iterations = 10; const int ordersToFetch = 10000; Northwind nw; nw = new Northwind("Data Source=D:\Development\Projects\C#\LINQPerf\LINQPerf\Northwind.sdf"); //nw.Log = Console.Out; nw.ObjectTrackingEnabled = false; //s = Stopwatch.StartNew(); //var cacheQuery = from orders in nw.Orders select orders; //cacheQuery.Take(ordersToFetch).GetEnumerator(); //Console.WriteLine("Cache: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); (from o in nw.Orders select new { OrderID_N = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); Console.WriteLine("Non compiled: " + s.ElapsedMilliseconds); Console.WriteLine(); sAvg = Stopwatch.StartNew(); { s = Stopwatch.StartNew(); (from o in nw.Orders select new { OrderID_N_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); Console.WriteLine("Non compiled 0: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); (from o in nw.Orders select new { OrderID_N_1 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); Console.WriteLine("Non compiled 1: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); (from o in nw.Orders select new { OrderID_N_2 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); Console.WriteLine("Non compiled 2: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); (from o in nw.Orders select new { OrderID_N_3 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); Console.WriteLine("Non compiled 3: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); (from o in nw.Orders select new { OrderID_N_4 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); Console.WriteLine("Non compiled 4: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); (from o in nw.Orders select new { OrderID_N_5 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); Console.WriteLine("Non compiled 5: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); (from o in nw.Orders select new { OrderID_N_6 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); Console.WriteLine("Non compiled 6: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); (from o in nw.Orders select new { OrderID_N_7 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); Console.WriteLine("Non compiled 7: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); (from o in nw.Orders select new { OrderID_N_8 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); Console.WriteLine("Non compiled 8: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); (from o in nw.Orders select new { OrderID_N_9 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); Console.WriteLine("Non compiled 9: " + s.ElapsedMilliseconds); } Console.WriteLine("Non compiled avg not in loop: " + ((double)sAvg.ElapsedMilliseconds / 10)); s = Stopwatch.StartNew(); for (int i = 0; i < iterations; i++) { (from o in nw.Orders select new { OrderID_N_AVG = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator(); } Console.WriteLine("Non compiled avg in loop: " + ((double)s.ElapsedMilliseconds / iterations)); Console.WriteLine(); //s = Stopwatch.StartNew(); var compiledQuery = CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch)); //Console.WriteLine("Compile: " + s.ElapsedMilliseconds); //Console.WriteLine(); s = Stopwatch.StartNew(); compiledQuery(nw).GetEnumerator(); Console.WriteLine("Compiled 0: " + s.ElapsedMilliseconds); Console.WriteLine(); sAvg = Stopwatch.StartNew(); { s = Stopwatch.StartNew(); compiledQuery(nw).GetEnumerator(); Console.WriteLine("Compiled 0: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); compiledQuery(nw).GetEnumerator(); Console.WriteLine("Compiled 1: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); compiledQuery(nw).GetEnumerator(); Console.WriteLine("Compiled 2: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); compiledQuery(nw).GetEnumerator(); Console.WriteLine("Compiled 3: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); compiledQuery(nw).GetEnumerator(); Console.WriteLine("Compiled 4: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); compiledQuery(nw).GetEnumerator(); Console.WriteLine("Compiled 5: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); compiledQuery(nw).GetEnumerator(); Console.WriteLine("Compiled 6: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); compiledQuery(nw).GetEnumerator(); Console.WriteLine("Compiled 7: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); compiledQuery(nw).GetEnumerator(); Console.WriteLine("Compiled 8: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); compiledQuery(nw).GetEnumerator(); Console.WriteLine("Compiled 9: " + s.ElapsedMilliseconds); } Console.WriteLine("Compiled avg not in loop: " + ((double)sAvg.ElapsedMilliseconds / 10)); s = Stopwatch.StartNew(); for (int i = 0; i < iterations; i++) { compiledQuery(nw).GetEnumerator(); } Console.WriteLine("Compiled avg in loop: " + ((double)s.ElapsedMilliseconds / iterations)); Console.WriteLine(); sAvg = Stopwatch.StartNew(); { s = Stopwatch.StartNew(); CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_0_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator(); Console.WriteLine("Compiled 0 0: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_1_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator(); Console.WriteLine("Compiled 1 0: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_2_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator(); Console.WriteLine("Compiled 2 0: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_3_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator(); Console.WriteLine("Compiled 3 0: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_4_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator(); Console.WriteLine("Compiled 4 0: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_5_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator(); Console.WriteLine("Compiled 5 0: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_6_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator(); Console.WriteLine("Compiled 6 0: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_7_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator(); Console.WriteLine("Compiled 7 0: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_8_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator(); Console.WriteLine("Compiled 8 0: " + s.ElapsedMilliseconds); s = Stopwatch.StartNew(); CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_9_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator(); Console.WriteLine("Compiled 9 0: " + s.ElapsedMilliseconds); } Console.WriteLine("Compiled 0-9 avg not in loop: " + ((double)sAvg.ElapsedMilliseconds / 10)); Console.ReadLine(); } } } Result: Non compiled: 482 Non compiled 0: 31 Non compiled 1: 16 Non compiled 2: 12 Non compiled 3: 12 Non compiled 4: 12 Non compiled 5: 12 Non compiled 6: 16 Non compiled 7: 12 Non compiled 8: 12 Non compiled 9: 12 Non compiled avg not in loop: 15,2 Non compiled avg in loop: 9,6 Compiled 0: 19 Compiled 0: 6 Compiled 1: 14 Compiled 2: 6 Compiled 3: 6 Compiled 4: 6 Compiled 5: 6 Compiled 6: 6 Compiled 7: 6 Compiled 8: 6 Compiled 9: 6 Compiled avg not in loop: 7,4 Compiled avg in loop: 6,6 Compiled 0 0: 13 Compiled 1 0: 31 Compiled 2 0: 12 Compiled 3 0: 11 Compiled 4 0: 12 Compiled 5 0: 14 Compiled 6 0: 12 Compiled 7 0: 12 Compiled 8 0: 12 Compiled 9 0: 12 Compiled 0-9 avg not in loop: 14,7 Conclusion: compiled expressions are emitted on first run, but not within CompiledQuery.Compile method. Seems like compilation also affects data mapping.Anonymous
January 14, 2008
LINQ query versus compiled LINQ queryAnonymous
January 14, 2008
LINQ query versus compiled LINQ queryAnonymous
January 14, 2008
>> A similar query on a different database produced results that showed compiled was slower than uncompiled up to about 15 reps. >> So, it really depends on the query and the database. I took a course on Optimizing SQL Server several years ago and learned a very valuable lesson. The answer is ALWAYS "It depends" The general guideline though seems to be use a compiled query for smaller sets of data with moderate re-use. But always profile to check ;)Anonymous
January 15, 2008
Chabster, When you call GetEnumerator() on a query you are executing it but not reading any of the data. As Rico points out, for large data sets, reading the data is most of the work.Anonymous
January 15, 2008
mattwar, i don't want data to be read. Only make sure that the query is built and sent to db. But your point might have some value - perhaps compilation affect s mapping performance?Anonymous
January 20, 2008
Welcome to the XXXIX issue of Community Convergence. The big news this week is that Microsoft has begunAnonymous
January 28, 2008
Welcome to the XXXIX issue of Community Convergence. The big news this week is that Microsoft has begunAnonymous
August 25, 2008
I did a series of postings on Linq Compiled Queries last year, I recently got some questions on those