Quantcast
Channel: Everything SQL Server Compact
Viewing all articles
Browse latest Browse all 160

Fix for Entity Framework poor INSERT performance with SQL Server Compact and server generated keys

$
0
0

In this blog post I will describe the steps I took in order to find out why the title above was the case, and how it could be fixed.

On Stackoverflow the general opinion was that the reported slowness was “by design” and could not be fixed, but looking at recent tests posted on Stackoverflow pointed to the fact that something was not done right.

Since Entity Framework is now Open Source and available on CodePlex, I decided to have a deeper look.

To test if the process could be improved, I created the following console app:

 

   1:  namespace EF6SqlCETest
   2:  {
   3:  using System;
   4:  using System.Data.Entity;
   5:  using System.Diagnostics;
   6:   
   7:  class Program
   8:      {
   9:  staticvoid Main(string[] args)
  10:          {
  11:  using (var db = new StudentContext())
  12:              {
  13:                  Stopwatch sw = new Stopwatch();
  14:                  db.Database.Delete();
  15:                  sw.Start();
  16:                  db.Database.CreateIfNotExists();
  17:                  db.Configuration.AutoDetectChangesEnabled = false;
  18:                  db.Configuration.ProxyCreationEnabled = false;
  19:                  Console.WriteLine(
  20:  "Db created in {0}", sw.Elapsed.ToString());
  21:                  sw.Restart();
  22:  for (int i = 0; i < 4000; i++)
  23:                  {
  24:                      var student = new Student { Name = Guid.NewGuid().ToString() };
  25:                      db.Students.Add(student);
  26:                  }
  27:                  Console.WriteLine(
  28:  "Entities added in {0}", sw.Elapsed.ToString());
  29:   
  30:                  sw.Restart();
  31:  int recordsAffected = db.SaveChanges();
  32:                  sw.Stop();
  33:                  Console.WriteLine(
  34:  "Saved {0} entities to the database, press any key to exit.",
  35:                      recordsAffected);
  36:                  Console.WriteLine(
  37:  "Saved entities in {0}", sw.Elapsed.ToString());
  38:                  Console.ReadKey();
  39:              }
  40:   
  41:          }
  42:      }
  43:   
  44:  publicclass Student 
  45:      {
  46:  publicint Id { get; set; }        
  47:  publicstring Name { get; set; }
  48:      }
  49:   
  50:  publicclass StudentContext : DbContext
  51:      {
  52:  public DbSet<Student> Students { get; set; }
  53:      }
  54:   
  55:  }



The test project and the related app.config is available for download here: http://sdrv.ms/UCL2j5


The test code is a simple Code First DbContext model. For each run I start with a new blank database, and creates it before doing SaveChanges, so that part of the process can be timed individually. The 2 options on lines 17 and 18 are there to ensure that the for loop runs quickly, without these option the loop adding objects takes much longer (test for yourself).


The resulting table looks like this:

CREATETABLE [Students] (
[Id] intNOTNULLIDENTITY (1,1)
, [Name] nvarchar(4000) NULL
);
GO
ALTERTABLE [Students] ADDCONSTRAINT [PK_dbo.Students] PRIMARYKEY ([Id]);
GO



In order to find out where time was spent during SaveChanges, I ran a Visual Studio Performance Analysis. It turned out that all the time was spent in sqlceqp40.dll, the SQL Server Compact 4.0 unmanaged query processor – so something was amiss.


As described in my earlier blogpost, the SQL statements generated in order to return the server generated id (the IDENTITY value), looked like the following:


SELECT [Id] FROM [Student] WHERE [Id] = @@IDENTITY


So using the SQL Server Compact Toolbox, I coulde analyze the 2 statements:


image


And got the following result:


image


So for every INSERT, a table scan was performed, as for some reason, the SQL Server Compact query processor could not figure out to do an Index Seek. And the more rows to be scanned, the worse the performance got. And all the time for the operation was spent doing this.


In order to avoid this, I decided that the goal of the statement executed should be to avoid table scans, but return a value with the exact same shape as the previous statement executed, that is; it should have the name of the IDENTITY column, and be of the correct type (only bigint and int are supported as IDENTITY types with SQL Server Compact).


The return value of @@IDENTITY is numeric, so simply using “SELECT @@IDENTITY AS [Id]” would not work. So the statement should be:


SELECT CAST(@@IDENTITY AS int) AS [Id]


The type could then be either int or bigint and the column alias should of course be the correct column name.


I could then analyze the modified statement:


INSERT INTO [Students] ([Name])
VALUES (N'jasdjsakjdajd');
GO
SELECT CAST(@@IDENTITY AS int) AS [Id]
GO


image


And (not surprisingly) no table scan, as the statement does not refer to any table!


And so this is what I have implemented in my fix, that I now need to figure out how to “submit a pull request” for.


If you look closer at the fix. and compare to the original code, you will notice that it is quite different, This is partly due to the fact, that it is now a different SQL statement that is generated, but it is also apparent, that the orginal code was based on/copied from the implementation for SQL Server, which supports many more key types, and therefore the code for SQL Server Compact is nearly unreadable considering the simple outcome even before my changes.


Viewing all articles
Browse latest Browse all 160

Trending Articles