The Art of CRM Batch Jobs

The more CRM projects I do, the more fan I become of the scheduled batch jobs. They are not a silver bullet solution, but in many scenarios they have many benefits over more traditional code customization for CRM, like form scripting, plug-ins and workflows. Of course, the different ways of customizing a CRM or Dynamics 365 solution differs a lot on what you want to accomplish, and I see all of the above as important tools in any CRM dev’s tool chest.

You will also often run into cases where more than one of your tools will be valid solutions to your problems, and that is when you should stop, and consider the pros and cons of each tool available to you, rather than just selecting the hammer time after time. In situations like this, I lately find myself going for a batch job approach more and more often. Why that is I will try to explain in this post, but the key factors for me (if implemented right) are:

  • Scalability
  • Performance
  • Robustness

The typical batch job scenario

Unlike plug-ins, batch jobs are typically used for data processing that are not event driven, involving a large amount of records to be processed in a short amount of time. These are typically integration scenarios, or recalculation jobs. Most often, the data will be transactional.

Another typical scenario is that some logic should be executed on a record on a certain date, or when a certain condition is met, without any changes being made on that record at that time. For example, creating a reminder task for you to remind you of each contact you have who’s birthday it is today.

Plugins/workflows vs. Batch Jobs

There are a couple of caveats when working with plug-ins and workflows in CRM. Both types of customizations are built to be able to add some processing logic to save events on single records in CRM. They are not meant to be used for long running calculations or integrations, affecting many records in the CRM database. If you try to do this, you may run into the following challenges:

Both plug-ins and workflows have a 2 min timeout

Now, 2 min should be more than enough for anyone, but just think about it. You might have a business critical application running a plugin that for some reason runs very slow, and suddenly it just stops! Well written plug-ins or workflows shouldn’t run more than a couple of seconds tops, and it you get longer execution times than that, you should consider if you made the right approach.

Plug-ins and workflows execute in transaction (DB-locking)

This means that if something goes wrong, the state of the affected records gets rolled back which is nice. However it also means that the SQL server will ensure that whatever data used as input for your logic, remains in the state you retrieved it, until the transaction is saved. This means that whichever records you retrieve within the plug-in/workflow, will be locked for read and write until the transaction is done. If you have plugins waiting for external resources (web service replies) running on multiple records, you very quickly get into deadlock scenarios which cause very bad performance.

NB: The read-lock can be bypassed, if you add the no-lock=’true’ hint to your fetchXml queries or QueryExpressions. This way you can read locked data, well knowing that the state of the data you read might be rolled back if a CRM transaction should fail. No-lock is not supported in Linq to CRM queries…

Waiting workflows as a scheduler

By using waiting workflows, it is possible to run some logic on a record at a specific date, or when a condition is met. However, in a transactional setting, waiting workflows are really bad as they fill up the asyncOperationBase table. CRM is not particularly efficient at handling a huge amount of unfinished workflows, and so this is something you should try to avoid.

Transactions in Batch jobs

Ever since CRM 2015, it has been possible to execute multiple request against CRM in a single transaction, using the ExcecuteTransactionRequest. This is a powerful tool for making robust batch jobs. The biggest advantage is that you can retrieve all the data you want without locking any rows in the database, and still get the roll-back functionality when storing the changes in a very quick operation (short locking).

Just remember that if you really need to lock the data you fetch, until the transaction is finished, you need to use a plug-in or workflow. The CRM SDK does not offer any mean of making DB locks on reads.

Also, the ExcecuteTransactionRequest allows you to stack multiple requests together in one request, but does not enable you to insert logic on the result of on request, before moving to the next. This is typically a drawback when creating linked records, and you need the id of the first created record before you can link nr. 2 to it. The workaround is to assign the id’s in code, before calling create. If you want to do this, you should generate sequential guids using this guide. Otherwise, CRM will not be able to sort the records in the created order without an explicit ordering applied. Be aware that the generation of the sequential guids use the servers MAC address, and since the CRM’s internal guids are created on the SQL server, the guids you generate will in most cases not be in sequence with records created manually in CRM.

Best practices for developing batch jobs

Re-use your batch logic for user requests

I prefer to use batch jobs to process data in bulk on a schedule, but at the same time allow a user to trigger the same logic on individual records manually. This can help leverage the disadvantage of scheduled jobs, that data is updated with a significate delay (ie. Hourly or nightly). I typically do this by implementing the core of the processing logic in a workflow assembly, that is then referenced from the batch job application. The logic that is being run on an individual record, can then be exposed to a CRM user as a synchronous ad-hoc workflow or a custom action. Both are implemented in practically the same way, and can be run from a command bar (ribbon) button. However, if you want to make a button to trigger the logic, I prefer to use a custom action, as you would not have to hard code the workflow id in JavaScript.

Cache non-transactional data

Caching is really efficient in batch jobs, because by nature of the job, many of the records being processed by the job will require the same input information. Therefore, caching should be applied for all base and configuration data used by the job. I use MemoryCache.Default in .Net, and for large configuration tables I cache individual entries, while for smaller tables (less than 200 records?), I often cache the entire dataset. Just make sure not to cache data that changes on a regular basis.

Transactional data (salesorders, opportunities) should never be cached, as it changes all the time, and the amount of data you would end up caching would be too much. In most cases, you would not get a performance benefit for caching transactional data anyways.

NB: If you are re-using your processing logic for custom workflow activities, remember that custom types needs to be serialized before caching (See earlier post).

NBB: MemoryCache is not working in CRM Online plugins and workflows, due to plugins running in isolation, and the sandbox process is being restarted every few seconds! If you have exposed you logic to ad hoc user requests in a CRM Online environment, your code will not fail, but you should not expect running workflows to benefit from caching. I you are in an online scenario, you should make sure not to fetch more data than necessary instead.

Use pagination for retrieving data

Make sure that your bach job logic is scalable, by making sure the job won’t crash with timeouts or OutOfMemoryExceptions. Do this by using paging in all large queries, and make sure that the processing of each row is independent of the other rows in the retrieved page set. Also, for the paging to make sense, don’t keep the retrieved data in memory after you are done processing it.

Save the processing state to the record itself, and split your processing into several independent stages

Rather than creating processing jobs that are long and complex with many dependencies that can make the entire processing fail, try to split the entire process up into smaller steps that are independent of each other. To accomplish this, you should update the records you are working on with the current processing status/stage along the way, so the processing can continue in another run, without starting over from scratch. In order to truly be able to re-process any individual record from any state/stage, you need to write code where processing of one record is completely independent from other records, even if the records are related by nature. This is really useful when using pagination to retrieve the record set to work on, because you can never be sure that all the records related to each other is retrieved in the same page.

When the data you are working on is naturally related, and you cannot relate them in-memory because you have not loaded all the records in memory, you need to maintain the relationships on the database level. This is not as efficient as loading everything into memory, and doing all the processing in one go, but is scales a lot better. The approach also results in more robust solutions, as you can see the intermediate state of records that failed processing, rather than rolling a processing back to scratch. Having records with detailed error codes related to specific processing stages, makes error handling much easier. The chances that a user will actually look at an error and try to fix it (data errors), also increases drastically if the error message is inside CRM, rather than buried in a log file.

Consider an example:

Let’s say we have a invoicing batch job responsible for creating invoices and sending them to the ERP system.

The initial process, which is run daily, could be like the following:

  1. Query all salesorderdetail (orderlines), created the last 24 hours
  2. Create invoice with invoice lines for each salesorderdetail (orderlines)
    • The invoices are bundled by the related account
  3. When the invoices are created, they are sent to ERP through a web service, in a single call containing a list of invoices

The problems with this approach is the following:

  • If one salesorderdetail fails processing, nothing can be invoiced on that account.
  • If the processing logic is not built to expect errors, invoices will be left stranded in an error state, or an entire invoice might to be rolled back in case of errors (if saved in same transaction)
  • If the queried dataset has grown very large, the job might fail with timeout or OutOfMemoryExeptions.
    • Alternatively, you can use paging to avoid this, but then you will not end up with a single invoice per account, as individual lines on the same account might be fetched in separate pages

To improve on the initial design, we can do the following:

  1. Step – Create invoice lines:
    • Query all salesorderdetail (orderlines) in a specific state (i.e. No related invoice line, or invoice status == null)
      • Use paging to reduce the amount of memory used
    • For each salesorderdetail:
      • Create an invoice line
      • Do validation to ensure the record can and should be processed in the next step
      • Set the status on the invoice line (and salesorderdetail). This could also be a validation error code.
  2. Step – Bundling:
    • Query all invoice lines that are validated OK
      • Use paging to reduce the amount of memory used
    • Try to find an draft invoice in CRM
      • If one is found, add the invoice line to it
      • If not create a new invoice in draft state
  3. Step – Calc sums and Validate
    • Query all invoices in draft state
      • Use paging to reduce the amount of memory used
    • For each invoice:
      • Retrieve all invoice lines
      • Sum up amounts
    • Do final validation
    • Update status on invoice from draft to “ready to send”, or set an error code.
  4. Step – Send to ERP
    • Query all invoices in “ready to send”
    • Generate message and call ERP web service
    • Update status on all the invoices

Each of these steps can be run completely independently of each other. If an error occurs, or the processing stops, the processing can continue exactly where it stopped the next time. This is because the records in CRM contains the exact state of the processing, as opposed to keeping state in-memory.

By the way: These steps could be separate batch jobs running at different intervals, or they could be different stages in the same job. It is completely up to you. However, if you expose the job logic as workflows or custom actions to be called by a user manually, it would be a good idea to be able to manually force each step individually. That gives the most control for the user.

Retrieve first, and save last

If possible, retrieve all the un-cached data you need in as few queries as possible. Then process your logic in-memory. Finally, save all the changes using an ExcecuteTransactionRequest.

When retrieving the data for the job, see if you can get all the data you need by joining on the record set used as input for the processing. This will be the optimal solution in terms of performance. In general, when considering performance in batch jobs, you can ignore the in-memory processing and the initial fetch of records to work on. For any long running job, the only relevant performance factors are the fetches made while processing individual rows, along with saving the changes.

Thread safety

Based on the nature of the data you are processing, and the business criticality of the application, you might want to consider thread safety. What you want to avoid is having multiple simultaneous instances of the job running at the same time, processing the same data. This can especially be an issue if you use the strategy of fetching all the relevant data for the entire batch in one big query, and doing the rest of the processing in-memory. The processing can run over a long period of time, and without some locking mechanism, there can be a risk that the data changes without being reflected in the job, while it is running.

Even if you fetch the data for each individual record when it is being processed, and you check if the record still needs processing, there is a risk of double processing. This can happen if the processing is done in a web service layer that supports multiple simultaneous threads, and transaction data is being cached. What can happen in such a scenario, is that when you have two threads processing the same large dataset, the second thread will catch up on the first thread, because when all the required data is cached it can run much faster.

When threads are running in parallel I have seen a case where both threads check if the data should be processed, processes the data in both threads, and then updates the record’s status to prevent further processing. Imagine what this can do in an invoicing scenario!

However, in the most common cases, you don’t need thread safety in batch jobs, because you will have the job hosted in a way that only allows one concurrent thread. As long as a user doesn’t copy the job locally, or a developer decides to debug against production etc. 😉

If you think you need thread safety, I recommend two approaches:

  • Record level locking: If you would like to be able to run processing in multiple treads and you need to make 100 % sure that you have the current version of each record, you need the running thread to lock the individual records to be processed. You can do this by having the job call a web service, that performs the actual processing. In the web service application you can use System.Caching.MemoryCache to keep a list of records that are being processed, and that should be locked for other threads. Use the lock feature of C# whenever updating the list of locked records to avoid timing issues.
  • Global locking: If you don’t need the granularity above, you can implement a lock that makes sure that only one instance of the job can run at the same time, wherever it is being run from. Do this by creating a Lock entity in CRM, create a single record for each job type, and update this record to lock/unlock for processing. You might also want to have a new instance of the job unlock or take over an existing lock, as a failsafe, if the lock has been held for an abnormally long time (Previous job failed to release the lock).

Use impersonation

Plug-ins and workflows automatically use impersonation in the way that any create or update request will run in the context of the user that triggered the event.

When you move the logic out into a batch job (for performance reasons), you often want to have the same functionality. Therefore, consider impersonating the user set as owner of the main record used for processing input when making changes to that record. Impersonation is done by setting CallerId on the OrganizationServiceProxy object.

Deployment options

On-premise deployment

For an on-premise deployment, I usually go with a simple console application that is scheduled Windows Task Scheduler.

Some people tell me they prefer to implement the jobs as Windows Services because it should be more reliable and stable. However, it is not my experience that Windows services are more reliable than a Scheduled Task, so I prefer the Task Scheduler because it has so many scheduling options.

Consider the following when using the Task Scheduler:

  • The task needs to be set to execute whether the owning user is logged in or not.
  • If you need to run the task as a background task, set the owner to “SYSTEM”
    • NB: This will grant full system privileges to the task, so be careful. You also might not be able to open log files etc. generated on the file system.
    • You won’t see the console open when the task is running for background tasks
    • There might be a performance difference between running the task as a background or a foreground tasks. Server systems are typically configured to prioritize background tasks, while PC’s are configured to prioritize foreground tasks.
  • Ensure proper error handling. If the job fails completely, it needs to close itself down. Otherwise it will block further executions later.

For more complex scenarios, you could use Windows Task Scheduler in cooperation with a WCF service layer or WebAPI hosted on IIS.

Cloud deployment

The on-premise deployment is usually not very convenient if you are using Dynamics 365 Online. Instead you could use Azure WebJobs, which gives the same functionality as Scheduled Tasks, in an easy interface.

If you need a more complex set up, using web services, you can host a service application with Azure App Service Web App. You could also have a look at Azure Logic Apps which is really good for orchestration in between.

CRM JS webresource best practice for ribbon Commands / Forms

Introduction

As a CRM developer, working with script web resources in forms and ribbons, I have learned a few lessons over the years, and I have seen a wide spread from best practice implementation to the complete unstructured mess.

What I often see when taking over an existing implementation, is the lack of namespacing in the JavaScript webresources. A flat function based code structure, where a myriad of functions are clashed directly into the DOM, is not only hard to read, but increases the chance of overriding existing functionality. For example, some commonly seen function names are often “onLoad”, “onSave”, “validate” or “calculate”. Imagine it you are using two different web resources with conflicting definitions of “onLoad” on the same form, without namespacing; You are headed into trouble!

Another common “mistake” is using the same webresource for form events, and for ribbon commands. This is not a problem in itself, but you should be aware that the scripts will be loaded twice, and that it might make debugging more confusing. While form web resources are loaded as normal included scripts during form initialization, command code for ribbon buttons are loaded as unnamed dynamic script blocks when the ribbon loads. This means that you cannot set a breakpoint in the ribbon code before clicking a button the first time (because the script is not yet loaded). It also means that if you use the same web resource for the form and the ribbon, the ribbon dynamic script will overload the form script, and thus you will loose the ability to debug the loaded form script (The dynamic script block can still be debugged if you can locate it…).

This post is an attempt to share some best practices that can prevent these issues, based on my experience. Code examples are found in the bottom.

General JavaScript development

Switch to TypeScript

Do your coding in TypeScript to have better intellisense and compile time error checking. ReSharper 9+ have TypeScript support, to help you code even better TypeScripts

NB: To really be flying with TypeScripts, you should use third party libraries that have TypeScript definition files. Luckily there are Typescript Definition files for jQuery, AngularJS, XrmServiceToolkit, Xrm.Page API, etc. See more at http://definitelytyped.org/

Follow general best practices

Follow general best practices for JavaScript that also applies to TypeScript to some extent: http://www.w3schools.com/js/js_best_practices.asp

There are also many online resources to improve your TypeScript style and code quality. Read up on the official documentation and guides like https://github.com/panuhorsmalahti/typescript-style-guide

Here are also two good CRM specific links, from JoeCRM:

CRM JavaScript Best Practices

JavaScript Best Practices for Microsoft Dynamics CRM 2015

Handle dependency conflicts

Be aware that CRM and certain managed solutions uses common third party libraries like jQuery. If you are using different versions of these libs, you might run into problems were something become not compatible as the last loaded version overloads the others. jQuery has a noConflict function that can be used to load several versions of jQuery at once, and this is facilitated by dependency injection frameworks like requireJS (http://requirejs.org/)

Optimize your scripts for good performance

If you have a lot of script files, and it is hurting load time, consider bundling and minifying scripts. This can be done with for example Gulp (http://gulpjs.com/)

General CRM Scripting

Namespaces

ALWAYS use namespaces in CRM JavaScript. Especially if you divide your logic in multiple scripts. It will make it a lot easier to maintain and read if you can see exactly what is being referenced. Also, you should introduce namespaces from the start, mostly because it is a pain to change all the references later on.

I prefer to use a hierarchical namespace structure, with the following naming convention:

[Orgname].CRM.[EntityLogicalName].[Area]

For example: Contoso.CRM.Account.Form and Contoso.CRM.Account.Ribbon

I also prefer to have a separate namespace for internal functions and variables, so that the only thing exposed through the main namespace is the functions registered as commands or event handlers in the forms

For example: Contoso.CRM.Account.Form._privateMembers

Prepare for re-use

Avoid having duplicate code in your Form JS and Ribbon JS. Place common code in a third library instead if possible, like Contoso.CRM.Account.Common.js. This will not only make the code re-usable, but also easier to maintain as changes will only have to me made in one place.

Use Virtual Folders for webresources

To make it easier to keep you scripts organized, group them in folders on your development computer, and make the name webresource name reflect that path. This is also a recommended practice from Microsoft (https://msdn.microsoft.com/en-us/library/gg309473.aspx)

For example: A file C:\ContosoCRM\Webresources\JavaScript\Account\AccountForm.js could be registered as “new_/Account/AccountForm.js”, with “new_” being the CRM publisher prefix.

Ribbons / Command bar

Use an editor

Use a ribbon editor like Ribbon Workbench (https://ribbonworkbench.uservoice.com/). There are others, but Ribbon Workbench is the one I have had best experiences with.

Load the actual logic in the form when appropriate

If the command/logic is only to be used on the form ribbon/command bar, consider having the actual logic loaded by the form itself, instead of as dynamic script in the ribbon. This will make debugging easier, and promotes reuse of the logic for form events. Also, making changes to web resources is a lot faster if you don’t have to change the command definition in the ribbon at the same time (i.e. changed function signature).

For commands that should be available on the Entity home page and in related views, all required scripts should be loaded by the ribbon itself.

Debugging Ribbon Commands

If you have trouble debugging the dynamic script blocks, you can try the “Breakpoints in Dynamic JavaScript” feature in Chrome. This is described along with other debugging approaches in this post: http://blogs.msdn.com/b/crm/archive/2015/11/29/debugging-custom-javascript-code-in-crm-using-browser-developer-tools.aspx

Loading dependencies in Ribbons / Command bars

If you need to load other JavaScripts in a ribbon command before executing your function (i.e. jQuery), load them by calling the function “isNaN”. The reason for this is that a javascript action in ribbon XML requires a webresource and a function name. isNaN() is part of the ecma javascript definition, and will therefore always be declared. It will also always execute without errors, and has a very little overhead.

Add a Keyboard Shortcut to your button

I think it is a good idea to add keyboard shortcuts for command bar buttons in forms. I have successfully registered keyboard shortcuts in the form’s onLoad event using this library: http://www.webreference.com/programming/javascript-keyboard-shortcuts/2.html
Add a shortcut hint as a tooltip on the button in the ribbon, so you won’t forget.

Code examples

Ribbon command script (JavaScript)

This example uses a minimal implementation, where the actual logic is in a separate file that is used both by the ribbon and form events.

var Contoso = Contoso || {};
Contoso.CRM = Contoso.CRM || {};
Contoso.CRM.Account = Contoso.CRM.Account || {};

Contoso.CRM.Account.Ribbon = {
    helloWorldCommand: function () {
        Contoso.CRM.Account.Common.helloWorldCommand();
    }
}

Ribbon command script (TypeScript)

This example uses a minimal implementation, where the actual logic is in a separate file that is used both by the ribbon and form events.

module Contoso.CRM.Account.Ribbon {
    export function helloWorldCommand() {
        Common.helloWorldCommand();
    }
}

Common code used by Ribbon and Form (JavaScript)

var Contoso = Contoso || {};
Contoso.CRM = Contoso.CRM || {};
Contoso.CRM.Account = Contoso.CRM.Account || {};

Contoso.CRM.Account.Common = {
    helloWorldCommand: function () {
        alert("Hello World!");
    }
}

Common code used by Ribbon and Form (TypeScript)

module Contoso.CRM.Account.Common {
    export function helloWorldCommand() {
        alert("Hello World!");
    }
}

Form script (JavaScript)

Adds a keyboard shortcut in the form, to the ribbon button.

Contoso.CRM.Account.Form = {
    _privateMembers: {
        initialize: function() {
            //...
        }
    },

    onLoad: function() {
        // Register shortcuts using http://www.webreference.com/programming/javascript-keyboard-shortcuts/2.html
        shortcut.add("Ctrl+H", Contoso.CRM.Account.Common.helloWorldCommand, { 'type': 'keydown', 'propagate': true, 'target': document });

        Contoso.CRM.Account.Form._privateMembers.initialize();
    }
}

Form script (TypeScript)

Adds a keyboard shortcut in the form, to the ribbon button.

module Contoso.CRM.Account.Form {
    
    // private members
    function initialize() {
        //...
    }

    // public/exported members
    export function onLoad() {
        // Register shortcuts using http://www.webreference.com/programming/javascript-keyboard-shortcuts/2.html
        shortcut.add("Ctrl+H", Contoso.CRM.Account.Common.helloWorldCommand, { 'type': 'keydown', 'propagate': true, 'target': document });

        initialize();
    }
}

Limitations when using OData in CRM to retrieve records

This post describes the limitations of the OData endpoint in CRM 2011, which should still be relevant for CRM 2013 and 2015. There is no reason to believe that the OData endpoint will be developed further as Microsoft is working on a brand new WebApi to replace both the current OData endpoint and the SOAP endpoint.

Please refer to this article for general information on the available options when Qerying CRM using OData: OData system query options using the OData endpoint

Re-blogged from http://blogs.msdn.com/b/crm/archive/2011/03/02/using-odata-retrieve-in-microsoft-dynamics-crm-2011.aspx:

With the release of Microsoft Dynamics CRM 2011, Microsoft have added a Windows Communication Foundation (WCF) data services (ODATA) endpoint. The endpoint facilitates CRUD operation on entities via scripts using Atom or Json format. This post mentions some of the considerations when using the endpoint, specifically around the use of retrieves.

First, the operations supported over this endpoint are limited to create, retrieve, update and delete. The REST philosophy does not support other operations and so we followed J. Microsoft did not implement others since the story around service operations was not fully developed in the current WCF data services offering.

The $format and $inlinecount operators are not supported. CRM’s OData endpoint only supports $filter, $select, $expand, $top, $skip, $orderby

Some of the restrictions when using the implemented operators are.

Operator Restrictions
$expand
  • Max expansion 6
$top
  • Page size is fixed to max 50 records
  • $top gives the total records returned across multiple pages
$skip
  • When using with distinct queries, we are limited to the total (skip + top) record size = 5000.
  • In CRM the distinct queries does not use paging cookie are and so we are limited by the CRM platform limitation to the 5000 record.
$select
  • One level of navigation property selection is allowed I.e.

…/AccountSet?$select=Name,PrimaryContactId,account_primary_contact

…/AccountSet?$select=Name,PrimaryContactId,account_primary_

contact/LastName&$expand=account_primary_contact

$filter
  • Conditions on only one group of attributes are allowed. By a group of attribute I am referring to a set of conditions joined by And/Or clause.
  • The attribute group may be on the root entity

…/TaskSet?$expand=Contact_Tasks&$filter=Subject eq ‘test’ and Subject ne null

  • (or) on the expanded entity.

…/TaskSet?$expand=Contact_Tasks&$filter=Contact_Tasks/FirstName eq ‘123‘

  • Arithmetic, datetime and math operators are not supported
  • Under string functions, Substringof, endswith and startswith are supported
$orderby
  • Order are only allowed on the root entity.
Navigation
  • Only one level of navigation is allowed in any direction of a relationship. The relationship could be 1:N, N:1, N:N

Working with the OrganizationServiceContext

When developing .Net code that is working with CRM data through the CRM Organization Service, we use the OrganizationServiceProxy class of the CRM SDK. Alternatively, we can use the OrganizationServiceContext class, which is generated by crmsvcutil.exe, along with early bound proxy classes for each entity in our CRM organization. The OrganizationServiceContext class builds on top of the OrganizationServiceProxy, and adds a lot of extra functionality like tracking changes, managing identities and relationships, and gives you access to the Microsoft Dynamics CRM LINQ provider.

See more information and examples here:

Use the OrganizationServiceContext class

In most typical cases I will use the OrganizationServiceContext class in order to use Linq to query data in CRM. Linq is by far my favorite method to query CRM for the following reasons:

  • Queries can be written type safe and with 100% intellisense support. I.e. no strings in your code to specify attribute names. Resharper is also a big help wriiting Linq to CRM.
  • Compact queries, either with SQL like syntax or Lambda expressions.
  • Prettier and more readable code than FetchXML and QueryExpressions.
  • It is possible to UnitTest your queries, by stubbing the data sets in the CrmServiceContext, using Microsoft Fakes
  • A query can return more than one entity type at the same time, by constructing an anonymously typed container as return object.

Linq to CRM also have some limitations, for which you should use FetchXML or QueryExpressions instead. Most of all, it is missing outer joins and the “in” operator. Other limitations you should be avare of are described here:

Use LINQ to construct a Query

Another useful aspect of the OrganizationServiceContext is when modifying data in CRM. This is done by attaching objects (entities) to the service context, and submitting the changes using the SaveChanges method. However, you need to be aware that CRM records are by default automatically added to the context when performing Linq queries, so saving changes might have unforeseen consequences.

Before you start creating and updating object through the service context, I suggest reading this excellent post by Scott Durow:

Do you understand MergeOptions?

Comparison of OrganizationServiceProxy and OrganizationServiceContext

OrganizationServiceProxy

OrganizationServiceContext

Bulk insert/update supported

No. All Update/Create/Excecute calls are handled as individual web service calls.

Yes. All changes in the service context are processed in a single web service call with SaveChanges. This also support relating objects to each other in the context, and have multiple inter-relationships created in the same service call.

When to use

  • When you need simple CRUD operations on a single or a few records.
  • When you are using Linq and don’t understand merge options, or the differences when modifying objects with the two different approaches
  • When you are on CRM 2011 rollup 11 or earlier.
  • When you need to modify or create many objects at the same time, you should use bulk update for improved performance.
  • When you already tracking objects you have retrieved through the service context, you can reduce the code by just saving changes to the retrieved objects

Creation of objects

OrganizationService


var account = new Account
{
    Name = "Test account",
    Address1_City = "Copenhagen"
};
var accountId = service.Create(account);

NB: The id is returned

OrganizationServiceContext


using (var ctx = new CrmServiceContext(service))
{
    var account = new Account
    {
        AccountId = Guid.NewGuid(),
        Name = "Test account",
        Address1_City = "Copenhagen"
    };
    ctx.AddObject(account);

    ctx.SaveChanges();
}

NB: The id is not returned. Id’s should be assigned manually in code before saving if the id should be used later in the code. Alternatively; use ctx.AddLink() to have references between created objects automatically created in SaveChanges, whithout having to know the Ids beforehand:


using (var ctx = new CrmServiceContext(service))
{
    var account = new Account
    {
        Name = "Test account",
        Address1_City = "Copenhagen"
    };
    ctx.AddObject(account);

    var contact = new Contact
    {
        FirstName = "John",
        LastName = "Doe"
    };
    ctx.AddObject(contact);

    ctx.AddLink(contact, new Relationship("contact_customer_accounts"), account);

    ctx.SaveChanges(); // Creates two objects, and sets a reference between them
}

Update of objects

OrganizationService


var accountUpdate = new Account
{
    AccountId = accountId,
    Name = "Test Account"
};

service.Update(accountUpdate);

OrganizationServiceContext


using (var ctx = new CrmServiceContext(service))
{
    var accountUpdate = new Account
    {
        AccountId = accountId,
        Name = "Test Account"
    };
    ctx.Attach(accountUpdate); // The object is now tracked, and will be evaluated by SaveChanges()

    ctx.UpdateObject(accountUpdate); // Equivalent of OrganizationService.Update()

    ctx.SaveChanges(); // This performs the actual call against CRM 
}

As mentioned earlier, objects that are retrieved from CRM through the service context, are automatically tracked, based on the merge options in effect. This can lead to unexpected errors if you are not aware of the Automatic tracking of objects:

using (var ctx = new CrmServiceContext(service))
{
    var trackedAccount = ctx.AccountSet.First(a => a.AccountId.Value == accountId); // Automatically tracked in context

    var accountUpdate = new Account
    {
        AccountId = accountId,
        Name = "Test Account"
    };
    ctx.Attach(accountUpdate); // Error: System.InvalidOperationException: The context is already tracking a different 'account' entity with the same identity.
    ctx.UpdateObject(accountUpdate);           

    ctx.SaveChanges();
}