Examine Provider for LINQ2Umbraco
FEBRUARY 27 2011The last couple of months I have been fortunate enough to work on a couple of Umbraco projects, which involved using Examine for searching and in the latest project we are using a Lucene index for caching Umbraco Members. This works surprisingly well and the performance of using Examine as a posed to the standard Member API is without comparison. But performance wasn't the only gain, as we have to retrieve the members based on different properties, which we now can do with a simple search.
After having worked with members through Examine I started thinking about whether there was anything to gain by using Examine to store and retrieve Umbraco documents. There is no doubt that Examine is super fast, but storing a page with its properties would need some modifications: 1. When Examine is indexing content it uses standard Lucene Analyzers, and HTML is stripped from the content, which makes perfect sense in a search scenario, but not so much when using it as an Umbraco document store. 2. Everything isn't indexed by default when using Examine. A few standard properties are not being indexed, and others are being Analyzed by default. So first step is to create a custom indexer, which would enable storing a complete document with all its properties, which is not analyzed upon indexing. Creating a custom Examine indexer is fairly straight forward. Shannon and Aaron have really done a great job with the implementation of Examine, and extending/modifying/creating is possible in every way imaginable.
With storing sorted, retrieval is the next step and creating a provider for LINQ2Umbraco is an obvious choice - well, at least in my opinion. Who doesn't love strongly typed objects :) Again I have to give a shout-out to the black belt Ninja of the Aussie Umbraco Clan for creating a provider based LINQ2Umbraco implementation. After spending a couple of hours reading the source I started to create my own provider, which uses Examine for retrieving Umbraco documents instead of using the XML context like the standard implementation. I made the provider by creating an ExamineDataProvider which implements the abstract UmbracoDataProvider class, an ExamineAssociationTree which implements the abstract AssociationTree class and an ExamineTree which implements the abstract Tree class. I also made an ExamineDataContext, which implements the IUmbracoDataContext interface. Pretty straight forward, right. The hard part was figuring out how to replace the XML context lookups with Examine searches, but it actually worked out pretty good.
Because my initial intension was to create something that performed well, performance testing will be the final examination (no pun intended). In order to test performance of the Examine provider I created a number of test cases, which retrieves and handles data in different ways. For comparison I created the same tests using the LINQ2Umbraco provider. Each test case is run 15 times and the outcome is the average of these test runs.
The Umbraco setup is version 4.6.1 and .NET 4.0 with 7 different Document Types with the following structure:
Home (Number of nodes in Umbraco: 1) - EventOverview (Number of nodes in Umbraco: 1) -- EventType (Number of nodes in Umbraco: 5) --- Year (Number of nodes in Umbraco: 22) ---- Month (Number of nodes in Umbraco: 237) ----- Event (Number of nodes in Umbraco: 2.896) ------ EventSection (Number of nodes in Umbraco: 28.830)
I generated a number of nodes in Umbraco to test different scenarios using the above structure. The scenarios range from getting a specific node by id, get parent of a node, ancestors and traversing children of a node. All common tasks in my opinion, and I could probably have included a lot more.
1. Scenario - DataContext: The first scenario is not all that interesting as its just running a simple get from the DataContext of each provider, and as you can see from the numbers there is no significant difference between the two.
| Examine | LINQ2Umbraco |
| GetMonths: 00:00:00.0000506GetYears: 00:00:00.0000477 GetEvents: 00:00:00.0000466 GetVarious: 00:00:00.0001604 | GetMonths: 00:00:00.0000508GetYears: 00:00:00.0000472 GetEvents: 00:00:00.0000467 GetVarious: 00:00:00.0001605 |
Example: var months = _dataContext.Months; |
|
2. Scenario - Get Parent: In this test I get a specific node by id and lookup the parent with type.
| Examine | LINQ2Umbraco |
| GetMonthParent: 00:00:00.0045961GetEventParent: 00:00:00.3434473 GetEventSectionParent: 00:00:00.7446487 | GetMonthParent: 00:00:00.6830741GetEventParent: 00:00:00.9745453 GetEventSectionParent: 00:00:01.2388632 |
Example:
var month = _dataContext.Months.Where(x => x.Id == 11321).FirstOrDefault();
var year = month.Parent();
|
|
3. Scenario - Traverse: In this test I get Months from the DataContext and loop through all months.
| Examine | LINQ2Umbraco |
| TraverseMonths: 00:00:00.0038782 | TraverseMonths: 00:00:00.6199738 |
Example:
var months = _examineDataContext.Months;
foreach (var month in months)
{
var name = month.NodeName;
}
|
|
4. Scenario - Traverse Children: In this test I get i.e. Events from the DataContext, loop through all events and its children
| Examine | LINQ2Umbraco |
| GetYearMonthTraverseChildren: 00:00:00.0662222GetEventsTraverseChildren: 00:00:10.7358665 | GetYearMonthTraverseChildren: 00:00:00.8251347GetEventsTraverseChildren: 00:00:23.0377022 |
Example:
var events = _examineDataContext.Events;
foreach (var @event in events)
{
if (@event == null) continue; foreach (var child in @event.Children)
{
string name = child.NodeName;
}
}
|
|
5. Scenario - Count: In this test I simply count the number of items of different types from in the DataContext
| Examine | LINQ2Umbraco |
| GetMonthsCount: 00:00:00.0031931GetYearsCount: 00:00:00.0004367 GetEventCounts: 00:00:00.3420866 GetEventSectionsCount: 00:00:00.7414970 | GetMonthsCount: 00:00:00.6178589GetYearsCount: 00:00:00.5692922 GetEventsCount: 00:00:00.9023448 GetEventSectionsCount: 00:00:01.1714143 |
Example: var events = _examineDataContext.Events;
int count = events.Count(); |
|
6. Scenario - Ancestors: In this test I get a specific Event from the DataContext, and get different Ancestors based on type
| Examine | LINQ2Umbraco |
| GetEventAncestorAsYear: 00:00:00.3837243 GetEventAncestorAsHome: 00:00:00.0011323 | GetEventAncestorAsYear: 00:00:01.0451501GetEventAncestorAsHome: 00:00:00.0014134 |
Example:
var firstEvent = _examineDataContext.Events.Where(x => x.Id == 1210).FirstOrDefault();
var home = firstEvent.AncestorOrDefault();
|
|
Note: All numbers are calculated averages from 15 test runs in the following format: hh:mm:ss:ms
For now I'll let the numbers speak for themselves. But I would like to include some tests for the new Query options that comes with Umbraco 4.7 for a little more perspective, and comment some more on the numbers from the various test cases.
If you have some test cases that you think would be valuable for this comparison, please feel free to add a comment with your test case.
I will try to wrap up the code in a solution and package it up with an Umbraco site and test cases, so others can have a go. So check back for updates.
All in all I'm pretty happy with the performance tests so far.
Here is links to the solution containing LINQ2Umbraco with my addition of an Examine Provider (please note that the solution contains the original LINQ2Umbraco source as it has internal dependencies):
Download solution Download assemblies only
The solution and assembly has an Examine indexer, which you should use for indexing your content before trying to use the provider to get strongly typed objects from your index. Set type="LINQ2Examine.Examine.Indexer.ExamineNodeIndexer, LINQ2Examine" for your ExamineIndexProvider in your ExamineSettings.config in order to use it for indexing.
Reference your generated datacontext like this:
public partial class UmbracoDataContext : LINQ2Examine.ExamineDataContext, IUmbracoDataContext public interface IUmbracoDataContext : LINQ2Examine.IUmbracoDataContext
When using the Examine DataContext you initialize like this:
ExamineDataProvider dataProvider = new ExamineDataProvider("MyIndexSearcher"); Generated.IUmbracoDataContext dataContext = new Generated.UmbracoDataContext(dataProvider);
I will add the test solution over the coming weekend, as I need to clean up the current solution and move it to a demo site instead.
Discussion