Intermittent NHibernate related errors

Jul 12 at 3:37 PM
We're using the most recent version of the Rework.NavigationCache module for caching a fairly large taxonomy menu. We recently began seeing seemingly random NHibernate related exceptions popping up very intermittently on our production site. When they happen they require an app restart to get things working again.

Here's an example of some of the exceptions we've seen:
System.ObjectDisposedException: Session is closed!
Found shared references to a collection: Orchard.ContentManagement.Records.ContentItemRecord.Versions
The column cannot contain null values. [ Column name = ContentItemRecord_id,Table name = Orchard_Framework_ContentItemVersionRecord ]
Enumerator was modified
We have been able to reproduce the issue if we put the site under load with a mix of simultaneous content item views and content item updates that involve changes to taxonomy terms associated with the content item. If we disable the Rework.NavigationCache the issue never happens.

Has anyone seen this before? I don't see anything that I believe would cause this in the code but the issue only appeared after we installed the module and cannot be recreated with it disabled.

I also see a comment in the GetCachedMenuPartContent method of the MenuWidgetPartDriver that says the following:
Due to the issues of HtmlMenuItem throwing MSPTC errors (i.e. the transaction scope seems to be closed and most hosts don't allow MSPTC) on the following line - MenuItemLink-HtmlMenuItem.cshtml <span class="raw">@Html.Raw(Model.Content.BodyPart.Text) I am going to replace the cached version with a freshly queried version. This is not limited to HtmlMenuItem unfortunately, anytime in a view that low level data is attempted to be accessed this whole process can break
I'm not sure I fully understand the comment or what MSPTC is (did it mean MSDTC?), but we do have what might be considered a "low level data" call in a MenuItemLink alternate view that's used by the menu in question. Here's what it looks like:
@if (Model.Item.Content.ContentItem.ProductCategoriesTerm.BadgeImage.MediaParts.Count == 0) { 
<a href="@Model.Href">@Model.Text</a>
}
Commenting out the if statement in this view eliminates the issue, but I don't understand why and we need to access the properties of the menu item.

Has anyone run into this before? Any thoughts or suggestions would be appreciated. Thanks!
Coordinator
Jul 12 at 4:16 PM
Hi Josh,

Sorry, looking back I believe it was MSDTC (will try and get that updated). basically what is happening (as I understand it) is that I have to clone the menu items in order to prevent the changes from being persisted back to the database. This works for the majority of the cases except when a view attempts to do a low level data access (i.e. tries to retrieve more information from the database that wasn't already passed into the view or cached in my clone. In these cases, the server freaks out and throws the MSDTC issue stating there is no open scope to call the database with.

So even though this module is great at what it does (I am biased I know) it fails on these deep calls. Unfortunately, I have not been able to determine a way around this that works every time. Even in the BodyPart example, I hate having to get a "fresh" MenuItemLink from the database each time as that somewhat defeats the purpose but if you don't then the deep clone doesn't work.

So, here are my thoughts on what would be a BETTER Navigation Cache module (and yes, even on 1.8.1 we still need this module for performance). When I built the module I approached it from the perspective o the underlying data access (i.e. how to mitigate calls to the database). But as I think about this today, what would likely be a better approach is to utilize something more like Output Cache (i.e. is there a way we can just store the results - unique to Roles of course so it works even when the user is logged in) and skip over the whole data processing.

This is not something I can jump on right away though I welcome your thoughts. As for your immediate needs, you could remove your item from being cached (i.e. get a fresh copy from the database as I do for "HtmlMenuItem") but that would totally defeat a lot of the point of the module... It would be a bit faster than native Orchard, but you would still have N calls to database for each taxonomy. Perhaps you could put in a nested database cache that get's reset when your taxonomies change, but again, a lot of custom management that I wanted to build this module to avoid.

Thoughts welcome.
Jul 12 at 4:59 PM
Edited Jul 12 at 4:59 PM
Thanks for the quick response Jeff. I'll take some time to think about the options you mentioned - very helpful.

One more question... this happens only under load and appears to be timing related as we can never reproduce it immediately on demand, do you know why? It seems that the issue you're describing would happen every time we try to make a deep call. Has that been your experience?

And by the way, the module is great! The performance gains we've seen with it have been excellent. Thank you for making it available!
Jul 12 at 5:05 PM
Edited Jul 12 at 5:05 PM
I also agree with your ideas regarding an improved version of the module. Actually, even better might be support for donut whole or fragment caching in Orchard. That way we could take any widget and cache it's output. I've seen a ton of discussions on this topic but never any implementations besides some ajax based approaches that are problematic for SEO. I'd be happy to help if I can...
Coordinator
Jul 12 at 5:10 PM
Now that is interesting, yes, I was able to reproduce mine on demand. However, as memory serves me now, I do believe on localhost the "HtmlMenuItem" would sometimes work. I can't rightly tell you why it would be intermittent. One possible idea would be that sometimes it can piggyback on an existing open connection? I don't know if that is plausible or not. Put differently, another solution to the problem of the clone would be if it could (somehow) reopen the connection to the SQL server. I am pretty sure that is NOT an option though as it could lead to other issues. In short, I just don't know.

Here is another concept that could work for you (as you have a very specific example of failure). The first time the deep call for data occurs it is operating on the original item from the database (i.e. not the clone). So, what if you could somehow cache this (in the view?? or in the Shape OnDisplaying event would seem to be better) so your view just checked passed in cache value every time rather than the deep call. This would be more efficient anyway than running the SQL call. Of course, you have to make this cache expire the same time the underlying Rework.NavigationCache expired (not hard, whenever content is published, just tap into the trigger event).

That might be the most efficient approach.

One final note, I locally have code changes that will be pushed out as I have switched Rework.NavigationCache to use Orchard.Caching module. It shouldn't impact you much at all, just an FYI that it is coming.
Coordinator
Jul 12 at 5:12 PM
Yes, fragment output caching would be the best. Orchard could also be improved a lot by allowing a checkbox for "Role Based Output Caching". I have sites where users log in and they get no benefit from Output Caching (though they often could if we could just have the Role be a "key" in the cache as I hardly EVER have cache based on the user). Great thoughts, love to expand on them.