Some ideas that came up while tinkering with this library

Nov 18, 2014 at 11:12 PM
So! I was tasked with adding some second-level caching support to one of the applications at work, and EFCache seemed like a natural fit. We've been using it for a while now, and our results have been great!

I had to replace a few components, though, and that work lead to some questions/ideas.
  1. Why does the InMemoryCache class use a dictionary/hard references to maintain the cached items? A System.Runtime.Caching.MemoryCache object is much more adjustable and automatically supports a lot of the things that are handled explicitly in InMemoryCache (like removing expired items.) Plus, it has built-in memory pressure limits, all of which can be adjusted after deployment via the app.config. :)
  2. Why does InMemoryCache use hard locks when accessing the cached items? Wouldn't a ReaderWriterLock or ReaderWriterLockSlim be leaner and better model the many readers/one writer concurrency relationship you're going for?
  3. Do you have any plans for MSDTC support? It was a requirement for us, so I've already implemented it once, and what I did or something similar might help other people.
Coordinator
Nov 19, 2014 at 3:40 AM
1,2 - InMemoryCache is just a very simple implementation that was supposed to prove that things work and also to show how to plug in your own caching mechanism.
  1. Are you asking about this? I have not looked into this much yet. If there is interest and it is feasible I might look into it. I would very likely accept a (quality) pull request.
Thanks,
Pawel
Nov 19, 2014 at 4:14 AM
On 1 and 2 - fair enough...though I'll say I was surprised when I realized I needed to replace it.

And yes! We use System.Transactions/TransactionScope/MSDTC transactions a lot, so that was kind of essential for us - not sure how important it is to the more general audience.

My current..perhaps not optimal or entirely transaction-isolation-level-aware..design works like this:
  1. When changes are applied and there's an active System.Transactions transaction, the modified entitysets are associated with its transaction id, but the cache is not updated as the changes are not yet committed.
  2. An event handler is attached to the transaction to expire the affected sets when/if the transaction is committed.
  3. From that point on, if a query from within that same transaction id touches any of the affected entity sets, it's treated as a cache miss as the underlying data is different for that transaction than it is for the system as a whole.
  4. When the transaction reaches the end of its life, the event handler attached in step 2 springs into action. If it was rolled back, everything referencing it gets unregistered. If it was committed, any cached queries referencing the affected entity sets get purged from the cache so that they can be globally repopulated as they're reissued.
Also, in the current code, it's a separate MSDTCAwareTransactionHandler (name negotiable), so if a given app didn't need that behavior, they could still reference the vanilla handler and avoid any overhead.

We can hash out the details, or, if it makes sense, I can put together a pull request this weekend.

Thanks!
Nov 26, 2014 at 2:34 AM
Hi wscalf,

The improved code is yet to be merged with the current version ?

Can you please share your modified version ? (in code project or other blogs) ?

Thanks!
Nov 26, 2014 at 2:59 PM
Edited Nov 26, 2014 at 3:01 PM
It hasn't been merged in yet, no; I just got the go-ahead from my boss today.

The way the code is structured, I could make it a separate package that depends upon and extends EFCache (as that's what we're currently doing), but I think it would be more cohesive and easier to consume as part of the base.

Now that I have the go-ahead, I'll put together a pull request for Pawel and see what he thinks.
Coordinator
Nov 26, 2014 at 5:03 PM
@wscalf - create a pull request and I will take a look
Dec 3, 2014 at 1:34 PM
I have most of my changes staged in a local repository, but I've hit a snag - the tests aren't showing up in Test Explorer. I can see from the attributes being used that they're XUnit tests, and I thought XUnit was natively supported in Visual Studio 2013, but it doesn't seem to be working. Do I need an add-on? Is there any one in particular you use?
Coordinator
Dec 3, 2014 at 6:24 PM
You need the xUnit runner. You can install it from here: https://xunit.codeplex.com/releases/view/114414. I think the 1.9.2 stable release should be enough (I have 0.99.8 on my boxes since I have some projects using xUnit 2.0 - if you feel adventurous you can try it out- it can run xUnit 1.x tests).

Thanks,
Pawel
Dec 4, 2014 at 5:35 AM
Update:
Cool - I got the test runner going and am plodding along. I've started committing to a fork (https://efcache.codeplex.com/SourceControl/network/forks/wscalf/efcachemerge), so that's there if you want a preview - The ScopeAwareTransactionHandler, some extensive refactoring of the actual cache mechanism, and some other assorted goodies are already in there. I'll be adding some tests to cover some of the new functionality before making the actual pull request.

I have made one potentially controversial change: I removed the test for the Purge() method on InMemoryCache class and turned it into a stub with the ObsoleteAttribute. I'm not completely clear on why it was there, and if it needs to come back I can see what I can do, but the System.Runtime.Caching.MemoryCache class already does a pretty good job of cleaning itself automatically (either when you read from it or on a configurable interval), so it seemed redundant, and since it was specific to the InMemoryCache concrete class, I didn't see it affecting any other types, and deprecating it gives consumers of the library a chance to remove calls to it..if they're using it. In the code base I'm migrating from, we implemented the ICache interface directly, and so didn't have the Purge() and Count members.

Also potentially cool- I did some rudimentary and unscientific benchmarking on the InMemoryCache before and after. The test, which looked a lot like this:
        static void Main(string[] args)
        {
            var cache = new InMemoryCache();

            var random = new Random((int)(DateTime.Now.Ticks % Int32.MaxValue));

            var watch = Stopwatch.StartNew();
            var result = Parallel.ForEach(Enumerable.Range(0, 1000000), i =>
                {
                    var key = random.Next(1, 32).ToString();

                    object data;
                    if (!cache.GetItem(key, out data))
                        cache.PutItem(key, i, new string[] {}, TimeSpan.FromMilliseconds(1), DateTimeOffset.MaxValue);
                });

            watch.Stop();

            Console.WriteLine("Completed in {0} seconds.", watch.Elapsed.TotalSeconds);
            Console.ReadKey(intercept: true);
        }
..ran through a million hits to the cache, each using a randomly selected key from a pool of 32 options, and added the item to the cache if it was missing (with a sliding expiration of 1ms), with several worker threads (4 on the development VM I'm using.) And the results were somewhat impressive:
Before: 2.2565414s
After: 0.44606s

Feel free to peruse if you like, and let me know if anything jumps out at you - I'll get you a formal pull request in a day or two, time permitting.

Thanks!
Coordinator
Dec 4, 2014 at 6:16 AM
Edited Dec 4, 2014 at 6:17 AM
Purge is needed in cases where you have a lot of one-off queries which pollute the cache and will never be removed. This is because the items from the memory cache are only removed if you try to get an item and it happened to be expired (it actually is not removed but replaced with a fresh item). Since purging the cache is just an implementation detail (e.g. this can be implemented by having a thread that periodically purges the cache) I did not make it a method on the ICache interface. I believe this method is useful and people are using it so I would like to leave it as is.

Thanks,
Pawel
Dec 4, 2014 at 12:34 PM
I think I follow, and given that behavior it makes sense.

I'm using .NET's built-in cache (http://msdn.microsoft.com/en-us/library/system.runtime.caching.memorycache%28v=vs.110%29.aspx) which behaves a little differently, though. It already has a thread that purges it periodically (on an interval that be adjusted either in code or via app.config - the default is every 3 minutes), and when you retrieve an item it not only checks that item for expiration but any items stored in close proximity to it, so they do end up getting cleaned out. This is another reason I prefer it over using a vanilla dictionary for storing the cached data.

Unfortunately, it doesn't appear to offer a mechanism to force it to reevaluate all of the items it contains. If this is a major sticking point, I can take another pass at making it purge on-demand or leave the InMemoryCache alone and provide an alternate cache implementation, but it seems to me like it already addresses the need (which, as you pointed out, is an implementation detail stemming from using a dictionary), and it'd be confusing to have multiple memory caches.

Not sure. Thoughts?

Thanks!
Coordinator
Dec 5, 2014 at 10:02 PM
But you don't have to have Purge in your implementation. This is why it is not on ICache interface. ICache interface describes the contract and Purge is not there and no-one is expecting any of the ICache implementations to have Purge method. Even if you implement it there would not be a uniform way to call it. The sole reason why MemoryCache has the Purge method is that the cache eviction mechanism is so simplistic that it does not handle stale items at all. If, as a MemoryCache user, you found that stale cached items take too much space you would call MemoryCache.Purge (but you won't ever be able to call ((ICache)myMemoryCache).Purge because it won't compile.

Hope this makes sense.
Pawel
Dec 5, 2014 at 11:58 PM
It does - I think I'm the one who's being confusing. :)

I was describing a replacement for the InMemoryCache rather than an alternate implementation, which was where the Purge() method became an issue. Making an alternate implementation solves that problem, and is what we did in our application, but putting it in the library introduces another one because it could be confusing to consumers of the library to have to choose between two local, in-memory implementations. I think I can draw up some documentation that outlines the differences (the MemoryCache-based on is faster in multithreaded environments and provides some really nice enterprise app features, but the currently built-in one is leaner and actually can be faster in situations where there are hundreds of thousands queries being cached), but the more I think about it, the more I'm leaning toward putting the cache implementation on hold that for the time being.

Instead, I'm going to focus on getting the transaction scope support and tests over to you, since it should drop right in and it seems to be what people are asking for. We can come back to the caches.

Thanks!
Coordinator
Dec 7, 2014 at 6:27 AM
I did not realize that you wanted replace the InMemoryCache instead of creating an alternate implementation. I think you can either replace the current implementation in which case you would just remove the Purge method because it is not useful at all if the cache takes care of removing stale entries itself or you create an alternate implementation (e.g. BetterInMemoryCache) and post is as a separate package on NuGet. You can point out that the default InMemoryCache is just a very simple implementation and for production you solution is just better. I could provide a link to your implementation of InMemoryCache in my blog and also suggest it as a better solution for real world applications. As I said before the implementation included in the package is very simplistic and served three purposes - to prove that the ICache interface has everything needed to implement a caching mechanism, to allow people to get started (without this the EFCache would be a dead project because you would not be able to use it without writing your own caching mechanism) and to show how implement your own caching mechanism. If using System.Runtime.Caching.MemoryCache fulfills these objectives I don't care too much about my own implementation. Removing Purge method would be a breaking change but I think it's reasonable (note - I would not make it a major release for this reason). Anyways, as you said the actual InMemoryCache is orthogonal to adding support for transaction scope and can (and should) be done separately.

Thanks,
Pawel
Coordinator
Dec 9, 2014 at 4:26 PM
@wscalf - Thanks a lot for the PR. I will try to look at it over the weekend.