Who hasn't seen in almost every best practices guide "cache everything!", but what they forget to tell you is that by that time, the caching you're using is one of the biggest bottlenecks of your program, you try to use as many dictionaries as you can (at least for a particular caching need) and avoid locking and blocking in any way possible.
When Microsoft decided its time for .NET to start using multi-threading heavily, they came out with Parallel Extensions, later on they added ConcurrentDictionary, but I've always wondered what is the price we're paying for the dictionary to work in a thread-safe environment.
I can't stress this enough, Do not use a regular Dictionary with multi-threading!
In some cases when the dictionary is updated by two (or more) threads it will corrupt its internal structures and it will throw exceptions. sometimes items will disappear, sometimes the whole dictionary will stop working.
In the past I've implemented a dictionary with lock keyword, spinlock, read/write locks and even a copying dictionary so whenever you had to make a change, it will copy itself, make the modification and replace the reference, I didn't care about losing information, I just wanted the quickest dictionary.
So how does the ConcurrentDictionary works? well, using ILSpy, we can take a look, I can't post the source code here so you'll have to do your own digging, but apparently its using Monitor.Enter for the writing portion, which is just an alternative of the lock keyword, for the reading portion it seems that its pretty close to the regular Dictionary.
I remembered reading about lock-free hashmaps (PDF) in the past but didn't have the time looking up the algorithm, understanding it and implementing it, luckily someone already did but I can't find the original C# implementation, here's what I've found now https://github.com/hackcraft/Ariadne.
I did some benchmarks and the results are pretty interesting.
The categories (X) are the number of concurrent threads attempting to write and read.
Note: the graphs do not work in IE for now.
You may find the benchmark project at:
https://github.com/drorgl/ForBlog/tree/master/DictionaryBenchmarks
I've taken the readwrite locked dictionary from Lorenz Cuno Klopfenstein, I've implemented a minimal spinlock dictionary just for the test, don't use it.
The ThreadSafeDictionary is the star of this benchmarks, it performs exceptionally well, so if your program relies heavily on dictionaries and threading, use it.
Thursday, May 24, 2012
Subscribe to:
Post Comments (Atom)
The referenced article uses Parallel.For, which in turn uses the TPL.
ReplyDelete"By default, For and ForEach will utilize however many threads the underlying scheduler provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used."
ref: https://msdn.microsoft.com/en-us/library/system.threading.tasks.paralleloptions.maxdegreeofparallelism.aspx
The number of threads affect the benchmarks, in this case, lock contention, reference:
https://www.jetbrains.com/help/profiler/2016.1/Lock_Contention.html
While locks can affect application performance, in most cases these microbenchmarks will not show a potential bottleneck as most real world applications have many other flows and processes that will take more processing time from the CPU, such as IO.
If your application does suffer from locking problems that does affect its performance, I would start by analyzing the performance with tools such as Visual Studio Performance Profiler, which has a concurrency profiler that can point to possible locking issues.