Simplify your locking strategy -- simpler is better

Article
04/28/2006

Well I didn't mean to but it seems I'm writing a series of articles on locking.

Based on my last posting I got several requests for a simple example of a case that could be improved with simpler locking. Here's something of that ilk that's similar to something I encountered in The Real World. The 2nd choice is based on what I recommended.

     class FreeThreaded1
    {
        private static volatile Dictionary  dict = new Dictionary();

        static void UpdateThreadProc() // one thread runs this
        {
            for (;;)
            {
                using (StreamReader sr = new StreamReader("c:\\pickupdir\\dictionary.txt"))
                {
                    lock (dict)
                    {
                        dict.Clear();

                        String line = null;

                        while (null != (line = sr.ReadLine()))
                        {
                            int iTab = line.IndexOf('\t');

                            if (iTab < 1 || iTab > line.Length-2)
                                continue;

                            string key = line.Substring(0, iTab);
                            string val = line.Substring(iTab+1);

                            dict.Add(key, val);
                        }
                    }
                }

                System.Threading.Thread.Sleep(5000);
            }
        }

        static string ReadEntry(string s)  // this is called by many threads at random
        {
            lock (dict)
            {
                return dict[s];
            }
        }
    }


    class FreeThreaded2
    {
        private static volatile Dictionary  dict = new Dictionary();

        static void UpdateThreadProc() // one thread runs this
        {
            for (;;)
            {
                using (StreamReader sr = new StreamReader("c:\\pickupdir\\dictionary.txt"))
                {
                    Dictionary d = new Dictionary();
                    String line = null;

                    while (null != (line = sr.ReadLine()))
                    {
                        int iTab = line.IndexOf('\t');

                        if (iTab < 1 || iTab > line.Length-2)
                            continue;

                        string key = line.Substring(0, iTab);
                        string val = line.Substring(iTab+1);

                        d.Add(key, val);
                    }
                    dict = d;
                }

                System.Threading.Thread.Sleep(5000);
            }
        }

        static string ReadEntry(string s)  // this is called by many threads at random
        {
            return dict[s];
        }
    }

Can you spot at least 3 reasons why I would like the 2nd one better? Not that it's perfect...

Comments

Anonymous
April 28, 2006
Well...

- Avoids locks (probably faster)
- Atomic commits (don't run into getting half an updated dictionary)
- Less code (clearer and probably more bug-free, especially when there's less concurrency-related code)
Anonymous
April 28, 2006
It looks like the second version should be better memory wise since the dict instance will be collected as one instance, instead of it having to do the work for clear()
Anonymous
April 28, 2006
The comment has been removed
Anonymous
April 28, 2006
The FreeThreaded2 is better, reasons:

- less locking, especially when calling the ReadEntry - assuming this is called quite often = faster application run

- the contention is not transferred to other threads (while filling the new dictionary the other threads are still served from the old one) *can be also disadvantage, depending on the situation.

- I guess there is some advantage in allocating new dictionary over calling Clear.

disadvantage - assumes that dict = d; is an atomic operation.
Anonymous
April 28, 2006
Does volatile keyword protect dict from processor reordering? Or is a call to System.Threading.Thread.WriteMemoryBarrier() required?
Anonymous
April 28, 2006
Also with the no lock approach, if a context switch happens while processing dict[s] and the value of dict changes, can the old dict be garbage collected?
Anonymous
April 28, 2006
disadvantage - assumes that dict = d; is an atomic operation.

Why wouldn't it be? It's only updating a reference.
Anonymous
April 29, 2006
The comment has been removed
Anonymous
April 29, 2006
The comment has been removed
Anonymous
April 29, 2006
The volatile keyword forces a full update of a reference all the way out to system memory whenver that variable changes on any thread, so "dict = d;" would indeed be an atomic operation.
Anonymous
April 29, 2006
I am also quite interested when an operation is atomic and when not. MemoryBarriers are a very interesting but I do not know by what means they can be replaced. I found them used in the BCL code:
taken from System.Environment:
private static void InitResourceHelper()
{
bool flag1 = false;
bool flag2 = false;
RuntimeHelpers.PrepareConstrainedRegions();
try
{
RuntimeHelpers.PrepareConstrainedRegions();
try
{
}
finally
{
Thread.BeginCriticalRegion();
flag1 = true;
Monitor.Enter(Environment.InternalSyncObject);
flag2 = true;
}
if (Environment.m_resHelper == null)
{
Environment.ResourceHelper helper1 = new Environment.ResourceHelper();
Thread.MemoryBarrier();
Environment.m_resHelper = helper1;
}
}
finally
{
if (flag2)
{
Monitor.Exit(Environment.InternalSyncObject);
}
if (flag1)
{
Thread.EndCriticalRegion();
}
}
}

I think it would be quite interesting to see why the code was this way written in terms of:
- Concurrency
- Exception Safety (Constrained Execution Regions)
- Performance

Yours,
Alois Kraus
Anonymous
April 29, 2006
my docs says about volatile:
The system always reads the current value of a volatile object at the point it is requested, even if the previous instruction asked for a value from the same object. Also, the value of the object is written immediately on assignment.

Which says nothing about the write itself being actually atomic. That means if the write is say 2 non-atomic instructions, you can be interrupted in middle of the write and if you'll read the memory meanwhile, you'll get some semi-updated reference (say 1st 4bytes updated, 2nd not).

Maybe I'm wrong on this, but I introduced a few locks just to be sure on this side.
Anonymous
April 29, 2006
georgem:

If I’m not mistaken, the atomicity of the assignment is guaranteed by the c# language. From the ECMA-334 (C# Specifications), 12.5 Atomicity of variable references: 'Reads and writes of the following data types shall be atomic: bool, char, byte, sbyte, short, ushort,
uint, int, float, and reference types. ...'
Anonymous
April 30, 2006
Yes, better to read the specifications (although quite they are quite boring to read).
It is specified in 12.6.6. of ECMA-335 (CLI):
Object references shall be treated as though they are stored in the native word size.

and

All reads and writes of certain properly aligned data types are guaranteed to occur atomically.

and

A conforming CLI shall guarantee that read and write access to properly aligned memory locations no larger than the native word size (the size of type native int) is atomic (see §12.6.2) when all the write accesses to a location are the same size.

you know, this is still flashbacks from the native world ;-)
Anonymous
April 30, 2006
As it happens the volatile is not even needed in this example. I included it only to ensure that the reading threads would see the update as soon as possible. But in this case it doesn't actually matter if they see an old dictionary for a read or two -- it will just appear to them that the update has not yet happened, which must necessarily be legal or the whole thing is doomed in the first place.

As has been noted object writes are guaranteed atomic (so you see all of it or none of it) but they are not necessarily immediately visible unless volatile.

I'm not going to dare to explain the complex environment example in just a few words... that one is in need of a 17 page article :)
Anonymous
May 01, 2006
While the second sample has all the threading benefits mentioned above, the cost is double memory use during load. In the first example dict is cleared before load, so all the entries can be garbage collected. In the second case, both fully-loaded dictionaries are in memory.
Anonymous
May 01, 2006
Igor makes a good point in his comment above. The reality could be very interesting indeed.

For instance, in a perfect world, recycling the entire dictionary might be enough to keep it out of generation two at all times... thus making the memory much cheaper.

Or the reverse could happen, freeing the entire dictionary might be a disaster because it lives just long enough to get into generation 2 and then dies... ergo it contributes to mid-life-crisis.

Is either of these likely to be an issue?

That's a whole 'nother article but feel free to comment.
Anonymous
May 02, 2006
A further possibility for improvement... add another private variable to store the last-modification-time of the dictionary file.

Have UpdateThreadProc parse the file the first time, then do a file stat to see if the file has changed since the last parse.

If this is a real bottleneck, it becomes worthwhile to have a journal file that stores only changes. UpdateThreadProc can integrate those changes into the shared dictionary while only reading the small journal file. Another lazy thread can be run to integrate the journal file into the main file during periods of low activity. This can be done in such a way that the journal file and the main file can both be accessed but for the time it takes to rename a file.

Share via

Simplify your locking strategy -- simpler is better

Comments

Additional resources