Locking -- Isolation -- Unit of Work -- Performance -- Solution
A few days ago I posted a concurrency problem for commentary and I got a very nice set of responses.
Now I can't say that there is any answer to this problem that I would say is universally correct but there are some interesting observations that we can/should make about the proffered code. Many of these points were covered in the commentary but not everything was so hopefully you’ll still find it interesting to read this. Again, as always, forgive me for generalizing in the interest of greater clarity in some cases below.
Now the first solution had one lock controlling all the access. This has perhaps the most understandable behavior of the two solutions and maybe about as understandable as we can get.
// it's easy to explain what this does
class ExampleCoarse
{
public ExampleCoarse() {}
private Object myLock = new Object();
private int count;
public void DoIt(StreamWriter sw)
{
lock(myLock)
{
long ticks = DateTime.Now.Ticks;
int count = IncrementCount();
LogToStream(sw, ticks, count);
}
}
private int IncrementCount()
{
return ++count;
}
private void LogToSteam(StreamWriter sw, long ticks, int count)
{
sw.WriteLine("{0} {1}", ticks, count);
}
}
The first thing to notice is that I said in the assumptions that there was only going to be one instance of this object – and yet the object has no static members. Presumably that was done to allow for several logically independent copies of the object at some point in the future. If the “count” and “myLock” members were truly static then the single instance choice would be locked in forever. As it stands the users of the object are sort of forced to deal with the fact that there is some shared instance because they must provide the “this” pointer. Fair enough I suppose.
But even in this simplest form of the code there are at least two other concerns. It appears that writing to the stream will be serialized … but will it? Well not exactly. The StreamWriter (sw) provided is external to the class. We have no idea if it is going to be shared by other threads. If it is then it will have to be the case that this is the only place that StreamWriter is used because otherwise lock(myLock) will not offer any protection (see “Same Resource, Same Lock or else locking won't give you the protection you need” for more details on this).
That’s bad enough. But can we say that it is wise to even use the external StreamWriter? What does it do? The trouble with StreamWriters is that they abstract the underlying stream with a nice interface. Coincidentally that’s also what’s good about them. So what is going to Actually Happen when we call sw.WriteLine()? Well the answer is pretty much anything could happen. For all we know the underlying Stream synchronously sends a letter to Grandma via US certified return-receipt air mail. So is it a good idea to call such a method inside a lock? Well… if we have a good understanding of what the stream is going to be for other reasons (not shown here) then maybe. But if this was framework code, it might be an astonishingly bad idea indeed.
Looks like for even the simple code, as written, to get good results it needs to make some pretty significant assumptions.
Let's have a look at the second version now:
// it's not so easy to explain this... or how to use it.. but is it better?
class ExampleFine
{
public ExampleFine() {}
private Object myLock = new Object();
private int count;
public void DoIt(StreamWriter sw)
{
long ticks = DateTime.Now.Ticks;
int count = IncrementCount();
LogToStream(sw, ticks, count);
}
private int IncrementCount()
{
int c = System.Threading.Interlocked.Increment(count);
return c;
}
private void LogToSteam(StreamWriter sw, long ticks, int count)
{
lock (myLock)
{
sw.WriteLine("{0} {1}", ticks, count);
}
}
}
That's of course more complicated, but what does it do? Well naturally it has all the same issues as the "easy" case. But additionally it suffers from the fact that it's basically impossible to characterize the output. Counts and ticks do not necessarily increase together and they can also be out of order in the file. The count could be arbitrarily unrelated to the ticks. Is it at least faster? Hrm... maybe... it does have finer locks but then the interlocked operation isn't cheap either and we don't know if the stream is fast (like say a memory stream) or slow (like that letter to Gramma). If the streaming operation is comparatively quick then contention would be low and the extra locks would just slow us down. On the other hand a very slow stream will dominate the cost as everyone is gonna be waiting for Gramma for so long that the rest of the time isn’t going to matter a whit. So somewhere in between there would be a cutoff where the extra concurrency is paying for the extra locking.
Wow that’s complicated.
So in return for getting streamed output that’s totally meaningless we get…maybe nothing at all. And don’t forget the Gramma stream could be asynchronous – further confusing the situation for both cases.
I’m not really liking that second choice very much at all at this point.
Could we do better? Yes I think we could. Here’s a couple of suggestions:
// it's easy to explain what this does
class ExampleHybrid
{
public ExampleHybrid() {}
private Object myLock = new Object();
private int count;
public void DoIt(StreamWriter sw)
{
lock(myLock)
{
long ticks = DateTime.Now.Ticks;
int count = IncrementCount();
}
LogToStream(sw, ticks, count);
}
private int IncrementCount()
{
return ++count;
}
private void LogToSteam(StreamWriter sw, long ticks, int count)
{
sw.WriteLine("{0} {1}", ticks, count);
}
}
In this case we’re basically coming right out and saying to our callers that the StreamWriter (sw) they provide us better not be shared with other threads, or else it better be internally threadsafe. Either way we’re staying out of it. And by the way we aren’t guaranteeing the output order but we are guaranteeing that at least the tick will match the counts in some meaningful way. It’s an interesting compromise – and actually easier to understand than the previous case. If the stream is slow at least we won’t be holding a lock while it does its business.
And maybe one last choice for discussion
// it's easy to explain what this does
class ExampleNoOutput
{
public ExampleNoOutput() {}
private Object myLock = new Object();
private int count;
public void DoIt(out long ticksOut, out int countOut)
{
lock(myLock)
{
ticksOut = DateTime.Now.Ticks;
countOut = IncrementCount();
}
}
private int IncrementCount()
{
return ++count;
}
}
And in this last example we finesse the output issues by declining to take a dependence on the stream entirely. “Leave me out of your output logging problems”, says this class, “I wash my hands of all that messy I/O”.
This could perhaps be the wisest course of all. Although the services are more modest they are readily understood and any desired logging strategy can be built on top of them. Whatever chuncking, synchronization, etc. is up to higher level code that is likely be better capable of making that decision intelligently. Moving synchronization primitives “up the stack” to code with more overall “awareness” of what is going on is a common theme to get more sensible synchronization. Definitely something you should keep in mind.
Is there an overall winner? Hardly. Though I think the second of the two original versions is probably the overall loser.
Thanks very much for the thoughtful comments on the first posting.
Comments
- Anonymous
April 27, 2006
I like ExampleNoOutput better. After all, you are trying to do two things, incrementing and logging, in the original class (and don't have all controls over them). To me, this violates SRP
http://foozle.berkeley.edu/projects/streek/agile/oo-design-principles.html - Anonymous
April 27, 2006
I'm assuming that ExampleNoOutput.DoIt taking the StreamWriter as an argument is a typo. Otherwise, it doesn't make sense to me. - Anonymous
April 27, 2006
Oh yes it is, I'll go ahead and fix that. Duh... sorry - Anonymous
April 29, 2006
Thanks for the continued discussion. I liked the original post, as it was thought-provoking. I know you stated that you wanted to keep things simple for examples sakem but with simplicity comes limitation, too.
Example 2 was the overall looser, I agree, not only because its confusing complexity, but because of the fact that it could deadlock if the stream writer was contentious. From the examples, the goal was to log ticks on a count in the order DoIt() was called. Simplifying and forcing the actual logging up the stack is one way of solving the problem, but you loose the central "logging-here" nature you had before. You have to duplicate code to log from each place you call DoIt(), rather than logging from a single place.
There has to be a better solution to the problem, I think. While it may not maintain the ultimate simplicity, part of programming is finding a balance between simplicity and functionality. There should be a way to provide centralized, ordered logging to an arbitrary stream without having the locking issues of the current examples. Perhapse it includes multiple locks, one for each different part of the work being done, or perhapse its more complicated than that. Anyway, I guess I'm the type who doesn't stop with the simplest solution when I know there has to be a better way that maintains a balance of functionality and simplicity. ;) - Anonymous
May 03, 2006
Of course when I say "solution" I really mean "my solution" because there's not really just one...