simple question about parallel programming

Started by
3 comments, last by KulSeran 13 years, 6 months ago
Hey,

Let's say we have a data structure and 2 threads; one writing and one reading. We know that if we don't use some thread safety techniques in this scenario something bad may occur. But I'm having issues understanding what this "something bad" is. I think there are 2 options;

1) reader could read data just in the time writer is writing and reader's data will stay old although it is actually changed. This scenario sounds bad if you're doing something time-critical but it's ok if your read operations are frequent and error tolerant.

or

2) if somehow your read and write operations start at the same time this crashes something (application, OS, something...) This scenario is unacceptable in any cases.

Which option defines the "something bad" for thread safety definitions? or is there something else?

Advertisement
Quote:Original post by troxtril
2) if somehow your read and write operations start at the same time this crashes something (application, OS, something...) This scenario is unacceptable in any cases.

This shouldn't cause a crash (directly).

Quote:Which option defines the "something bad" for thread safety definitions?


That's for you to decide. If you take "something bad" to mean "broken invariant", you won't go far wrong. So, though you can usually read and write an int atomically (with one instruction) on x86 (assuming appropriate alignment), you can't do the same with mutiple ints without some kind of mutual exclusion mechanism. If the state of those two ints are related in some fashion, just reading and writing to them willy nilly will likely violate an implicit assumption made by the wider system.

[Edited by - the_edd on October 7, 2010 6:44:00 AM]
3) Depending on the compiler and/or underlying memory implementation, the write may never show up for the reader.

4) The reader may read a value that was never written. If the write isn't atomic, for instance (half-write, read, half-write; the reader now has half new data, half old data). This also applies to groups of variables.


Your scenario 1 can be a valid use case; chaotic relaxation when searching for stable states, for instance.

I don't think scenario 2 will ever occur in practice, though.
Million-to-one chances occur nine times out of ten!
Quote:Original post by troxtril
Which option defines the "something bad" for thread safety definitions? or is there something else?

Incorrect or potentially incorrect behaviour, which would have been correct in a single-threaded model.

Quote:it's ok if your read operations are frequent and error tolerant.

The problem is that error tolerance is actually very hard and very specific to the language and platform you're using. Frequency is irrelevant really.

Say you're copying a string. So you call strlen() on the string to find out how long it is, allocate a buffer of that size, then call strcpy() to copy the string. In a multi-threaded system, the other thread might double the length of the string you're copying after you read the length but before you copy it, so that it becomes too big for your buffer, causing a crash.

How can you be 'tolerant' of that? You can use strncpy to ensure that you don't overrun your buffer, so you've solved your crash. But now you still find that you copied a prefix of the new string instead of the entire old string or the entire new string. So the behaviour is still incorrect. To ensure correct behaviour you'd have to use a more complex mechanism.
Quote:
Let's say we have a data structure and 2 threads; one writing and one reading.

1) reader could read data just in the time writer is writing and reader's data will stay old although it is actually changed. This scenario sounds bad if you're doing something time-critical but it's ok if your read operations are frequent and error tolerant.

Mike nl almost nailed a key point for you. Your "data structure" isn't atomicly writeable.
If you have a std::string, the exact process Kylotan outlined could happen. A string operations like "cin >> your_name" looks like one operation, but it is many operations. If you don't lock, any invariants in your data structure can now be considered broken. Locking around your structure reads and writes makes the operations behave properly.

This topic is closed to new replies.

Advertisement