I'm experimenting with a distributed sorting routine where you have a large file which is read into memory and divided into chunks. Each chunk is passed to a thread where it is sorted.
A useful abstraction that I read about is a "thread barrier" which is basically a waiting point which causes all threads to block until all threads have reached the barrier. I've implemented a sorting routine in C which uses the thread barrier, and it works fine, but I have some important questions.
It seems to me that the only way to actually get a distributed sorting algorithm like the one I describe to actually work properly is to use *TWO* thread barriers. Although, the textbooks I read from seem to imply that only one thread barrier is necessary.
The following pseudo-code outlines the program I wrote, and shows why two thread barriers are necessary to properly synchronize the sorting. The worker thread goes like:
void* worker_thread(void* state)
{
// create unique ID for this thread
int id = uniqueID
// main loop:
for(;;) {
barrier_wait(&Barrier1);
begin = sort_task[id].start_pointe
r;
end = begin + sort_task[id].nelements;
sort(begin, end);
if (terminate) return 0;
barrier_wait(&Barrier2);
}
}
And the control thread goes like:
void control()
{
// set up threads and all that...
for (unsigned i = 0; i < nchunks; ++i) {
// read in next chunk of file
if (i == nchunks - 1) terminate = true;
// Distribute sorting task among threads
for (unsigned k = 0; k < nthreads; ++k) {
sort_task[k].start_pointer
= &io_buffer[byte_offset];
sort_task[k].nelements = segment_size / sizeof(int);
byte_offset += segment_size;
}
// Wait for workers threads to become ready before proceeding
barrier_wait(&Barrier1);
if (terminate) {
// Join with all worker threads and then exit
}
else {
// Wait for worker threads to finish before proceeding to the next
// iteration
barrier_wait(&Barrier2);
}
// Finally write the sorted output buffer to disk
}
}
The above outline works fine, but requires TWO barrier waits. Is that normal? I can't see how this can work without two barrier waits. The first barrier wait is needed so that all the worker threads wait until the control thread allows them to start sorting. The second barrier wait is needed so that the control thread waits until the worker threads are done sorting.
Again, this works fine - but I suspect I'm not doing this correctly, because the textbooks I've read do not mention a need to use TWO barrier waits. So am I doing this correctly, or is there some easier way which only involves a single barrier wait?
Start Free Trial