c++ - How to make Win32/MFC threads loop in lockstep? -
i'm new multithreading in windows, might trivial question: what's easiest way of making sure threads perform loop in lockstep?
i tried passing shared array of event
s threads , using waitformultipleobjects
@ end of loop synchronize them, gives me deadlock after one, two, cycles. here's simplified version of current code (with 2 threads, i'd make scalable):
typedef struct { int rank; handle* step_events; } iterationparams; int main(int argc, char **argv) { // ... iterationparams p[2]; handle step_events[2]; (int j=0; j<2; ++j) { step_events[j] = createevent(null, false, false, null); } (int j=0; j<2; ++j) { p[j].rank = j; p[j].step_events = step_events; afxbeginthread(iteration, p+j); } // ... } uint iteration(lpvoid pparam) { iterationparams* p = (iterationparams*)pparam; int rank = p->rank; (int i=0; i<100; i++) { if (rank == 0) { printf("%dth iteration\n",i); // setevent(p->step_events[0]); waitformultipleobjects(2, p->step_events, true, infinite); } else if (rank == 1) { // else setevent(p->step_events[1]); waitformultipleobjects(2, p->step_events, true, infinite); } } return 0; }
(i know i'm mixing c , c++, it's legacy c code i'm trying parallelize.)
reading docs @ msdn, think should work. however, thread 0 prints once, twice, , program hangs. correct way of synchronizing threads? if not, recommend (is there no built-in support barrier in mfc?).
edit: solution wrong, including alessandro's fix. example, consider scenario:
- thread 0 sets event , calls wait, blocks
- thread 1 sets event , calls wait, blocks
- thread 0 returns wait, resets event, , completes cycle without thread 1 getting control
- thread 0 sets own event , calls wait. since thread 1 had no chance reset event yet, thread 0's wait returns , threads go out of sync.
so question remains: how 1 safely ensure threads stay in lockstep?
introduction
i implemented simple c++ program consideration (tested in visual studio 2010). using win32 apis (and standard library console output , bit of randomization). should able drop new win32 console project (without precompiled headers), compile , run.
solution
#include <tchar.h> #include <windows.h> //--------------------------------------------------------- // defines synchronization info structure. threads // use same instance of struct implement randezvous/ // barrier synchronization pattern. struct syncinfo { syncinfo(int threadscount) : awaiting(threadscount), threadscount(threadscount), semaphore(::createsemaphore(0, 0, 1024, 0)) {}; ~syncinfo() { ::closehandle(this->semaphore); } volatile unsigned int awaiting; // how many threads still have complete iteration const int threadscount; const handle semaphore; }; //--------------------------------------------------------- // thread-specific parameters. note sync reference // (i.e. threads share same syncinfo instance). struct threadparams { threadparams(syncinfo &sync, int ordinal, int delay) : sync(sync), ordinal(ordinal), delay(delay) {}; syncinfo &sync; const int ordinal; const int delay; }; //--------------------------------------------------------- // called @ end of each itaration, "randezvous" // (meet) threads before returning (so next // iteration can begin). in practical terms function // block until other threads finish iteration. static void randezvousothers(syncinfo &sync, int ordinal) { if (0 == ::interlockeddecrement(&(sync.awaiting))) { // last ones arrive? // @ point, other threads blocking on semaphore // can manipulate shared structures without having worry // conflicts sync.awaiting = sync.threadscount; wprintf(l"thread %d last arrive, releasing synchronization barrier\n", ordinal); wprintf(l"---~~~---\n"); // let's release other threads slumber // using semaphore ::releasesemaphore(sync.semaphore, sync.threadscount - 1, 0); // "threadscount - 1" because last thread not block on semaphore } else { // nope, there other threads still working on iteration let's wait wprintf(l"thread %d waiting on synchronization barrier\n", ordinal); ::waitforsingleobject(sync.semaphore, infinite); // note return value should validated @ point ;) } } //--------------------------------------------------------- // define worker thread lifetime. starts retrieving // thread-specific parameters, loops through 5 iterations // (randezvous-ing other threads @ end of each), // , finishes (the thread can joined). static dword winapi threadproc(void *p) { threadparams *params = static_cast<threadparams *>(p); wprintf(l"starting thread %d\n", params->ordinal); (int = 1; <= 5; ++i) { wprintf(l"thread %d executing iteration #%d (%d delay)\n", params->ordinal, i, params->delay); ::sleep(params->delay); wprintf(l"thread %d synchronizing end of iteration #%d\n", params->ordinal, i); randezvousothers(params->sync, params->ordinal); } wprintf(l"finishing thread %d\n", params->ordinal); return 0; } //--------------------------------------------------------- // program illustrate iteration-lockstep c++ solution. int _tmain(int argc, _tchar* argv[]) { // prepare run ::srand(::gettickcount()); // pseudo-randomize random values :-) syncinfo sync(4); threadparams p[] = { threadparams(sync, 1, ::rand() * 900 / rand_max + 100), // delay between 200 , 1000 milliseconds simulate work iteration threadparams(sync, 2, ::rand() * 900 / rand_max + 100), threadparams(sync, 3, ::rand() * 900 / rand_max + 100), threadparams(sync, 4, ::rand() * 900 / rand_max + 100), }; // let threads rip handle t[] = { ::createthread(0, 0, threadproc, p + 0, 0, 0), ::createthread(0, 0, threadproc, p + 1, 0, 0), ::createthread(0, 0, threadproc, p + 2, 0, 0), ::createthread(0, 0, threadproc, p + 3, 0, 0), }; // wait threads finish (join) ::waitformultipleobjects(4, t, true, infinite); return 0; }
sample output
running program on machine (dual-core) yields following output:
starting thread 1 starting thread 2 starting thread 4 thread 1 executing iteration #1 (712 delay) starting thread 3 thread 2 executing iteration #1 (798 delay) thread 4 executing iteration #1 (477 delay) thread 3 executing iteration #1 (104 delay) thread 3 synchronizing end of iteration #1 thread 3 waiting on synchronization barrier thread 4 synchronizing end of iteration #1 thread 4 waiting on synchronization barrier thread 1 synchronizing end of iteration #1 thread 1 waiting on synchronization barrier thread 2 synchronizing end of iteration #1 thread 2 last arrive, releasing synchronization barrier ---~~~--- thread 2 executing iteration #2 (798 delay) thread 3 executing iteration #2 (104 delay) thread 1 executing iteration #2 (712 delay) thread 4 executing iteration #2 (477 delay) thread 3 synchronizing end of iteration #2 thread 3 waiting on synchronization barrier thread 4 synchronizing end of iteration #2 thread 4 waiting on synchronization barrier thread 1 synchronizing end of iteration #2 thread 1 waiting on synchronization barrier thread 2 synchronizing end of iteration #2 thread 2 last arrive, releasing synchronization barrier ---~~~--- thread 4 executing iteration #3 (477 delay) thread 3 executing iteration #3 (104 delay) thread 1 executing iteration #3 (712 delay) thread 2 executing iteration #3 (798 delay) thread 3 synchronizing end of iteration #3 thread 3 waiting on synchronization barrier thread 4 synchronizing end of iteration #3 thread 4 waiting on synchronization barrier thread 1 synchronizing end of iteration #3 thread 1 waiting on synchronization barrier thread 2 synchronizing end of iteration #3 thread 2 last arrive, releasing synchronization barrier ---~~~--- thread 2 executing iteration #4 (798 delay) thread 3 executing iteration #4 (104 delay) thread 1 executing iteration #4 (712 delay) thread 4 executing iteration #4 (477 delay) thread 3 synchronizing end of iteration #4 thread 3 waiting on synchronization barrier thread 4 synchronizing end of iteration #4 thread 4 waiting on synchronization barrier thread 1 synchronizing end of iteration #4 thread 1 waiting on synchronization barrier thread 2 synchronizing end of iteration #4 thread 2 last arrive, releasing synchronization barrier ---~~~--- thread 3 executing iteration #5 (104 delay) thread 4 executing iteration #5 (477 delay) thread 1 executing iteration #5 (712 delay) thread 2 executing iteration #5 (798 delay) thread 3 synchronizing end of iteration #5 thread 3 waiting on synchronization barrier thread 4 synchronizing end of iteration #5 thread 4 waiting on synchronization barrier thread 1 synchronizing end of iteration #5 thread 1 waiting on synchronization barrier thread 2 synchronizing end of iteration #5 thread 2 last arrive, releasing synchronization barrier ---~~~--- finishing thread 4 finishing thread 3 finishing thread 2 finishing thread 1
note simplicity each thread has random duration of iteration, iterations of thread use same random duration (i.e. doesn't change between iterations).
how work?
the "core" of solution in "randezvousothers" function. function either block on shared semaphore (if thread on function called not last 1 call function) or reset sync structure , unblock threads blocking on shared semaphore (if thread on function called last 1 call function).
Comments
Post a Comment