Saturday, August 23, 2014

best practices with Python's multiprocessing module

Thanks to dano and others on StackOverflow, I have more best practices for Python's multiprocessing module:

1) put everything in an if __name__=='__main__' block, vs having top-level code.  This is a Windows thing, but is good practice in general. Main-level code will be executed even if the module is imported, as opposed to being executed directly.

2) I'd previously been using signal.pause() to wait for subprocesses to exit and/or die.  Alas this has two challenges: a) this isn't available in Windows, and b) if a subproc dies, the parent will still hang around until explicitly killed/interrupted.

Here's an example of the right and wrong way to wait for a subproc.  If the signal.pause() path is enabled, then the program will run forever, ignoring the child proc that exits with an error.  The other path is more verbose, but correctly runs on Windows, and handles children correctly.

source

    import multiprocessing, signal, sys, time
    
    def wait_crash():
        time.sleep(1)
        sys.exit('crash')
    
    if __name__=='__main__':
        procs = [ multiprocessing.Process(target=wait_crash) ]
        for proc in procs:
            proc.start()
        if 0:
            signal.pause()              # XXX: not on Windows
        else:
            for proc in procs:
                proc.join()

output

    crash

No comments:

Post a Comment