Wordclouds (and multiple processes) are fun
Let's say you're writing a browser toy, to display what people say in their important #beer tweets. On the server you want to scan Twitter for #beer, and store the tweets. On occasion, a browser will fetch the list of words, then display most recent related words in those tweets as a word cloud.
To reduce complexity you don't want to add any extra packages, which might be untested and/or sketchy. What do you do with a standard "batteries included" Python? You use multiprocessing!
The multiprocessing module lets you write programs as a system of a connected processes. In this case, one is a producer: it does work then pushes information to list of tweets, shared in the system. Another process is a consumer: it waits for data from producer, then processes it for display on a browser in a pretty word cloud.
Server programming: log early and often
Without writing asynchronous code it's hard to do a lot of I/O in a single Python process. By splitting up your project into multiple tasks, each with its own process, each task can run on a separate CPU in parallel. The multiprocessing module helps us start and stop proceses, and communicate data back and forth.
It's better to have too much logging than not enough. Your Operations people don't understand your code. If they see overall system problems it's easier for them to sift out the irrelevant logging messages, rather than add more logging into a complex system.
Log early, log often -- you and your operations people will love you for it.
This post was inspired by Playing with REALTIME data, Python and D3 by Brett Dangerfield. His code actually scans Twitter and does the word cloud display.
If you're even curious about Python, run don't walk to get Python Cookbook by David Beazley and Brian K. Jones. I've been programming in Python for 15 years and learn new tools and techniques from every chapter!
In modern Python3, take a look at the more graceful concurrent.futures solution.