Saturday, April 18, 2009

Fast Parallel Downloading (for apt-get)

I'm rebuilding a Ubuntu server. Normally apt-get downloads one file at a time, which can get dull when you're installing 598 files. I found the tool "apt-fast" which downloads one or two files quickly, by downloading with multiple streams per file. This is somewhat sketchy, as it requires installation of additional software, assumes the file gets spliced together correctly, and doesn't gracefully handle network problems.

I have a solution: xargs

Xargs walks on water. It is incredibly useful. In a nutshell, it runs a single command on a list of files. I'll post a lot more later, but here's how to speed up apt-get:

cd /var/cache/apt/archives/
apt-get -y --print-uris install ubuntu-desktop^ > debs.list
egrep -o -e "http://[^\']+" debs.list | xargs -l3 -P5 wget -nv
apt-get -y install ubuntu-desktop^

Replace "ubuntu-desktop^" with whichever task or package you want. Since ubuntu-desktop is a task, a huge collection of packages, the "^" on the end is required (and magic).

The options say to take three packages into a batch (-l3), and download five batches at a time in parallel (-P5). These settings are arbitrary, but provide a nice speedup while also not hammering the Ubuntu repository servers too hard.

