Friday, June 17, 2016

Quality DevOps: installing and verifying Network Time Protocol (NTP)

I lurve Ansible. It lets me install or update software on one or 100 instances, easily. The entire system becomes a set of scripts to run and run and run again until I get things exactly the way I want them.

In today's devops ecosystem, where "infrastructure is code", how do we test our infrastructure?

Ansible gives us one way to do this.  When we install or update a service, run a service-specific command to make doubly sure that things are working as expected. If something's not quite right, Ansible will abort and we can figure out what went kablooey.

Save the following into "ntp.yml" and run with ansible-playbook -vvi myhost ntp.yml

Thanks to phillipuniverse !

# ntp.yml -- install NTP time sync daemon
# Adapted from
# USAGE: ansible-playbook -vvi myhost ntp.yml

- hosts: all
  become: yes
  gather_facts: no


    - name: Install NTP
      apt: package=ntp state=present update_cache=yes
      tags: ntp

    - name: Make sure NTP is started up
      service: name=ntp state=started enabled=yes
      tags: ntp

    - name: verify NTP synchronized
      command: timedatectl status
      register: ntp_result
      failed_when: "'synchronized: yes' not in ntp_result.stdout"
      tags: ntp

    - name: restart ntp
      service: name=ntp state=restarted

Thursday, June 2, 2016

DevOps, Availability, and Risk

excellent points from an episode of Arrested DevOps, entitled "Who Owns Your Availability?" (TLDR: you do!)

My thoughts:

- technical risk can produce business risk, as in "your hundred employees can't do anything for an hour", up to "your database is gone therefore the company is gone" kinds of risk.  Or, "feature X doesn't work for user class Y" kinds of risk. Do you as a business prioritize consumers paying you, or you delivering their stuff, or your admins/phone people delighting your customers, or your developers fixing bugs?

From the show (Charity Majors, Pete Cheslock, ADO crew). (Quotes are my foggy recollections, not quotes.):

- cache ("vendor") your dependencies

If you can't deploy to production because GitHub or a 3rd party package server in China is down, things are not good.  Likewise, if your server is connecting to China and all your packages are local, perhaps it's time for a security check. (If you don't know what servers your server is talking to, that's another risk.)

- what is your Risk Profile? What is considered acceptable risk?

As your company starts it's probably fine to rely on the internet being always available all the time. Not being able to deploy for an hour/day might be okay. Spending resources on growing your company might be a good tradeoff vs security and availability.

- your dependencies are cached. What about deps of deps of deps?

- "Packerize the base"

If your system has a baked, reliable base, with a little bit of changes on top, then it's easier to track down and fix things that break.

One mechanism is "baking" all your random dependencies to a Docker layer.  Or, network volume -- Amazon S3 for example (deb-s3).  It can still go down, but if it's up you get everything in one place. It'll be there for you even if the original host is not happy for whatever reason. One person mentioned she had more problems with GitHub's reliability than her own.

Another failure mode: known-good version is broken. Your business depends on the "beer-1.0" package. It's been working fine for months.  Developer gets drunk and uploads a broken package, but uses the same version number -- "beer-1.0" is now broken.  You can no longer make changes to your business!  Since you own your availability, it's your problem.

- "if you treat your devs like children, they'll act like children. They'll become subject matter experts on doing things the wrong way. We as devops can be spirit guides, career counselors for your leveling up skills." Developers own the code, the availability. Give them pagers and wake them up when the site has problems.

- site should have "circuit breakers" - if the site is in "continuous partial failure", that's better than just being down for everyone, full stop.

I dig the Arrested DevOps podcast, and listen to it often. Thanks!

Monday, May 30, 2016

Django trick: keep "runserver" from crashing on Python syntax error

When developing Django, the "runserver" command is convenient. It runs our appserver, and reloads itself when we change our app's source code.  We can have a rapid "edit stuff then see what happened" cycle.

However, runserver it has an annoying habit. If we're typing so fast that we add a Python syntax error into our code, the  command will crash. We expect that when we fix the syntax error, we can see our results, but it doesn't work. The appserver has crashed, and is no more.

The following workaround works wonders. If we add a syntax error, "runserver" will crash, and this loop will wait for a moment then re-run it. We can now type as fast as possible all the time, and Django will always show us what we want to see. Or, what we give it, which sometimes is enough ;)

while true; do ./ runserver; sleep 15; done


In my creative and job work, I've learned how to move around and have great ideas.

It turns out these things are related.

In "Doodle Revolution", Sunni Brown points out that moving and creative thinking are related.  Steve Jobs would go on "power walks" not to clear his mind but to focus and to work through creative blocks.  Einstein would play the violin, improvising melodies while pondering complicated problems. Tesla could design and run machines in his mind, not bothering to draw them out!

At work I walk around as much as possible. Often I'll jog around the block to stimulate the little gray cells. Before or after lunch I'll draw my coworkers or tourists or cute dogs. At meetings and when learning new material I'll take notes on paper, adding lines and diagrams and fonts as much as possible. It's fun, and it helps me concentrate on the material, and integrate into my brain.

I recommend everyone go out at buy Brown's book, or at least watch her TED talk "Doodlers, unite!" She's on the twitters at @SunniBrown 

Here's my notes from 2015's Continuous Delivery "cdSummit SoCal" conference

Friday, April 29, 2016

talk: Functional Programming and Django QuerySets

thanks everyone for coming, for Media Temple for hosting, and for Esther for coordinating! I thought Justin's talk (his first!) on Swagger was fun and extremely useful. I hope everyone had a good time.

Here are the slides and notes for "Functional Programming and Django QuerySets" (2016 edition) -- have fun!

Friday, April 15, 2016

talk: Functional Programming and Django QuerySets soon!

I'll be speaking at the next SoCal Python meetup! If you want to go, sign up soon, there are only a few spots left.

This talk is mostly about Functional Programming in the Python world.  FP is great for testing, but it can be a bit mysterious at times. I highlight the awesomeness of FP, the dangerous bit, and how it is related to Django QuerySets.

I've given this talk a few times, and people always get excited about it. FP is not magic!  It's a clean and graceful way to write code.

Here's a link with my previous version's slides, notes, and other references:
Each time I give a talk I change it up a little or a lot. This time I'll probably add more cat photos.

Thursday, March 24, 2016

UPDATED: quickly download lots of Python packages

This trick downloads Python packages up to 9x faster than normal:

egrep -o '^([A-Za-z].*==[^ ]+)' requirements.txt | xargs -n1 -P9 pip download

After things are downloaded, actually build and install the packages:

pip install -r requirements.txt

EDIT: original code gave `egrep: Invalid range end` -- fixed. Also added "-n1" to xargs so it'll download in parallel, vs sequentially.