Using Ansible to spawn a VM and then perform tasks on it

When working with VM instances in the cloud, it is often necessary to spawn one and provision it with certain packages. It would be very cumbersome to expect an engineer to do this manually every time.

Ansible provides modules for provisioning VM instances with various cloud service providers, as well as modules for performing tasks on a server. So the order of play starts to become clear:

  1. Provision a server
  2. SSH into it
  3. Performing provisioning operations

Server? What server?!

Now it is easy to do 2) and 3) with Ansible - almost every example you will see is based on that. So one would assume that a playbook that looks something like this would be sufficient:

- hosts: localhost
  tasks:
    - name: "Spawn a server"
      ec2:
      instance_type: t2.micro
      image: "ami-8b8c57f8"
      region: eu-west-1
      wait: yes
      state: present
      instance_tags:
        Name: test_server

- hosts: tag_Name_test_server
  become: true
  tasks:
    - name: "Install postgresql"
      yum:
        name: postgresql
        state: present
        update_cache: true

But there is a problem when we bring spawning a server into the mix: the inventory. Normally, the inventory - the list of servers and metadata about them like IP addresses, etc. - is loaded when Ansible first runs. (See here for more details about Ansible's inventory). This means that when we come to connecting to the server (the task that installs postgresql), Ansible will not know about it, since it loaded it's inventory before the server even existed (i.e. at the beginning of the playbook).

The solution is the following task:

- meta: refresh_inventory

This causes Ansible to reload it's inventory so from that point onwards, it's inventory will contain the state of any new servers that were created since the playbook began.

I said, what server?!

Now unfortunately adding the meta: refresh_inventory task between the spawning and provisioning tasks is not all that is required. That is because it takes a while for a cloud instance to boot up after it is spawned, yet Ansible will try to connect to it straight away, and fail because the SSH server has not started yet. The wait_for module to the rescue.

By adding a task like the following, Ansible will (as the name suggests) wait for a connection on the given port. The search_regex parameter causes it to wait not just until the port is open, but until the server responds with a string matching it when the connection is open, which is what we need in this case:

- hosts: tag_Name_test_server
  become: false
  gather_facts: no
  tasks:
    - name: "Wait for server port 22 to appear"
      local_action:
        module: wait_for
        host: "{{ inventory_hostname }}"
        port: 22
        delay: 15
        timeout: 120
        search_regex: "OpenSSH"

Note the following:

local_action is used while running the task against the host so we can look up the hostname of the host become (previously known in Ansible as 'sudo') is set to false as, since we are performing the task on the local host, we do not need/want to try to sudo to the root user.

gather_facts is set to no - gathering facts tells Ansible to connect to the host and look up some metadata like IP addresses and number of CPUs, which would completely defeat the purpose of what we are trying to do in the first place.

Here is the entire playbook. We are also adding a predefined SSH key and creating a security group for it so we can SSH to the host.

- hosts: localhost
  tasks:
    - name: "Create security group to allow SSH access"
      ec2_group:
        name: "test-server"
        description: "test server firewall"
        region: eu-west-1
        state: present
        rules:
          - proto: tcp
            from_port: 22
            to_port: 22
            cidr_ip: 0.0.0.0/0
        rules_egress:
          - proto: all
            cidr_ip: 0.0.0.0/0

    - name: "Spawn a server"
      ec2:
        instance_type: t2.micro
        image: "ami-8b8c57f8"
        groups: "test-server"
        region: eu-west-1
        wait: yes
        key_name: "My SSH key"
        state: present
        instance_tags:
          Name: test_server

    - name: "Refresh inventory"
      meta: refresh_inventory

- hosts: tag_Name_test_server
  become: false
  gather_facts: no
  tasks:
    - name: "Wait for server port 22 to appear"
      local_action:
        module: wait_for
        host: "{{ inventory_hostname }}"
        port: 22
        delay: 15
        timeout: 120
        search_regex: "OpenSSH"

- hosts: tag_Name_test_server
  become: true
  user: ec2-user
  tasks:
    - name: "Install postgresql"
      yum:
        name: postgresql
        state: present
        update_cache: true