Auto partition secondary EBS on CentOS 7

If you add an additional blank EBS volume to a CentOS 7 EC2 instance, it won’t auto-partition in the same way as the primary volume gets resized on launch. Additionally if your baking an image you will certainly encounter problems with it not automounting on different instance types and even manually trying to mount it can give misleading error messages.

The observed behaviour when launching from stock CentOS 7 AMI

When launching an EC2, any ext3/ext4 file systems that is already loaded, either by being specified in the AMI or a snapshot manually specified to populate the secondary EBS, then it will be resized on start up.
However if no snapshot was given for the additional EBS volume or the snapshot didn’t already have a partition defined in the AMI (i.e. the disk is empty and un-initiated), then no partitions will be created.
If you pass cloud-init the options for disk_setup and fs_setup through the user metadata, it doesn’t init the second disk. To make things confusing, the cloud-init docs imply that should just work with no hint as to what is wrong. Turns out that the CentOS 7 install of cloud-init excludes some of the modules from the cloud.cfg that would normally be included.

When launching EC2 of baked AMI

Another problem you might encounter is after creating an AMI off this EC2 with a root EBS + secondary EBS, is that any new EC2 instance the 2nd EBS volume won't be recognised and mounted that gets attached and populated from the AMI.
If you also wanted to launch the AMI later and override the secondary’s EBS snapshot with another snapshot created on another host (i.e. that’s pre-loaded with the data to save downloading it all), then if not properly configured, it also won’t mount the volume.
You can also encounter this problem if you shutdown an instance that’s working fine and start it as another instance type (say, switching between M4 and M5).

The problem is device names

The root volume deliberately has the device name kept the same, but the other attached volumes’ device names change based on the instance type and the setup of the operating system. This is generally due to the changes being too varied for cloud-init to sensibly handle automatically.
To make matters worse, mixing advice from the cloud-init docs and the EC2 documentation on device naming you can easily get confused about what the name you should give them when launching the EC2 instance and get lost in expecting that to control the control what device name the OS assigns it on boot up.

From amazon docs:

With the following instances, EBS volumes are exposed as NVMe block devices: C5, C5d, i3.metal, M5, and M5d. The device names are /dev/nvme0n1, /dev/nvme1n1, and so on. The device names that you specify in a block device mapping are renamed using NVMe device names (/dev/nvme[0-26]n1).

From cloud-init mounts:

When specifying the fs_spec, if the device name starts with one of xvd, sd, hd, or vd, the leading /dev may be omitted.

I found the cloud-init doc to be a little misleading so I’ll show what worked for me. I always built the AMI’s with the same instance type, but then configured so that cloud-init could detect where it ended up when it’s booted up as another instance.

Creating the EC2 and bake the AMI

The uses packer to launch an EC2 instance from the CentOS 7 ami, do some software installs and then make the nesecary changes for it to mount the volume correctly when launching it with another instance type.

The packer template

{
  "builders": [
    {
      "source_ami": "ami-0d13c26f",
      "type": "amazon-ebs",
      "instance_type": "m5.2xlarge",
      "ssh_username": "centos",
      "name": "amazon",
      "user_data_file": "init_user_data.yaml",
      "launch_block_device_mappings": [
        {
          "device_name": "/dev/sda1",
          "volume_size": "20",
          "volume_type": "gp2",
          "delete_on_termination": true
        },
        {
          "device_name": "/dev/sdf",
          "volume_size": "100",
          "volume_type": "gp2",
          "delete_on_termination": true
        }
      ]
    }
  ],
  "provisioners": [
    {
      "type": "puppet-masterless",
      "manifest_file": "{{user `manifest_file`}}",
      "module_paths": "{{user `module_paths`}}"
    },
    {
      "type": "shell",
      "script": "cleanup.sh"
    }
  ]
}

User data: init_user_data.yaml

The important change here is: to get the disk partitioning to work you need to override the cloud_init_modules that CentOS 7 defines to run.

#cloud-config

# centos doesn't include disk_setup by default
cloud_init_modules:
- migrator
- bootcmd
- write-files
- growpart
- disk_setup
- resizefs
- set_hostname
- update_hostname
- update_etc_hosts
- rsyslog
- users-groups
- ssh

disk_setup:
  /dev/nvme1n1:
    table_type: 'gpt'
    layout: True
    overwrite: False

fs_setup:
- label: data-mnt
  filesystem: 'ext4'
  device: '/dev/nvme1n1'
  partition: auto
  overwrite: False

mounts:
- [/dev/nvme1n1, /mnt,"auto","defaults,nofail", "0", "0"

As part of the puppet setup, a cloud-init script is installed to be run on next boot. This is mainly to stop cloud-init changing anything on next boot and it mounts the second drive by label instead of device name to increase portability.

/etc/cloud/cloud.cfg.d/01_mnt-fs.cfg

mounts:
 - ["LABEL=data-mnt", /mnt/data, "auto", "defaults,nofail", "0", "0"]

Once done, we have packer to run a final script to allow it to start-up on different machine types. If the drive mount in /etc/fstab and /etc/mstab then you won't be able to mount the drive if it has another device name.

cleanup.sh

#!/bin/bash
sudo yum -y clean all
# umount the extra volume to stop conflicts when re-mounting
sudo umount -l /mnt

Search This Blog

DevGrok