My MAAS deployment is essentially dead in the water.
I’ve been troubleshooting this for the last week or so and I’ve finally uncovered the problem: MAAS is somehow changing my BIOS boot order, and machines are failing deployment because of it.
According to @ltrager,
However, searching through Launchpad, I might have found that I’m not the only one.
- Bug #1894217 “2.8.2 deploy and commission fails corrupted bootor...” : Bugs : MAAS
- Bug #1789650 “Servers set to boot from disk after MAAS installat...” : Bugs : curtin
Here is my experience:
-
I double check my boot order
-
I commission the machine, it gathers the correct storage info
-
Check BIOS, remains unchanged…
-
I deploy the machine, and get “Failed deployment”, although the machine finishes deployment.
BootCurrent: 0006 Timeout: 0 seconds BootOrder: 0002 Boot0000* EFI Network 1 Boot0001* EFI Network 2 Boot0002* EFI Fixed Disk Boot Device 1 Boot0003* EFI Fixed Disk Boot Device 1 Boot0004* EFI Fixed Disk Boot Device 1 Boot0005* EFI Fixed Disk Boot Device 1 Running command ['udevadm', 'settle'] with allowed return codes [0] (capture=False) TIMED subp(['udevadm', 'settle']): 0.008 Running command ['umount', '/tmp/tmpj7vthvop/target/sys/firmware/efi/efivars'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/sys'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/run'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/proc'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/dev'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/dev', '/tmp/tmpj7vthvop/target/dev'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/proc', '/tmp/tmpj7vthvop/target/proc'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/run', '/tmp/tmpj7vthvop/target/run'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/sys', '/tmp/tmpj7vthvop/target/sys'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/sys/firmware/efi/efivars', '/tmp/tmpj7vthvop/target/sys/firmware/efi/efivars'] with allowed return codes [0] (capture=False) Running command ['unshare', '--fork', '--pid', '--', 'chroot', '/tmp/tmpj7vthvop/target', 'efibootmgr', '-v'] with allowed return codes [0] (capture=True) Running command ['udevadm', 'settle'] with allowed return codes [0] (capture=False) TIMED subp(['udevadm', 'settle']): 0.010 Running command ['umount', '/tmp/tmpj7vthvop/target/sys/firmware/efi/efivars'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/sys'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/run'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/proc'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/dev'] with allowed return codes [0] (capture=False) Setting currently booted 0006 as the first UEFI loader. New UEFI boot order: 0006,0002 Running command ['mount', '--bind', '/dev', '/tmp/tmpj7vthvop/target/dev'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/proc', '/tmp/tmpj7vthvop/target/proc'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/run', '/tmp/tmpj7vthvop/target/run'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/sys', '/tmp/tmpj7vthvop/target/sys'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/sys/firmware/efi/efivars', '/tmp/tmpj7vthvop/target/sys/firmware/efi/efivars'] with allowed return codes [0] (capture=False) Running command ['unshare', '--fork', '--pid', '--', 'chroot', '/tmp/tmpj7vthvop/target', 'efibootmgr', '-o', '0006,0002'] with allowed return codes [0] (capture=False) Invalid BootOrder order entry value0006 ^ efibootmgr: entry 0006 does not exist Running command ['udevadm', 'settle'] with allowed return codes [0] (capture=False) TIMED subp(['udevadm', 'settle']): 0.010 Running command ['umount', '/tmp/tmpj7vthvop/target/sys/firmware/efi/efivars'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/sys'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/run'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/proc'] with allowed return codes [0] (capture=False) Running command ['umount', '/tmp/tmpj7vthvop/target/dev'] with allowed return codes [0] (capture=False) finish: cmd-install/stage-curthooks/builtin/cmd-curthooks/install-grub: FAIL: installing grub to target devices finish: cmd-install/stage-curthooks/builtin/cmd-curthooks/configuring-bootloader: FAIL: configuring target system bootloader finish: cmd-install/stage-curthooks/builtin/cmd-curthooks: FAIL: curtin command curthooks Traceback (most recent call last): File "/curtin/curtin/commands/main.py", line 202, in main ret = args.func(args) File "/curtin/curtin/commands/curthooks.py", line 1770, in curthooks builtin_curthooks(cfg, target, state) File "/curtin/curtin/commands/curthooks.py", line 1736, in builtin_curthooks setup_grub(cfg, target, osfamily=osfamily) File "/curtin/curtin/commands/curthooks.py", line 701, in setup_grub uefi_reorder_loaders(grubcfg, target) File "/curtin/curtin/commands/curthooks.py", line 462, in uefi_reorder_loaders in_chroot.subp(['efibootmgr', '-o', new_boot_order]) File "/curtin/curtin/util.py", line 708, in subp return subp(*args, **kwargs) File "/curtin/curtin/util.py", line 275, in subp return _subp(*args, **kwargs) File "/curtin/curtin/util.py", line 141, in _subp cmd=args) curtin.util.ProcessExecutionError: Unexpected error while running command. Command: ['unshare', '--fork', '--pid', '--', 'chroot', '/tmp/tmpj7vthvop/target', 'efibootmgr', '-o', '0006,0002'] Exit code: 8 Reason: - Stdout: '' Stderr: '' Unexpected error while running command. Command: ['unshare', '--fork', '--pid', '--', 'chroot', '/tmp/tmpj7vthvop/target', 'efibootmgr', '-o', '0006,0002'] Exit code: 8 Reason: - Stdout: '' Stderr: ''
-
Check BIOS again, boot order is different
-
Finally, since the machine actually deploys, even though it reports “Failed deployment”, I can check:
root@4-R420:~# efibootmgr BootCurrent: 0008 Timeout: 0 seconds BootOrder: 0008,0002,0006,0007 Boot0000* EFI Network 1 Boot0001* EFI Network 2 Boot0002* EFI Fixed Disk Boot Device 1 Boot0003* EFI Fixed Disk Boot Device 1 Boot0004* EFI Fixed Disk Boot Device 1 Boot0005* EFI Fixed Disk Boot Device 1 Boot0006* EFI Network 1 Boot0007* EFI Network 2 Boot0008* ubuntu
One of the bug reports say that a fix has been released for curtin:
I’m running 2.8
root@controller:~# snap list
Name Version Rev Tracking Publisher Notes
core18 20200724 1885 latest/stable canonical✓ base
maas 2.8.2-8577-g.a3e674063 8980 2.8/stable canonical✓ -
maas-cli 0.6.5 13 latest/stable canonical✓ -
snapd 2.46.1 9279 latest/stable canonical✓ snapd
With all that said, my final question is: when will this fix get integrated to the next MAAS Snap?