Upgrading Cisco Catalyst 9800 WLAN Controller in HA

My two 9800-40's run in HA.
They have been running 16.10.1e for some time now, and I wanted to upgrade to the new suggested 16.11.1c or directly to 16.12.1 to test it out, as long as we are still in the summer holidays and there isn't that much going on. A newer release is definitely needed in a few weeks, when my first Wi-Fi 6 (or 802.11ax, depending on marketing flavour) APs arrive to go online.

The official Cisco documentation (found here) for upgrading an HA pair is a little lacking:

Yes, this is it. The whole section.
Not only is there not much detail, but it is in "old IOS style" upgrade (now called "bundle mode"), setting the .bin file in the boot variable. With this, it does not get copied automatically to the standby-box, there is no rollback, and, if I read correctly, you can't AP image predownload that way.

IOS XE "new style" (officially called "install mode") works just as the release notes state, and, even if it does not say, it does everything on your standby-box too.

I downloaded the new .bin file from the Cisco download section and copied it to the bootflash on my primary box.
Then you use the "install add file" command:

9800-40#install add file bootflash:C9800-40-universalk9_wlc.16.12.01.SPA.bin
install_add: START Thu Aug 22 11:30:41 UTC 2019
install_add: Adding PACKAGE

--- Starting initial file syncing ---
[1]: Copying bootflash:C9800-40-universalk9_wlc.16.12.01.SPA.bin from chassis 1 to chassis 2
[2]: Finished copying to chassis 2
Info: Finished copying bootflash:C9800-40-universalk9_wlc.16.12.01.SPA.bin to the selected chassis
Finished initial file syncing

--- Starting Add ---
Performing Add on all members
[1] Add package(s) on chassis 1
[1] Finished Add on chassis 1
[2] Add package(s) on chassis 2
[2] Finished Add on chassis 2
Checking status of Add on [1 2]
Add: Passed on [1 2]
Finished Add

Image added. Version: 16.12.1.0.544
SUCCESS: install_add Thu Aug 22 11:33:18 UTC 2019


As you can see, it got automatically copied to the standby, and checked.
"show install summary" displays the current state:

9800-40#sh install summary 
[ Chassis 1 2 ] Installed Package(s) Information:
State (St): I - Inactive, U - Activated & Uncommitted,
C - Activated & Committed, D - Deactivated & Uncommitted
--------------------------------------------------------------------------------
Type St Filename/Version
--------------------------------------------------------------------------------
IMG I 16.12.1.0.544
IMG C 16.10.1e.0.441

--------------------------------------------------------------------------------
Auto abort timer: inactive
--------------------------------------------------------------------------------


On both Chassis, 16.10 is activated and commited, and 16.12 is inactive.

If you are ready to start the upgrade, type "install activate" (you can do the add and activate in one command if you like - "install add file xxx activate"

9800-40#install activate 
install_activate: START Fri Aug 23 04:17:34 UTC 2019

System configuration has been modified.
Press Yes(y) to save the configuration and proceed.
Press No(n) for proceeding without saving the configuration.
Press Quit(q) to exit, you may save configuration and re-enter the command. [y/n/q]y
Modified configuration has been saved
install_activate: Activating PACKAGE
Following packages shall be activated:
/bootflash/C9800-rpboot.16.12.01.SPA.pkg
/bootflash/C9800-mono-universalk9_wlc.16.12.01.SPA.pkg
/bootflash/C9800-hw-programmables.16.12.01.SPA.pkg

This operation requires a reload of the system. Do you want to proceed? [y/n]y
--- Starting Activate ---
Performing Activate on all members
[1] Activate package(s) on chassis 1
--- Starting list of software package changes ---
Old files list:
Removed C9800-mono-universalk9_wlc.16.10.01e.SPA.pkg
Removed C9800-rpboot.16.10.01e.SPA.pkg
New files list:
Added C9800-mono-universalk9_wlc.16.12.01.SPA.pkg
Added C9800-rpboot.16.12.01.SPA.pkg
Finished list of software package changes
[1] Finished Activate on chassis 1
[2] Activate package(s) on chassis 2
--- Starting list of software package changes ---
Old files list:
Removed C9800-mono-universalk9_wlc.16.10.01e.SPA.pkg
Removed C9800-rpboot.16.10.01e.SPA.pkg
New files list:
Added C9800-mono-universalk9_wlc.16.12.01.SPA.pkg
Added C9800-rpboot.16.12.01.SPA.pkg
Finished list of software package changes
[2] Finished Activate on chassis 2
Checking status of Activate on [1 2]
Activate: Passed on [1 2]
Finished Activate

Install will reload the system now!
SUCCESS: install_activate Fri Aug 23 04:22:23 UTC 2019


[Sidenote: If I read the documentation here correctly, if you type "n" above to not proceed, this would be the time for ap image predownload, using "ap image predownload" and monitoring the status with "show ap image". I did not try a predownload today]

Now the whole stack reloads. After they both come back up, you can check if the new version is running:

9800-40#sh ver
Cisco IOS XE Software, Version 16.12.01
Cisco IOS Software [Gibraltar], C9800 Software (C9800_IOSXE-K9), Version 16.12.1, RELEASE SOFTWARE (fc4)


But: it has not been commited yet. With another reload, or when the timer (seems to default to 360 minutes, I did not explicitly set it) runs out, it should roll back and come back online with the old version. This can come in handy in remote locations - if something goes wrong, rollback, or send remote hands to switch off and on. You can see the status here:

9800-40#sh install summary 
[ Chassis 1 2 ] Installed Package(s) Information:
State (St): I - Inactive, U - Activated & Uncommitted,
C - Activated & Committed, D - Deactivated & Uncommitted
--------------------------------------------------------------------------------
Type St Filename/Version
--------------------------------------------------------------------------------
IMG U 16.12.1.0.544

--------------------------------------------------------------------------------
Auto abort timer: active on install_activate, time before rollback - 05:53:27
--------------------------------------------------------------------------------


As you can see - "Activated and Uncommited". Commit with "install commit" (as before, you could do it all in one command - "install add file xxx activate commit", if you are sure about it).

9800-40#install commit 
install_commit: START Fri Aug 23 04:58:01 UTC 2019
install_commit: Committing PACKAGE

--- Starting Commit ---
Performing Commit on all members
[1] Commit package(s) on chassis 1
[1] Finished Commit on chassis 1
[2] Commit package(s) on chassis 2
[2] Finished Commit on chassis 2
Checking status of Commit on [1 2]
Commit: Passed on [1 2]
Finished Commit

SUCCESS: install_commit Fri Aug 23 04:58:18 UTC 2019


Then the status is:

9800-40#sh install summary 
[ Chassis 1 2 ] Installed Package(s) Information:
State (St): I - Inactive, U - Activated & Uncommitted,
C - Activated & Committed, D - Deactivated & Uncommitted
--------------------------------------------------------------------------------
Type St Filename/Version
--------------------------------------------------------------------------------
IMG C 16.12.1.0.544

--------------------------------------------------------------------------------
Auto abort timer: inactive
--------------------------------------------------------------------------------


And our HA is healthy:

9800-40#sh chassis 
Chassis/Stack Mac Address : dddd.cccc.eeee - Local Mac Address
Mac persistency wait time: Indefinite
Local Redundancy Port Type: FIBRE
H/W Current
Chassis# Role Mac Address Priority Version State IP
-------------------------------------------------------------------------------------
*1 Active 7777.3333.aaaa 1 V02 Ready x.y.z.10
2 Standby dddd.cccc.eeee 2 V02 Ready x.y.z.11


That's it!
Now, of course there is more to do. With such code updates, especially when there are lots of new features introduced, I'll do a diff of my config before and after upgrade. That way you can see, if there are things added, removed, changed, or are just different. Monitor for a while, if clients join the way they should, and if all APs come back up as they should.

I had two minor issues, 2 APs did not stay connected but always join/disjoin - I just replaced them and look into that later; and the controllers booted in advanced license level - but I had them set to essentials. So I need another reboot, because you can't just switch license levels.