Cisco

Upgrading Cisco 9800 Controller in HA

Upgrading Cisco 9800 Controller in HA

My two 9800-40’s run in HA. They have been running 16.10.1e for some time now, and I wanted to upgrade to the new suggested 16.11.1c or directly to 16.12.1 to test it out, as long as we are still in the summer holidays and there isn’t that much going on. A newer release is definitely needed in a few weeks, when my first Wi-Fi 6 (or 802.11ax, depending on marketing flavour) APs arrive to go online.

The official Cisco documentation (found here) for upgrading an HA pair is a little lacking:

Official HA upgrade documentation

Yes, this is it. The whole section. Not only is there not much detail, but it is in “old IOS style” upgrade (now called “bundle mode”), setting the .bin file in the boot variable. With this, it does not get copied automatically to the standby-box, there is no rollback, and, if I read correctly, you can’t AP image predownload that way.

NOTE: Documentation has been improved vastly since then. This shows how early I was with the C9800 - I was told one of the first in Europe and the first in Austria.

IOS XE “new style” (officially called “install mode”) works just as the release notes state, and, even if it does not say, it does everything on your standby-box too.

I downloaded the new .bin file from the Cisco download section and copied it to the bootflash on my primary box. Then you use the “install add file” command:

6 more weeks with the Cisco Catalyst 9800-40

6 more weeks with the Cisco Catalyst 9800-40

In my last post I wrote about my first findings in trying to implement the C9800 in my wireless network.

As this was a “try&buy”, and things were looking good - not perfect, but very reasonable and with great prospects - I went from “try” to “buy”, and started migrating.

So I want to share some more insight about the process, about issues and about my setup. As of the time I began writing this post, 16.10.1e was the lastest released software. During writing, 16.11.1b was released, that includes some missing features (mDNS, CAPWAP over NAT/PAT,…) and fixes multiple bugs. As I am not yet running this software, this post is about 16.10, and if I discover new insights, I will make a new post about 16.11.

NOTE: This was a very early look on the C9800, with software version 16.10. A lot has changed over 16.11, 16.12 an on to the 17 train.

One of my two 9800-40 (in HA-SSO), mounted into the destination Rack. Connection as Multichassis-Etherchannel (20G for now, option to 40G) and 1G HA link, in different buildings, 2 PSUs on different circuits (one with UPS, one without), one building includes diesel generator.

C9800-40 mounted in the rack

  • I went full IPv6 in the wireless infrastructure. Gone are the days of RFC1918 space, weird routes and NAT. The controller itself does have an IPv4 address in addition to the IPv6, as the RADIUS servers are (not yet) IPv6, and there are some APs outside of the network, that do not have an IPv6 connection. But my management, and the CAPWAP connection between AP and controller is IPv6 only. The APs do not even have an IPv4 address.
6 weeks with Ciscos newest wireless contorller, the Catalyst 9800

6 weeks with Ciscos newest wireless contorller, the Catalyst 9800

With a lot of fanfare, Cisco introduced their newest wireless controller platform, the Catalyst 9800, late last year. They are still commited to AireOS (the software the classic WLCs run - like the 5508/5520 or the giant 8540), but every time you talk to someone from Cisco it is clear that they want the new Catalyst 9800 - based on IOS-XE - to succeed.

The two Catalyst 9800-40 Controllers in my lab

Just to be clear, this is not a converged access controller as the 5760, that also ran on IOS-XE - this was my first thought, but the blog posts from Dave from wifireference.com or Phil from networkphil.com were assuring that this is not the case.

No matter what flavor you like - as VM as the 9800-CL or as hardware appliance as 9800-40 and 9800-80 - Cisco has got you covered.

As my WiSM2s are ageing, get no new software and are about to run out of licenses, I gave this a shot, and ordered a pair of 9800-40s, to run in HA. Sessions and Meet-the-Engineers at Cisco Live! in Barcelona were reassuring. Our Cisco partner arranged a speedy delivery - despite pretty long waiting times at the moment - and also got me a contact directly at Cisco. This has been proven to be very valuable - someone you can direct questions to that arise all the time, little issues you run into, or defects you find. There is already a good amount of documentation online, but it is still lacking in detail.

So, I had the C9800-40s now for about 6 weeks, and learned a LOT in this time:

Cisco outdoor AP not connecting to WLC

Cisco outdoor AP not connecting to WLC

This might be some old news for most, but I only have a few outdoor APs and touch one every few years. But since I just got a new Cisco Aironet 1542I and was surprised again, I’d thought I document this for me. This also applies not only to the 1542, but 1532, 1552 and probably the whole outdoor portfolio, on Cisco wireless controllers.

The simple issue is: the AP is out of the box in “bridge mode”, and the controller wont let it join just that way without authenticating it. To fix, you have two possibilities: authenticate it on the controller, or switch it from bridge mode to local mode:

Cisco access point firmware corruption issues and how to remediate

Cisco access point firmware corruption issues and how to remediate

Over the last years, some nasty bugs have accumulated with Cisco lightweight APs, involving flash corruption. In the first instances, APs have lost configuration (coming up with default name and role), later some refused to boot or were stuck in a boot loop. That occured at any given time the AP reboots - power outage, software upgrade,… The most common I have seen:

*Mar 1 00:00:30.483: %SYS-5-RELOAD: Reload requested by Init. Reload Reason:
Radio FW image download failed ....

There are a number of bugs that resulted in these behaviours, ranging from software version 8.0 to 8.5. Cisco Field Notice: FN - 70330 lists the corresponding bug IDs, software versions and APs (including 1700/2700/3700, which were hit the hardest it seems).

If you are running one of the affected versions, seeing this problems and want to upgrade - Cisco made a Python tool, that tries to fix the flash before you’re upgrading, saving you from stranded APs. You can read how to use it in their document Understanding Various AP-IOS Flash Corruption Issues.

If you already have non-functional APs, they are probably fixable, just not remote. Here’s how you do it:

Making sense of Cisco WLC access point firmware pre-download

Making sense of Cisco WLC access point firmware pre-download

Firmware pre-download on the Cisco WLC wireless LAN controllers is a beautiful thing.

“In the old days"™ the update went like this:

  1. Upload firmware to controller
  2. Reboot controller
  3. Accesspoints want to rejoin controller, see non matching firmware, start updating, 10 at a time … 1 hour later …
  4. Most of your APs updated, some are still updating…

In large wireless LANs, this was not really an ideal solution, as the Wi-Fi was down for a long time if you had many APs on the controller. So Cisco introduced the pre-download feature with version 6.0, many years ago. It is documented in the official config guides, for example here.

The issue I have with this, that it was not really clear to me, which primary/backup command does what, because you have primary and backup images on WLC and AP.

So here a little guide, with the example of a usual upgrade path: