6 more weeks with the Cisco Catalyst 9800-40

In my last post I wrote about my first findings in trying to implement the C9800 in my wireless network.

As this was a "try&buy", and things were looking good - not perfect, but very reasonable and with great prospects - I went from "try" to "buy", and started migrating.

So I want to share some more insight about the process, about issues and about my setup. As of the time I began writing this post, 16.10.1e was the lastest released software. During writing, 16.11.1b was released, that includes some missing features (mDNS, CAPWAP over NAT/PAT,...) and fixes multiple bugs. As I am not yet running this software, this post is about 16.10, and if I discover new insights, I will make a new post about 16.11.

One of my two 9800-40 (in HA-SSO), mounted into the destination Rack. Connection as Multichassis-Etherchannel (20G for now, option to 40G) and 1G HA link, in different buildings, 2 PSUs on different circuits (one with UPS, one without), one building includes diesel generator.



- I went full IPv6 in the wireless infrastructure. Gone are the days of RFC1918 space, weird routes and NAT. The controller itself does have an IPv4 address in addition to the IPv6, as the RADIUS servers are (not yet) IPv6, and there are some APs outside of the network, that do not have an IPv6 connection. But my management, and the CAPWAP connection between AP and controller is IPv6 only. The APs do not even have an IPv4 address.

We have multiple distribution layers; every one of them has now its own "ap management" VLAN; DHCPv6 gives out addresses to the APs, including "DHCPv6 option 52", which is the controller v6 address.



Two caveats here - first: 2802 APs with manufacturing date March 2019 still ship with 8.2 code, which is on COS APs not IPv6 capable. So to install, you would need to "prime" them first on IPv4. Once they start shipping with newer code, this is no longer necessary. They boot, get IPv6 address and controller address from DHCP, and migrate from 8.2 code to the 16.x code. If you want to migrate and you're already running newer code, this is not an issue. Its just that the AP is "out of the box" on code that does not do IPv6.

Second: The "lower cost" APs (I saw it with 1832i, 1542i and 1810w - 2800 does not show this, and 2700 does not run COS) have a red blinking LED. They work, they connect to controller, turn on radios, accept clients, absolutely no issue - but the LED blinks red, because of "ethernet failure", when they don't have an IPv4 address. This is already under investigation at Cisco.

- My contact directly at Cisco - I said it before and I stand by it - is worth gold. As the documentation is not yet full-fledged, and there are of course issues that no one yet has run into, it is so valuable to check back with someone who immediately tries to figure out what went wrong here - configuration, environment, or bug - has tons of suggestions and deep debug commands available and can - if it is a defect - file it right away, so you can track it. There are so far about 100 e-Mails between me and my Cisco-contact, and 5 phone calls/WebEx meetings. I do have now multiple bug IDs "to my name", probably more to come; some issues still under investigation, but no deal breakers so far.

- The mobility tunnel between AireOS (I have an wlc 3504 for migration) on 8.5.140-special and the 9800 was some pain, because of multiple issues on the AireOS side. For example: If they are in same IPv4 subnet, the 3504 sends some packets via default gateway instead of direct. Thus said, there were no severe problems and workarounds for everything. I do have now a handful APs on the 3504 (mostly outdoor APs that are harder to replace) and mobility works.

- Do not edit a WLAN in the GUI when you are connected via that WLAN. You will be disconnected - this is expected of course and happens in AireOS too - but you won't be able to reconnect, as the WLAN stays disabled. The changes to the config are done, but it is in shutdown state and needs a "no shut".

- Netconf is nice and all, but one thing that irks me: You need a Priv-15 user, even if you query read-only objects, for example data gathering for statistics.

- Missing features: no SE-Connect yet, no mDNS proxy yet (comes with 16.11), no CAPWAP over NAT/PAT yet (comes with 16.11), no RLAN yet (comes with 16.11).

- The license level is not in config, so think about setting it on your HA box too

- While migrating the APs, I had one AP that downloaded the software, connected to the controller but did not really come "up". Going back to AireOS worked, back to 9800 did not. After factory resetting the AP, everything went smooth. This was only 1 AP. 2 other ones were really slow to download the software, but came up eventually and now work as expected.

- LED disable is only in site-tags and really inconvenient. The will be on a per-AP-basis with 16.11. LED-blink to locate is also a 16.11 feature.

- Entering RADIUS secret on GUI silently discards "%", but works on CLI. This was a real head-scratcher, as to why one RADIUS server was not working. Already communicated this bug to Cisco.

- CSCvo04998 is really fun, as it uses much more licenses as you have APs on, now even more than the box-limit - over 2000 licenses for the 9800-40, which supports 2000 APs.



But all issues aside - the controller works, has thousands of users on and does as expected. Migrate at your speed, test all the features you use, and see if you can get help from Cisco or your VAR during migration. The 9800 is not perfect yet, but given that it is a very new product, I am very impressed. And I like it more than AireOS already. As soon as the few old non-11ac APs are gone, the 3504 will be shut down and AireOS will be completely gone from my network.

If you have any questions, hit me up on Twitter.