Over the last few weeks we have been encountering some strange networking issues at our off site data center. The problem was characterized by slow ping times, packet loss, service timeouts, etc. After checking all of the network devices on the outside facing link, we turned to the internal infrastructure, our new Cisco 3750 series routers.
After reviewing the configuration using the ‘show running-config’ command, we checked each of our bonded networking interfaces with ‘the show interface’ command to see if we could uncover any errors, dropped packets, etc.
Next we decided to check the switches available resources. We used ‘show proc cpu’ in order to check the cpu usage. This is when we saw the following line:
‘CPU utilization for five seconds: 97%/0%; one minute: 97%; five minutes: 98%’
We immediately knew that this is what we were looking for. After contacting Cisco support…we learned that there exists an obscure bug in this version of IOS which causes this kind of behavior. If you are interested in learning more about this specific issue, refer to Cisco bug ID ‘CSCsd95669’.
The only known fixes at this point are upgrade your version of IOS or restart your switch. We have decided to restart the switch and monitor the cpu at regular intervals. If this problem were to appear again, at that point we would obviously choose to upgrade.