Over the last few weeks we have been encountering some strange networking issues at our off site data center. The problem was characterized by slow ping times, packet loss, service timeouts, etc. After checking all of the network devices on the outside facing link, we turned to the internal infrastructure, our new Cisco 3750 series routers.
After reviewing the configuration using the ‘show running-config’ command, we checked each of our bonded networking interfaces with ‘the show interface’ command to see if we could uncover any errors, dropped packets, etc.
Next we decided to check the switches available resources. We used ‘show proc cpu’ in order to check the cpu usage. This is when we saw the following line:
‘CPU utilization for five seconds: 97%/0%; one minute: 97%; five minutes: 98%’
We immediately knew that this is what we were looking for. After contacting Cisco support…we learned that there exists an obscure bug in this version of IOS which causes this kind of behavior. If you are interested in learning more about this specific issue, refer to Cisco bug ID ‘CSCsd95669’.
The only known fixes at this point are upgrade your version of IOS or restart your switch. We have decided to restart the switch and monitor the cpu at regular intervals. If this problem were to appear again, at that point we would obviously choose to upgrade.
CSCsd95669 Bug Details
Information contained within bug ID CSCsd95669 is only available to Cisco employees. It is our policy to make all externally-facing bugs available in Bug Toolkit so the system administrators have been automatically alerted to the problem
Great policy they have — can you shed any more light — I have the issue.
Adam,
Here is the information that I have on this issue:
CSCsd95669 Invalid hw forwarding entry causing traffic to be sw forwarded
Symptom:
Traffic is being forwarded in software even though all forwarding information
is correct. The may cause high cpu and packet loss at high traffic rates.
Several iterations of “show controller cpu | inc host” will show increased
retrieve packet counts while the issue is occurring. This is the cpu queue the
traffic is being sent to.
Not all traffic flows are effected.
Conditions:
Unknown how he system gets into this state
Workaround:
Reloading the system will resolve issue. Root cause has not been determined so
the system may return to this state.
Can you tell me the IOS version you’re using?
And how will I know if the numbers in “sh controllers cpu | incl host” are high?
We have a switch, (c3750TS), with IOS 12.2 (46) AdvIP services which shows some of the same probs you’re describing. Now I can upgrade to 12.2 (53), but I would like to know if this will solve anything.
Best regards
Pieter,
The bug was seen in IOS version ‘12.2(35)SE5’.
It is a bit unclear as to whether or not upgrading will help out. it appears that they are not 100% sure exactly what is causing this problem…at least that is what I understood from one of the Cisco support reps. Doing a reload (restart of the switch) fixed our issue and we have not had any problems since.
We saw cpu usage of 97% using:
’show proc cpu’
After the reboot we are running at about 5% usage…what does ‘show proc cpu’ report on your device?
Hoi Shain,
Our CPU is a lot less. We’re hitting somewhere between 5 and 10%.
The “show controller cpu | inc host†is increasing at a steady rate of 3 to 5 per second, but we see this at other switches as well.
Since the IOS version you’re referring to is way older as ours, I think this bug ID might not refer to our problem.
Another indication is that rebooting our switch didn’t help.
Thanks for your input though.