Recommended Posts

This is a ping graph to a switch in our network... it's a Cisco Catalyst 2960X-48FPS-L.

 

This pattern keeps repeating, a whole 12 ports are in use and traffic is very light on it. It will repeat this pattern then the peaks will stop for 9 minutes then repeat it again... this continues on for ever even when no one is on it at night.

 

What would cause a pattern like this? 

Capture.PNG

 

stack bandwidth usage is at 0% when this is happening stack packet error rate is 0% CPU temp is 41 degrees C and port utilization on all ports is below 5% almost all are at 0%

 

the average jitter is 105ms, the average ping is 54.9 ms with a max of 550ms

 

Graphed with ping plotter using the ICMP using windows DLL engine

 

windows ping example:


Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=37ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=7ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=8ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=26ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=500ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=4ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=482ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=9ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=6ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=458ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=428ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=9ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=5ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=406ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=10ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=121ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=6ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=3ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=119ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=5ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=2ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=97ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=10ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=7ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=66ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=35ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=8ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=7ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=499ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=470ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=11ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=8ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=447ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=416ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=405ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=13ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=6ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=386ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=357ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=4ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=4ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=11ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=327ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=306ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=9ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=5ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=282ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=10ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=6ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=9ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=258ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=20ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=235ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=3ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=5ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=9ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=209ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=2ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=184ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=4ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=154ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=6ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=3ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=9ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=9ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=126ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=10ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=93ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=74ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=4ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=13ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=50ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=19ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=498ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=483ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=14ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=2ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time<1ms TTL=172.30.0.20
Reply from 10.1.3.20: bytes=32 time=4ms TTL=172.30.0.20


 

 

every 5th ping you get a high latency, which starts at 500 and every 5 pings decreases until it reaches zero then goes back up to 500


 

Edited by neufuse

what exactly are you pinging?  The IP of the switch?  Ping through the switch between devices connected to the switch, do you see the same pattern?

 

Are you logged into the switch when this is pinging?  Does it do it when your not logged in, not ssh/telnet or webgui if you have that enabled.  Do you have anything else monitoring the switch like a snmp query to it?

 

What ios are you running on it?  I show 15.2.5bE as current MD which you should prob be running on it.  Are you running the ios with the web base management or not? etc..

 

Have you opened a TAC case with cisco?

 

  On 01/12/2016 at 15:49, BudMan said:

what exactly are you pinging?  The IP of the switch?  Ping through the switch between devices connected to the switch, do you see the same pattern?

Expand  

Pinging the IP of the switch from a workstation plugged into it, and from another location in the building that goes through another switch both show the same symptoms.

Running a ping from inside the IOS CLI gives me the same results going out, the latency spikes and decrements in the same pattern.

 

  10 minutes ago, BudMan said:

Are you logged into the switch when this is pinging?  Does it do it when your not logged in, not ssh/telnet or webgui if you have that enabled.  Do you have anything else monitoring the switch like a snmp query to it?

Expand  

I get the same pattern regardless of being logged into the switch or not via telnet or the web gui, we do not have SNMP monitoring enabled for these switches.

 

  10 minutes ago, BudMan said:

 

 

What ios are you running on it?  I show 15.2.5bE as current MD which you should prob be running on it.  Are you running the ios with the web base management or not? etc..

Expand  

It appears the switches haven't been upgraded in a couple years... it's at 15.0.(2)EX4 we have web based management but actually never use it

  10 minutes ago, BudMan said:

 

Have you opened a TAC case with cisco?

 

Expand  

not yet, I was just curious as to what might case a pattern like that before going down that road or doing a firmware upgrade, never saw a pattern happen like that before was curious what would cause it if anyone knew

I personally have never seen anything like that before..  Before you open up a tac I would prob just update to current ios, since can tell from experience first thing tac is going to tell you to do is update ;) hehe

  On 01/12/2016 at 16:08, BudMan said:

I personally have never seen anything like that before..  Before you open up a tac I would prob just update to current ios, since can tell from experience first thing tac is going to tell you to do is update ;) hehe

Expand  

I'm shocked you've never seen something like that before ;) I thought you've seen it all lol

do not ping the switch.

 

the switch sucks.

 

I made a call to cisco TAC regarding high ping times and drops to the switch.  The result was that this was normal and to not use the switch interface as a vaild measure of connectivity using the ping command.  Ping another device that is plugged into the switch.  FWIW, I have the same series switch...well this one WS-C2960XR-48LPS-I

 

Edit:  Just did a ping test of 100 pings to my local gateway (the same switch I have mentioned above), here are my results

 

    Packets: Sent = 100, Received = 100, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 17ms, Average = 3ms

 

They are a low end enterprise switch.  They blow, but it is what the business is allowing us to afford.  The distance from my computer to it is about 20 feet.

 

here is another one that has a laptop connected to it with about a 3 foot patch cable between

 

    Packets: Sent = 100, Received = 100, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 806ms, Average = 33ms

 

I reiterate, do not use them as a valid measurement of anything other than if you can or cannot get basic communications up.  device to device all have under 1ms responses attached to the switch, even on different vlans....even several switches down....the switches themselves do not give icmp responses a priority.

  On 01/12/2016 at 16:37, sc302 said:

do not ping the switch.

 

the switch sucks.

 

I made a call to cisco TAC regarding high ping times and drops to the switch.  The result was that this was normal and to not use the switch interface as a vaild measure of connectivity using the ping command.  Ping another device that is plugged into the switch.  FWIW, I have the same series switch...well this one WS-C2960XR-48LPS-I

 

Edit:  Just did a ping test of 100 pings to my local gateway (the same switch I have mentioned above), here are my results

 

    Packets: Sent = 100, Received = 100, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 17ms, Average = 3ms

 

They are a low end enterprise switch.  They blow, but it is what the business is allowing us to afford.  The distance from my computer to it is about 20 feet.

 

here is another one that has a laptop connected to it with about a 3 foot patch cable between

 

    Packets: Sent = 100, Received = 100, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 806ms, Average = 33ms

 

I reiterate, do not use them as a valid measurement of anything other than if you can or cannot get basic communications up.  device to device all have under 1ms responses attached to the switch, even on different vlans....even several switches down....the switches themselves do not give icmp responses a priority.

Expand  

I get the same pattern pinging any device plugged into the switch...

do you have any sort of qos or ratelimiting setup?

 

I have seen a lot sure - but never anything like this.. The seems like some sort of queue loading and unloading, etc. ;)

 

I am with sc302 though that is why asked if same thing happens when pinging through.  Pinging an actual device like a router or switch whos main function is to route and switch don't always put high priority on ping response.  Also that pingplotter right.  You sure its even doing a icmp echo request or is it doing udp port and waiting for the icmp not available response?

 

Also

TTL=172.30.0.20

 

What kind of ttl is that??  Where you trying to manipulate the pings to hide your IPs and messed up the formatting.. Didn't you say work wanted you to hide your rfc1918 even?

  On 01/12/2016 at 19:55, BudMan said:

do you have any sort of qos or ratelimiting setup?

 

I have seen a lot sure - but never anything like this.. The seems like some sort of queue loading and unloading, etc. ;)

 

I am with sc302 though that is why asked if same thing happens when pinging through.  Pinging an actual device like a router or switch whos main function is to route and switch don't always put high priority on ping response.  Also that pingplotter right.  You sure its even doing a icmp echo request or is it doing udp port and waiting for the icmp not available response?

 

Also

TTL=172.30.0.20

 

What kind of ttl is that??  Where you trying to manipulate the pings to hide your IPs and messed up the formatting.. Didn't you say work wanted you to hide your rfc1918 even?

Expand  

I don't know where the 172 came from, we have no 172 private ranges.... the only thing I changed was the 10.10 addys, I didn't notice that either until you mentioned it wonder if I did a replace on accident

 

and the 10.1 isn't even our range, we use a 10.10.0.0  255.255.0.0 subnet ;) I just have to mask them when I can't replace them easily... text it's just replace with something not in our range.... pictures it's blank out...

Cisco's suddenly interested in this switch too.... it's updated now to the latest firmware and it's acting odd still... to quote one of their techs something not normal is happening based on a dump they took, they can't figure out what causes that step down pattern, even with nothing but two test systems plugged into it, it's reproducible... only one of these switches is doing it and we have 10 of them not showing this

  On 02/12/2016 at 18:16, neufuse said:

it's updated now to the latest firmware

Expand  

So see that was the first thing they had you do right ;) hehehe

 

Yeah never seen anything like that.. So please keep us updated on what cisco figures out, hope its not just hey here is a new switch.. Cuz that sort of a fix doesn't answer the question on why its happening ;)

well cisco has spent almost a day and a half looking at our switch now, they can't explain it so they cleared the running config and rebuilt it like new... and boom.... same pattern.... now we are onto trying some newer "debug" firmware... to gather more data

Wow.. interesting.. Please keep us updated, I am very curious what they find.. Be a let down if they just replace the hardware without any actual reason for the odd behavior.

 

Like when they cancel a tv show without any sort of closure to the story lines..

 

Well do you kind of need to use this switch??  They could always play with it in their labs after you have working one.. But having it on your site does keep them honest, as long as you can live without having a working switch in your environment.

 

Have you tried changing the icmp payload size. either up or down.  Does it do the same thing if you set byte to zero?  Or up it from the default 32 bytes?

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.