The (hidden) power of ‘ping’


PING

Ping is a Unix utility that sends ICMP ECHO to the server ( destination machine ) from the client ( originating machine ). It takes its name from a submarine sonar search – you send a short sound burst and listen for an echo – a ping – coming back. There is a myth that Ping is actually an acronym for the words ‘Packet INternet Groper’, but there is no proven evidence to support the statement.


For years, system admins, managers, developers, and the-next-door-jack have been using this utility, primarily to check if the ‘target system is alive’. Ping, is thus undercredited by deeming it as an utility which has only one function. “Type in ping <target_machine> , if it gives response the machine is UP, if it doesn’t either machine is down OR network is down. “NOT NECESSARILY TRUE!

What (other) Information can a ‘ping’ give you ?

  1. If the <target_machine> is alive [ there are cases where this doesn't work ]
  2. How long each packet exchange took
  3. Interframe Gap
  4. Reports other ICMP messages that might otherwise get buried in the system software
  5. Exponential Moving Average
  6. How occupied is the target system including the network that routes from host to the target.

Now, how ‘you’ use this information to draw conclusions and metrics, is upon you. I’ll show some examples in the later part of this article.

Lets see a sample Ping Output

-bash-3.00$ ping www.google.com
PING www.l.google.com (64.233.169.104) 56(84) bytes of data.
64 bytes from yo-in-f104.google.com (64.233.169.104): icmp_seq=0 ttl=246 time=7.97 ms
64 bytes from yo-in-f104.google.com (64.233.169.104): icmp_seq=1 ttl=246 time=13.7 ms
64 bytes from yo-in-f104.google.com (64.233.169.104): icmp_seq=2 ttl=246 time=9.69 ms
64 bytes from yo-in-f104.google.com (64.233.169.104): icmp_seq=3 ttl=246 time=8.15 ms
 
--- www.l.google.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss
rtt min/avg/max/mdev = 7.971/9.884/13.716/2.313 ms

pretty normal eh ! Whats so new in that ?
Now, I can bet 50% of people who read this may not know the significance of all the digits or abbreviations in the output. It is a good idea to have a detailed knowledge of it in order to understand what is actually happening.
This will also help you to change the data in order to detect certain problems.

1. When you ping www.google.com it replies www.l.google.com which is actually a CNAME in the DNS database.
2. The second line has the IP address of the host that is found. This IP address is found from the DNS query to the DNS database. ( I will be writing an article about how the DNS query actually works ! Watch out for it )
3. Line two has two numbers (56 and 84 ) as bytes of data while all the reply has another (64) bytes of data. Now what are these ?

PING is an ICMP Packet with 20 bytes of IP header + 8 bytes ICMP header + XX byes of data ( or call it the payload )

IN default case, the whole frame is set to 64 bytes That is 20 IP header + 8 ICMP header + 56 bytes of payload. That is how you get all these numbers .. never thought of it isn’t it ?

image reference : www.caida.org

image reference : www.caida.org

4. The next four lines show the successful reply from the IP address and their reverse DNS host names. See how the host names actually differ in all the first three lines of the output ? That is the magic of DNS and load balancing, which I will be discussing in a n separate article.

5. These also have 3 major things: (a) icmp_seq (b)ttl and (c)time in miliseconds

(a) icmp_seq number is the sequence number of packets that are transmitted and received back by the client. If a sequence number is missing OR if there is a gap in the sequence number that means that the client is sending more than what the server can take. It is called source quench. As ICMP is a lower level protocol it can only detect errors not correct it ( as IP or TCP does ).

how to simulate source quench ?

Image Courtesy: www.cartoonstock.com

Image Courtesy: www.cartoonstock.com

You can use the command -i0 :this switch will set interval to 0 , this means that the client will continuously keep on sending packets without waiting for a response from the server. This will result in server getting busy and thus dropping packets.
You can also use the switch -s500 : this switch will set the payload along with ICMP header size to 500 and the server has to accept and process more data from the server ( A ping with packet larger than 64 is called The Ping of Death )
You can use both at the same time to get good results

ping -i0 -s500 www.google.com

Lets see an output:

64 bytes from host232.hostmonster.com (74.220.215.232): icmp_seq=24 ttl=50 time=114 ms
64 bytes from host232.hostmonster.com (74.220.215.232): icmp_seq=25 ttl=50 time=123 ms
64 bytes from host232.hostmonster.com (74.220.215.232): icmp_seq=27 ttl=50 time=112 ms

Here we see that the icmp sequence number 26 is lost. This indicates that the server is busy and thus cannot accept data at these throughputs. Now, in production environment something like this can be used to compare past ‘good’ results to a problem scenario. If all of a sudden you find that there is a large packet drop on the ping with the normal 64 byte 0 second interval packets, you can judge that there could be something wrong with the either devices OR the network pipe OR the internet connection itself could be slow. Comparing these figures on a different server / different network / different subnet would help you determine the problem area, and in some cases you can actually pinpoint the culprit.

(b) TTL

Any IP packet that gets sent out will have a TTL field which normally is set to a relatively high number (in the case of ping the default TTL is 255). As the packet traverses the network, the TTL field gets decreased by one by each HOP ( or router ) it goes through; when the TTL drops to 0, the packet is discarded by the router. The IP specification says that the TTL should be set to 60 (though it’s 255 for ping packets). The main purpose of this is so that a packet doesn’t live forever on the network and will eventually die when it is deemed “lost.” But for ping purposes, it provides additional information. The TTL can be used to determine approximately how many router hops the packet has gone through. If the TTL field varies in successive pings, it could indicate that the successive reply packets are going via different routes, which isn’t a great thing.

So a TTL of 245 for gogole.com means that it took (255-245 = ) 10 hops to reach google.com server.
On a production environment, in a ping output a low TTL means that it took the packet higher number of hops to reach to the destination. Time to check your network routing table ?

(c) time

The time is the time in milliseconds (ms) that it took to reach to the source and back. Thus it is also called RTT or the Round Trip Time. In a production environment a HIGH RTT suggests that there is a congestion in the network.

6. After the ping terminates ( normally ctrl+c OR with a switch -c<#> where <#> is the count before termination ) you will see number of packets transmitted, number failed and the percentage failed. In this example output its a 100% success but In the above source quench example we can see the number of packets failed and the percentage non zero.

7. The min/avg/max/mdev is the minimum / average / maximum / standard deviation of the round trip times. To have a good number and analysis, send at least 50 to 100 packets.

What else ?

Lets take a look at the end of the output:

64 bytes from yo-in-f99.google.com (64.233.169.99): icmp_seq=29 ttl=245 (truncated)
--- www.l.google.com ping statistics ---
30 packets transmitted, 30 received, 0% packet loss, time 540ms
rtt min/avg/max/mdev = 7.560/7.996/8.609/0.257 ms,  ipg/ewma 18.632/7.945 ms

Focus on the line: ipg/ewma . That is: inter-packet gap / exponential moving average.

InterPacket Gap (measured in seconds)

The Inter-packet gap (or the inter-frame gap) is an idle time period appended to the end of every frame by the ethernet adapter. This idle time gives the network media a chance to stabilize, and other network components time to process the frame. On specifying the i0 or the -f switch in ping we can get the output resulting ping statistics which gives the current ipg of the system.
The minimum interframe gap is 96 bit times (the time it takes to transmit 96 bits of raw data on the medium), which is
9.6 μs for 10 Mbit/s Ethernet,
960 ns for 100 Mbit/s (fast) Ethernet,
96 ns for 1 Gbit/s (gigabit) Ethernet, and
9.6 ns for 10 Gbit/s (10 gigabit) Ethernet.
This is the minimum gap as specified in Ethernet protocol and required for a non-colliding transmission. There are ways to reduce it for a faster UDP transmissions, but can cause heavy collisions if other devices ( client and server ) aren’t able to handle the high rate of transmission.

In ideal situations for a 10 Mbps line and 9.6 μs of IFG the loss is 14.28%

Now, once you have the ifg and your Ethernet pipe speed. You can easily determine the network efficiency.

Exponential Weighted Moving Average (measured in seconds)

Estimated packet rate is used to identify abnormal activities and attacks. The ethernet adapter estimates the arrival of the next packet based on the information of previous packet. If the packet time is more it will go to sleep (saving power).
Although I want to I cannot talk a lot about EWMA as it is beyond the scope of this article, but on a production system A quick look at the rtt and ewma will tell you if something is wrong. rtt ~ ewma for regular case.

During operations, the effective idletime is measured using an exponential weighted moving average (EWMA), which considers recent packets to be exponentially more important than past ones. The Unix loadaverage is calculated in the same way.
The calculated idle time is subtracted from the EWMA measured one, the resulting number is called ‘avgidle’. A perfectly loaded link has an avgidle of zero: packets arrive exactly at the calculated interval.

  • Share/Bookmark


9 Responses to “The (hidden) power of ‘ping’”

  1.  Prakash Says:

    This is so cool. 5 years as a sysadmin and I did not know all this. kudos to you man ++++ very detailed.

  2.  Roystien M Says:

    This is going into my bookmarks :) great one;

  3.  Dipak Says:

    dude… please upload the DNS article asap.. this is awesome,
    i was not knowing about the ping in this much detail. great work man.

  4.  Abdul Says:

    very nice article. To the point and it covers everything. I have become you fan. I will show this to my professor Never thought ping could do so much informtion. I tried it andyes I could see so many things I am a IT support in my graduate school and I have to check for machine status every now and then you saved my life

  5.  Arup Says:

    Really a very good one Nish. !!!! Please let me know when you publish any new article.

  6.  Alaric Says:

    Thanks.

  7.  Bernd Eckenfels Says:

    I think the info about ithernet IFG is a bit missleading here. In case if ping the ipg looks more like the calculated source quench delay. So its basically the time ping waited inbetween sending.

    and the emwa can be compared to avg for a long running ping to see some trends.

    BTW: do you know what the “pipe” classification is in recent iputils output?

    Bernd

  8.  admin Says:
  9.  Yonit Says:

    Hi – great explanation for ping – I’ve gone and printed it out to read again to fully understand.
    thank you !

Leave a Reply