Creating an ICMP traceroute in Python

Creating an ICMP traceroute in Python - python

I am trying to implement an ICMP based Traceroute in Python. I found a very helpful guide ( https://blogs.oracle.com/ksplice/entry/learning_by_doing_writing_your ) that has allowed me to create a UDP based Traceroute so just needs modification. However I have looked around and am having trouble changing the send socket and making it work. Is anybody able to assist me?
#!/usr/bin/python
import socket
def main(dest_name):
dest_addr = socket.gethostbyname(dest_name)
port = 33434
max_hops = 30
icmp = socket.getprotobyname('icmp')
udp = socket.getprotobyname('udp')
ttl = 1
while True:
recv_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
send_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
send_socket.setsockopt(socket.SOL_IP, socket.IP_TTL, ttl)
recv_socket.bind(("", port))
send_socket.sendto("", (dest_name, port))
curr_addr = None
curr_name = None
try:
_, curr_addr = recv_socket.recvfrom(512)
curr_addr = curr_addr[0]
try:
curr_name = socket.gethostbyaddr(curr_addr)[0]
except socket.error:
curr_name = curr_addr
except socket.error:
pass
finally:
send_socket.close()
recv_socket.close()
if curr_addr is not None:
curr_host = "%s (%s)" % (curr_name, curr_addr)
else:
curr_host = "*"
print "%d\t%s" % (ttl, curr_host)
ttl += 1
if curr_addr == dest_addr or ttl > max_hops:
break
if __name__ == "__main__":
main('google.com')

Not sure why you chose scapy (nice module though it is), as this is certainly possible using only python. To send an ICMP packet, you simply send out your recv_socket. In order to send out this socket, you'll need to first create an ICMP packet.
However, it seems what you want is to send out a UDP packet over an ICMP socket. This won't work as you might think.
First, let me say apparently there exists a patch to the Linux kernel that'll allow SOCK_DGRAM with IPPROTO_ICMP: ICMP sockets (linux). I've not tested this.
Generally, though, this combination of socket flags won't work. This is because an ICMP socket expects an ICMP header. If you were to send an empty string over recv_socket similarly to send_socket, the kernel will drop the packet. Further, if you were to layer a UDP header over an ICMP header, the receiving system will only react to the received ICMP header, and treat the UDP header as nothing more than just data appended to ICMP. In fact, in its ICMP reply to you, the remote system will include your UDP header simply as data you sent to it in the first place.
The reason you're able to send an empty string over send_socket is because the kernel has created the UDP header for you, and what you send over that UDP socket is simply appended as data to the UDP header. It's not like that with an ICMP socket. As I wrote, you'll need to create the icmp header to send on this socket.
The actuality of what happens in a UDP "ping" is this: A UDP packet is sent over a UDP socket to a remote system, using a (hopefully) unopened port as the destination port. This elicits an ICMP response back from the remote system (type 3, code 3). At this point, you will need an ICMP-registered socket to handle the ICMP reply back, which requires root (or administrator on Windows) privileges.
To create an icmp echo header is quite easy:
import struct
icmp = struct.pack(">BBHHH", 8, 0, 0, 0, 0)
icmp = struct.pack(">BBHHH", 8, 0, checksum(icmp), 0, 0)
A couple things to note with this header. First, arguments four and five are the "Identifier" and "Sequence Number" fields. Typically the identifier is set to a process ID while the sequence number starts at 1 so you can track the sequence of sends to replies. In this example, I've set the numbers to zero just for illustration (although, it is perfectly normal to keep it this way if you don't care about them).
Second, see the checksum(icmp) function I allude to? I'm assuming you have access to code that does this. It is known as the 1's complement checksum, and is ratified under RFC 1071. The checksum is vital because, without it being correctly computed, the receiving system might not forward the packet to any of the ICMP socket handlers.
Third, if you're wondering why I initially created the header with a zero checksum, it's because the kernel bases a checksum result on that field being zero, before the actual checksum is created and added back to the header.
Because the struct module deals with what's called "packed" binary data, I advise that you familiarize yourself with bit shifting and bitwise operators. You're going to need this as you advance further into raw sockets, particulary if and when you need to extract certain bits from bit fields that may or may not overlap. For example, the following is the IP header:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 |Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
4 | Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
8 | Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
12 | Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
16 | Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
20 | Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Assuming you wanted to insert IP address 192.168.1.10 in the source field, first you must note its length, 4 bytes. To correctly insert this into struct (just this field, not rest of header, as an example), you must:
struct.pack(">I", 192 << 24| 168 << 16| 1 << 8| 10)
Doing this adds the proper integer for this field, 3232235786L.
(Some readers may point out that the same result could be had with struct.pack(">BBBB", 192, 168, 1, 10). Although this would work for this use case, it is incorrect in general. In this case, it works because an IP address is byte-oriented; however, fields that are not byte-oriented, such as the checksum field, will fail because the resulting value of a checksum is greater than 255, that is to say, it is expected to be a 16-bit value. So don't do this in general i.e. use the exact bits expected by a protocol field.)
As another example for learning bit shifting and operations, take the VER field of the IP header. This is a nibble-sized field, i.e. 4 bits. In order to extract this, you would do the following as an example (assuming IHL is zero):
# Assume ip[] already has a full IP header.
ver = struct.unpack("!B", ip[0])[0]
# This produces the integer 4, which is required for sending IPV4
ver >> 4
In a nutshell, I've right-shifted a one-byte value down 4 bits. In this downshift, the new value compared to the old value now looks like this:
# OLD VALUE IN BINARY; value is 64
# 01000000
# NEW VALUE IN BINARY; value is 4
# 00000100
It's important to note that, without this shift, checking the raw binary value would result in 64, which isn't what is expected. For the inverse, in order to "pack" the value 4 into the upper 4 bits of one byte, do this:
# New value will be 64, or binary 01000000
x = 4 << 4
Hope this points you in the right direction.

Old question, but adding another point for clarity as I recently had to do just this.
You can write a native Python version of ICMP traceroute using sockets. The two important issues to solve are:
creating the ICMP header with the correct checksum (as Eugene states above)
setting the TTL of the socket using sockopts
For the first issue, there's a great example in the pyping module that works perfectly; it's what I used when I wrote my traceroute utility.
For the second, it's as easy as:
current_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW,
socket.getprotobyname("icmp"))
current_socket.setsockopt(socket.IPPROTO_IP, socket.IP_TTL, ttl)
where 'ttl' is the integer value of the TTL you're setting
Then it's just a matter of doing a send/recv and watching the type/code in the return packets, in a control structure that increments the TTL.
I modeled my header pack/unpack after what was already written in the pyping module; just look for header type = 11 with code 0 for the intermediate hops, and header type = 0 for when you've reached your destination.
Scapy works fine, but it's slower, and is an additional external dependency. I was able to write a robust traceroute in an afternoon with AS-lookup (via RADb) and geolocation, using nothing but sockets.

Ended up writing my own using scapy as this wasn't possible.

Related

How to send unsigned characters over TCP in this communication protocol format in Python?

So, I'm working on a project with a proprietary communication protocol. I need to send data in given format and Up, Down, Left & Right co-ordinates to particular IP address. Communication happens over TCP and I need to program TCP client in python to send data in following format:
IP address/Handshake address: 192.166.166.166
This is the Data format.
Data type is unsigned char 0-255 (8-bit binary) and data length is 6-64 bit.
Data length of each frame is 10-64 bit.
So, If I want to move object at 1.0.0.1 to (23, 45, 67, 89), this is the given instruction:
Send 255 255 10 3 1 0 0 1 23 45 67 89 to specified IP address. I imagine specified IP address is 192.166.166.166. You can refer to data format to understand this data that I'm supposed to send. It's quite simple.
The question is, how am I supposed to send this series of unsigned chars over TCP in python?
I've tried following:
import socket
host = '192.166.166.166'
port = 80 # 80 Because TCP
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((host, port))
data = '255 255 10 3 1 0 0 1 23 45 67 89'
s.sendall(data)
result = s.recv(1024)
s.close()
print('Received', repr(result))
Obviously this is not working. I've not specified unsigned char and I'm just sending raw data and spaces.
This is what I get in return from server:
('Received', '\'HTTP/1.0 400 Bad Request\\r\\nServer: Mini-IoT-314\\r\\nDate: , 31 1969 23:59:59 GMT\\r\\nPragma: no-
cache\\r\\nCache-Control: no-cache\\r\\nContent-Type:
text/html\\r\\nConnection: close\\r\\n\\r\\n<HTML><HEAD><TITLE>400 Bad
Request</TITLE></HEAD>\\n<BODY BGCOLOR="#cc9999"><H4>400 Bad
Request</H4>\\nCan\\\'t parse request.\\n</BODY></HTML>\\n\'')
Now, I'm not sure what to do and how to send this data so server can process this data appropriately. I would really appreciate a help here.

Not really my area of expertise, just an idea:
You might need to send the data as a Structure.
Have a look at python's ctypes: https://docs.python.org/3/library/ctypes.html#structured-data-types
from ctypes import *
class payload(Structure):
_fields_ = [("data_sign1", c_ubyte),
...]

1) I tested your code in python 3.5 with slight modifications rather than sending string,i sent bytes by converting string into bytes
bytes(data,'utf-8')
2) Do not use Port 80,might be your http server is running on that port,
Try to use some other port. it will work

meaning of python socket address info output

I'm tring to understand the meaning of the python socket address info output.
import socket
rawSocket = socket.socket(socket.PF_PACKET, socket.SOCK_RAW, socket.htons(0x0800))
pkt = rawSocket.recvfrom(2048)
print pkt[1]
('ens33', 2048, 1, 1, 'HE \xfd\x12h')
ens33 is the interface sending the data.
I guess that 2048 is the buffer size.
I have no idea what the first "1" is. Sometimes it's "0".
I noticed the second "1" relates to the interface (i.e. "772" for "lo")
'HE \xfd\x12h' : Reverting the converted hex values, we get '\x48\x45\x20\xfd\x12\x68', it gives the mac address of host machine in a VM bridged connection.
So, the main question is for #3. What 1 or 0 means here ?

In short, the third 1 means it's a broadcast packet. 0 would mean it was a packet addressed to the machine running the Python code. Details follow.
This is based on Python 3.6, but the answer should be similar for other Py2 or Py3 versions. The answer is split between the source for the socket module, the packet(7) man page, and the Linux source.
The Python library includes function makesockaddr(). For PF_PACKET sockets (same as AF_PACKET), the relevant portion gives you the following order of fields. Explanations from the man page are italicized.
ifname (e.g., ens33) (the interface name, as you noted)
sll_protocol (e.g., 2048) Physical-layer protocol
sll_pkttype (e.g., 1) Packet type
sll_hatype (e.g., 1) ARP hardware type
sll_addr (e.g., the MAC address, as you noted above) Physical-layer address
The Linux source gives the various values for packet type. In that list, 1 is PACKET_BROADCAST. 0 is PACKET_HOST, explained as "To us".

Python dectect the length of the data with socket

I found this code to detect the length of encrypted data in the frame :
header = self.request.recv(5)
if header == '':
#print 'client disconnected'
running = False
break
(content_type, version, length) = struct.unpack('>BHH', header)
data = self.request.recv(length)
Souce :
https://github.com/EiNSTeiN-/poodle/blob/master/samples/poodle-sample-1.py
https://gist.github.com/takeshixx/10107280
https://gist.github.com/ixs/10116537
This code, listen the connection between a client and a server. When the client talk to the server, self.request.recv(5) can get you the length of the header in the frame. Then we use that length to take the data.
If we print the exchange between the client and the server :
Client --> [proxy] -----> Server
length : 24 #why 24 ?
Client --> [proxy] -----> Server
length: 80 #length of the data
Client <-- [proxy] <----- Server
We can see that the client will send two packet to the server.
If i change
data = self.request.recv(length)
to
data = self.request.recv(4096)
Only one exchange is made.
Client --> [proxy] -----> Server
length: 109 #length of the data + the header
Client <-- [proxy] <----- Server
My question is why we only need to take a size of 5 to get the lenght, content_type informations ? Is there an understandable doc about this ?
Why there is two request: one with 24 and another with the lenght of our data ?

why we only need to take a size of 5 to get the lenght, content_type
informations ?
Because obviously that's the way the protocol was designed.
Binary streams only guarantee that when some bytes are put into one end of the stream, they arrive in the same order on the other end of the stream. For message transmission through binary streams the obvious problem is: where are the message boundaries? The classical solution to this problem is to add a prefix to messages, a so-called header. This header has a fixed size, known to both communication partners. That way, the recipient can safely read header, message, header, message (I guess you grasp the concept, it is an alternating fashion). As you see, the header does not contain message data -- it is just communication "overhead". This overhead should be kept small. The most efficient (space-wise) way to store such information is in binary form, using some kind of code that must, again, be known to both sides of the communication. Indeed, 5 bytes of information is quite a lot.
The '>BHH' format string indicates that this 5 byte header is built up like this:
unsigned char (1 Byte)
unsigned short (2 Bytes)
unsigned short (2 Bytes)
Plenty of room for storing information such as length and content type, don't you think? This header can encode 256 different content types, 65536 different versions, and a message length between 0 and 65535 bytes.
Why there is two request: one with 24 and another with the lenght of
our data ?
If your network forensics / traffic analysis does not correspond to what you have inferred from code, one of both types of analyses is wrong/incomplete. In this case, I guess that your traffic analysis is correct, but that you have not understood all relevant code for this kind of communication. Note that I did not look at the source code you linked to.

Wrong TCP checksum calculation by Scapy

After asking this, I just wanted to make a simple test. I captured a traffic using tcpdump. Filtered out a TCP ACK packet in Wireshark and exported the filtered packet to sample.pcap.
Now this is pretty much my code for TCP checksum recalculation:
from scapy.all import *
ack_pkt = sniff(offline="sample.pcap", count=1)[0]
print "Original:\t", ack_pkt[TCP].chksum
del ack_pkt[TCP].chksum
print "Deleted:\t", ack_pkt[TCP].chksum
ack_pkt[TCP]=ack_pkt[TCP].__class__(str(ack_pkt[TCP]))
print "Recalculated:\t", ack_pkt[TCP].chksum
The output I'm getting is:
WARNING: No route found for IPv6 destination :: (no default route?)
Original: 30805
Deleted: None
Recalculated: 55452
Is the checksum recalculation process is correct or is there something else to recalculate the checksum? Since scapy is being used widely for a long time, I guess, there is something wrong in my checksum recalculation.
Updated with packet information: (Ethernet header is not shown for a better view.)
To view the packet in hex string:
from binascii import hexlify as hex2
ack_pkt = sniff(offline="sample.pcap", count=1)[0]
print ack_pkt.chksum, ack_pkt[TCP].chksum
print hex2(str(ack_pkt[IP]))
del ack_pkt.chksum
del ack_pkt[TCP].chksum
print ack_pkt.chksum, ack_pkt[TCP].chksum
print hex2(str(ack_pkt[IP]))
ack_pkt=ack_pkt.__class__(str(ack_pkt))
print ack_pkt.chksum, ack_pkt[TCP].chksum
print hex2(str(ack_pkt[IP]))
ack_pkt[TCP].chksum=0
print hex2(str(ack_pkt[IP]))
And the output I get is:
26317 30805
450000345bc840004006*66cd*0e8b864067297c3a0016a2b9f11ddc3fe61e9a8d801000f7*7855*00000101080a47e8a8af0b323857
None None
450000345bc840004006*66cd*0e8b864067297c3a0016a2b9f11ddc3fe61e9a8d801000f7*d89c*00000101080a47e8a8af0b323857
26317 55452
450000345bc840004006*66cd*0e8b864067297c3a0016a2b9f11ddc3fe61e9a8d801000f7*d89c*00000101080a47e8a8af0b323857
450000345bc840004006*66cd*0e8b864067297c3a0016a2b9f11ddc3fe61e9a8d801000f7*0000*00000101080a47e8a8af0b323857
(* is only for marking the checksum bytes.)
Isn't it strange? After deleting the checksum, when I put a ack_pkt.show(), I see both checksum fields are None. But while converting to hex-string, is it been recalculated?
ack_pkt[TCP].chksum=0 with this, the recalculated checksum comes 0 only.
Note:
I've tried with ack_pkt[TCP].show2() and I'm getting the same value as I'm getting above.

1) Did you try to copy a complete packet, not just its TCP part:
ack_pkt=ack_pkt.__class__(str(ack_pkt))
2) Did you try the following approach, explicitly converting the packet to a string and recreating it using the string?
3) If none of the above works, please post the TCP ACK packet you are working with

I tried a manual checksum recalculation both with python and C (with libpcap) by following one's compliment of one's compliment sum and I'm getting the same value as scapy gets for the mentioned packet. So I guess the checksum calculation in linux-kernel is modified somehow.
(It's better to say one guy is wrong rather than saying all other 3 were wrong. :P)

Converting hex string to packet in Scapy

My aim is to sniff a packet from a pcap file, modify the last 4 bytes of the packet and send it. For now, I'm doing like this:
from scapy.all import *
from struct import pack
from struct import unpack
pkt = sniff(offline="my.pcap", count=1)
pkt_hex = str(pkt)
# getting output like '\x00\x04\x00 ... ... \x06j]'
last_4 = unpack('!I',pkt_hex[-4:])[0]
# getting last 4 bytes and converting it to integer
rest = pkt_hex[:-4]
# getting whole packet in string except last 4 bytes
new_pkt = rest + pack('>I',(last_4+1))
# created the whole packet again with last 4 bytes incremented by 1
# the new string is like '\x00\x04\x00 ... ... \x06j^'
Now My problem is I'm unable to convert it back to a layered packet object of Scapy and hence unable to send it using sendp.
PS: I've to recalculate the checksum. But once I'll convert it to a packet object, I can recalculate the checksum following this.

You can rebuild the packet using the class of the original packet, however there are other errors in your program.
The official API documentation on the sniff function states that it returns a list:
sniff(prn=None, lfilter=None, count=0, store=1, offline=None, L2socket=None, timeout=None)
Sniffs packets from the network and return them in a packet list.
Therefore, rather than extracting the packet with pkt_hex = str(pkt), the correct form to extract the packet is pkt_hex = str(pkt[0]).
Once that is done, you are free to alter the packet, update its checksum (as suggested here) and rebuild it using the class of the original packet, as follows (note my comments):
from scapy.all import *
from struct import pack
from struct import unpack
pkts = sniff(offline="my.pcap", count=1)
pkt = pkts[0] # <--- NOTE: correctly extract the packet
del pkt.chksum # <--- NOTE: prepare for checksum recalculation
del pkt[TCP].chksum # <--- NOTE: prepare for TCP checksum recalculation (depends on the transport layer protocol in use)
pkt_hex = str(pkt)
# getting output like '\x00\x04\x00 ... ... \x06j' <--- NOTE: there is no trailing ']'
last_4 = unpack('!I',pkt_hex[-4:])[0]
# getting last 4 bytes and converting it to integer
rest = pkt_hex[:-4]
# getting whole packet in string except last 4 bytes
new_hex_pkt = rest + pack('>I',(last_4+1))
# created the whole packet again with last 4 bytes incremented by 1
# the new string is like '\x00\x04\x00 ... ... \x06k' <--- NOTE: 'j' was incremented to 'k' (rather than ']' to '^')
new_pkt = pkt.__class__(new_hex_pkt) # <--- NOTE: rebuild the packet and recalculate its checksum
sendp(new_pkt) # <--- NOTE: send the new packet
EDIT:
Note that this doesn't preserve the packet's timestamp, which would change to the current time. In order to retain the original timestamp, assign it to new_pkt.time:
new_pkt.time = pkt.time
However, as explained here, even after changing the packet's timestamp and sending it, the updated timestamp won't be reflected in the received packet on the other end since the timestamp is set in the receiving machine as the packet is received.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.