how to dump http traffic?

how to dump http traffic? - python

I am working with webservices, and I need to get a dump of all the HTTP requests and responses, so that I can debug the interoperability between the devices.
I have a small pc with 3 nics that are bridged, so that it acts as an hub and I can tap the traffic. I am looking for a way to easily dump the HTTP traffic, so that I can analyze the SOAP messages exchanged by the two devices.
For I would prefer to implement that with python, I tried scapy with the HTTP extension, but it does not seem to work, because I see the request parsed three times (I wonder if this is due to the used of a bridge) and I am not able to see the responses.
Is there any other way to implement such a tool? I prefer python, but it is not mandatory.
Another small question
I add a subquestion: by using the HTTP interpreter that I linked in the previous question, I see that I sometimes get packets that are only recognized as HTTP and not as HTTPRequest or HTTPResponse. Such packets look gzipped, and I think they are related to the fact that a response body does not fit in a single packet. It there a way with scapy to have these packets merged together? I need a way to get the body of the messages. Again, not only in python, and not only with scapy.

I finally solved my problem by using tshark in pipe and by parsing its output with a python script. Most of the decoding activity is performed by the following command
tshark -l -f "tcp port 80" -R "http.request or http.response " -i br0 -V
which outputs the decoded HTTP packets. and my script performs all the necessary operations.

For the raw sniffing I'd go with tcpdump writing to a pcap file.
tcpdump -i <interface> -s 65535 -w file.pcap port 80
The -s says write the whole packet out and -w is save. I'm assuming your http goes over port 80, but you can make an arbitrarily complex filter expression. Make sure the interface is the one that leads to the server so you see what it's sending and receiving vs how traffic gets to your bridge host.
You can then parse the pcap with scapy at your leisure, knowing that the capture is happening in a well tested, fast, and reliable manner.
rdpcap("/tmp/file.pcap")

There are some respectable traffic sniffers around already, so you probably have no need to implement one of your own. Wireshark is amongst the most popular. Not only it allows you to capture traffic, but also has some great tools for filtering and analyzing the packets.
sharktools allows you to use Wireshark packet dissection engine from Python, e.g. to filter the packets.
If you have very specific needs or just want to learn something new, pylibpcap is a Python interface for libpcap library, which is used by (almost) every traffic capture program out there.
UPD: Fixed typo in URL for pylibpcap.

Related

Is there any way to extract the payload from a UDP packet in real time and use that payload for another application?

Im receiving some udp packets from a network interface called tun0. I am able to see those packets through wireshark. What I need is to extract the payloads from those packets. I tried to use python sockets but Im unable to extract the payload and I think thats because of the packets have a uip stack. Is there anyway to take the payload from the wireshark directly to real time processes? Or is there any other suggestions?

If I understand correctly what you're trying to accomplish is to get the udp.stream out of Wireshark without the headers.
If youre using Windows you might wanna use powershell:
http://winpowershell.blogspot.com/2010/01/powershell-udp-clientserver.html?m=1
But if you're using linux or don't have powershell/ not comfortable using it you may use tshark like this.
tshark -r $file -R '(ip.addr eq 10.0.0.X and ip.addr eq 10.0.0.X) and (udp.port eq X and udp.port eq X)' -T fields -e data
Hopefully this will works.

How to bind ports for TCP clients when using raw packet sending TCP packets in Python? [duplicate]

Ok, I realize this situation is somewhat unusual, but I need to establish a TCP connection (the 3-way handshake) using only raw sockets (in C, in linux) -- i.e. I need to construct the IP headers and TCP headers myself. I'm writing a server (so I have to first respond to the incoming SYN packet), and for whatever reason I can't seem to get it right. Yes, I realize that a SOCK_STREAM will handle this for me, but for reasons I don't want to go into that isn't an option.
The tutorials I've found online on using raw sockets all describe how to build a SYN flooder, but this is somewhat easier than actually establishing a TCP connection, since you don't have to construct a response based on the original packet. I've gotten the SYN flooder examples working, and I can read the incoming SYN packet just fine from the raw socket, but I'm still having trouble creating a valid SYN/ACK response to an incoming SYN from the client.
So, does anyone know a good tutorial on using raw sockets that goes beyond creating a SYN flooder, or does anyone have some code that could do this (using SOCK_RAW, and not SOCK_STREAM)? I would be very grateful.
MarkR is absolutely right -- the problem is that the kernel is sending reset packets in response to the initial packet because it thinks the port is closed. The kernel is beating me to the response and the connection dies. I was using tcpdump to monitor the connection already -- I should have been more observant and noticed that there were TWO replies one of which was a reset that was screwing things up, as well as the response my program created. D'OH!
The solution that seems to work best is to use an iptables rule, as suggested by MarkR, to block the outbound packets. However, there's an easier way to do it than using the mark option, as suggested. I just match whether the reset TCP flag is set. During the course of a normal connection this is unlikely to be needed, and it doesn't really matter to my application if I block all outbound reset packets from the port being used. This effectively blocks the kernel's unwanted response, but not my own packets. If the port my program is listening on is 9999 then the iptables rule looks like this:
iptables -t filter -I OUTPUT -p tcp --sport 9999 --tcp-flags RST RST -j DROP

You want to implement part of a TCP stack in userspace... this is ok, some other apps do this.
One problem you will come across is that the kernel will be sending out (generally negative, unhelpful) replies to incoming packets. This is going to screw up any communication you attempt to initiate.
One way to avoid this is to use an IP address and interface that the kernel does not have its own IP stack using- which is fine but you will need to deal with link-layer stuff (specifically, arp) yourself. That would require a socket lower than IPPROTO_IP, SOCK_RAW - you need a packet socket (I think).
It may also be possible to block the kernel's responses using an iptables rule- but I rather suspect that the rules will apply to your own packets as well somehow, unless you can manage to get them treated differently (perhaps applying a netfilter "mark" to your own packets?)
Read the man pages
socket(7)
ip(7)
packet(7)
Which explain about various options and ioctls which apply to types of sockets.
Of course you'll need a tool like Wireshark to inspect what's going on. You will need several machines to test this, I recommend using vmware (or similar) to reduce the amount of hardware required.
Sorry I can't recommend a specific tutorial.
Good luck.

I realise that this is an old thread, but here's a tutorial that goes beyond the normal SYN flooders: http://www.enderunix.org/docs/en/rawipspoof/
Hope it might be of help to someone.

I can't help you out on any tutorials.
But I can give you some advice on the tools that you could use to assist in debugging.
First off, as bmdhacks has suggested, get yourself a copy of wireshark (or tcpdump - but wireshark is easier to use). Capture a good handshake. Make sure that you save this.
Capture one of your handshakes that fails. Wireshark has quite good packet parsing and error checking, so if there's a straightforward error it will probably tell you.
Next, get yourself a copy of tcpreplay. This should also include a tool called "tcprewrite".
tcprewrite will allow you to split your previously saved capture files into two - one for each side of the handshake.
You can then use tcpreplay to play back one side of the handshake so you have a consistent set of packets to play with.
Then you use wireshark (again) to check your responses.

I don't have a tutorial, but I recently used Wireshark to good effect to debug some raw sockets programming I was doing. If you capture the packets you're sending, wireshark will do a good job of showing you if they're malformed or not. It's useful for comparing to a normal connection too.

There are structures for IP and TCP headers declared in netinet/ip.h & netinet/tcp.h respectively. You may want to look at the other headers in this directory for extra macros & stuff that may be of use.
You send a packet with the SYN flag set and a random sequence number (x). You should receive a SYN+ACK from the other side. This packet will have an acknowledgement number (y) that indicates the next sequence number the other side is expecting to receive as well as another sequence number (z). You send back an ACK packet that has sequence number x+1 and ack number z+1 to complete the connection.
You also need to make sure you calculate appropriate TCP/IP checksums & fill out the remainder of the header for the packets you send. Also, don't forget about things like host & network byte order.
TCP is defined in RFC 793, available here: http://www.faqs.org/rfcs/rfc793.html

Depending on what you're trying to do it may be easier to get existing software to handle the TCP handshaking for you.
One open source IP stack is lwIP (http://savannah.nongnu.org/projects/lwip/) which provides a full tcp/ip stack. It is very possible to get it running in user mode using either SOCK_RAW or pcap.

if you are using raw sockets, if you send using different source mac address to the actual one, linux will ignore the response packet and not send an rst.

Using pcapy or scapy to monitor self-generated (HTTP) network traffic

I need to monitor how long it takes for a certain website to respond when addressed. I would like to sniff the traffic on port 80 but only when there is traffic being exchanged with the targeted site. I have searched SO and it seems like pcapy or scapy is the right tool for the job, but they seem deeper than I need. I have studying the following script:
Network traffic monitor with pcapy in python
and I think I need to change the
def __handle_packet(self, header, data):
# method is called for each packet by dispatch call (pcapy)
self._dispatch_bytes_sum += header.getlen() #header.getlen() #len(data)
logger.debug("header: (len:{0}, caplen:{1}, ts:{2}), d:{3}".format(header.getlen(), header.getcaplen(), header.getts(), len(data)))
#self.dumper.dump(header, data)
to somehow only unpack/handle packets that are destined for the target site. Note that this is for a Windows XP machine on a LAN and it is critical that the browser initiate the traffic.
Any pointers appreciated?

The problem with scapy is it doesn't handle reassembling TCP streams. Your HTTP that you're looking for is likely to be embedded in a TCP stream. To quote the docs:
Scapy is based on a stimulus/response model. This model does not work well for a TCP stack. On the other hand, quite often, the TCP stream is used as a tube to exchange messages that are stimulus/response-based.
Like you said scapy is more ideal for lower-layer things. You could, for instance, probably track IP packets on DHCP requests. Like many network tools, the complexities and stream-based nature of TCP means once you cross that layer it gets harder to reassemble everything and deal with all the retransmission and what not edge cases and coherently pull data out.
Could you use something like curl or urllib and see how long it takes for the response to come back?

Python: how to calculate data received and send between two ipaddresses and ports

I guess it's socket programming. But I have never done socket programming expect for running the tutorial examples while learning Python. I need some more ideas to implement this.
What I specifically need is to run a monitoring program of a server which will poll or listen to traffic being exchange from different IPs across different popular ports. For example, how do I get data received and sent through port 80 of 192.168.1.10 and 192.168.1.1 ( which is the gateway).
I checked out a number of ready made tools like MRTG, Bwmon, Ntop etc but since we are looking at doing some specific pattern studies, we need to do data capturing within the program.
Idea is to monitor some popular ports and do a study of network traffic across some periods and compare them with some other data.
We would like to figure a way to do all this with Python....

You probably want to use scapy for that. Just sniff all ethernet traffic on a particular interface, drop everything that is not TCP and doesn't match the port.
Not sure if scapy can already track TCP connections (stuff like recognizing duplicate sequence numbers, extracting just the payload stream) but I would guess it probably can, and if not it's not too hard to hack together a good-enough TCP connection tracker that works for 95% of the traffic.
Alternatives would be to use sockets directly (look for raw sockets) or libpcap, which can both be done from Python. You may also want to check out the filter experssion syntax of the 'tcpdump' commandline tool, maybe it can do what you want already.
I bet there are more specialized high-level tools for this, but I don't know them.
PS: if you don't know wireshark yet, go check it out and play around with it first. It can follow TCP streams and will teach you what TCP connection tracking means. Maybe its commandline binary, tshark, can be used to extract TCP streams for what you want.

IPTraf is an ncurses based IP LAN monitoring tool. Has a capability to generate network statistics including TCP,UDP,ICMP and some more.
Since you're thinking to execute it from python, you may consider to use screen (screen manager with VT100/ANSI terminal emulation) to overcome ncurses issues and you may want to pass logging and interval parameters to IPTraf which forces iptraf to log to a file in a given interval. Little bit tricky but eventually you can have what you are looking for by basically parsing the log file.

twisted - print IP datagrams from/to proxy

I have a twisted proxy from here: Python Twisted proxy - how to intercept packets .
It prints the HTTP data, and I would like also to intercept and examine the raw IP datgrams. How to hook the callback for the IP packets?
http://twistedmatrix.com/documents/11.0.0/api/twisted.pair.ip.IPProtocol.html

Twisted doesn't have a built-in friendly way to hook in a listener on a raw IP socket (SOCK_RAW). This is for several reasons:
using SOCK_RAW can be tricky and it can work in non-obvious ways;
in most environments, using such a socket requires elevated privileges;
and the packets you actually get through a raw socket differ a lot between operating systems (e.g., you won't get any raw TCP-protocol IP packets on *BSD/Darwin through a raw socket, even if you're root).
The best way to capture raw datagrams in general, in a remotely portable manner, is with libpcap. Here is a link to someone who appears to have combined pcap and Twisted in a reasonably intelligent way; that may help.

Twisted doesn't include comprehensive support for operating at the IP level. There is some support for parsing IP datagrams, as you found, but no built-in support for hooking into platform support for sending or receiving these.
You might want to take a look at scapy.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.