Python dectect the length of the data with socket - python

I found this code to detect the length of encrypted data in the frame :
header = self.request.recv(5)
if header == '':
#print 'client disconnected'
running = False
break
(content_type, version, length) = struct.unpack('>BHH', header)
data = self.request.recv(length)
Souce :
https://github.com/EiNSTeiN-/poodle/blob/master/samples/poodle-sample-1.py
https://gist.github.com/takeshixx/10107280
https://gist.github.com/ixs/10116537
This code, listen the connection between a client and a server. When the client talk to the server, self.request.recv(5) can get you the length of the header in the frame. Then we use that length to take the data.
If we print the exchange between the client and the server :
Client --> [proxy] -----> Server
length : 24 #why 24 ?
Client --> [proxy] -----> Server
length: 80 #length of the data
Client <-- [proxy] <----- Server
We can see that the client will send two packet to the server.
If i change
data = self.request.recv(length)
to
data = self.request.recv(4096)
Only one exchange is made.
Client --> [proxy] -----> Server
length: 109 #length of the data + the header
Client <-- [proxy] <----- Server
My question is why we only need to take a size of 5 to get the lenght, content_type informations ? Is there an understandable doc about this ?
Why there is two request: one with 24 and another with the lenght of our data ?

why we only need to take a size of 5 to get the lenght, content_type
informations ?
Because obviously that's the way the protocol was designed.
Binary streams only guarantee that when some bytes are put into one end of the stream, they arrive in the same order on the other end of the stream. For message transmission through binary streams the obvious problem is: where are the message boundaries? The classical solution to this problem is to add a prefix to messages, a so-called header. This header has a fixed size, known to both communication partners. That way, the recipient can safely read header, message, header, message (I guess you grasp the concept, it is an alternating fashion). As you see, the header does not contain message data -- it is just communication "overhead". This overhead should be kept small. The most efficient (space-wise) way to store such information is in binary form, using some kind of code that must, again, be known to both sides of the communication. Indeed, 5 bytes of information is quite a lot.
The '>BHH' format string indicates that this 5 byte header is built up like this:
unsigned char (1 Byte)
unsigned short (2 Bytes)
unsigned short (2 Bytes)
Plenty of room for storing information such as length and content type, don't you think? This header can encode 256 different content types, 65536 different versions, and a message length between 0 and 65535 bytes.
Why there is two request: one with 24 and another with the lenght of
our data ?
If your network forensics / traffic analysis does not correspond to what you have inferred from code, one of both types of analyses is wrong/incomplete. In this case, I guess that your traffic analysis is correct, but that you have not understood all relevant code for this kind of communication. Note that I did not look at the source code you linked to.

Related

how can I parse prefixed length message from TCP stream in twisted python?

I'm coding to tcp client/server using python twisted
in order to replace for Java or C#.
I have to parse length prefixed string messages based on ANS(alpha numeric string) in connected permanent session.
like this :
message format : [alpha numeric string:4byte][message data]
example-1 : 0004ABCD ==> ABCD
example-2 : 0002AB0005HELLO ==> AB, HELLO
it can't be solved by IntNProtocol, NetStringProtocol.
And if a client send a 2kb message from application layer, the kernel split message data by MSS(maximum segment size) and send packets are splitted.
in TCP send/receive environment, it often raise like this :
example : 1000HELLO {not yet arrived 995 byte data}
so it has to wait for receiving spare data using array, queue...
in the twisted, I don't know how to parse multiple large-message.
Anybody help me to give some information or URL?
class ClientProtocol(protocol.Protocol):
def dataReceived(self, data):
# how can I code to parse multiple large message?
# is there solution to read specific size for data ?
It looks like you can implement this protocol using StatefulProtocol as a base. Your protocol basically has two states. In the first state, you're waiting for 4 bytes which you will interpret as a zero-padded base 10 integer. In the second state, you're waiting for a number of bytes equal to the integer read in the first state.
from twisted.protocols.stateful import StatefulProtocol
class ANSProtocol(StatefulProtocol):
def getInitialState(self):
return (self._state_length, 4)
def _state_length(self, length_bytes):
length = int(length_bytes)
return self._state_content, length
def _state_content(self, content):
self.application_logic(content)
return self.getInitialState()
def application_logic(self, content):
# Application logic operating on `content`
# ...

Python sending mutliple strings using socket.send() / socket.recv()

I am trying to send multiple strings using the socket.send() and socket.recv() function.
I am building a client/server, and "in a perfect world" would like my server to call the socket.send() function a maximum of 3 times (for a certain server command) while having my client call the socket.recv() call 3 times. This doesn't seem to work the client gets hung up waiting for another response.
server:
clientsocket.send(dir)
clientsocket.send(dirS)
clientsocket.send(fileS)
client
response1 = s.recv(1024)
if response1:
print "\nRecieved response: \n" , response1
response2 = s.recv(1024)
if response2:
print "\nRecieved response: \n" , response2
response3 = s.recv(1024)
if response3:
print "\nRecieved response: \n" , response3
I was going through the tedious task of joining all my strings together then reparsing them in the client, and was wondering if there was a more efficient way of doing it?
edit:
My output of response1 gives me unusual results. The first time I print response1, it prints all 3 of the responses in 1 string (all mashed together). The second time I call it, it gives me the first string by itself. The following calls to recv are now glitched/bugged and display the 2nd string, then the third string. It then starts to display the other commands but as if it was behind and in a queue.
Very unusual, but I will likely stick to joining the strings together in the server then parsing them in the client
You wouldn't send bytes/strings over a socket like that in a real-world app.
You would create a messaging protocol on-top of the socket, then you would put your bytes/strings in messages and send messages over the socket.
You probably wouldn't create the messaging protocol from scratch either. You'd use a library like nanomsg or zeromq.
server
from nanomsg import Socket, PAIR
sock = Socket(PAIR)
sock.bind('inproc://bob')
sock.send(dir)
sock.send(dirS)
sock.send(fileS)
client
from nanomsg import Socket, PAIR
sock = Socket(PAIR)
sock.bind('inproc://bob')
response1 = sock.recv()
response2 = sock.recv()
response3 = sock.recv()
In nanomsg, recv() will return exactly what was sent by send() so there is a one-to-one mapping between send() and recv(). This is not the case when using lower-level Python sockets where you may need to call recv() multiple times to get everything that was sent with send().
TCP is a streaming protocol and there are no message boundaries. Whether a blob of data was sent with one or a hundred send calls is unknown to the receiver. You certainly can't assume that 3 sends can be matched with 3 recvs. So, you are left with the tedious job of reassembling fragments at the receiver.
One option is to layer a messaging protocol on top of the pure TCP stream. This is what zeromq does, and it may be an option for reducing the tedium.
The answer to this has been covered elsewhere.
There are two solutions to your problem.
Solution 1:
Mark the end of your strings. send(escape(dir) + MARKER) Your client then keeps calling recv() until it gets the end-of-message marker. If recv() returns multiple strings, you can use the marker to know where they start and end. You need to escape the marker if your strings contain it. Remember to escape on the client too.
Solution 2:
Send the length of your strings before you send the actual string. Your client then keeps calling recv() until its read all the bytes. If recv() returns multiple strings. You know where they start and end since you know how long they are. When sending the length of your string, make you you use a fixed number of bytes so you can distinguish the string lenght from the string in the byte stream. You will find struct module useful.

Mixed formatted List/String from Serial

i am new to python. I have some experience with Pascal and a little bit with C++.
At the moment i have to program some code for a research project demonstrator.
The setup is as follows:
We have a 868MHz radio master device. i can communicate with this device via a USB port (COM4 at the moment but may change in the future).
The 868MHz master communicates with a slave unit. The slave unit replies with a message that i can read from the USB port.
Until this point everything works good. I request data packages and also receive them.
From the moment of receiving the data packages i have a propblem i seem not
able to solve on myself.
I use Anaconda 32 bit with the Spyder editor
# -*- coding: utf-8 -*-
"""
Created on Thu May 7 13:35:59 2015
#author: roland
"""
import serial
portnr = 3 #Serial Port Number 0=Com1, 3=Com4
portbaud = 38400 #Baud rate
tiout = 0.1 #Timout in seconds
i = 1
wrword = ([0x02,0x04,0x00,0x00,0x00,0x02,0x71,0xF8])
try:
ser = serial.Serial(portnr, portbaud, timeout=tiout) # open port
except:
ser.close() # close port
ser = serial.Serial(portnr, portbaud, timeout=tiout) # open port
print(ser.name) # check which port was really used
while (i < 100):
ser.write(wrword)
seread = ser.readline()
print(seread)
i = i+1
sere = seread.split()
try:
readdat = str(sere[0])
except:
print("Index Error")
retlen = len(readdat)
print(retlen)
readdat = readdat[2:retlen-1]
print(readdat)
ser.close() # close port
The variable wrword is my request to the 868MHz radio master.
The Format is as follows:
0x02 Address of the unit
0x04 Command to send information from a certain register range
0x00 0x00 Address of first Register (Start address 0 is valid!)
0x00 0x02 Information how much registers are to be sent (in this case Registers 0 and 1 shall be transmitted to the Radio master)
0x71 0xF8 Checksum of the command sentence.
The program sends the command sequence successful to the master unit and the slave unit answers. Each time the command is send an answer is expected. Nevertheless it may happen that now correct answer is given thats why the
try command is in use.
I know i use ser.readline() but this is sufficient for the application.
I receive a list as answer from the USB Port.
The data look as follows:
b'\x02\x04\x04\x12\xb6\x12\xa5\xe0\xc1' (This is the Output from print(seread) )
For clarification this answer is correct and must be read as follows:
\x02 Address of the answering unit
\x04 Function that was executed (Read from certain register area)
\x04 Number of Bytes of the answer
\x12 \xb6 Value of first register (2 Byte)
\x12 \xa5 Value of second register (2 Byte)
\xe0 \xc1 Checksum of answer
If the data from the com port had all this Format i might be able to get the data values from the both Registers. But unfortunately the data format is not always the same.
Sometimes i receive answers in the following style:
b'\x02\x04\x04\x12\x8e\x12{\xe1T'
The answer is similar to the example above (different values in the Registers and different checksum) but the Format i receive has changed.
If i use the hex ASCII codes for the symbols obviously not hex values i find a valid answer telegram.
b'\x02\x04\x04\x12\x8e\x12{\xe1T'
becomes
b'\x02\x04\x04\x12\x8e\x12\x7b\xe1\x54'
when i Exchange the ASCII symbols by their hex code by Hand.
So my questions are:
Is it possible to force Python to give me the answer in a defined Format?
If not is it possible to handle the list or the string i can derive from the list in such a way that i get my values in the required format?
Does somebody can give me a hint how to extract my register values from the list and convert the two hex numbers of each register into one integer value for each register (the first value is the high byte, the second the low byte)?
Thanks in advance for your answer(s)
sincerely
Roland
I found a solution.
During a small testpiece of program i stumbled upon the fact that the variable seread contains already the data in a suitable and usable format for me.
I assume that the Spyder Editor causes the format change when displaying byte type objects.
If i Access the single Bytes using seread[i] while i is in range 0 to len(seread)-1 i receive the correct values for the single bytes.
So i can acess my data and calculate my measurement values as required.
Nevertheless thanks to keety for reading my question.

Converting hex string to packet in Scapy

My aim is to sniff a packet from a pcap file, modify the last 4 bytes of the packet and send it. For now, I'm doing like this:
from scapy.all import *
from struct import pack
from struct import unpack
pkt = sniff(offline="my.pcap", count=1)
pkt_hex = str(pkt)
# getting output like '\x00\x04\x00 ... ... \x06j]'
last_4 = unpack('!I',pkt_hex[-4:])[0]
# getting last 4 bytes and converting it to integer
rest = pkt_hex[:-4]
# getting whole packet in string except last 4 bytes
new_pkt = rest + pack('>I',(last_4+1))
# created the whole packet again with last 4 bytes incremented by 1
# the new string is like '\x00\x04\x00 ... ... \x06j^'
Now My problem is I'm unable to convert it back to a layered packet object of Scapy and hence unable to send it using sendp.
PS: I've to recalculate the checksum. But once I'll convert it to a packet object, I can recalculate the checksum following this.
You can rebuild the packet using the class of the original packet, however there are other errors in your program.
The official API documentation on the sniff function states that it returns a list:
sniff(prn=None, lfilter=None, count=0, store=1, offline=None, L2socket=None, timeout=None)
Sniffs packets from the network and return them in a packet list.
Therefore, rather than extracting the packet with pkt_hex = str(pkt), the correct form to extract the packet is pkt_hex = str(pkt[0]).
Once that is done, you are free to alter the packet, update its checksum (as suggested here) and rebuild it using the class of the original packet, as follows (note my comments):
from scapy.all import *
from struct import pack
from struct import unpack
pkts = sniff(offline="my.pcap", count=1)
pkt = pkts[0] # <--- NOTE: correctly extract the packet
del pkt.chksum # <--- NOTE: prepare for checksum recalculation
del pkt[TCP].chksum # <--- NOTE: prepare for TCP checksum recalculation (depends on the transport layer protocol in use)
pkt_hex = str(pkt)
# getting output like '\x00\x04\x00 ... ... \x06j' <--- NOTE: there is no trailing ']'
last_4 = unpack('!I',pkt_hex[-4:])[0]
# getting last 4 bytes and converting it to integer
rest = pkt_hex[:-4]
# getting whole packet in string except last 4 bytes
new_hex_pkt = rest + pack('>I',(last_4+1))
# created the whole packet again with last 4 bytes incremented by 1
# the new string is like '\x00\x04\x00 ... ... \x06k' <--- NOTE: 'j' was incremented to 'k' (rather than ']' to '^')
new_pkt = pkt.__class__(new_hex_pkt) # <--- NOTE: rebuild the packet and recalculate its checksum
sendp(new_pkt) # <--- NOTE: send the new packet
EDIT:
Note that this doesn't preserve the packet's timestamp, which would change to the current time. In order to retain the original timestamp, assign it to new_pkt.time:
new_pkt.time = pkt.time
However, as explained here, even after changing the packet's timestamp and sending it, the updated timestamp won't be reflected in the received packet on the other end since the timestamp is set in the receiving machine as the packet is received.

How to receive http response data use socket?

As you know sometimes we can't know what the size of the data(if there is no Content-Length in http response header).
What is the best way to receive http response data(use socket)?
The follow code can get all the data but it will blocking at buf = sock.recv(1024).
from socket import *
import sys
sock = socket(AF_INET, SOCK_STREAM)
sock.connect(('www.google.com', 80))
index = "GET / HTTP/1.1\r\nHOST:www.google.com\r\nConnection:keep-alive\r\n\r\n"
bdsock.send(index)
data = ""
while True:
buf = bdsock.recv(1024)
if not len(buf):
break
data += buf
I'm assuming you are writing the sender as well.
A classic approach is to prefix any data sent over the wire with the length of the data. On the receive side, you just greedily append all data received to a buffer, then iterate over the buffer each time new data is received.
So if I send 100 bytes of data, I would prefix an int 100 to the beginning of the packet, and then transmit. Then, the receiver knows exactly what it is looking for. IF you want to get fancy, you can use a special endline sequence like \x00\x01\x02 to indicate the proper end of packet. This is an easily implemented form of error checking.
Use a bigger size first, do a couple of tests, then see what is the lenght of those buffers, you will then have an idea about what would the maximum size be. Then just use that number +100 or so just to be sure.
Testing different scenarios will be your best bet on finding your ideal buf size.
It would also help to know what protocol you are using the sockets for, then we would have a better idea and response for you.
Today I got the same question again.
And I found the simple way is use httplib.
r = HTTPResponse(sock)
r.begin()
# now you can use HTTPResponse method to get what you want.
print r.read()

Categories

Resources