Sometimes Receiving shifted data with TCP - python

I'm receiving packages where the content has its start not at the beggining of the buffer but at byte 2 or 3 of the buffer. That just happens sometimes. That means the received data are all shifted then.
I set the buffer size to 4096 bytes. I'm using the recv method from the Python sockets and am receving data from a Siemens PLC (if that's a possible source of issues).
Am I missing something out or has the error to be somewhere else than in my program?

Related

How many bytes can be send() over tcp without ever receive(), before send() blocks -- dependent on buffer sizes?

In python 3.9 I wrote a TCP server that never calls receive(). And a client that sends 1KB chunks to the server. Previously I'm setting send- and receive buffer sizes in the KB-range.
My expectation was to be able to send (send-buffer + receive-buffer) bytes before send() would block. However:
On Windows 10: send() consistently blocks only after (2 x send-buffer + receive-buffer) bytes.
On Raspberry Debian GNU/Linux 11 (bullseye):
setting buffer seizes (with setsockopt) results in twice the buffer (as reported by getsockopt).
send() blocks after roughly (send-buffer + 2 x receive-buffer) bytes wrt the buffer sizes set with setsockopt.
Questions: Where does the "excess" data go? How come, the implementation behave to differently?
All tests where done on the same machine (win->win, raspi->raspi) with various send/ receive buffer sizes in the range 5 - 50 KB.
TCP is a byte stream, there is no 1:1 relationship between sends and reads. send() copies data from the sender's buffer into a local kernel buffer, which is then transmitted to the remote peer in the background, where it is received into a kernel buffer, and finally copied by receive() into the receiver's buffer. send() will not block as long as the local kernel still has buffer space available. In the background, the sending kernel will transmit buffered data as long as the receiving kernel still has buffer space available. receive() will block only when the receiving kernel has no data available.

Python - serial does not read correct avlue

I am trying to create a communication between an STM32 and a laptop.
I am trying to receive data from the serial, sent thanks to an STM32. Actual code that I am sending is 0x08 0x09 0x0A 0x0B
I checked on the oscilloscope and I am indeed sending the correct values in the correct order.
What I receive is actually :
b'\n\x0b\x08\t'
I assume that Python is not reading an input that is greater than a 3 bit size, but can not figure out why
Please find my code below :
import serial
ser = serial.Serial('COM3', 115200, bytesize=8)
while 1 :
if(ser.inWaiting() != 0) :
print(ser.read(4))
If someone could help, it would be nice ! :)
check your uart rate, keep the python serial rate the same for stm32
What comes to my mind when looking at pySerial library is that while You initialize Your COM port:
You are not providing read timeout parameter.
You are awaiting 4 bytes of data from serial port.
The problem with such approach is that Your code will wait forever until it gets these 4 bytes and if You are sending the same array from STM32 for me it looks like You received 0x0A,0x0B from older packet and 0x08,0x09 from newer packet, so python code printed these 4 bytes and in fact received into buffer also 0x0A,0x0B of newer packet but it waits until it will receive 2 more bytes before it will be allowed to return with data to print.
Putting here a timeout on read and limiting read argument to single byte might solve Your problem.
Also for further development if You would like to create regular communication between microcontroller and computer I would suggest to move these received bytes into separate buffer and recognize single packets in separate thread parser. In Python it will be painful to create more complex serial communication as even with separate thread it will be quite slow.

Issue with receiving a byte from ATMEGA2560

I m trying to receive a byte from Atmega2560 at an unexpected time ( using USART ) on my pc. So how do I ensure that i don't miss the byte in my python code ( which has may functions running)
You didn't say what operating system you are using or how the ATmega2560 is connected to the computer, but the drivers in your operating system responsible for receiving the serial data from the ATmega2560 will almost certainly have a buffer for holding incoming bytes, so you don't need to worry about constantly reading from the serial port in your Python program. Just read when you get around to it, and the byte should be waiting for you in the buffer.
It's easy to test that this is the case: send a byte from the AVR, purposely wait a few seconds, then read the byte and make sure it was received correctly.

reading both tcp and udp packets from same socket

I am trying to read packets in a router, like this in python:
# (skipping the exception handling code here)
s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.ntohs(0x0003))
while True:
p = s.recvfrom(2000)
pkt = p[0]
# process pkt here ...
Answers to a related question (36115971) say that parameters and methods for UDP vs TCP data are different (some say recv is for TCP and recvfrom is for UDP, and others say the opposite, similarly some say 1024 as buffer size for TCP and larger for UDP, and again some say the reverse). In my case of reading in a router, I do not have different sockets for TCP and UDP, so I need to read both from the same socket, so I am bit confused regarding how I should read the incoming packets.
(1) Should I use recv() or recvfrom(), if I want to read both TCP and UDP packets?
(2) Do the calls return data one packet at a time, or do they return after the buffer is filled up? eg, if I have a large buffer of 4096 bytes, and the incoming streaming 2 packets have 2400 bytes each, will the call return as soon as the 1st packet ends, or will it return after filling up the buffer from the 2nd packet also?
(2a) same question, but if I have a smaller buffer of 2000 bytes. It is clear that on the 1st call I will get the first 2000 bytes of the 1st packet. But on the next call, will I get the last 400 bytes of the 1st packet, or the first 2000 bytes of the 2nd packet?
(3) If I am delayed in making the next call, maybe because I was busy processing the 1st dataset, am I in danger of losing data, or will the OS keep its internal queue of the incoming packets to be given to me when I call the next time? If the OS keeps its internal queue, where can I find information about its size?
NOTE: Some of the given replies have been divergent, so let me put in some boundaries to my question. Hopefully these restrictions will help to give more specific answers.
(a) My objective is to sniff the incoming packets with python sockets only. So other solutions involving tcpdump or tshark etc are outside the scope.
(b) The objective is to only sniff for incoming packets. Additional details like packet reordering (for connection oriented protocols like TCP) are outside the scope, actually they are avoidable overhead.
If you're reading packets from a raw socket (as shown in your source code), then you can easily read all packets from the same socket. Be sure this is what you intend to do. A raw socket is for doing packet inspection for troubleshooting, forensic, security or educational purposes. You cannot easily communicate with another system this way.
And likewise, the receive calls will not differ here by protocol because you are not actually using TCP or UDP, you're simply receiving the raw packets that those protocols build and decode.
(1) Should I use recv() or recvfrom(), if I want to read both TCP and UDP packets?
Either one will work. recv() will return to you only the actual packet data, while recvfrom will return to you the data along with metadata about the packet, including the interface from which the data was received (and other things defined in struct sockaddr_ll from the packet(7) man page).
(2) Do the calls return data one packet at a time, or do they return after the buffer is filled up? eg, if I have a large buffer of 4096 bytes, and the incoming streaming 2 packets have 2400 bytes each, will the call return as soon as the 1st packet ends, or will it return after filling up the buffer from the 2nd packet also?
When using a raw socket like this, you get exactly one packet at a time. You will never get more than one. If the buffer you give is not large enough, then the packet will be truncated (with the ending bytes discarded).
(2a) same question, but if I have a smaller buffer of 2000 bytes. It is clear that on the 1st call I will get the first 2000 bytes of the 1st packet. But on the next call, will I get the last 400 bytes of the 1st packet, or the first 2000 bytes of the 2nd packet?
Generally speaking, packets on most networks are limited to about 1514 bytes. This is because the traditional "MTU" (Maximum Transfer Unit) that is configured on the network interface is 1500 bytes and usually an Ethernet header containing two MAC addresses (6 bytes each) plus a two-byte Ethertype is prepended to that. In a switch or router, you may also see packets that have an additional 4-byte header containing a VLAN header (IEEE 802.1Q). (But, some networks internally use "jumbo" packets up to about 9K in size for specific purposes.)
You should also understand that, in writing an application, one can send UDP datagrams (or TCP buffers) larger than the maximum packet size. In that case, the OS breaks those up into smaller chunks for sending (and they are re-assembled on the destination side before being handed to an application). When you're receiving raw packets like this, you will see the packets in their low-level, possibly fragmented, state.
(3) If I am delayed in making the next call, maybe because I was busy processing the 1st dataset, am I in danger of losing data, or will the OS keep its internal queue of the incoming packets to be given to me when I call the next time? If the OS keeps its internal queue, where can I find information about its size?
The OS will keep a queue of packets for you. The size is of course limited since there is no way you would be able to keep up with, say, a 1Gb NIC at full line rate (let alone a 10Gb or higher NIC). The size is configured in a system-specific way. On linux -- and probably other Unix-based systems -- you can call getsockopt with SOL_SOCKET / SO_RCVBUF to get an idea of the queue space available.
On linux, at least, the size can be set with setsockopt up to a system-imposed maximum (which itself can be configured with various sysctl settings).
I think you should not do that, because TCP assures various things like reliability, ordering, flow control, and congestion. However UDP does not guarantee anything.
These parameters are defined in the moment of creation of the socket by operating system. That is why I think that you cannot do that you are saying.
Open two different sockets, one native UDP sock and one native TCP sock.

How does the python socket.recv() method know that the end of the message has been reached?

Let's say I'm using 1024 as buffer size for my client socket:
recv(1024)
Let's assume the message the server wants to send to me consists of 2024 bytes.
Only 1024 bytes can be received by my socket. What's happening to the other 1000 bytes?
Will the recv-method wait for a certain amount of time (say 2 seconds) for more data to come and stop working after this time span? (I.e., if the rest of the data arrives after 3 seconds, the data will not be received by the socket any more?)
or
Will the recv-method stop working immediately after having received 1024 bytes of data? (I.e. will the other 1000 bytes be discarded?)
In case that 1.) is correct ... is there a way for me to to determine the amount of time, the recv data should wait before returning or is it determined by the system? (I.e. could I tell the socket to wait for 5 seconds before stopping to wait for more data?)
UPDATE:
Assume, I have the following code:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((sys.argv[1], port))
s.send('Hello, world')
data = s.recv(1024)
print("received: {}".format(data))
s.close()
Assume that the server sends data of size > 1024 bytes. Can I be sure that the variable "data" will contain all the data (including those beyond the 1024th byte)?
If I can't be sure about that, how would I have to change the code so that I can always be sure that the variable "data" will contain all the data sent (in one or many steps) from the server?
It depends on the protocol. Some protocols like UDP send messages and exactly 1 message is returned per recv. Assuming you are talking about TCP specifically, there are several factors involved. TCP is stream oriented and because of things like the amount of currently outstanding send/recv data, lost/reordered packets on the wire, delayed acknowledgement of data, and the Nagle algorithm (which delays some small sends by a few hundred milliseconds), its behavior can change subtly as a conversation between client and server progresses.
All the receiver knows is that it is getting a stream of bytes. It could get anything from 1 to the fully requested buffer size on any recv. There is no one-to-one correlation between the send call on one side and the recv call on the other.
If you need to figure out message boundaries its up to the higher level protocols to figure that out. Take HTTP for example. It starts with a \r\n delimited header and then has a count of the remaining bytes the client should expect to receive. The client knows how to read the header because of the \r\n then knows exactly how many bytes are coming next. Part of the charm of RESTful protocols is that they are HTTP based and somebody else already figured this stuff out!
Some protocols use NUL to delimit messages. Others may have a fixed length binary header that includes a count of any variable data to come. I like zeromq which has a robust messaging system on top of TCP.
More details on what happens with receive...
When you do recv(1024), there are 6 possibilities
There is no receive data. recv will wait until there is receive data. You can change that by setting a timeout.
There is partial receive data. You'll get that part right away. The rest is either buffered or hasn't been sent yet and you just do another recv to get more (and the same rules apply).
There is more than 1024 bytes available. You'll get 1024 of that data and the rest is buffered in the kernel waiting for another receive.
The other side has shut down the socket. You'll get 0 bytes of data. 0 means you will never get more data on that socket. But if you keep asking for data, you'll keep getting 0 bytes.
The other side has reset the socket. You'll get an exception.
Some other strange thing has gone on and you'll get an exception for that.

Categories

Resources