TCP flows by their own nature will grow until they fill the maximum capacity of the links used from src to dst (if all those links are empty).
Is there an easy way to limit that ? I want to be able to send TCP flows with a maximum X mbps rate.
I thought about just sending X bytes per second using the socket.send() function and then sleeping the rest of the time. However if the link gets congested and the rate gets reduced, once the link gets uncongested again it will need to recover what it could not send previously and the rate will increase.
At the TCP level, the only control you have is how many bytes you pass off to send(), and how often you call it. Once send() has handed over some bytes to the networking stack, it's entirely up to the networking stack how fast (or slow) it wants to send them.
Given the above, you can roughly limit your transmission rate by monitoring how many bytes you have sent, and how much time has elapsed since you started sending, and holding off subsequent calls to send() (and/or the number of data bytes your pass to send()) to keep the average rate from going higher than your target rate.
If you want any finer control than that, you'll need to use UDP instead of TCP. With UDP you have direct control of exactly when each packet gets sent. (Whereas with TCP it's the networking stack that decides when to send each packet, what will be in the packet, when to resend a dropped packet, etc)
Related
How can we determine the packet rate of clients connected to our server in case of multi client server using Winsock. The idea I came up with is keeping a frequency map for IP addresses of all the clients and storing the packets count for some arbitrary amount k seconds. Now after k seconds we traverse the map and see what IP addresses have more than 100*k packets, now we block these IP addresses. After every k seconds we empty the map and start again.
PSEUDO CODE: (k = 10)
map<string,int> map;
void calculate() {
for(auto &ip : map){
if(ip.second>10000) blacklist(ip.first);
}
map.clear();
Sleep(10000);
calculate();
}
int s = socket(AF_INET,SOCK_STREAM, IPPROTO_TCP);
// bind(), listen()
calculate();
while(1) {
if(recv(s,buff,len)>0) map[client.ip]++;
}
Per comments:
If someone is sending too fast, I'd like to block him permanently rather than receiving his messages less frequently. Something like this is what I'm trying to achieve
If this was UDP, I'd 100% be onboard with what you are trying to do and give the code I have. But this is TCP and your assumptions are flawed.
Let's say the sender invokes this:
send(sock, buffer, 1000, 0);
And then on the other side, you invoke this:
recv(sock, buffer, 1000, 0)
Did you know that recv may do any of the following:
It may return any value less than or equal to 1000. It could return 1 and expect you to invoke it another 999 times to consume the entire message. One of the biggest confusions with TCP socket is assuming that each send call mirrors a recv call in a 1:1 fashion. Lots of buggy apps have shipped that way.
More probably, you'll get 1 or 2 recv calls because of IP fragmentation and/or TCP segmentation. How fast you invoke recv also But this is never guaranteed or expected to be consistent. What you observe with local testing on your on LAN will not resemble actual internet behavior.
How many recv calls you get has nothing to do with how many actual IP packets or TCP segments, because "the packets" will get coalesced anyway by the TCP stack on the recv side.
Similarly, how many bytes you pass to send doesn't influence the packet count. TCP, including any number of routers and gateways in between, may split up this 1000 byte stream into additional fragments and segments.
I'm going to offer two suggestions:
Detect flood attacks by counting application protocol messages and/or the size of these application protocol messages - but not individual recv calls. That is, as you recv data, you'll accumulate this data stream into logical protocol messages based on a fixed size of bytes or a delimiter based message structure and pass it up to a higher part of your application for processing. Do the incremental count then.
Instead of trying to thwart flood attacks at the message level, it's probably simpler to just throttle clients to a fixed rate of data. That is, each time you recv data, count how many bytes it returns and use with a timer to measure an incoming bytes/second rate. If the remote side exceeds your limit, insert sleep statements in between recv calls. This will implicitly make TCP slow the other side down from sending too fast.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I am client receiver of UDP multicast data sent by Sender server (Stock Exchange data). I am continuously receiving udp multicast packet flow sequentially numbered 1 to approximately 35,000,000 sent uniformly over a period of 6 hours . I need to ensure all packets upto say N are received before the set of N packets is periodically processed after every say ~ 256 packets. i.e. I need reliable UDP.
Reliable UDP is mimicked using TCP retransmit. If any udp packet(s) is lost/not received, it is requested by using tcp protocol by specifying the desired missing packet range (starting number, ending number).
Sender keeps record of all the packets (stock exchange data) it has sent via UDP multicast so far. So Sender will resend by TCP only those packets numbers that the receiver specifically requests for via TCP. This is how UDP reliability is achieved by receiver. The UDP drop ratio is very small (less than 0.001%) except when starting the UDP multicast in the middle of the day, in which case all previously sent UDP packets from 1 to some N will need to be resent on TCP, while live transmission of UDP multicast data packet number N+1 onward is being received.) I can't request Sender (Stock Exchange) to change its protocol--it is fixed.
What is the efficient algorithm to implement this in terms of CPU?
The issue is speed BigOh. I can make a naive algorithm using several nested loops and methods, but it not necessarily the best.
I am thinking of maintaining a number N which confirms I have received UDP
packets 1 through N, and any packet no. M which is not the next expected packet no. N+1 will be buffered, for say 256 packets, and then TCP will be used to request the missing numbers. Then normal UDP reception resumes over from the last confirmed received number after TCP request is filled.
Example:
Suppose UDP packets received by receiver are in the following sequence {1,2,3,6,7,8,9,10 ...}
After packet No. 3, the next packet is No. 6. Packets 4 through 5 are missing.
So the missing packets {4,5} are requested using TCP request({4 through 5}), and {6,7,8,9,10} are buffered. There is enough space on the 10GBaseT LAN card for buffering 35,000,000 packets.
So: receive UDP {1,2,3}, refill by TCP request {4,5}, continue receive UDP {6,7,8,9,10, ...}
I assume since you are using multicast that there are going to be multiple receivers of this data? (Because if not, you'd probably be using unicast instead)
Therefore, if the receivers are going to have the option of requesting TCP retransmission of packets they didn't get, that means that the transmitting program will need to keep a copy of recently-sent UDP packets in memory, so that when it receives a retransmit-request, it will have the requested data available to retransmit. Assuming you're stamping each packet with a unique ID, it can store this data in a std::map or std::unordered_map or similar for quick lookup.
The real question becomes, how much of this old-packet data should the transmitter retain? ideally it would retain all of it, because you never know how much a given receiver might have missed and might want to request; but that would require infinite memory so that's not a realistic option. Probably the best you can do is decide how much RAM you're willing to tie up for this purpose, and keep a count of the total number of bytes you have in your table, and when it reaches the limit, start dropping the oldest packets from the table in order to keep its size under the limit.
I wrote an open-source library that uses essentially the technique you describe (multicast UDP + TCP-retransmit-to-recover-from-packet-loss) to synchronize databases across multiple hosts as quickly as possible; some things I learned while implementing it include:
If/when you can, pack your data-messages together into larger packets, up to the MTU of the network you are transmitting over (e.g. 1388 bytes for IPv4/Ethernet). Very small packet-sizes (like 48-bytes/packet) are inefficient, since the fixed-sized packet-headers make up a greater percentage of the total data sent/received.
Only try to send when your sending-socket indicates it is ready-for-write. (i.e. don't assume that you will never fill up the socket's outgoing-data-buffer; if your traffic is "bursty", you probably will at some point)
Minimize UDP packet loss by making your UDP sockets' send and receive buffers as large as you can get away with
Further minimize UDP packet loss by doing all the UDP receiving in a dedicated, high-priority thread (which can then route the received UDP data back to a normal-priority thread for further processing -- the main thing is to avoid allowing the receiving UDP-socket's incoming-data-buffer to overflow if possible)
For the TCP retransmission part, keep in mind that TCP streams can potentially slow down to nearly zero bytes-per-second in the worst case scenario, which makes it important to ensure that poor TCP performance to client A doesn't block the TCP communications to/from clients B, C, D, etc. This can be accomplished either via non-blocking I/O and select() (or poll() or similar), or asynchronous networking, or via multiple threads; avoid blocking I/O unless you are implementing a thread-per-socket model (and probably avoid that model as well, since a thread that is indefinitely-blocked-inside-recv() is difficult to shut down cleanly)
Think about under what circumstances (if any) it is acceptable for a client to never receive a particular packet at all; are there situations where that is okay? Or must the entire system grind to a halt until every receiver has received every packet in the group, regardless of how long that might take?
If you want to get really fancy, you can look into Forward Error Correction algorithms that encode data across packets, such that the receiver can still decode all of the data even if it never receives (up to a certain percentage of) the packets. This makes the need for a re-transmit request less likely, at the cost of making all of the packets slightly larger.
I am trying to read packets in a router, like this in python:
# (skipping the exception handling code here)
s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.ntohs(0x0003))
while True:
p = s.recvfrom(2000)
pkt = p[0]
# process pkt here ...
Answers to a related question (36115971) say that parameters and methods for UDP vs TCP data are different (some say recv is for TCP and recvfrom is for UDP, and others say the opposite, similarly some say 1024 as buffer size for TCP and larger for UDP, and again some say the reverse). In my case of reading in a router, I do not have different sockets for TCP and UDP, so I need to read both from the same socket, so I am bit confused regarding how I should read the incoming packets.
(1) Should I use recv() or recvfrom(), if I want to read both TCP and UDP packets?
(2) Do the calls return data one packet at a time, or do they return after the buffer is filled up? eg, if I have a large buffer of 4096 bytes, and the incoming streaming 2 packets have 2400 bytes each, will the call return as soon as the 1st packet ends, or will it return after filling up the buffer from the 2nd packet also?
(2a) same question, but if I have a smaller buffer of 2000 bytes. It is clear that on the 1st call I will get the first 2000 bytes of the 1st packet. But on the next call, will I get the last 400 bytes of the 1st packet, or the first 2000 bytes of the 2nd packet?
(3) If I am delayed in making the next call, maybe because I was busy processing the 1st dataset, am I in danger of losing data, or will the OS keep its internal queue of the incoming packets to be given to me when I call the next time? If the OS keeps its internal queue, where can I find information about its size?
NOTE: Some of the given replies have been divergent, so let me put in some boundaries to my question. Hopefully these restrictions will help to give more specific answers.
(a) My objective is to sniff the incoming packets with python sockets only. So other solutions involving tcpdump or tshark etc are outside the scope.
(b) The objective is to only sniff for incoming packets. Additional details like packet reordering (for connection oriented protocols like TCP) are outside the scope, actually they are avoidable overhead.
If you're reading packets from a raw socket (as shown in your source code), then you can easily read all packets from the same socket. Be sure this is what you intend to do. A raw socket is for doing packet inspection for troubleshooting, forensic, security or educational purposes. You cannot easily communicate with another system this way.
And likewise, the receive calls will not differ here by protocol because you are not actually using TCP or UDP, you're simply receiving the raw packets that those protocols build and decode.
(1) Should I use recv() or recvfrom(), if I want to read both TCP and UDP packets?
Either one will work. recv() will return to you only the actual packet data, while recvfrom will return to you the data along with metadata about the packet, including the interface from which the data was received (and other things defined in struct sockaddr_ll from the packet(7) man page).
(2) Do the calls return data one packet at a time, or do they return after the buffer is filled up? eg, if I have a large buffer of 4096 bytes, and the incoming streaming 2 packets have 2400 bytes each, will the call return as soon as the 1st packet ends, or will it return after filling up the buffer from the 2nd packet also?
When using a raw socket like this, you get exactly one packet at a time. You will never get more than one. If the buffer you give is not large enough, then the packet will be truncated (with the ending bytes discarded).
(2a) same question, but if I have a smaller buffer of 2000 bytes. It is clear that on the 1st call I will get the first 2000 bytes of the 1st packet. But on the next call, will I get the last 400 bytes of the 1st packet, or the first 2000 bytes of the 2nd packet?
Generally speaking, packets on most networks are limited to about 1514 bytes. This is because the traditional "MTU" (Maximum Transfer Unit) that is configured on the network interface is 1500 bytes and usually an Ethernet header containing two MAC addresses (6 bytes each) plus a two-byte Ethertype is prepended to that. In a switch or router, you may also see packets that have an additional 4-byte header containing a VLAN header (IEEE 802.1Q). (But, some networks internally use "jumbo" packets up to about 9K in size for specific purposes.)
You should also understand that, in writing an application, one can send UDP datagrams (or TCP buffers) larger than the maximum packet size. In that case, the OS breaks those up into smaller chunks for sending (and they are re-assembled on the destination side before being handed to an application). When you're receiving raw packets like this, you will see the packets in their low-level, possibly fragmented, state.
(3) If I am delayed in making the next call, maybe because I was busy processing the 1st dataset, am I in danger of losing data, or will the OS keep its internal queue of the incoming packets to be given to me when I call the next time? If the OS keeps its internal queue, where can I find information about its size?
The OS will keep a queue of packets for you. The size is of course limited since there is no way you would be able to keep up with, say, a 1Gb NIC at full line rate (let alone a 10Gb or higher NIC). The size is configured in a system-specific way. On linux -- and probably other Unix-based systems -- you can call getsockopt with SOL_SOCKET / SO_RCVBUF to get an idea of the queue space available.
On linux, at least, the size can be set with setsockopt up to a system-imposed maximum (which itself can be configured with various sysctl settings).
I think you should not do that, because TCP assures various things like reliability, ordering, flow control, and congestion. However UDP does not guarantee anything.
These parameters are defined in the moment of creation of the socket by operating system. That is why I think that you cannot do that you are saying.
Open two different sockets, one native UDP sock and one native TCP sock.
Doing an arp poisonning, I am in the middle of the connection of 1-the router and 2-the victim computer. How can I retransmit the packet to the destination? (preferably with scapy)
I have this :
send(ARP(op=ARP.is_at, psrc=router_ip, hwdst=victim_mac, pdst=victim_ip))
send(ARP(op=ARP.is_at, psrc=victim_ip, hwdst=router_mac, pdst=router_ip))
Reviewing Scapy's API documentation suggests these alternatives:
The send function accepts 2 additional arguments that could prove useful:
loop: send the packets endlessly if not 0.
inter: time in seconds to wait between 2 packets.
Therefore, executing the following statement would send the packets in an endless loop:
send([ARP(op=ARP.is_at, psrc=router_ip, hwdst=victim_mac, pdst=victim_ip),
ARP(op=ARP.is_at, psrc=victim_ip, hwdst=router_mac, pdst=router_ip)],
inter=1, loop=1)
The sr function accepts 3 arguments that could prove useful:
retry: if positive, how many times to resend unanswered packets. if negative, how many consecutive unanswered probes before giving up. Only the negative value is really useful.
timeout: how much time to wait after the last packet has been sent. By
default, sr will wait forever and the user will have to interrupt (Ctrl-C) it when he expects no more answers.
inter: time in seconds to wait between each packet sent.
Since no answers are expected to be received for the sent ARP packets, specifying these arguments with the desired values enables sending the packets in a finite loop, in contrast to the previous alternative, which forces an endless one.
On the down side, this is probably a bit less efficient since resources are allocated towards packet receipt and handling, but this is negligible.
Therefore, executing the following statement would send the packets in a finite loop of 1000 iterations:
sr([ARP(op=ARP.is_at, psrc=router_ip, hwdst=victim_mac, pdst=victim_ip),
ARP(op=ARP.is_at, psrc=victim_ip, hwdst=router_mac, pdst=router_ip)],
retry=999, inter=1, timeout=1)
I have to write a reliable, totally-ordered multicast system from scratch in Python. I can't use any external libraries. I'm allowed to use a central sequencer.
There seems to be two immediate approaches:
write an efficient system, attaching a unique id to each multicasted message,
having the sequencer multicast sequence numbers for the message id's it receives,
and sending back and forth ACK's and NACK's.
write an inefficient flooding system, where each multicaster simply re-sends each
message it receives once (unless it was sent by that particular multicaster.)
I'm allowed to use the second option, and am inclined to do so.
I'm currently multicasting UDP messages (which seems to be the only option,) but that means that some messages might get lost. That means I have to be able to uniquely identify each sent UDP message, so that it can be re-sent according to #2. Should I really generate unique numbers (e.g. using the sender address and a counter) and pack them into each and every UDP message sent? How would I go about doing that? And how do I receive a single UDP message in Python, and not a stream of data (i.e. socket.recv)?
The flooding approach can cause a bad situation to get worse. If messages are dropped due to high network load, having every node resend every message will only make the situation worse.
The best approach to take depends on the nature of the data you are sending. For example:
Multimedia data: no retries, a dropped packet is a dropped frame, which won't matter when the next frame gets there anyway.
Fixed period data: Recipient node keeps a timer that is reset each time an update is received. If the time expires, it requests the missing update from the master node. Retries can be unicast to the requesting node.
If neither of these situations applies (every packet has to be received by every node, and the packet timing is unpredictable, so recipients can't detect missed packets on their own), then your options include:
Explicit ACK from every node for each packet. Sender retries (unicast) any packet that is not ACKed.
TCP-based grid approach, where each node is manually repeats received packets to neighbor nodes, relying on TCP mechanisms to ensure delivery.
You could possibly rely on recipients noticing a missed packet upon reception of one with a later sequence number, but this requires the sender to keep the packet around until at least one additional packet has been sent. Requiring positive ACKs is more reliable (and provable).
The approach you take is going to depend very much on the nature of the data that you're sending, the scale of your network and the quantity of data you're sending. In particular it is going to depend on the number of targets each of your nodes is connected to.
If you're expecting this to scale to a large number of targets for each node and a large quantity of data then you may well find that the overhead of adding an ACK/NAK to every packet is sufficient to adversely limit your throughput, particularly when you add retransmissions into the mix.
As Frank Szczerba has said multimedia data has the benefit of being able to recover from lost packets. If you have any control over the data that you're sending you should try to design the payloads so that you minimise the susceptibility to dropped packets.
If the data that you're sending cannot tolerate dropped packets and you're trying to scale to high utilisation of your network then perhaps udp is not the best protocol to use. Implementing a series of tcp proxies (where each node retransmits, unicast, to all other connected nodes - similar to your flooding idea) would be a more reliable mechanism.
With all of that said, have you considered using true multicast for this application?
Just saw the "homework" tag... these suggestions might not be appropriate for a homework problem.
IMHO, you should choose an existing reliable UDP protocol. There are several you can choose, take a look at this SO question: What do you use when you need reliable UDP?
I personally like and use MoldUDP, which is the protocol used by Nasdaq's ITCH market data feed.