Python (but also C) Strange sorting done by gethostbyname

Python (but also C) Strange sorting done by gethostbyname - python

I have found this problem in Python but I was also able to reproduce it with a basic C program.
I am in CentOS 6 (tested also on 7), I have not tested on other Linux distributions.
I have an application on 2 VMs. One has IP address 10.0.13.30 and the other is 10.0.13.56. They have a shared FQDN to allow load-balancing (and high availability) DNS based using gethostbyname or getaddrinfo (it is what is suggested in Python doc).
If my client application is on a different sub-net (10.0.12.x for example), I have no problem: the socket.gethostbyname(FQDN) is returning randomly 10.0.13.30 and 10.0.13.56.
But if my client application is on the same sub-network, it returns always the same entry. And it seems to be always the "closest": I have depoyed it on 10.0.13.31 and it returns always 10.0.13.30 and on 10.0.13.59 it returns always 10.0.13.56.
On these servers CLI commands such as ping and dig are returning the result in different orders almost each time
I have searched many subjects and I concluded that it seems to be a kind of "prioritization to improve the success chances done by glibc" but I have not found any way to disable it.
Because clearly in my case the 2 clients and the 2 servers VMs are on the VMware connected to a single router, so I do not see how the fact that the last byte of the IP of the server is closest to the last byte of the IP of the client is taken into account.
This is a replication of a problem that I have at customer side so it is not an option for me to just move the VMs to a different sub-net :-( ....
Anybody has an idea to have correct load-balancing in the same sub-network? I can partially control the VM config so if a settings has to be changed I can do it.

Instead of hoping that the standard library will do load balancing for you, use socket.getaddrinfo() and choose one of the resulting hosts at random explicitly. This will also make it easy to fail over to a different host if the first one you try is not available.

Related

Use python to naively find an Arduino's IP address from another PC on the local network

I have a ESP8266 Nodemcu device running a local HTTP server. I followed the quick-start instructions here.
My goal is to have a large number of these devices running in sync. To do that, I wrote this script:
#!/usr/bin/env python
import time
import sys
import socket
import requests
def myFunction():
#This is what I have right now...
ipAddresses = ["192.168.1.43", "192.168.1.44"]
#
#Instead, I need to search the local network for all the arduinos [ESP8266s] and append their address to the array.
#
for address in ipAddresses:
TurnOnLED(address)
def TurnOnLED(address):
try:
r = requests.post('http://' + address + '/LedON')
r.close()
except requests.exceptions.ConnectionError:
pass
#Main
def Main():
try:
print("Press CTRL-C to stop.")
while ON:
myFunction()
time.sleep(60)
except KeyboardInterrupt:
sys.exit(0)
if __name__ == "__main__":
Main()
This works, and allows me to control all of my devices from my desktop PC. My difficulty is with finding each IP address dynamically. I considered assigning a static IP address, but I would like them to be retail-optimized, so I cannot guarantee any particular IP address being unused out-of-the-box.
For the same reason, I want to be able to install the same arduino code on all of them. This is what I mean by naive. I want my python script to find the IP address of each device over the local network without any [unique] 'help' from the devices themselves. I want to install the same code on each device, and have my script find them all without any additional set up.
My first attempt was using the socket python module and looking for every hostname that began with "ESP-" since those are the first four characters of the factory hostname on all the boards. My query consistently turned up nothing, though.
This answer here has some good information, but my solution runs on a Macintosh, and I will not have the full host name, only "ESP-".

So, now that I know what are the constraints, and what is just what is in the tutorial, here are my 2 cents (keep in mind that I too, am just a hobbyist about everything that has "voltage". And not even a good one).
1st strategy : PC is server
So, if I assume that your devices are, for example, temperature sensors, and you want to frequently grab all the temperatures, then, one strategy could be that those devices all connect, every minute, to the server, report the temperature, and disconnect. For example, using a HTTPS request to do so.
So host the server on your PC, and have your devices behave as clients
# Example request from micropython
import urequests
rep = urequests.get(URL)
print(rep.text)
response.close()
2nd strategy : a third party is server
A variant of that strategy is to have a sort of central data repository, that acts as a server. So, I mean, your PC is not the server, nor any ESP, but another machine (which could be one ESP, a rasppi, or even a droplet or EC2 instance in the cloud, doesn't matter. As long as you can host a backend on it).
So, ESP8266 are clients of that server. And so is your PC. So that server has to be able to answer to request "set value from client", and "get value for the PC".
The drawback of those first 2 strategies, is that they are ok when the devices are the one sending data. It fits less if they are (as in your led example, but I surmise that was just an example) the ones receiving mostly the data. Because then, you would have to wait for the ESP to connect to have it get the message "you need to switch on the led".
3rd strategy : central registry
Or you can have it both way. That is keep your current architecture. When your PC wants something from the ESP8266, it has to connect to them, that are servers, and send it request.
But, in parallel to that, ESP8266 also behave as clients to register themselves in a central directory, which could be on the PC, or on a 3rd party. The objective of that central directory is not to gather the data from the ESP8266. Just to gather a uptodate list of them. Each minute they, in parallel with their server activity, the ESP8266 send a message "I am alive" to this central dircetory. And then the PC (that could be, or not, hosting that central directory) just need to get all IP associated with a not too old "I am alive" message, to get a list of IP.
4th strategy : ARP
Once your PC is on, it could scan the network, using an ARP request, with scapy. Search for "scanning local network with scapy".
For example here is a tutorial
From there, you get a list of IP. Associated with MAC address. And now, you need to know which ones are the ESP8266. You can here also apply several ideas. For example, you may use the MAC address to find guess which one are the ESP8266. Or you may simply try a dummy request on all found IP, to check which one are the ESP8266 (using a specific API of the server code of the ESP8266 you wrote)
Or you may decide to host the server on each ESP8266 on a specific port, like 8123, so that you can quickly rule out devices whose port 8123 is not listening.
5th strategy : don't reinvent the wheel
The best strategy is clearly a mix between my second and my third. Having a directory, and a 3rd party handling messages. But that is reinventing message brokers.
There is one well known middleware, fitted for ESP8266 (I mean, for IoT low profile devices), that is MQTT.
That needs some more tutorial reading and long trial and error on your behalf. You can start here for example. But that is just the 1st example I found on Google searching "MQTT ESP8266 micropython". There are zillions resources on that.
It may seem to be not the easiest way (compared to just copy and paste some code that list all the alive IP on a network). But in the long run, I you intend to have many ESP8266, so many of them that you can't afford to assign them static IP and simply list their IP, you probably really need a message broker like that, a preferably, not one that your reinvent

How to make HTTP requests from different clients appear that they came from the same IP address?

I'm using a 3rd-party API that invalidates the OAuth token if the requests came from different IP addresses. This is causing issues because the service runs on multiple hosts.
Ideally, I want the option that only the requests to this particular API will be routed through a single IP.
I thought about setting up a proxy server, but I'm concerned that I won't be able to scale this proxy beyond 1 machine.
Any suggestions?

The ideal option here would of course be to obtain an OAuth token for each machine. (Or, even better, to get the service to allow you to share a token across IPs.) But I assume there's some reason you can't do that.
In which case you probably do want a proxy server here.
The option that only the requests to this particular API be routed through that proxy is dead simple. Set up an explicit proxy rather than a transparent one, and specify that explicit proxy for these particular methods.
Since you haven't shown us, or even described, your code, I can't show you how to do that with whatever library you're using, but here's how to do it with requests, and it's not much harder with the stdlib urllib or most other third-party libraries.
But, for completeness: It's not at all impossible to make the separate machines appear to have the same IP address, as long as all of you machines are behind a router that you have control over. In fact, that's exactly what you get with a typical home DSL/cable setup via NAT: each machine has its own internal-only address, but they all share one public address. But it's probably not what you want. For one thing, if your machines are actually GCP hosts, you don't control the router, and you may not even be able to control whether they're on the same network (in case you were thinking of running a software router to pipe them all through). Also, NAT causes all kinds of problems for servers. And, since your worry here is scaling, using NAT is a nightmare once you have to scale beyond a single subnet. And even more so if these instances are meant to be servers (which seems likely, if you're running them on GCP). And finally, to use NAT to talk just to one service, you either need very complicated routing tables, or an extra network interface per machine (that you can put behind a different router). So, I doubt it's what you actually want here.

ZeroRPC auto-assign free port number

I am using ZeroRPC for a project, where there may be multiple instances running on the same machine. For this reason, I need to abe able to auto-assign unused port numbers. I know how to accomplish this with regular sockets using socket.bind(('', 0)) or with ZeroMQ using the bind_to_random_port method, but I cannot figure out how to do this with ZeroRPC.
Since ZeroRPC is based on ZeroMQ, it must be possible.
Any ideas?

Having read details about ZeroRPC-python current state, the safest option to solve the task would be to create a central LotterySINGLETON, that would receive <-REQ/REP-> send a next free port# upon an instance's request.
This approach is isolated from ZeroRPC-dev/mod(s) modifications of use of the otherwise stable ZeroMQ API and gives you the full control over the port#-s pre-configured/included-in/excluded-from LotterySINGLETON's draws.
The other way aroung would be to try to by-pass the ZeroRPC layer and ask ZeroMQ directly about the next random port, but the ZeroRPC documentation discourages from bypassing their own controls imposed on (otherwise pure) ZeroMQ framework elements ( which is quite a reasonable to be emphasised, as it erodes the ZeroRPC-layer consistency of it's add-on operations & services, so it shall rather be "obeyed" than "challenged" in trial/error ... )

The following will let ZMQ choose a free port:
s = zerorpc.Server(SomeClass())
s.bind('tcp://127.0.0.1:0')
The problem with this is that now you don't know which port it bound to. I managed to find the port with netstat and successfully connected to it, but that's probably not what you want to do. I made a seperate question out of this: Find out bound ports of zerorpc server

Decentralized networking in Python - How?

I want to write a Python script that will check the users local network for other instances of the script currently running.
For the purposes of this question, let's say that I'm writing an application that runs solely via the command line, and will just update the screen when another instance of the application is "found" on the local network. Sample output below:
$ python question.py
Thanks for running ThisApp! You are 192.168.1.101.
Found 192.168.1.102 running this application.
Found 192.168.1.104 running this application.
What libraries/projects exist to help facilitate something like this?

One of the ways to do this would be the Application under question is broadcasting UDP packets and your application is receiving that from different nodes and then displaying it. Twisted Networking Framework provides facilities for doing such a job. The documentation provides some simple examples too.

Well, you could write something using the socket module. You would have to have two programs though, a server on the users local computer, and then a client program that would interface with the server. The server would also use the select module to listen for multiple connections. You would then have a client program that sends something to the server when it is run, or whenever you want it to. The server could then print out which connections it is maintaining, including the details such as IP address.
This is documented extremely well at this link, more so than you need but it will explain it to you as it did to me. http://ilab.cs.byu.edu/python/

You can try broadcast UDP, I found some example here: http://vizible.wordpress.com/2009/01/31/python-broadcast-udp/

You can have a server-based solution: a central server where clients register themselves, and query for other clients being registered. A server framework like Twisted can help here.
In a peer-to-peer setting, push technologies like UDP broadcasts can be used, where each client is putting out a heartbeat packet ever so often on the network, for others to receive. Basic modules like socket would help with that.
Alternatively, you could go for a pull approach, where the interesting peer would need to discover the others actively. This is probably the least straight-forward. For one, you need to scan the network, i.e. find out which IPs belong to the local network and go through them. Then you would need to contact each IP in turn. If your program opens a TCP port, you could try to connect to this and find out your program is running there. If you want your program to be completely ignorant of these queries, you might need to open an ssh connection to the remote IP and scan the process list for your program. All this might involve various modules and libraries. One you might want to look at is execnet.

Proper way to publish and find services on a LAN using Python

My app opens a TCP socket and waits for data from other users on the network using the same application. At the same time, it can broadcast data to a specified host on the network.
Currently, I need to manually enter the IP of the destination host to be able to send data. I want to be able to find a list of all hosts running the application and have the user pick which host to broadcast data to.
Is Bonjour/ZeroConf the right route to go to accomplish this? (I'd like it to cross-platform OSX/Win/*Nix)

it can broadcast data to a specified host on the network
This is a non-sequitur.
I'm presuming that you don't actually mean broadcast, you mean Unicast or just "send"?
Is Bonjour/ZeroConf the right route to go to accomplish this?
This really depends on your target environment and what your application is intended to do.
As Ignacio points out, you need to install the Apple software on Windows for Zeroconf/mDNS to work at the moment.
This might be suitable for small office / home use.
However larger networks may have Layer 2 Multicast disabled for a variety of reasons, at which point your app might be in trouble.
If you want it to work in the enterprise environment, then some configuration is required, but that doesn't have to be done at the edge (in the app client instances).
Could be via a DHCP option, or by DNS service records.. in these cases you'd possibly be writing a queryable server to track active clients.. much like a BitTorrent Tracker.
Two things to consider while designing your networked app:
Would there ever be reason to run more than one "installation" of your application on a network?
Always consider the implications of versioning: One client is more up to date than another, can they still talk to each other or at least fail gracefully?

Zeroconf/DNS-SD is an excellent idea in this case. It's provided by Bonjour on OS X and Windows (but must be installed separately or as part of an Apple product on Windows), and by Avahi on FOSS *nix.

I think that ZeroConf is a very good start. You may find this document useful.

I have a list on a webpage, nice if you need internet communications.
<dl_service updated="2010-12-03 11:55:40+01:00">
<client name="internal" ip="10.0.23.234" external_ip="1.1.1.1"/>
<client name="bigone" ip="2.2.2.2" external_ip="2.2.2.2">
<messsage type="connect" from="Bigone" to="internal" />
</client>
</dl_service>
My initial idea was to add firewall punching and all that, but I just couldn't be bothered too many of the hosts where using external IPs for it to be a problem..
But I really recommend Zeroconf, at least if you use Linux+MacOSX, don't know about Windows at all.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.