Python Sockets: gethostbyaddr : Reverse DNS Lookup Failure - python

I've been having a problem with getting the host name while using socket.gethostbyaddr(ip_addr) on specific sites.
I will not go into detail about which site this is not working for.
so getting the host by name works fine for every site I've tried so far , but then when I try to get the site name from I get an error say
ing host not found.
A fix or an alternative would be nice for this to have complete data. If there is no fix I can merely leave out the host name. no biggie. Thanks for the help.
# not full code
hostip = socket.gethostbyname(hostname)
print socket.gethostbyaddr(hostip)
Error: socket.herror: [Errno 11004] host not found

Not every IP address has reverse DNS. Sometimes this is on purpose, sometimes it's because you're looking at an internal address and there's no need for it inside the network so it wasn't worth setting up, sometimes someone just screwed up.
Why would anyone do this on purpose? Most commonly, because multiple domain names map to the same IP address.
For example, a shared hosting site might map sites for three of its customers, www.foo.com, www.bar.com, and www.baz.com, all to 1.2.3.4. HTTP gives you the requested host name in a Host: header, so it can figure out which site your browser wanted to go. But outside of HTTP (or some other higher-level protocol), there's no way to figure out which of the three names you meant with 1.2.3.4. So, there's nothing they can provide that would be useful to you. There may also be a name like shared_1234.hostingcompany.com which is useful to their own IT people, in which case they might provide that, but otherwise, they won't bother with any reverse DNS.

Related

Get gTLD or ccTLD from IP address

There are many questions on SO related to fetching an IP address from URL, but not vice versa.
As the title suggests, I would like to get the website URL of its respective IP address. For instance:
>>> import socket
>>> print(socket.gethostbyname('google.com'))
This looks up the domain and returns 172.217.20.14. I am looking for the counter part like e.g.:
>>> print(socket.getnamebyhost('172.217.20.14'))
Anything similar that would return the domain as google.com for the IP specified.
Is this possible to do in python3?
If yes, how can this be achieved?
UPDATE
Unfortunately, the way I'm approaching this is wrong. There are IPs that share a one-to-many relationship i.e. the nameserver points to numerous urls, unless the PTR record indicates otherwise. My question rephrased:
How do IP-to-domain data providers like ipinfo.io return
top-level domains for a single IP?
To my understanding, the A or AAAA records play an important role, but the only thing I get from these are ns rather than the domain. I don't know how to extract the gTLD or ccTLD from the records. I'm open to any suggestions, if anyone is willing to share an answer on how to parse gTLD(s) or ccTLD(s) from any IP. Preferably in python, but a shell script would also suffice.
The socket.gethostbyaddr('172.217.20.14'), would be the right way to go here, but not necessarily. Here's why:
Domain to IP resolution goes like:
domain > root server > origin server > origin server's hostname to IP configurations.
Now to reverse engineer it, we have to take into account:
There can be multiple domains sharing that same IP address as is the case with shared hosting.
Assuming the domain has dedicated IP, the nslookup or gethostbyaddr 'should' return the domain name, but there can be proxy servers in-front, like Cloudflare and whatever Google is using.
So even if you do this manually like try to find out actual IP google's server is running on you cannot, as that would open their central server for all kinds of attacks, most importantly DDoS.

NET::ERR_CERT_COMMON_NAME_INVALID - Error Message

I built a website some time ago with Flask. Now all of a sudden when I try to navigate there I get the following:
NET::ERR_CERT_COMMON_NAME_INVALID
Your connection is not private
Attackers might be trying to steal your information from www.mysite.org (for example, passwords, messages, or credit cards). Learn more
Does anyone know what's going on?
The error means: The host name you use in the web browser does not match one of the names present in the subjectAlternativeName extension in the certificate.
If your server has multiple DNS entries you need to include all of into the certificate to be able to use them with https. If you access the server using it's IP address like https://10.1.2.3 then the IP address also have to present in the certificate (of course this only makes sense if you have a static IP address that never changes).
The certificate subject alternative name can be a domain name or IP address. If the certificate doesn’t have the correct subjectAlternativeName extension, users get a NET::ERR_CERT_COMMON_NAME_INVALID error letting them know that the connection isn’t private. If the certificate is missing a subjectAlternativeName extension, users see a warning in the Security panel in Chrome DevTools that lets them know the subject alternative name is missing.
https://support.google.com/chrome/a/answer/7391219?hl=en
For Chrome 58 and later, only the subjectAlternativeName extension, not commonName, is used to match the domain name and site certificate. So, if you are missing the Subject Alternative Name in your certificate then you will experience the NET::ERR_CERT_COMMON_NAME_INVALID error.
In order to have a Subject Alternate Name (SAN) on an SSL certificate, you must first edit your OpenSSL configuration. On Ubuntu/Debian, that can be found at /etc/ssl/openssl.cnf Find the section of that file with the heading [ v3_ca ], you can add the line with your SAN there:
subjectAltName = www.example.com

urllib2 - get resource if you already know the IP

In my python script, I am fetching pages but I already know the IP of the server.
So I could save it the hassle of doing a DNS lookup, if I can some how pass in the IP and hostname in the request.
So, if I call
http://111.111.111.111/
and then pass the hostname in the HOST attribute, I should be OK. However the issue I see is on the server side, if the user looks at the incomming request (ie REQUEST_URI) then they will see I went for the IP.
Anyone have any ideas?
First, the main idea is suspicious. Well, you can "know" IP of the server but this knowledge is temporary and its correctness time is controlled by DNS TTLs. For stable configuration, server admin can provide DNS record with long TTL (e.g. a few days) so DNS request will be always fulfilled using the nearest caching resolver or nscd. For changing configuration, TTL can be reduced to a few seconds or ever to 0 (means no caching), and it can be useful for some kind of load balancers. You try to organize your own resolver cache which is TTL ignorant, and this can lead to requests to non-functioning or wrong servers, with incorrect contents. So, I suggest not to do this.
If you are strictly sure you shall do this and you can't use external tools as custom resolver or even /etc/hosts, try to install custom "opener" (see urllib2.build_opener() function in documentation) which overrides DNS lookup. However I didn't do this ever, the knowledge is only on documentation read just now.
You can add the ip address mapping to the hosts file.

get the host the user is coming from

i wanted to know in python how can i get the host the user came from?
how do i extract it?
i tried this:
host = self.request._environ['HTTP_HOST']
but it's empty...
Do you have any idea what it should be
Thanks.
self.request._environ['HTTP_HOST'] tells you your host name.
You can use self.request.remote_addr to get the remote IP address. You'll need to do a reverse DNS lookup (which might fail) if you need a host name from that.

Alternate host/IP for python script

I want my Python script to access a URL through an IP specified in the script instead of through the default DNS for the domain. Basically I want the equivalent of adding an entry to my /etc/hosts file, but I want the change to apply only to my script instead of globally on the whole server. Any ideas?
Whether this works or not will depend on whether the far end site is using HTTP/1.1 named-based virtual hosting or not.
If they're not, you can simply replace the hostname part of the URL with their IP address, per #Greg's answer.
If they are, however, you have to ensure that the correct Host: header is sent as part of the HTTP request. Without that, a virtual hosting web server won't know which site's content to give you. Refer to your HTTP client API (Curl?) to see if you can add or change default request headers.
You can use an explicit IP number to connect to a specific machine by embedding that into the URL: http://127.0.0.1/index.html is equivalent to http://localhost/index.html
That said, it isn't a good idea to use IP numbers instead of DNS entries. IPs change a lot more often than DNS entries, meaning your script has a greater chance of breaking if you hard-code the address instead of letting it resolve normally.

Categories

Resources