urllib2.urlopen timeout works only when connected to Internet - python

I'm working on Python 2.7 code to read a value from HTML page by using urllib2 library. I want to timeout the urllib2.urlopen function after 5 seconds in case of no Internet and jump to remaining code.
It works as expected when computer is connected to working internet connection. And for testing if I set timeout=0.1 it timed out suddenly without opening url, as expected. But when there is no Internet, timeout not works either I set timeout to 0.1, 5, or any other value. It simply does not timed out.
This is my Code:
import urllib2
url = "https://alfahd.witorbit.net/fingerprint.php?n"
try:
response = urllib2.urlopen(url , timeout=5).read()
print response
except Exception as e:
print e
Result when connected to Internet with timeout value 5:
180
Result when connected to Internet with timeout value 0.1 :
<urlopen error timed out>
Seems timeout is working.
Result when NOT connected to Internet and with any timeout value (it timed out after about 40 seconds every time I open url despite of any value I set for timeout=:
<urlopen error [Errno -3] Temporary failure in name resolution>
How can I timeout urllib2.urlopen when there is no Internet connectivity? Am I missing some thing? Please guide me to solve this issue. Thanks!

Because name resolution happens before the request is made, it's not subject to the timeout. You can prevent this error in name resolution by providing the IP for the host in your /etc/hosts file. For example, if the host is subdomain.example.com and the IP is 10.10.10.10 you would add the following line in the /etc/hosts file
10.10.10.10 subdomain.example.com
Alternatively, you may be able to simply use the IP address directly, however, some webservers require you use the hostname, in which case you'll need to modify the hosts file to use the name offline.

Related

Why do i get "HTTP/1.1 400 Bad Request" error on execution?

So i want to send (through a proxy) a request to a website.. The script looks like this and its made with the socket library in python:
import socket
TargetDomainName="www.stackoverflow.com"
TargetIP="151.101.65.69"
TargetPort=80
ProxiesIP=["107.151.182.247"]
ProxiesPort=[80]
Connect=f"CONNECT {TargetDomainName} HTTP/1.1"
Connection=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
Connection.connect((ProxiesIP[0],ProxiesPort[0]))
Connection.sendto(str.encode(Connect),(TargetIP, TargetPort))
Connection.sendto(("GET /" + TargetIP + " HTTP/1.1\r\n").encode('ascii'), (TargetIP, TargetPort))
Connection.sendto(("Host: " + ProxiesIP[0] + "\r\n\r\n").encode('ascii'), (TargetIP, TargetPort))
print (Connection.recv(1028))
Connection.close()
My question is why i get the 400 bad request error?
You did not indicate whether the 400 reply is coming from the proxy or the target server. But both of your commands are malformed.
Your CONNECT command is missing a port number, a Host header since you are requesting HTTP 1.1, and trailing line breaks to terminate the command properly.
Your GET command is sent to the target server (if CONNECT is successful) and should not be requesting a resource by IP address. It is also sending the wrong value for the Host header. The command is relative to the target server, so it needs to specify the target server's host name.
Also, you should be using send()/sendall() instead of sendto().
Try something more like this instead:
import socket
TargetDomainName="www.stackoverflow.com"
TargetIP="151.101.65.69"
TargetPort=80
ProxiesIP=["107.151.182.247"]
ProxiesPort=[80]
Connection=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
Connection.connect((ProxiesIP[0], ProxiesPort[0]))
Connection.sendall((f"CONNECT {TargetDomainName}:{TargetPort} HTTP/1.1\r\n").encode("ascii"))
Connection.sendall((f"Host: {TargetDomainName}:{TargetPort}\r\n\r\n").encode("ascii"))
print (Connection.recv(1028))
Connection.sendall(("GET / HTTP/1.1\r\n").encode('ascii'))
Connection.sendall((f"Host: {TargetDomainName}\r\n").encode('ascii'))
Connection.sendall(("Connection: close\r\n\r\n").encode('ascii'))
print (Connection.recv(1028))
Connection.close()
You really need to read the proxy's reply before sending the GET command. The proxy will send its own HTTP reply indicating whether it successfully connected to the target server or not.
You really should not be implementing HTTP manually though, there are plenty of HTTP libraries for Python that can handle these details for you. Python even has one built-in: http.client

Python soap client - connection having issue

I am using Python soap API client Zeep and here is the code that I have written:
from zeep import Client
def myapi(request):
client = Client("https://siteURL.asmx?wsdl")
key = client.service.LogOnUser('myusername', 'mypassord')
print(key)
it is giving me an error as: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
While I am trying below command the URL works well and shows all the services it has
python -mzeep https://siteURL.asmx?wsdl
Please help to understand what is the reason above code is not working.
PS: I couldn't share site URL which I am trying to connect to.
Additional Info: The site/page is accessible only through intranet and I am testing locally from intranet itself.
Traceback error:
Exception Type: ConnectionError at /music/mypersonalapi/
Exception Value: HTTPSConnectionPool(host='URL I have hidden', port=81):
Max retries exceeded with url: /ABC/XYZ/Logon.asmx
(Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x0546E770>:
Failed to establish a new connection:
[WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond',))
Please note: I have removed URL and Host information from my traceback due to confidentiality
What this does:
python -mzeep https://site/url.asmx?wsdl
is:
c = Client("https://site/url.asmx?wsdl")
c.wsdl.dump()
both alternatives are using port 443 since that is the default https port.
From your traceback we see
Exception Value: HTTPSConnectionPool(host='URL I have hidden', port=81):
which would have been similar to
python -mzeep https://site:81/url.asmx?wsdl
I.e. the command line and your code are not connecting to the same address (also note that port values less than 1024 requires system level permissions to use -- in case you're writing/controlling the service too).
The last line does say "..failed because the connected party did not properly respond after a period of time..", but that is not the underlying reason. In line 3 you can read
Max retries exceeded with url: /ABC/XYZ/Logon.asmx
in other words, you've tried (and failed) to log on too many times and the server is probably doubling the time it uses to respond every time you try (a well known mitigation strategy for "things" that fail to login multiple times -- i.e. look like an attack). The extended delay is most likely causing the error message you see at the bottom.
You'll need to wait a while, or reset your account for the service, and if the service is yours then perhaps turn off this feature during development?
Maybe this can help. I had the same connexion problem (Max retries exceeded...). I solved it by increasing the transport timeout.
client = Client(wsdl=wsdl, transport=Transport(session=session, timeout=120))

Get socket level errors from tornado HTTPClient

Suppose that I have http server located at (host:port) and I'm trying to use HTTPClient example from the tornado docs to fetch some data from that server, so I have a code below
from tornado import httpclient
http_client = httpclient.HTTPClient()
try:
response = http_client.fetch("http://host:port", connect_timeout=2)
print response.body
except httpclient.HTTPError as e:
print e.code
print e.message
http_client.close()
This works fine and in case everything goes well (host is available, server is working, etc.) my client successfully receives data from the server. Now I want to handle errors that can occur during connection process and handle them depending on the error code. I'm trying to get error code by using e.code in my exception handler, but it contains HTTP error codes only and I need some lower level error codes, for instance:
1) If server is not avaliable on the destination host and port it prints out
599
HTTP 599: [Errno 111] Connection refused
2) If it is impossible to reach the host (for example due to network connectivity problems), I receive the same HTTP 599 error code:
599
HTTP 599: [Errno 101] Connection refused
So I need to get Errno in my exception handler that would indicate actual lower level socket error and the only solution I see for now is to grep it from the text message I receive (e.message) but this solution looks rough and ugly.
I think your only option is to pull it out of e.message:
import re
m = re.search( "\[Errno (\d+)\]", e.message)
if m:
errno = m.group(1)
On my platform (Linux), I actually get a socket.gaierror exception when I take your code example and use a bogus URL, which provides direct access to errno. But if you're dealing with tornado.httpclient.HTTPError, I don't think there's any way for you to get at it directly.

Logging into a website which uses Microsoft ForeFront "Thread Management Gateway"

I want to use python to log into a website which uses Microsoft Forefront, and retrieve the content of an internal webpage for processing.
I am not new to python but I have not used any URL libraries.
I checked the following posts:
How can I log into a website using python?
How can I login to a website with Python?
How to use Python to login to a webpage and retrieve cookies for later usage?
Logging in to websites with python
I have also tried a couple of modules such as requests. Still I am unable to understand how this should be done, Is it enough to enter username/password? Or should I somehow use the cookies to authenticate? Any sample code would really be appreciated.
This is the code I have so far:
import requests
NAME = 'XXX'
PASSWORD = 'XXX'
URL = 'https://intra.xxx.se/CookieAuth.dll?GetLogon?curl=Z2F&reason=0&formdir=3'
def main():
# Start a session so we can have persistant cookies
session = requests.session()
# This is the form data that the page sends when logging in
login_data = {
'username': NAME,
'password': PASSWORD,
'SubmitCreds': 'login',
}
# Authenticate
r = session.post(URL, data=login_data)
# Try accessing a page that requires you to be logged in
r = session.get('https://intra.xxx.se/?t=1-2')
print r
main()
but the above code results in the following exception, on the session.post-line:
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='intra.xxx.se', port=443): Max retries exceeded with url: /CookieAuth.dll?GetLogon?curl=Z2F&reason=0&formdir=3 (Caused by <class 'socket.error'>: [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond)
UPDATE:
I noticed that I was providing wrong username/password.
Once that was updated I get a HTTP-200 response with the above code, but when I try to access any internal site I get a HTTP 401 response. Why Is this happening? What is wrong with the above code? Should I be using the cookies somehow?
TMG can be notoriously fussy about what types of connections it blocks. The next step is to find out why TMG is blocking your connection attempts.
If you have access to the TMG server, log in to it, start the TMG management user-interface (I can't remember what it is called) and have a look at the logs for failed requests coming from your IP address. Hopefully it should tell you why the connection was denied.
It seems you are attempting to connect to it over an intranet. One way I've seen it block connections is if it receives them from an address it considers to be on its 'internal' network. (TMG has two network interfaces as it is intended to be used between two networks: an internal network, whose resources it protects from threats, and an external network, where threats may come from.) If it receives on its external network interface a request that appears to have come from the internal network, it assumes the IP address has been spoofed and blocks the connection. However, I can't be sure that this is the case as I don't know what this TMG server's internal network is set up as nor whether your machine's IP address is on this internal network.

I/O error(socket error): [Errno 111] Connection refused

I have a program that uses urllib to periodically fetch a url, and I see intermittent
errors like :
I/O error(socket error): [Errno 111] Connection refused.
It works 90% of the time, but the othe r10% it fails. If retry the fetch immediately after it fails, it succeeds. I'm unable to figure out why this is so. I tried to see if any ports are available, and they are. Any debugging ideas?
For additional info, the stack trace is:
File "/usr/lib/python2.6/urllib.py", line 203, in open
return getattr(self, name)(url)
File "/usr/lib/python2.6/urllib.py", line 342, in open_http
h.endheaders()
File "/usr/lib/python2.6/httplib.py", line 868, in endheaders
self._send_output()
File "/usr/lib/python2.6/httplib.py", line 740, in _send_output
self.send(msg)
File "/usr/lib/python2.6/httplib.py", line 699, in send
self.connect()
File "/usr/lib/python2.6/httplib.py", line 683, in connect
self.timeout)
File "/usr/lib/python2.6/socket.py", line 512, in create_connection
raise error, msg
Edit - A google search isn't very helpful, what I got out of it is that the server
I'm fetching from sometimes refuses connections, how can I verify its not a bug in my code
and this is indeed the case?
Use a packet sniffer like Wireshark to look at what happens. You need to see a SYN-flagged packet outgoing, a SYN+ACK-flagged incoming and then a ACK-flagged outgoing. After that, the port is considered open on the local side.
If you only see the first packet and the error message comes after several seconds of waiting, the other side is not answering at all (like in: unplugged cable, overloaded server, misguided packet was discarded) and your local network stack aborts the connection attempt. If you see RST packets, the host actually denies the connection. If you see "ICMP Port unreachable" or host unreachable packets, a firewall or the target host inform you of the port actually being closed.
Of course you cannot expect the service to be available at all times (consider all the points of failure in between you and the data), so you should try again later.
Getting an ECONNREFUSED errno means that your kernel was refused a connection at the other end, so if it's a bug, it's either in your kernel or in the other end.
What you can do is to trap the error in a very specific way and try again in a little while, since this seems to work:
# This is Python > 2.5 code
import errno, time
for attempt in range(MAXIMUM_NUMBER_OF_ATTEMPTS):
try:
# your urllib call here
except EnvironmentError as exc: # replace " as " with ", " for Python<2.6
if exc.errno == errno.ECONNREFUSED:
time.sleep(A_COUPLE_OF_SECONDS)
else:
raise # re-raise otherwise
else: # we tried, and we had no failure, so
break
else: # we never broke out of the for loop
raise RuntimeError("maximum number of unsuccessful attempts reached")
Replace the two all-caps constants with your favourite numbers.
I previously had this problem with my EC2 instance (I was serving couchdb to serve resources -- am considering Amazon's S3 for the future).
One thing to check (assuming Ec2) is that the couchdb port is added to your open ports within your security policy.
I specifically encountered
"[Errno 111] Connection refused"
over EC2 when the instance was stopped and started. The problem seems to be a pidfile race. The solution for me was killing couchdb (entirely and properly) via:
pkill -f couchdb
and then restarting with:
/etc/init.d/couchdb restart
I'm not exactly sure what's causing this. You can try looking in your socket.py (mine is a different version, so line numbers from the trace don't match, and I'm afraid some other details might not match as well).
Anyway, it seems like a good practice to put your url fetching code in a try: ... except: ... block, and handle this with a short pause and a retry. The URL you're trying to fetch may be down, or too loaded, and that's stuff you'll only be able to handle in with a retry anyway.
Its seems that server is not running properly so ensure that with terminal by
telnet ip port
example
telnet localhost 8069
It will return connected to localhost so it indicates that there is no problem with the connection
Else it will return Connection refused it indicates that there is problem with the connection

Categories

Resources