urllib2 fails when URL has a port number appended

urllib2 fails when URL has a port number appended - python

The code below:
import urllib2
file = urllib2.urlopen("http://foo.bar.com:82")
works just fine on my mac (OS X 10.8.4 running Python 2.7.1. It opens the URL and I can parse the file with no problems.
When I try the EXACT same code (these two lines) in GoDaddy Python 2.7.3 (or 2.4) I receive an error:
urllib2.URLError: <urlopen error (111, 'Connection refused')
The problem has something to do with the port :82 that is an essential part of the address. I have tried using a forwarding address with masking, etc., and nothing works.
Any idea why it would work in one environment and not in the other (ostensibly similar) environment? Any ideas how to get around this? I also tried Mechanize to no avail. Previous posts have suggested focusing on urllib2.HTTPBasicAuthHandler, but it works fine on my OS X environment without anything special.
Ideas are welcome.

Connection refused means that your operating system tried to contact the remote host, but got a "closed port" message.
Most likely, this is because of a firewall between GoDaddy and foo.bar.com. Most likely, foo.bar.com is only reachable from your computer or your local network, but it also could be GoDaddy preventing access to strange ports.

From a quick look at the GoDaddy support forums, it looks like they only support outgoing requests to ports 80 (HTTP) and 443 (HTTPS) on their shared hosts. See e.g.
http://support.godaddy.com/groups/web-hosting/forum/topic/curl-to-ports-other-than-80/

Related

Python v2.7 Requests v2.5.1 - all get requests return HTTP Error 503

I'm running anaconda python 2.7 and the latest Requests library on a Windows 7 desktop connected to a corporate network with an outbound proxy server at 10.0.0.255.
My python script reads as follows:
import requests
r = requests.get("http://google.com")
I've also tried many different intranet and internet urls, HTTP and HTTPs all with the same result: 503 error.
I've thought somehow the proxy is at fault. I've added the 'proxies = prox' statement with the following definition"
prox={
"http" : "10.0.0.255:80"
"https" : "10.0.0.225:443"
}
Which made no difference, but its entirely possible that my ports are wrong as the documentation is a bit sparse on the statement (only one example).
I did try localhost and it gave me a different error:
ConnectionError: ('Connection aborted.', error(10061, 'No connection could be made because the target machine actively refused it'))
My machine hates me. Great.
At this point I'm stumped. Its probably something related to all the security c_rp on this machine, but I'm not sure what my next move is.
I am a N00b to python, and haven't coded in 20 years. That said, I wrote hard core C and ran memory debugs deep in architecture to find leaks, so I'm not completely dumb, just very, very rusty.

Doing a GET request on localhost won't do anything unless there is a webserver running on localhost:80. Setup a node.js webserver running on localhost and then try again.
Most corporate proxies use port 8080 for all traffic.

"getaddrinfo failed", what does that mean?

File "C:\Python27\lib\socket.py", line 224, in meth
return getattr(self._sock,name)(*args) gaierror: [Errno 11004]
getaddrinfo failed
Getting this error when launching the hello world sample from here:
http://bottlepy.org/docs/dev/

It most likely means the hostname can't be resolved.
import socket
socket.getaddrinfo('localhost', 8080)
If it doesn't work there, it's not going to work in the Bottle example. You can try '127.0.0.1' instead of 'localhost' in case that's the problem.

The problem, in my case, was that some install at some point defined an environment variable http_proxy on my machine when I had no proxy.
Removing the http_proxy environment variable fixed the problem.

The problem in my case was that I needed to add environment variables for http_proxy and https_proxy.
E.g.,
http_proxy=http://your_proxy:your_port
https_proxy=https://your_proxy:your_port
To set these environment variables in Windows, see the answers to this question.

Make sure you pass a proxy attribute in your command
forexample - pip install --proxy=http://proxyhost:proxyport pixiedust
Use a proxy port which has direct connection (with / without password). Speak with your corporate IT administrator. Quick way is find out network settings used in eclipse which will have direct connection.
You will encouter this issue often if you work behind a corporate firewall. You will have to check your internet explorer - InternetOptions -LAN Connection - Settings
Uncheck - Use automatic configuration script
Check - Use a proxy server for your LAN. Ensure you have given the right address and port.
Click Ok
Come back to anaconda terminal and you can try install commands

May be this will help some one. I have my proxy setup in python script but keep getting the error mentioned in the question.
Below is the piece of block which will take my username and password as a constant in the beginning.
if (use_proxy):
proxy = req.ProxyHandler({'https': proxy_url})
auth = req.HTTPBasicAuthHandler()
opener = req.build_opener(proxy, auth, req.HTTPHandler)
req.install_opener(opener)
If you are using corporate laptop and if you did not connect to Direct Access or office VPN then the above block will throw error. All you need to do is to connect to your org VPN and then execute your python script.
Thanks

I spent some good hours fixing this but the solution turned out to be really simple. I had my ftp server address starting with ftp://. I removed it and the code started working.
FTP address before:
ftp_css_address = "ftp://science-xyz.xyz.xyz.int"
Changed it to:
ftp_css_address = "science-xyz.xyz.xyz.int"

getaddrinfo unable to resolve host

I'm having a weird problem. I have this Python application and when I try to open a url in the application, for instance urllib2.urlopen("http://google.com", None) I get the following error:
IOError: [Errno socket error] [Errno 8] nodename nor servname provided, or not known
However when I do the same thing on the python command line interpreter it works fine. The same python executable is being used for both the application and the command line.
nslookup google.com seems to work fine. I opened up wireshark and it looks like when the application tries to open google.com only a mDNS query goes out for "My-Name-MacBook-Pro.local". However, when the command line tries to open google.com a regular DNS query goes out for "google.com" I found if I hardcoded Google's IP in /etc/hosts then the request from the application finally started working.
It seems something weird must be altering how the application resolves domain names, but I have no idea what could be doing this.
I'm running Mac OSX 10.6.7 and Python 2.6.
Edit: I am not using a proxy to access the internet

Just see that you don't have HTTP_PROXY environment variable set which is preventing this. (In which case, that would be a bad error message. Given the proper directory and try again, like
import urllib
r = urlib.urlopen('http://www.google.com')
print r.read()

Fabric says 'no route to host', even though I can access it over SSH

I'm having some issues uploading a file to a server with Fabric. I get the following output:
Fatal error: Low level socket error connecting to host ssh.example.com: No route to host
Aborting.
The weird thing is, when I connect manually using ssh (same host string, I copy-pasted it from the fabfile to make sure), it works perfectly as expected. I can also use scp to copy the file to the same location manually.
The offending line in my Fabfile is this, if it helps:
put('media.tgz','/home/private/media.tgz')
Also, I'm connecting to a different host to the rest of my fabfile using the #hosts() decorator (this particular method uploads static media, which is served from somewhere different to the app itself).

I had the same issue. Had not investigated it but using the IP-Address instead the hostname helped. This particular host had an IPv6 AAAA record but my client had no IPv6 connection, maybe this is the reason. HTH

strange urllib2 failures on some systems

I've got a python script that simply grabs a page with urllib2, and then proceeds to use BeautifulSoup to parse that stuff. Code is:
class Foo(Bar):
def fetch(self):
try:
self.mypage = urllib2.urlopen(self.url + 'MainPage.htm', timeout=30).read()
except urllib2.URLError:
sys.stderr.write("Error: system at %s not responding\n" % self.url)
sys.exit(1)
the system I'm trying to access is remote and behind a linux router that does port forwarding between the public static ip and the lan ip of the actual system.
I was getting failures on some systems and at first I thought about a bug in urllib2/python, or some weird TCP stuff (the http server is actually an embedded card in some industrial system). But then I tried other systems and urllib2 works as expected, and I can also correctly access the http server using links2 or wget even on systems where urllib2 fails.
Ubuntu 10.04 LTS 32bit behind Apple Airport nat on remote adsl: everythin works
Mac OSX 10.6 in LAN with the server, remote behind nat, etc...: everything works
Ubuntu 10.04 LTS 64bit with public ip: urllib2 times out, links and wget work
Gentoo Linux with public ip: urllib2 times out, links and wget work
I have verified with tcpdump on the linux router (http server end) and urllib2 always completes the tcp handshake even from the problematic systems, but then it seems to hang there. I tried toggling on/off syncookies and ECN but that didn't change anything.
How could I debug and possibly solve this issue?

You could also switch to using httplib2.

After nearly 17 months I don't have access to that specific system anymore, so I won't be able to accept any real answer to this question.
At least I can tell future readers what answers are not good:
changing to httplib2
no, we're not getting ICMP redirects
no, we don't even drop ICMP fragmentation packets
cheers.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

urllib2 fails when URL has a port number appended - python

From a quick look at the GoDaddy support forums, it looks like they only support outgoing requests to ports 80 (HTTP) and 443 (HTTPS) on their shared hosts. See e.g. http://support.godaddy.com/groups/web-hosting/forum/topic/curl-to-ports-other-than-80/

Related

Python v2.7 Requests v2.5.1 - all get requests return HTTP Error 503

"getaddrinfo failed", what does that mean?

getaddrinfo unable to resolve host

Fabric says 'no route to host', even though I can access it over SSH

strange urllib2 failures on some systems

Categories

Resources