Which URLLIB package to use with Python

Which URLLIB package to use with Python - python

I am using Python.org version 2.7 64 bit on Windows Vista 64 bit. I am looking at the docs and sample code for URLLIB here:
https://docs.python.org/3/howto/urllib2.html
...and trying to submit the following code to access data from the Guardian API:
from urllib2 import Request, urlopen, URLError
response = urllib.request.urlopen('http://beta.content.guardianapis.com/search?tag=football%2Fworld-cup-2014&api-key=uexnxqm5bfwca4tn2m47wnhv')
html = response.read()
print html
This is not working and is kicking out the following error:
Traceback (most recent call last):
File "C:/Python27/stack", line 4, in <module>
response = urllib.request.urlopen('http://beta.content.guardianapis.com/search?tag=football%2Fworld-cup-2014&api-key=uexnxqm5bfwca4tn2m47wnhv')
NameError: name 'urllib' is not defined
On page address for the documents it is pointing to a sub directory called 'urllib2', but the code examples are referencing a module called 'urllib'. On PYPI I can find no installation for 'urllib'.If I just run the import statement the code executes without causing an error, but with the rest of the code does not work.
Can anyone tell me which 'urllib' module I should have installed and/or why the code is producing this error?
Thanks

You are using Python 2.7, but trying to follow a HOWTO written for Python 3.
Use the correct documentation instead: https://docs.python.org/2/howto/urllib2.html, note how the URL contains a 2, not a 3, and the styling of the documentation differs materially.
Next, you are importing several names from the urllib2 module:
from urllib2 import Request, urlopen, URLError
This means now have bound the name urlopen (together with Request and URLError), so you don't (and can't) use the urllib2 module name in your code:
response = urlopen('http://beta.content.guardianapis.com/search?tag=football%2Fworld-cup-2014&api-key=uexnxqm5bfwca4tn2m47wnhv')

Please use requests or if you really need urllib API, urllib3 that is shipped with requests.
Everything else is has way too many gotchas, for example when it comes to ssl.

Related

how can I get data properly?

Hi I am new to python I use python 3 on a mac. I don't know if this is relevant. Now to the question. I need for school data from an api, but I get an error.
<module 'requests' from '/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/requests/__init__.py'>. Can somebody explain what this means
import requests
requests.get('https://api.github.com')
print(requests)

You are printing the module requests instead of the response of your request.
Try this one:
import requests
res = requests.get('https://api.github.com')
print(res.content)

How do I get the HTML of a website using Python 3?

I've been trying to do this with repl.it and have tried several solutions on this site, but none of them work. Right now, my code looks like
import urllib
url = "http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=12345"
print (urllib.urlopen(url).read())
but it just says "AttributeError: module 'urllib' has no attribute 'urlopen'".
If I add import urllib.urlopen, it tells me there's no module named that. How can I fix my problem?

The syntax you are using for the urllib library is from Python v2. The library has changed somewhat for Python v3. The new notation would look something more like:
import urllib.request
response = urllib.request.urlopen("http://www.google.com")
html = response.read()
The html object is just a string, with the returned HTML of the site. Much like the original urllib library, you should not expect images or other data files to be included in this returned object.
The confusing part here is that, in Python 3, this would fail if you did:
import urllib
response = urllib.request.urlopen("http://www.google.com")
html = response.read()
This strange module-importing behavior is, I am told, as intended and working. BUT it is non-intuitive and awkward. More importantly, for you, it makes the situation harder to debug. Enjoy.

Python3
import urllib
import requests
url = "http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=12345"
r = urllib.request.urlopen(url).read()
print(r)
or
import urllib.request
url = "http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=12345"
r = urllib.request.urlopen(url).read()
print(r)

Download file as string in python

I want to download a file to python as a string. I have tried the following, but it doesn't seem to work. What am I doing wrong, or what else might I do?
from urllib import request
webFile = request.urlopen(url).read()
print(webFile)

The following example works.
from urllib.request import urlopen
url = 'http://winterolympicsmedals.com/medals.csv'
output = urlopen(url).read()
print(output.decode('utf-8'))
Alternatively, you could use requests which provides a more human readable syntax. Keep in mind that requests requires that you install additional dependencies, which may increase the complexity of deploying the application, depending on your production enviornment.
import requests
url = 'http://winterolympicsmedals.com/medals.csv'
output = requests.get(url).text
print(output)

In Python3.x, using package 'urllib' like this:
from urllib.request import urlopen
data = urlopen('http://www.google.com').read() #bytes
body = data.decode('utf-8')

Another good library for this is http://docs.python-requests.org
It's not built-in, but I've found it to be much more usable than urllib*.

Locally execute python file that is located on a web server

I am working on an open-source project called RubberBand which is an open source project that allows you to do what the title says. Locally execute python file that is located on a web server, however I have run a problem. If a comma is located in a string (etc. "http:"), It Will return an error.
'''
RubberBand Version 1.0.1 'Indigo-Charlie'
http://www.lukeshiels.com/rubberband
CHANGE-LOG:
Changed Error Messages.
Changed Whole Code Into one function, rather than three.
Changed Importing required libraries into one line instead of two
'''
#Edit Below this line
import httplib, urlparse
def executeFromURL(url):
if (url == None):
print "!# RUBBERBAND_ERROR: No URL Specified #!"
else:
CORE = None
good_codes = [httplib.OK, httplib.FOUND, httplib.MOVED_PERMANENTLY]
host, path = urlparse.urlparse(url)[1:3]
try:
conn = httplib.HTTPConnection(host)
conn.request('HEAD', path)
CORE = conn.getresponse().status
except StandardError:
CORE = None
if(CORE in good_codes):
exec(url)
else:
print "!# RUBBERBAND_ERROR: File Does Not Exist On WEBSERVER #!"

RubberBand in three lines without error checking:
import requests
def execute_from_url(url):
exec(requests.get(url).content)

You should use a return statement in your if (url == None): block as there is no point in carrying on with your function.
Where abouts in your code is the error, is there a full traceback as URIs with commas parse fine with the urlparse module.
Is it perhaps httplib.ResponseNotReady when calling CORE = conn.getresponse().status?
Nevermind that error message, that was me quickly testing your code and re-using the same connection object. I can't see what would be erroneous in your code.

I would suggest to check this question.
avoid comma in URL, that my suggestion.
Can I use commas in a URL?

This seems to work well for me:
import urllib
(fn,hd) = urllib.urlretrieve('http://host.com/file.py')
execfile(fn)
I prefer to use standard libraries, because I'm using python bundled with third party software (abaqus) which makes it a real headache to add packages.

What can be used instead of parse_qs function

I have the following code for parsing youtube feed and returning youtube movie id. How can I rewrite this to be python 2.4 compatible which I suppose doesn't support parse_qs function ?
YTSearchFeed = feedparser.parse("http://gdata.youtube.com" + path)
videos = []
for yt in YTSearchFeed.entries:
url_data = urlparse.urlparse(yt['link'])
query = urlparse.parse_qs(url_data[4])
id = query["v"][0]
videos.append(id)

I assume your existing code runs in 2.6 or something newer, and you're trying to go back to 2.4? parse_qs used to be in the cgi module before it was moved to urlparse. Try import cgi, cgi.parse_qs.
Inspired by TryPyPy's comment, I think you could make your source run in either environment by doing:
import urlparse # if we're pre-2.6, this will not include parse_qs
try:
from urlparse import parse_qs
except ImportError: # old version, grab it from cgi
from cgi import parse_qs
urlparse.parse_qs = parse_qs
But I don't have 2.4 to try this out, so no promises.

I tried that, and still.. it wasn't working.
It's easier to simply copy the parse_qs/qsl functions over from the cgi module to the urlparse module.
Problem solved.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Which URLLIB package to use with Python - python

Please use requests or if you really need urllib API, urllib3 that is shipped with requests. Everything else is has way too many gotchas, for example when it comes to ssl.

Related

how can I get data properly?

How do I get the HTML of a website using Python 3?

Download file as string in python

Locally execute python file that is located on a web server

What can be used instead of parse_qs function

Categories

Resources