python urllib error - python

so I have this code:
def crawl(self, url):
data = urllib.request.urlopen(url)
print(data)
but then when I call the function, it returns
data = urllib.request.urlopen(url)
AttributeError: 'module' object has no attribute 'request'
what did I do wrong? I already imported urllib..
using python 3.1.3

In python3, urllib is a package with three modules request, response, and error for its respective purposes.
Whenever you had import urllib or import urllib2 in Python2.
Replace them with
import urllib.request
import urllib.response
import urllib.error
The classs and methods are same.
BTW, use 2to3 tool if you converting from python2 to python3.

urllib.request is a separate module; import it explicitly.

Related

Import urllib for module "request" error

import urllib request
import requests
goog_url = "https://query1.finance.yahoo.com/v7/finance/download/GOOG?period1=1501517722&period2=1504196122&interval=1d&events=history&crumb=bU42Yaj88Bt"
def download_stock_data(csv_url):
response = ur.urlopen(csv_url)
csv = response.read()
csv_str = str(csv)
lines = csv_str.split("\\n")
dest_url = r'goog.csv'
fx = open(dest_url, "w")
for line in lines:
fx.write(line + "\n")
fx.close()
download_stock_data(goog_url)
I'm trying to import a CSV file from the internet with this code. But I continue, despite my best efforts, to get a syntax error that says that it cannot find the request module of the urllib import.
File "/Users/Micmaster/PycharmProjects/pythonProject/firstProject.py", line 1
import urllib request
^
SyntaxError: invalid syntax
I've tried many different variations "from urllib import request", "import urllib.request", "import urllib", "import urllib2.request" and even changing versions of my interpreter on pycharm. Any help would be appreciated, thanks!
The urllib.request module is in Python 3 library;
For Python 2 you'd use urllib2.
I will write about python2.
Why are you trying import request python or object from the urllib package, when this package doesn't have it.
And on the next line your import requests. So use requests.
And I don't know why you need urllib, but change import urllib request to import urllib as ur.
import urllib
from urllib import request
goog_url = "https://query1.finance.yahoo.com/v7/finance/download/GOOG?period1=1501517722&period2=1504196122&interval=1d&events=history&crumb=bU42Yaj88Bt"
def download_stock_data(csv_url):
response = urllib.request.urlopen(csv_url)
csv = response.read()
csv_str = str(csv)
lines = csv_str.split("\\n")
dest_url = r'goog.csv'
fx = open(dest_url, "w")
for line in lines:
fx.write(line + "\n")
fx.close()
download_stock_data(goog_url)
Your code should look like this. But it is still showing HTTP Error 401: Unauthorized you had taken the wrong URL.

python: called function from other file needs modules

I am calling a function from functions.py into work.py, which works fine:
from functions import get_ad_page_urls
The get_ad_page_urls function makes use of a.o. the requests module.
Now, wether or not I import the requests module into work.py, when I run the called function in work.py, it gives an error: NameError: name 'requests' is not defined.
I have defined get_ad_page_urls in functions.py including the module, like so,
def get_ad_page_urls():
import requests
<rest of function>
or excluding the module, like so,
import requests
def get_ad_page_urls():
<rest of function>
but it doesn't matter, the NameError persists.
How should I write the function such that when I call the function in work.py everything works fine?
Traceback:
get_ad_page_urls(page_root_url)
Traceback (most recent call last):
File "<ipython-input-253-ac55b8b1e24c>", line 1, in <module>
get_ad_page_urls(page_root_url)
File "/Users/myname/Documents/RentIndicator/Python Code/idealista_functions.py", line 35, in get_ad_page_urls
NameError: name 'requests' is not defined
functions.py
import requests
import bs4
import re
from bs4 import BeautifulSoup
def get_ad_page_urls(page_root_url):
response = requests.get(page_root_url)
soup = bs4.BeautifulSoup(response.text)
container=soup.find("div",{"class":"items-container"})
return [link.get("href") for link in container.findAll("a", href=re.compile("^(/inmueble/)((?!:).)*$"))]
work.py
import requests
import bs4
import re
from bs4 import BeautifulSoup
from functions import get_ad_page_urls
city='Valencia'
lcity=city.lower()
root_url = 'https://www.idealista.com'
house_href='/alquiler-habitacion/'
page_root_url = root_url +house_href +lcity+ '-' + lcity + '/'
get_ad_page_urls(page_root_url)
Mine works perfectly fine running on python 3.4.4
functions.py
import requests
def get_ad_page_urls():
return requests.get("https://www.google.com")
work.py
from functions import get_ad_page_urls
print(get_ad_page_urls())
# outputs <Response [200]>
Make sure they are in the same directory. You might be using two different python versions and one of them doesn't have requests?

Python : no JSON object could be decoded

I am trying to run this app:
https://github.com/bmjr/guhTrends
I have python 2.7.x running the following script at command line. I reckon it was written using python3.x. What is deprecated in the code below?
import urllib
import json
import matplotlib.pyplot as plt
dates = urllib.request.urlopen('http://charts.spotify.com/api/tracks/most_streamed/global/weekly/')
dataDates = json.loads(dates.read().decode())
the error:
Traceback (most recent call last):
File "DataMining.py", line 6, in <module>
dates = urllib.request.urlopen('http://charts.spotify.com/api/tracks/most_streamed/global/weekly/')
AttributeError: 'module' object has no attribute 'request'
That script won't work under python2 as urllib of python2 has no request module.
Use urllib2.urlopen instead of urllib.request if you want start running that script under python2 .
To get python script work on bith (python2 and python3) use six module which is Python 2 and 3 Compatibility Library.
from six.moves import urllib
import json
import matplotlib.pyplot as plt
dates = urllib.request.urlopen('http://charts.spotify.com/api/tracks/most_streamed/global/weekly/')
dataDates = json.loads(dates.read().decode())
You're requesting a resource that is not currently available (I'm seeing a 504). Since this could potentially happen any time you request a remote service, always check the status code on the response; it's not that your code is necessarily wrong, in this case it's that you're assuming the response is valid JSON without checking whether the request was successful.
Check the urllib documentation to see how to do this (or, preferably, follow the recommendation at the top of that page and use the requests package instead).

Python: Importing urllib.quote

I would like to use urllib.quote(). But python (python3) is not finding the module.
Suppose, I have this line of code:
print(urllib.quote("châteu", safe=''))
How do I import urllib.quote?
import urllib or
import urllib.quote both give
AttributeError: 'module' object has no attribute 'quote'
What confuses me is that urllib.request is accessible via import urllib.request
In Python 3.x, you need to import urllib.parse.quote:
>>> import urllib.parse
>>> urllib.parse.quote("châteu", safe='')
'ch%C3%A2teu'
According to Python 2.x urllib module documentation:
NOTE
The urllib module has been split into parts and renamed in Python 3 to
urllib.request, urllib.parse, and urllib.error.
If you need to handle both Python 2.x and 3.x you can catch the exception and load the alternative.
try:
from urllib import quote # Python 2.X
except ImportError:
from urllib.parse import quote # Python 3+
You could also use the python compatibility wrapper six to handle this.
from six.moves.urllib.parse import quote
urllib went through some changes in Python3 and can now be imported from the parse submodule
>>> from urllib.parse import quote
>>> quote('"')
'%22'
This is how I handle this, without using exceptions.
import sys
if sys.version_info.major > 2: # Python 3 or later
from urllib.parse import quote
else: # Python 2
from urllib import quote
Use six:
from six.moves.urllib.parse import quote
six will simplify compatibility problems between Python 2 and Python 3, such as different import paths.

NameError for urllib

When I type dir(urllib) in the Python shell I get NameError: Name urllib not defined.
What is causing this?
It is because, you have to import urllib first. Try this one by one:
import urllib
dir(urllib)

Categories

Resources