When I tried this I am getting an Attribute Error : 'Response' object has no attribute 'css'
I tried with this code :
response.css('h1.ctn-article-title::text').extract()
can anyone help please?
i'm trying to get text "Update Primary Care" from below code which is title :
Update Primary Care
CME
i'm placing my entire code :
response = requests.get(url, headers = headers)
Traceback (most recent call last):
File "<console>", line 1, in <module>
NameError: name 'requests' is not defined
import requests
response = requests.get(url, headers = headers)
Traceback (most recent call last):
File "<console>", line 1, in <module>
NameError: name 'url' is not defined
url = 'somethingurl'
response = requests.get(url, headers = headers)
response.css('h1.ctn-article-title::text').extract()
Traceback (most recent call last):
File "<console>", line 1, in <module>
AttributeError: 'Response' object has no attribute 'css'
response.css('h1').extract()
Traceback (most recent call last):
File "<console>", line 1, in <module>
AttributeError: 'Response' object has no attribute 'css'
response.css('h1.ctn-article-title::text').extract()
As Tarun pointed out in the comments: You are mixing scrapy and requests code.
If you want to create a scrapy response from requests response you can try:
from scrapy.http import TextResponse
import requests
url = 'http://stackoverflow.com'
resp = requests.get(url)
resp = TextResponse(body=resp.content, url=url)
resp.xpath('//div')
# works!
See the docs for requests.Response and scrapy.http.TextResponse objects.
In this case the line where your error occurs expects a CSSResponse object not a normal response. Try to create a CSSResponse instead of the normal Response to resolve the error.
You can get it here
More specifically use an HtmlResponse because your response would be some HTML and not plain text. HtmlResponse is a subclass of CSSResponse so it inherits the missing method.
add this line in your code and it will work fine
remove any imports for requests from any other package.
from scrapy.http import Request
Related
First of all, I am getting this error. When I try running
pip3 install --upgrade json
in an attempt to resolve the error, python is unable to find the module.
The segment of code I am working with can be found below the error, but some further direction as for the code itself would be appreciated.
Error:
Traceback (most recent call last):
File "Chicago_cp.py", line 18, in <module>
StopWork_data = json.load(BeautifulSoup(StopWork_response.data,'lxml'))
File "/usr/lib/python3.8/json/__init__.py", line 293, in load
return loads(fp.read(),
TypeError: 'NoneType' object is not callable
Script:
#!/usr/bin/python
import json
from bs4 import BeautifulSoup
import urllib3
http = urllib3.PoolManager()
# Define Merge
def Merge(dict1, dict2):
res = {**dict1, **dict2}
return res
# Open the URL and the screen name
StopWork__url = "someJsonUrl"
Violation_url = "anotherJsonUrl"
StopWork_response = http.request('GET', StopWork__url)
StopWork_data = json.load(BeautifulSoup(StopWork_response.data,'lxml'))
Violation_response = http.request('GET', Violation_url)
Violation_data = json.load(BeautifulSoup(Violation_response.data,'lxml'))
dict3 = Merge(StopWork_data,Violation_data)
print (dict1)
json.load expects a file object or something else with a read method. The BeautifulSoup object doesn't have a method read. You can ask it for any attribute and it will try to find a child tag with that name, i.e. a <read> tag in this case. When it doesn't find one it returns None which causes the error. Here's a demo:
import json
from bs4 import BeautifulSoup
soup = BeautifulSoup("<p>hi</p>", "html5lib")
assert soup.read is None
assert soup.blablabla is None
assert json.loads is not None
json.load(soup)
Output:
Traceback (most recent call last):
File "main.py", line 8, in <module>
json.load(soup)
File "/usr/lib/python3.8/json/__init__.py", line 293, in load
return loads(fp.read(),
TypeError: 'NoneType' object is not callable
If the URL is returning JSON then you don't need BeautifulSoup at all because that's for parsing HTML and XML. Just use json.loads(response.data).
I have defined the following function to get the redirected URLs using Requests library. However i get the error KeyError: 'location'
def get_redirected_url(r_url):
r = requests.get(r_url, allow_redirects=False)
url = r.headers['Location']
return url
Calling the function
get_redirected_url('http://omgili.com/ri/.wHSUbtEfZQujfav8g98PjRMi_ogV.5EwBTfg476RyS2Gqya3tDAwNIv8Yi8wQ9AK4.U2mxeyq2_xbUjqsOx8NYY8r0qgxD.4Bm2SrouZKnrg1jqRxEfVmGbtTaKTaaDJtOjtS46fYr6A5UJoh9BYxVtDGJIsbSfgshRXR3FVr4-')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in get_redirected_url
File "/home/user/PycharmProjects/untitled/venv/lib/python3.6/site-packages/requests/structures.py", line 54, in __getitem__
return self._store[key.lower()][1]
KeyError: 'location'
Is it failing because the redirection waits for 5 seconds? If so, how do we incorporate that as well?
I have tried the other answers like this and this. But unable to crack it.
It is simple: r.headers doesn't have 'Location' key. You may have use the wrong key.
Edit: the site you want to browse with requests is protected.
I am unable to load the json data & getting errors which mention below.
My code is ,
import requests
import json
url = 'https://172.28.1.220//actifio/api/info/lsjobhistory?sessionid=cafc8f31-fb39-4020-8172-e8f0085004fd'
ret=requests.get(url,verify=False)
data=json.load(ret)
print(data)
Getting error
Traceback (most recent call last):
File "pr.py", line 7, in <module>
data=json.load(ret)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py", line 293, in load
return loads(fp.read(),
AttributeError: 'Response' object has no attribute 'read'
You dont actually need to import json
try this
import requests
url = 'https://172.28.1.220//actifio/api/info/lsjobhistory?sessionid=cafc8f31-fb39-4020-8172-e8f0085004fd'
ret = requests.get(url,verify=False)
data = ret.json()
print(data)
I have code like this :
import requests
import xmltodict
Locations = ['http://129.94.5.93:49154/setup.xml', 'http://129.94.5.92:49154/setup.xml', 'http://129.94.5.93:49154/setup.xml', 'http://129.94.5.92:49154/setup.xml', 'http://129.94.5.95:80/description.xml']
for item in Locations:
r = requests.get(item)
reply = xmltodict.parse(r.text)
this gives me the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "ssdpxml.py", line 57, in <module>
r = requests.get(item)
AttributeError: 'str' object has no attribute 'get'
But it works when I do this:
r=requests.get('http://129.94.5.95:80/description.xml')
Why do I get the above mentioned error??
When I try to use the .json() method of a response object from the requests library, I get an error:
>>> import requests
>>> response = requests.get("http://example.com/myfile.json")
>>> response_json = response.json()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Response' object has no attribute 'json'
Why am I getting this error and how can I fix it?
Your version of the requests library is too old. JSON support was added in version 0.12.1, released nearly 2 years ago.
The currently released version is 2.2.1.