Python3, Requests: How to merge CookieJars

Python3, Requests: How to merge CookieJars - python

I am learning Python and using the Requests Lib.
I want to use a CookieJar to store cookies, but I cannot find out how to add a response's Cookies to an existing CookieJar:
CookieJar.extract_cookies requires a request object - I dont understand which request to reference and why. I want to add the Cookies to a CookieJar, not to a request...
So I tried
cj= http.cookiejar.CookieJar()
tmp= requests.utils.dict_from_cookiejar(resp.cookies)
requests.utils.add_dict_to_cookiejar(cj, tmp)
the third line Fails:
File "[...]\Python35-32\lib\site-packages\requests\utils.py", line 336, in add_dict_to_cookiejar
return cookiejar_from_dict(cookie_dict, cj)
File "[...]\Python35-32\lib\site-packages\requests\cookies.py", line 515, in cookiejar_from_dict
names_from_jar = [cookie.name for cookie in cookiejar]
File "[...]\Python35-32\lib\site-packages\requests\cookies.py", line 515, in <listcomp>
names_from_jar = [cookie.name for cookie in cookiejar]
AttributeError: 'str' object has no attribute 'name'
As the Cookiejar of Requests is a dict as well, I finally tried
requests.utils.add_dict_to_cookiejar(cj, resp.cookies)
which Fails with the same error.....
what am I doing wrong?

Try this way
# Create cookie one
one = requests.cookies.RequestsCookieJar()
# Create cookie two
two = requests.cookies.RequestsCookieJar()
# set some cookie value
one.set("one_key", "one_value")
two.set("two_key", "two_value")
print(one)
<RequestsCookieJar[<Cookie one_key=one_value for />]>
print(two)
<RequestsCookieJar[<Cookie two_key=two_value for />]>
# Now merge
one.update(two)
<RequestsCookieJar[<Cookie one_key=one_value for />, <Cookie two_key=two_value for />]>

Related

keyword argument clashes with variable

Although this is most likely a newbie question I struggled to find any information online to help me with my problem
My code is meant to scrap onion sites, and despite being able to connect to TOR and the web scraper working fine as a stand-alone, when I tried combining both code blocks I kept getting numerous errors regarding the keyword argument in my code, even attempting to delete it presents me with bugs, I am a bit lost on what I'm supposed to do
import socket
import socks
import requests
from pywebcopy import save_webpage
socks.set_default_proxy(socks.SOCKS5, "127.0.0.1", 9050)
socket.socket = socks.socksocket
def get_tor_session():
session = requests.session()
# Tor uses the 9050 port as the default socks port
session.proxies = {'http': 'socks5h://127.0.0.1:9050',
'https': 'socks5h://127.0.0.1:9050'}
return session
session = get_tor_session()
print(session.get("http://httpbin.org/ip").text)
kwargs = {'project_name': 'site folder'}
save_webpage(
# url of the website
session.get(url="http://elfqv3zjfegus3bgg5d7pv62eqght4h6sl6yjjhe7kjpi2s56bzgk2yd.onion"),
# folder where the copy will be saved
project_folder=r"C:\Users\admin\Desktop\WebScraping",
**kwargs
)
In this case, I'm presented with the following error:
TypeError: Cannot mix str and non-str arguments
attempting to replace
project_folder=r"C:\Users\admin\Desktop\WebScraping",
**kwargs
with
kwargs,
project_folder=r"C:\Users\admin\Desktop\WebScraping"
presents me with this error:
TypeError: save_webpage() got multiple values for argument
traceback for the first error:
File "C:\Users\admin\Desktop\WebScraping\tor.py", line 43, in <module>
**kwargs
File "C:\Users\admin\anaconda3\lib\site-packages\pywebcopy\api.py", line 58, in save_webpage
config.setup_config(url, project_folder, project_name, **kwargs)
File "C:\Users\admin\anaconda3\lib\site-packages\pywebcopy\configs.py", line 189, in setup_config
SESSION.load_rules_from_url(urljoin(project_url, '/robots.txt'))
File "C:\Users\admin\anaconda3\lib\urllib\parse.py", line 487, in urljoin
base, url, _coerce_result = _coerce_args(base, url)
File "C:\Users\admin\anaconda3\lib\urllib\parse.py", line 120, in _coerce_args
raise TypeError("Cannot mix str and non-str arguments")
I'd really appreciate an explanation on what causes such a bug and how to avoid it in the future

Not sure why this hasn't been answered yet. As mentioned in my comment, simply change this:
save_webpage(
# url of the website
session.get(url=...),
# folder where the copy will be saved
project_folder=r"C:\Users\admin\Desktop\WebScraping",
**kwargs
)
To:
save_webpage(
# url of the website
url=...,
# folder where the copy will be saved
project_folder=r"C:\Users\admin\Desktop\WebScraping",
**kwargs
)
save_webpage makes the request internally.

SOLVED
adding the following code resolved the issue:
def getaddrinfo(*args):
return [(socket.AF_INET, socket.SOCK_STREAM, 6, '', (args[0], args[1]))]
socket.getaddrinfo = getaddrinfo

Python urllib.request.urlopen: AttributeError: 'bytes' object has no attribute 'data'

I am using Python 3 and trying to connect to dstk. I am getting an error with urllib package.
I researched a lot on SO and could not find anything similar to this problem.
api_url = self.api_base+'/street2coordinates'
api_body = json.dumps(addresses)
#api_url=api_url.encode("utf-8")
#api_body=api_body.encode("utf-8")
print(type(api_url))
response_string = six.moves.urllib.request.urlopen(api_url, api_body).read()
response = json.loads(response_string)
If I do not encode the api_url and api_body I get the below:
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1247, in do_request_
raise TypeError(msg)
TypeError: POST data should be bytes, an iterable of bytes, or a file object. It cannot be of type str.
However if I try and encode them to utf-8 (uncommenting the lines) then I get the below error:
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 514, in open
req.data = data
AttributeError: 'bytes' object has no attribute 'data'
This seems like a circular error for me and I am not able to resolve it. I did try make to solutions from SO regards to change it to json.load etc but nothing seems to work.

You are encoding both the url and the request body, but only the body should be encoded.
This ought to work:
api_url = self.api_base+'/street2coordinates'
api_body = json.dumps(addresses)
api_body=api_body.encode("utf-8")
response_string = six.moves.urllib.request.urlopen(api_url, api_body).read()
response = json.loads(response_string)
urlopen's arguments are passed to another class to create an opener, and this class does not know whether it has been passed a url or a Request instance. So it checks whether the "url" is a string - if the "url" is a string, it creates a Request, if not it assumes that "url" is a Request instance and tries to set its data attribute, causing the exception that you are seeing.
The code in question is here.

Requests CookieJar empty even thought the page have it

I'm on Python 3.5.1, using requests, the relevant part of the code is as follows:
req = requests.post(self.URL, data={"username": username, "password": password})
self.cookies = {"MOODLEID1_": req.cookies["MOODLEID1_"], "MoodleSession": req.cookies["MoodleSession"]}
self.URL has the correct page, and the POST is working as intended, I did some print to check that, and it passed.
My output:
Traceback (most recent call last):
File "D:/.../main.py", line 14, in <module>
m.login('first.last', 'pa$$w0rd!')
File "D:\...\moodle2.py", line 14, in login
self.cookies = {"MOODLEID1_": req.cookies["MOODLEID1_"], "MoodleSession": req.cookies["MoodleSession"]}
File "D:\...\venv\lib\site-packages\requests\cookies.py", line 287, in __getitem__
return self._find_no_duplicates(name)
File "D:\...\venv\lib\site-packages\requests\cookies.py", line 345, in _find_no_duplicates
raise KeyError('name=%r, domain=%r, path=%r' % (name, domain, path))
KeyError: "name='MOODLEID1_', domain=None, path=None"
I'm trying to debug during runtime to check what req.cookies has. But what I get is surprising, at least for me. If you put a breakpoint on self.cookies = {...} and run [(c.name, c.value, c.domain) for c in req.cookies] I get an empty list, like there isn't any cookie in there.
The site does send cookies, checking with a Chrome extension, I found 2, "MOODLEID1_" and "MoodleSession", so why I'm not getting them?

The response doesn't appear to contain any cookies. Look for one or more Set-Cookie headers in req.headers.
Cookies stored in a browser are there because a response included a Set-Cookie header for each of those cookies. You'll have to find what response the server sets those cookies with; apparently it is not this response.
If you need to retain those cookies (once set) across requests, do use a requests.Session() object; this'll retain any cookies returned by responses and send them out again as appropriate with new requests.

Python AWeber API throws Exception: Combination of timestmap, nonce, consumer_key must be unique

I am using Python's AWeber API (https://github.com/aweber/AWeber-API-Python-Library), and I frequently get these exceptions. I have no idea why this happens. Any ideas?
File "/<path>/aweber_api/entry.py", line 160, in __getattr__
return self._child_collection(attr)
File "/<path>/aweber_api/entry.py", line 151, in _child_collection
self._child_collections[attr] = self.load_from_url(url)
File "/<path>/aweber_api/base.py", line 38, in load_from_url
response = self.adapter.request('GET', url)
File "/<path>/aweber_api/oauth.py", line 60, in request
'{0}: {1}'.format(error_type, error_msg))
APIException: UnauthorizedError: Combination of nonce, timestamp, and consumer_key must be unique. https://labs.aweber.com/docs/troubleshooting#unauthorized

The error message is actually due to OAuth. You are sending the same request multiple times. You need to generate your request again (even if the same command and parameters) to get a new timestamp and nonce.
This is an OAuth measure to ensure it isn't dealing with the exact same request multiple times. e.g. your program actually sends the command twice at the exact same time.

Using WSDL service from Python, it is my client code or the server?

I'm trying to write a Python client for a a WSDL service. I'm using the Suds library to handle the SOAP messages.
When I try to call the service, I get a Suds exception: <rval /> not mapped to message part. If I set the retxml Suds option I get XML which looks OK to me.
Is the problem with the client code? Am I missing some flag which will allow Suds to correctly parse the XML? Alternatively, the problem could be with the server. Is the XML not structured correctly?
My code is a follows (method names changed):
c = Client(url)
p = c.factory.create('MyParam')
p.value = 100
c.service.run(p)
This results in a Suds exception:
File "/home/.../test.py", line 38, in test
res = self.client.service.run(p)
File "/usr/local/lib/python2.6/dist-packages/suds-0.3.9-py2.6.egg/suds/client.py", line 539, in __call__
return client.invoke(args, kwargs)
File "/usr/local/lib/python2.6/dist-packages/suds-0.3.9-py2.6.egg/suds/client.py", line 598, in invoke
result = self.send(msg)
File "/usr/local/lib/python2.6/dist-packages/suds-0.3.9-py2.6.egg/suds/client.py", line 627, in send
result = self.succeeded(binding, reply.message)
File "/usr/local/lib/python2.6/dist-packages/suds-0.3.9-py2.6.egg/suds/client.py", line 659, in succeeded
r, p = binding.get_reply(self.method, reply)
File "/usr/local/lib/python2.6/dist-packages/suds-0.3.9-py2.6.egg/suds/bindings/binding.py", line 151, in get_reply
result = self.replycomposite(rtypes, nodes)
File "/usr/local/lib/python2.6/dist-packages/suds-0.3.9- py2.6.egg/suds/bindings/binding.py", line 204, in replycomposite
raise Exception('<%s/> not mapped to message part' % tag)
Exception: <rval/> not mapped to message part
The returned XML (modified to remove customer identifiers)
<S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/">
<S:Body>
<ns2:getResponse xmlns:ns2="http://api.xxx.xxx.com/api/">
<rval xmlns="http://xxx.xxx.xxx.com/api/">
<ns2:totalNumEntries>
2
</ns2:totalNumEntries>
<ns2:entries>
<ns2:id>
1
</ns2:id>
</ns2:entries>
<ns2:entries>
<ns2:id>
2
</ns2:id>
</ns2:entries>
</rval>
</ns2:getResponse>
</S:Body>
</S:Envelope>

Possible dup of What does suds mean by "<faultcode/> not mapped to message part"?
Here is my answer from that question:
I had a similar issue where the call was successful, and Suds crashed on parsing the response from the client. The workaround I used was to use the Suds option to return raw XML and then use BeautifulSoup to parse the response.
Example:
client = Client(url)
client.set_options(retxml=True)
soapresp_raw_xml = client.service.submit_func(data)
soup = BeautifulStoneSoup(soapresp_raw_xml)
value_i_want = soup.find('ns:NewSRId')

This exception actually means that the answer from SOAP-service contains tag <rval>, which doesn't exist in the WSDL-scheme of the service.
Keep in mind that the Suds library caches the WSDL-scheme, that is why the problem may occur if the WSDL-scheme was changed recently. Then the answers match the new scheme, but are verified by the suds-client with the old one. In this case rm /tmp/suds/* will help you.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python3, Requests: How to merge CookieJars - python

Related

keyword argument clashes with variable

Python urllib.request.urlopen: AttributeError: 'bytes' object has no attribute 'data'

Requests CookieJar empty even thought the page have it

Python AWeber API throws Exception: Combination of timestmap, nonce, consumer_key must be unique

Using WSDL service from Python, it is my client code or the server?

Categories

Resources