Why can't I get cookie value in Playwright? - python

Firstly, sry for my poor English
I want to use playwright to get the cookie, but I can't.
I tried 3 ways I've found, and got nothing.
Using page.on
page.on('request',get_cookie)
page.on('response',get_cookie)
def get_cookie(request):
allheaders = request.all_headers()
print(allheaders)
>>>
{'accept-ranges': 'bytes', 'age': '9576', 'cache-control': 'max-age=600', 'content-length': '6745', 'content-type': 'image/png', 'date': 'Thu, 30 Jun 2022 01:09:20 GMT', 'etag': '"206578bcab2ad71:0"', 'expires': 'Thu, 30 Jun 2022 01:19:20 GMT', 'last-modified': 'Tue, 06 Apr 2021 06:11:52 GMT', 'server': 'NWS_SPMid', 'x-cache-lookup': 'Cache Hit', 'x-daa-tunnel': 'hop_count=1', 'x-nws-log-uuid': '16892018456232999193', 'x-powered-by': 'ASP.NET'}
{'accept-ranges': 'bytes', 'age': '9576', 'cache-control': 'max-age=600', 'content-length': '6745', 'content-type': 'image/png', 'date': 'Thu, 30 Jun 2022 01:09:20 GMT', 'etag': '"206578bcab2ad71:0"', 'expires': 'Thu, 30 Jun 2022 01:19:20 GMT', 'last-modified': 'Tue, 06 Apr 2021 06:11:52 GMT', 'server': 'NWS_SPMid', 'x-cache-lookup': 'Cache Hit', 'x-daa-tunnel': 'hop_count=1', 'x-nws-log-uuid': '16892018456232999193', 'x-powered-by': 'ASP.NET'}
...(and more like this)
returned somthing, but no cookie here
Using browser_context.cookies Resolved! Thx for #Charchit
context = browser.new_context();
page = context.new_page()
page.goto(url)
cookies = context.cookies
print(cookies)
>>>
<bound method BrowserContext.cookies of <BrowserContext browser=<Browser type=<BrowserType name=chromium executable_path=/Users/swong/Library/Caches/ms-playwright/chromium-1005/chrome-mac/Chromium.app/Contents/MacOS/Chromium> version=102.0.5005.40>>>
Using JS
cookie = page.evaluate('console.log(document.cookie)')
print(cookie)
>>>
None
I opened the network tab from the Chromium page,there was the cookie I want in Requests' header.
please help me, Thank you all!
Here's my code example. The site is in Chinese language, and hope you will not mind it. It's just a simple login page.
from playwright.sync_api import sync_playwright
url = 'https://so.gushiwen.cn/user/login.aspx'
def get_cookie(request_or_reqponse):
headersArray = request_or_reqponse.headers_array()
print('「headersArray」:', headersArray)
with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
context = browser.new_context();
page = context.new_page()
page.goto(url)
page.fill('#email','6j3y4ecy#spymail.one')
page.fill('#pwd', '6j3y4ecy#spymail.one')
page.wait_for_timeout(5000) # imput the captcha code manually
page.on('request',get_cookie)
page.on('response',get_cookie)
print('loging in...')
page.click('#denglu')
page.wait_for_timeout(50000) # wait for nothing
browser.close()

In your second method, change cookies = context.cookies to cookies = context.cookies(). It's a method, you need to call it. Check the documentation:
context = browser.new_context();
page = context.new_page()
page.goto(url)
cookies = context.cookies()
print(cookies)
Also, doing it like your first method is not advisable. This is because even if you get the Cookie header from the response, you can't really store and use it else where unless you use a factory function or a global variable. Besides, why do that when BrowserContext specifically has a method for it :)
Edit
The reason the first method seemingly does not work is because it returns the headers of the request and responses made. Cookies can also be created through javascript on the page itself, these may not show up in the headers at all.
Secondly, from the headers you have printed out for the first method in your question, it seems like it was only for a single request. After running your code, there were a lot more requests and responses received, which in place printed out a lot more headers. From the responses in particular, you can retrieve the cookies set by the server by searching for the header 'set-cookie'.

Related

Windows PowerShell & InfluxDB: Unable to Write Data to Bucket

I am new to InfluxDB. Understand we need to use Windows Powershell to interact with InfluxDB.
Using Python in the shell, I tried to write data to the bucket using the below code
import influxdb_client, os, time
from influxdb_client import InfluxDBClient, Point, WritePrecision
from influxdb_client.client.write_api import SYNCHRONOUS
token = os.environ.get("INFLUXDB_TOKEN")
org = "nil"
url = "https://us-west-2-1.aws.cloud2.influxdata.com/"
client = influxdb_client.InfluxDBClient(url=url, token=token, org=org)
bucket="MyBucket"
write_api = client.write_api(write_options=SYNCHRONOUS)
for value in range(5):
point = (
Point("measurement1")
.tag("tagname1", "tagvalue1")
.field("field1", value)
)
write_api.write(bucket=bucket, org="nil", record=point)
time.sleep(1) # separate points by 1 second
But I get the error
influxdb_client.rest.ApiException: (401)
Reason: Unauthorized
HTTP response headers: HTTPHeaderDict({'Date': 'Thu, 29 Dec 2022 01:44:17 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Content-Length': '55', 'Connection': 'keep-alive', 'trace-id': '5374d7ae5df282f4', 'trace-sampled': 'false', 'x-platform-error-code': 'unauthorized', 'Strict-Transport-Security': 'max-age=15724800; includeSubDomains', 'X-Influxdb-Request-ID': '2e52a39e6d7442b5fc7eb7306ef004d4', 'X-Influxdb-Build': 'Cloud'})
HTTP response body: {"code":"unauthorized","message":"unauthorized access"}
401 indicates that there should something wrong with the authorization header. Could you try to enable to debug log as follows to see the details:
client = influxdb_client.InfluxDBClient(url=url, token=token, org=org, debug=True) // see that **debug=True** will enable verbose logging of HTTP requests
Both HTTP request headers and body will be logged to standard output. And please check the HTTP header to see whether there is "Authorization: Token" header, which is similar to
Authorization: 7mexfXXXXXXXXXXXXX
Please double check if the header is missing or malformed.

Tornado Google+ Oauth Error Code 400

I have problem in Google+ OAuth using tornado framework. I used AngularJS as front end and python tornado as backend with nginx server. I send HTTP request to Google+ API from AngularJS and my tornado API is redirects to Google login. After successfully login it redirects back to my app. At the time of redirect I think it refresh automatically, i.e there two redirect calls from Google.
See there are two HTTP redirect call from tornado OAuth2
This is my code:
class GoogleOAuth2LoginHandler(tornado.web.RequestHandler, tornado.auth.GoogleOAuth2Mixin):
#tornado.gen.coroutine
def get(self):
if self.get_argument('code', False):
user = yield self.get_authenticated_user(
redirect_uri='http://your.site.com/auth/google',
code=self.get_argument('code')
)
# Save the user with e.g. set_secure_cookie
else:
yield self.authorize_redirect(
redirect_uri='http://your.site.com/auth/google',
client_id=self.settings['google_oauth']['key'],
scope=['profile', 'email'],
response_type='code',
extra_params={'approval_prompt': 'auto'}
Error:
Google auth error: HTTPResponse(_body=None,buffer=<_io.BytesIO object at 0xb37809bc>,code=400,effective_url='https://accounts.google.com/o/oauth2/token',error=HTTPError('HTTP 400: Bad Request',),headers={'X-Consumed-Content-Encoding': 'gzip', 'Alternate-Protocol': '443:quic,p=1', 'X-Xss-Protection': '1; mode=block', 'X-Content-Type-Options': 'nosniff', 'Transfer-Encoding': 'chunked', 'Set-Cookie': 'NID=76=iaY_jJFPzvLg3_h3eqUFMt4fecbELKk9_bGJju-mwsHBNlxeDqSrtmpyazsrJ3mDgtDnTnzsw5_fjIfV8GcUAegoNgxGi5ynpcfg0vEWULSeVXKio_ANxEoK9C-F5oRs;Domain=.google.com;Path=/;Expires=Sat, 13-Aug-2016 10:17:46 GMT;HttpOnly', 'Expires': 'Fri, 12 Feb 2016 10:17:46 GMT', 'Server': 'GSE', 'Connection': 'close', 'Cache-Control': 'private, max-age=0', 'Date': 'Fri, 12 Feb 2016 10:17:46 GMT', 'P3p': 'CP="This is not a P3P policy! See https://support.google.com/accounts/answer/151657?hl=en for more info."', 'Alt-Svc': 'quic=":443"; ma=604800; v="30,29,28,27,26,25"', 'Content-Type': 'application/json; charset=utf-8', 'X-Frame-Options': 'SAMEORIGIN'},reason='Bad Request',request=,request_time=0.4158029556274414,time_info={})
We had the same problem with exactly same config (Tornado+nginx+angularjs). I just rewrote OAuth authentication part without tornado and problem resolved. You can use tornado's AsyncHttpClient but I used aiohttp since I have hosted tornado inside asyncio.
Following is the new code and commented parts are the old code.
from backend.helpers.async_oauth2.client import Client
oauth_client = Client(app_settings.security.google.client_id, app_settings.security.google.client_secret,
app_settings.security.google.redirect_uri, "https://accounts.google.com/o/oauth2/auth"
, "https://accounts.google.com/o/oauth2/token")
access = await oauth_client.get_token(code, grant_type="authorization_code")
# access = await self.get_authenticated_user(
# redirect_uri=app_settings.security.google.redirect_uri,
# code=code)
# user = await self.oauth2_request(
# "https://www.googleapis.com/oauth2/v1/userinfo",
# access_token=str(access["access_token"]))
user = await oauth_client.http_get(
"https://www.googleapis.com/oauth2/v1/userinfo?{}".format(
url_parse.urlencode({'access_token':str(access["access_token"])})))

How to change request headers

How do I set request headers? I am downloading image from instagram and I want to change its filename, get the file size. These many request headers are there wiki
This is what I have done until now
import requests
#app.route('/try')
def trial():
img = 'https://igcdn-photos-e-a.akamaihd.net//hphotos-ak-xaf1//t51.2885-15//e35//12093691_1082288621781484_1524190206_n.jpg'
imgData = requests.get(img)
return imgData # this gives me error: Response object is not callable
Edit : I want to set the content-disposition header . Another question : when I set the headers, then how would I give the image to the client side?
I read on Internet that to send the file to client side I should set the request header. But how do I send the file is my another question? Sorry for asking another question in the same
The query which you asked has nothing to do with Flask. Flask is a web framework. Requests is a HTTP library which will help solve your issue
You just need to create a dict with your headers (key: value pairs where the key is the name of the header and the value is, well, the value of the pair) and pass that dict to the headers parameter on the .get or .post method.
headers = {'Content-Type': 'text/plain'}
r = requests.get('http://example.com', headers=headers)
If you wish to check the header values set when the request was sent then you can simple do this:
print r.headers
print r.headers['Content-Type'] # Output - 'text/html'
resp = requests.get('https://igcdn-photos-e-a.akamaihd.net/hphotos-ak-xaf1/t51.2885-15/e35/12093691_1082288621781484_1524190206_n.jpg')
resp.headers
And you will get the response like this ..
{'content-length': '73921', 'last-modified': 'Fri, 30 Oct 2015 15:18:29 GMT', 'connection': 'keep-alive', 'cache-control': 'max-age=1209600', 'date': 'Wed, 18 Nov 2015 08:44:15 GMT', 'access-control-allow-origin': '*', 'content-type': 'image/jpeg', 'timing-allow-origin': '*'}
You can get headers... check out the content-length. And you will not gonna get the file name I believe because content-disposition will not be there. So try something else to get the file name.

HTTP request with requests in python returning text instead of json

def thready(name,count):
payload={'Accept': 'application/json;charset=utf-8', 'X-Mashape-Key': 'key'
}
link = "https://montanaflynn-gender-guesser.p.mashape.com/?name=" + name
r=requests.get(link ,headers=payload)
print r.headers
data=r.json()
print data
count=0
thready("bob",count)
So I just tried to do a simple HTTP request in python. In the r.header i get
{'date': 'Wed, 22 Jul 2015 06:30:12 GMT', 'content-length': '178', 'content-type
': 'text/html', 'connection': 'keep-alive', 'server': 'Mashape/5.0.6'}
In the header I said it should return json. The return is text though? Very confused about this fact? Some insight would be very helpful because I'm trying to do the simple
There is nothing wrong with the code.
I have inspected your API request upstream by going to the source: the API provider's page and it looks like the API is defunct/not working as expected.
Which is exactly why you're getting errors.
Solution:
1) Contact the provider to try to resolve the problem, its on them not on you to fix it
2) Find an alternative API from the same portal: https://www.mashape.com/explore?query=gender
Good luck

Linkedin Api for python not working correctly

Linkedin Documents are confusing like crazy. I just want to get some basic information. I need to get a company's recent updates, comments for the update, and how many likes the update got. I tried to follow the documentation and this is my code:
from linkedin import linkedin
import oauth2 as oauth
import httplib2
api_key = '9puxXXXXXXX'
secret_key = 'brtXoXEkXXXXXXXXX'
auth_token = '75e15760-XXXXXXXXXXXXXXXXXXXXXX'
auth_secret = '10d8caXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
RETURN_URL = 'http://localhost:8000'
cos = oauth.Consumer(api_key,secret_key)
access_token = oauth.Token(key=auth_token, secret=auth_secret)
client = oauth.Client(cos,access_token)
resp,content = client.request("http://api.linkedin.com/v1/companies/1219692/updates?start=0&count=10", "GET", "")
This code is supposed to get the 10 recent updates for apple, but this is what I get when I
print resp
print content
{'status': '200', 'content-length': '78', 'content-location': u'http://api.linkedin.com/v1/companies/216984/updates?count=10&oauth_body_hash=2jmj7l5rSw0yVb%2FvlWAYkK%2FYBwk%3D&oauth_nonce=87365476&oauth_timestamp=1372347259&oauth_consumer_key=9puxXXXXXXX&oauth_signature_method=HMAC-SHA1&oauth_version=1.0&start=0&oauth_token=75e1576XXXXXXX&oauth_signature=EhcMiQXXXXXXX%3D', 'transfer-encoding': 'chunked', 'vary': '*', 'server': 'Apache-Coyote/1.1', 'connection': 'close', '-content-encoding': 'gzip', 'date': 'Thu, 27 Jun 2013 15:34:18 GMT', 'x-li-request-id': '84BXIU5ZQK', 'x-li-format': 'xml', 'content-type': 'text/xml;charset=UTF-8'}
what am I doing wrong?
Your code does not make sense, you imported linkedin module, which seems to be this module, but I can't see you are using that API wrapper in your code. If I'm wrong you can still use the wrapper above and easily handle data back from LinkedIn. Just take a look at this example:
Querying updates
sample:
from linkedin import server
application = server.quick_api(KEY, SECRET)
application.get_company_updates(1035, params={'count': 2})
where 1035 is the ID of the company you are trying to get updates from.

Categories

Resources