I'm trying to login to a specific site in Python, but without success. Any helpers?
import requests
url = 'https://us0.forgeofempires.com/page/'
user_agent = 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:90.0) Gecko/20100101 Firefox/90.0'
session = requests.Session()
session.headers['User-Agent'] = user_agent
_xsrf = session.cookies.get('_xsrf', domain='.forgeofempires.com')
login = session.post(url, {
'login[userid]':'user',
'login[password]':'pass',
'_xsrf':_xsrf
})
print(login)
f = open('results.html', 'w', encoding = 'utf-8')
f.write(login.text)
f.close()
Update
The following code finally works. I had to figure out by trial and error the fewest headers that needed to be set to get this to work. And, of course, as I previously stated, it is necessary to first visit the URL https://us.forgeofempires.com/glps/iframe-login to have the necessary cookies downloaded before you can proceed with sending the login POST (it is also necessary to extract from the cookies the 'XRRF-TOKEN' value and use this to set the 'x-xsrf-token' header value). And in my earlier attempt I was erroneously sending up the form encoded as JSON.
Note that the output is in JSON format; it is not HTML. Your output file should probably have the file extension of .json.
import requests
initial_url = 'https://us.forgeofempires.com/glps/iframe-login'
login_url = 'https://us.forgeofempires.com/glps/login_check'
session = requests.Session()
user_agent = 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:90.0) Gecko/20100101 Firefox/90.0'
session.headers['User-Agent'] = user_agent
r = session.get(initial_url)
xsrf_token = session.cookies.get('XSRF-TOKEN', domain='us.forgeofempires.com')
session.headers['x-xsrf-token'] = xsrf_token
session.headers['x-requested-with'] = 'XMLHttpRequest'
login = session.post(login_url, data={
'login[userid]': 'HTWyou',
'login[password]': 'kjh&*76dfgG',
'login[remember_me]': False
})
#f = open('results.html', 'w', encoding = 'utf-8')
f = open('results.json', 'w', encoding='utf-8')
f.write(login.text)
f.close()
"""
# code to print the headers:
hdrs = login.request.headers
for k in sorted(hdrs.keys()):
print(f'{k}: {hdrs[k]}')
"""
print(login.json())
Prints:
{'success': True, 'url': '/', 'player_id': 851846598}
The headers being passed up by the login XHR Request
:authority: us.forgeofempires.com
:method: POST
:path: /glps/login_check
:scheme: https
accept: application/json, text/plain, */*
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
content-length: 74
content-type: application/x-www-form-urlencoded; charset=UTF-8
cookie: device_view=full; metricsUvId=da9ca1f1-8e32-411f-ac7d-bf389df25469; portal_tid=1627319201921-36801; portal_ref_session=1; portal_ref_url=https://us0.forgeofempires.com/; portal_data=portal_tid=1627319201921-36801&portal_ref_session=1&portal_ref_url=https://us0.forgeofempires.com/; PHPSESSID=54955p3mbbstetnnquivk12rk2r0ihqmk41nagl620r5g9p5; XSRF-TOKEN=mQxleEoXU2Hyz6LP1ZlM7huHE-DLt5pg8qySl-M1vtI
dnt: 1
origin: https://us.forgeofempires.com
referer: https://us.forgeofempires.com/glps/iframe-login
sec-ch-ua: "Chromium";v="92", " Not A;Brand";v="99", "Google Chrome";v="92"
sec-ch-ua-mobile: ?0
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36
x-requested-with: XMLHttpRequest
x-xsrf-token: mQxleEoXU2Hyz6LP1ZlM7huHE-DLt5pg8qySl-M1vtI
The headers being passed up by session.post
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 89
Content-Type: application/x-www-form-urlencoded
Cookie: PHPSESSID=86lis3h0s8291fgkto5r2lcnqr7a6c22nnotveeo2iuuusht; XSRF-TOKEN=Qxm6zfZrDqaI8ce7MDQ5rQqpHHOUozKaFu-NL2Av9KU; device_view=full
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:90.0) Gecko/20100101 Firefox/90.0
x-requested-with: XMLHttpRequest
x-xsrf-token: Qxm6zfZrDqaI8ce7MDQ5rQqpHHOUozKaFu-NL2Av9KU
Related
I need your help in putting together a post request.
The output I get is html, but the plan was to get the following:
Below are all the data for the desired item:
General
Request URL: https://dgslivebetting.betonline.ag/ngwbet.aspx/gvFrameHtml
Request Method: POST
Status Code: 200
Remote Address: 104.17.64.19:443
Referrer Policy: strict-origin-when-cross-origin
Response Headers
cache-control: no-cache
cf-cache-status: DYNAMIC
cf-ray: 76800ae95afc35b3-DME
content-encoding: br
content-type: application/json; charset=utf-8
date: Thu, 10 Nov 2022 16:07:42 GMT
expires: -1
pragma: no-cache
server: cloudflare
set-cookie: server_persistent=!zk3OrErnBetHZkiKJcby5Il79pzHsf7dxKD0PcVuB54Z2dznuEbqgGAVDWLDvoqpVSDnVq+Jtf91LHo=; path=/; Httponly; Secure
x-newrelic-app-data: PxQFUFRTDQMHR1NRBQkOVVABDhFORDQHUjZKA1ZLVVFHDFYPHjZWADdTRRcPAF0cXgMWAFJFaAcXQU4cBRAlEFEPXSpMVVgQH1UXUR1RHVBUAA9QVloUHgFIQ1YCAg9fAAgFAFZXUFYDUQBAFF5VXkAAZA==
Request Headers
:authority: dgslivebetting.betonline.ag
:method: POST
:path: /ngwbet.aspx/gvFrameHtml
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7
content-length: 12
content-type: application/json; charset=UTF-8
cookie: \_xpid=574830729; \_xpkey=K_F3GRHECOTdjT306mOafHByLTxopGhY; LPVID=MxZmQyM2Q5OTFlOTU0ZTJk; \_hjSessionUser_2115245=eyJpZCI6IjQ3MzAxYmQwLTQ4ODgtNWNjMC1hZGZjLWJlZDBmNDgwZDJjZCIsImNyZWF0ZWQiOjE2NjY1NTY0MjQwOTIsImV4aXN0aW5nIjp0cnVlfQ==; CT.CONTENT.NA.STATUS=1; \_gid=GA1.2.1666042031.1667883501; PreviousUrlNav=%2Fsportsbook%2Flive-betting; chQuickBet=undefined; inputAmount=100.00; kameleoonVisitorCode=\_js_ti27yqxpj7dd4k1x; DD-LINK-NAREDIRECT=0; ASP.NET_SessionId=5acflzzgqtjdvsnjc5wtwuys; tz=Eastern%20Standard%20Time; btpdb.1PR3l09.dGZjLjY2ODI2ODU=U0VTU0lPTg; oddsfmt=dec; \_hjSession_2115245=eyJpZCI6Ijk2NzBiMjNkLWY4MGQtNDM5OS1hYWNhLWQyODBjNmZlYzNkMSIsImNyZWF0ZWQiOjE2NjgwOTM2NzY4OTUsImluU2FtcGxlIjpmYWxzZX0=; \_hjAbsoluteSessionInProgress=0; \_hjIncludedInSessionSample=0; LPSID-90263191=bLgFHbiuTjOcwCg1FgR16g; \__cf_bm=5LozQOf4P4COCn1rVD5emsVzukFSNbWdS7kvBVodzJ4-1668096251-0-AQ+nY5HeihIwV+gAI1oaFKJJxOtgXWs5czIr198Ffrh18P1q4nriEcszp/j7dwjuDjVuki1jlT6IByy2ewOCcXSUWavF+3MCcBF4Yb8sfDPVkvoSufxJ46feYuPiCiPcw0eW9oTUnrmZNcEkZ1732RDx6LWq1OElUvT0Uk6sk1n1; \_gat_UA-190679354-1=1; \_ga_KC6V6402HY=GS1.1.1668096234.18.1.1668096460.0.0.0; \_ga=GA1.1.1142263304.1666556424; server_persistent=!Tdbrpsz3tJ8jlNmKJcby5Il79pzHsfLVz91fFnDrXObiJE45d6idCUAVcW4Qmd/g598vNFaqTVuVRvk=
origin: https://dgslivebetting.betonline.ag
referer: https://dgslivebetting.betonline.ag/ngwbet.aspx
sec-ch-ua: "Google Chrome";v="107", "Chromium";v="107", "Not=A?Brand";v="24"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36
x-newrelic-id: VgcFUVNTDxACV1NaDgIDVlw=
x-requested-with: XMLHttpRequest
Please help me figure out how I can get what I want.
My code:
import requests
import cloudscraper
scraper = cloudscraper.create_scraper()
url = 'https://dgslivebetting.betonline.ag/ngwbet.aspx/gvFrameHtml'
data = {"gameID":0}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36",
'Referer': "https://dgslivebetting.betonline.ag/ngwbet.aspx/gvFrameHtml"
}
r = requests.post(url, data=data, headers=headers)
print(r.text)
In order to get JSON back, you need to add the Content-Type header to your request.
Your current examples shows you are only sending these headers:
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36",
'Referer': "https://dgslivebetting.betonline.ag/ngwbet.aspx/gvFrameHtml"
}
At the very least, you'll need to add Content-Type: application/json; charset=UTF-8 to the request, otherwise, requests is doing an application/x-www-form-urlencoded form post which is why you're getting back HTML from this site instead of JSON.
I am trying to download the file from here.
In Firefox, the link redirects me to a webpage where I have to log in with my credentials. After doing so, Firefox automatically prompts me to save the desired file with a pop-up windows.
I am trying to replicate that behavior with Python without success.
I've tried the code from this question, namely
import requests
session = requests.session()
url = "https://metamap.nlm.nih.gov/download/public_mm_linux_main_2020.tar.bz2"
data= {'username': 'xxxxx', 'password':'xxxxxx'}
session.post(url, data=data)
response = requests.get("https://metamap.nlm.nih.gov/download/public_mm_linux_main_2020.tar.bz2")
print(response.text)
Insights from Firefox
Using the Network tab from the Developer Mode, I have been able to see the following:
The POST request, apart from username and password sends also an execution parameter, which seems to be very long hashed string. I don't know what this is or how to replicate this.
POST /cas/login?service=https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2 HTTP/1.1
Host: utslogin.nlm.nih.gov
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Content-Type: application/x-www-form-urlencoded
Content-Length: 4900
Origin: https://utslogin.nlm.nih.gov
Connection: keep-alive
Referer: https://utslogin.nlm.nih.gov/cas/login?service=https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2
Cookie: TGC=exxxxxx; JSESSIONID=xxxxx
Upgrade-Insecure-Requests: 1
Afterwards, there is a GET request to the same address with a ticket parameter.
GET /download/public_mm_linux_main_2020.tar.bz2?ticket=xxxxxx HTTP/1.1
Host: metamap.nlm.nih.gov
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://utslogin.nlm.nih.gov/cas/login?service=https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2
Connection: keep-alive
Cookie: MOD_AUTH_CAS_S=xxxxxx
Upgrade-Insecure-Requests: 1
Finally, another GET triggers the download.
GET /download/public_mm_linux_main_2020.tar.bz2 HTTP/1.1
Host: metamap.nlm.nih.gov
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://utslogin.nlm.nih.gov/cas/login?service=https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2
Connection: keep-alive
Cookie: MOD_AUTH_CAS_S=xxxxx
Upgrade-Insecure-Requests: 1
In between all of this, cookies are set and read.
I don't know what information is useful. I'm putting here all that I can see.
Than you in advanced.
EDIT:
After more exploration I am trying to replicate the 3 request pattern seen in Mozilla/Chrome.
This is my code now:
import requests
url_1 = "https://utslogin.nlm.nih.gov/cas/login"
# url_2 = "https://metamap.nlm.nih.gov/download/public_mm_linux_main_2020.tar.bz2"
params = {'service': 'https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2'}
data = {'username': 'xxxxx',
'password':'xxxxx',
'execution': '1cb37c1a-141b-4891-b961-ea5c4b20deed_ZXlKaGJHY2lPaUpJVXpVeE1pSjkuWXpWb1EzaFdTMmxOZGxsdUwwUlFiWEozSzBkSlVuUXhURlpuZEROSmVpdGtObFpVYjFkU01FMWlhMjkxS3lzNWVrbHlNblptTWt4MVdrdE9SU3RJTmsxTGNISmhNblYxTmtOVVExWkJPWFJZUkVKbFREWk1WV2RaZEhSSU5qbHNNbTlZVldONGMyTndNWGgxVGtWalZtcHRkRWMxZGxaSGNsVmtTVmh6UzJSWmFEQjBVRmRMUlVSalRsTk5WVVYxYlU4elRHUjNZV0pSWVhCblVsQXdjemRhZW5ac1NHeDNhMU5pYjA5UlVYTk1ZWHB1UWtkeGJtUTFRa1V5T0RseVNWcHlSa3RZY2pGS09FeFNjekZNY21KVmFsaE1la3RKZFdsd1pGaEhNVmhIY3pJMU4zRkZVVUZrYTA1aE5IZGljMlZZWTA5Q1kyNXFTbTlUUkRsR1UxcHdVa2wyUlhaU1dsbFpaRU5yVFhwQmQwOWlOVTh4WTNaVmNIWXJaalJqUmxGMU4xSmxVVWx3VjNwR1ZrdDROamhaTjI5UlJXTjVXbXg2YVdGbE5HTlpkM0JxUW5kVWJUaHVXVGx6V0VKek5VRmpjV2c0TmpsSlUydHJRWEZUVVRaeWNEUTNSVko2ZGtnclltZG5iRkY2VW5oQmMwcFVSRXAzV0c5eVJqSTVNSEJXTVhOSWNpczFMMVp3VDFWRlJsUnJlSE5ETlVaS2VYQjJPV2hWU2pOVVZ6RlBMM0pDYzBOWlQzbEJkV3g1TWtSYWMwaEhjR0ZSUjI5NGFIZGpha2RJTkVzeVRGaFRjMnh0UldKSFdVWm9halF6YUhKa04wUnNabmQ2U1ZaUE1tWkRUR2RPTlhORVJHOXhNbWR5WWxCWldFUkJjMmQyY1dGamVWTjZZMWRCUTBOclRXOW9XWGRpUkVKS09XZGlSVzl1WVZOQlpUVjZZamhSVEhKdGEyYzNUM0pxYVhCdmJUTmxRbWMwVXpJMGIycE5WRXBXZVhKV2NWRTJNSEJ6UTNobFVUZzJOa1ZMWlhkMGJGQjBRbVIwUjBGNlMyNURPVEZ6UWtWS1RHbGhWRzR4Ulc0MU0ybHRaRVIzVm1Ga1ZGVTVVemQ0UVd4Q2JtUndTMEZ1Tmtac1l6SmtWRVpxYUhKTlRFVnljMnhCV1VkMk5YaEZkbEpqZDBoSUwzQkZNVlJVYTFkUmFIUjBhMXBwTVZSdWRXcFdkR1EzYkhoQ2JXZDFjekZyV1hGbVVVaDBRV1JWTlZwQk1HSkRSV3BSWmxwSVNDOVNRbTR4V1dkNmJXNVlWRzl6VWxKWlRHUkdTV2h4TUhkVWRrMW9XR3AyVEdGRVZYRkhiMnB1Y2pSemFXNVBRa3h1Y0VGTUszWk9TekpYUjFsRVNYSkdOSFJ3ZVhwT1dqUldVMFozWVRoRVdIRkRZbmxEVGxSdWRqaE1SQzg0V2swdmJYcGpNRmxoUkVObllsVmtNVWRxYmpSM01VUlhURlpWY2t0VVJGRTNhMVZ4WlhSa2JVTnlWVVU0U2pWd2VDOWxZVFJZVmpoU1ZqWkxNVlJDVVUxS1lsUlJZM1k1ZWtwNmNqQTBZbFJXVUhaNmQxUnNkSFpuTWxsQmMxSkpjbWhOZW1SQ1ZEVllWeXR5ZFd0WFRsRkNOVkZNV2xweFRFVlpRelpSV0VoUWIyWlJZMjFSU1hScFJuQkhlRVI0Um5aV1l6RnJXazlpUlRjcmVFeE5ORWQwTTAxWVNYbGtTVzVLYnpsVFdXOUJhVXB5WWpGek1ERjZkMGxsZEVVMWRWTnVXbFprTjIwMFQza3lNWGgxUkhFME1FVlVLMXBqUzJWM1NEaFFPVFkwVEVwR2RtbElVelUzUVhJM1NHMVphMUJOVjJsUGFuZHJWRzVvT0hWdkwzVnVOVnBDV2tGSmJ6UkVWSFUwV2lzd1MwRXhSazlaUW5ZNE9EbFBkekZLZWpWa2RqZ3JUekEwT0ZaeGEzVmthamw0YWpSU09EUmFPRFIzT1VoUGN6QlpOVTlQZDFFMVdtWTJRMjFRYTFSUUx6aGhZMGxETTJsUVZFcHpMeXM0ZEVWS1MzaFdiRnBTU1RaU1dYYzJZWGR0UWpaUldsSXJVV2N3YVRZNWNIbEJSbUZGV2k4eFkwNTFUakUzVm5oWGFWbDVla1JhTlhKeVQzRkRSM05hTUVST1EyNUNTVkJFVkhKMldIRnpha3huVlZSNmQwTkdWVGN4YVhRMVFURldXUzlXUjB4a1VIUndZMFJSTTFobEwzSjZMemc1SzNoVGJGUjNkV3A2ZEd3elZtZFpPRVV2YVhaMlJUVXJkREJxWmtsSlZIRnhNa050ZGpSYUwwZEpVV3BXVVdodFFYbG5SazVOZUZCT2IycENVelkyT1dGRWIwUXlXblZFVFdVMmRFRlhXaXN5TWpVekwyZzRWSHBWZEdWU1pDczVURlI2Y0haYVRubzJXblpJYWl0Qk9Dc3djMlZIU0VJd2EwaFpNVWwxZEV4NE1ESnBWVFJOTUVSNVRXTk9Za3hDYjBwRWNtNUhVVzlSYjBoNk5rOXllSEpZYW5oUVdXbzRkeTkzYjJRM1dqSnRhekZNTVZKaGMyTjRXRGQ2V25sMFZpOVhVakUwTDBZM00yMTZVMDV0UTJRMFUweHBWbmhCUW14ME1ETmlWVGs0VHpaRFpIcGpjMk5xWVhWMGVtbzBTVmhpTTNWeVRHUnJSbmt3TVV4RlVVWkNha2Q1VkVkVWFuTnFOblpuTm1KTWNqZFRiVVZHTmxoTlYwUk5TbG8wZUdKTVNWbG9OU3NyVFRKQ2VGQktTek5hV0RKbVRraHdUbVY2YVdoUk9HMXNPVk42VVRsaGRuYzNaVlZoUzNkaFJHdEpZMjRyUkhwVGIzRjFTVlpZZWxGNllUZENiMVUwU1hOQlVteEtkVkpSZDFoak1TdExjR1JXV0dwWmFYbGtMMU5hYTBjMk5sSTFOSGRoTTBaVGFrZzVia1JOTmpWV0wzQnhaalZVS3pGNWJITm9SV1puVmprMFNIUm1halozWW10Q1VXRldTMGh5Wld4MFMzQkdUMFIyTmxCVmFUZEVLemhTTDNGVWVuTnNNa2hzUWpOUVlrZHdhSGRRVUZadGJWTnNPSFJHYVhoQ1RIVk9iMkZIUVVSdE1WSXpWM0pRZWt0aGMzcExhbWwxUWxNMFJWWXhjSE5uVVRZMVJsQkhXbEJYZDNORFVVaHZRM00wTkRaS1lsSkhaRll6UlhsS09IUmFNMHMxYkhwTk1XYzVSak5yYmpVclVqZzNWRGhNZEVwb1ZEQlRiMWN4VkZkdU1VTlplVmgzTlVRMllYUkpRbEJ5T0N0NVVYZDRhVkpsU1ZsTmMwaHpRelFyVFRBNU5GUnhVVWRUTTJjeFNWZEdTa0Z2YjJsTVFreHFXRTUwYkhveVZtTmhTSE5DVVdaTU1rTkhSR1JVWkRCMkt6RnRhbkpPVVhOeVVFMUZaMUpQVFhZMVJrUm9XVlZpSzA1SFUwMWhlVlJNV2taRGNqTmFVMFZpTkdsT1dVVjJUVzR5Y2pkTk4zQnFhMFZOWjJoSFpsaElXbGxVVjFCalJraGFaak5RYjA4eFJVcDBURUpUYURaeWJXbGlTek4xVG1aUVNYSlJaVEU1V213Mk1qUTJia3RTV1dvemVEVkZNVWhZYVdJMFVtUkJTWGxNWmsxS09WRmxRWFJ3VG1zdmVDdHlNaTlTWnpKSGRrZHRTMWhyV2xKbFUyczVSWFJSVTNaa1JrRlRRMnRFZUZFdlVXRkhOVGhpT0RGaUswTjFRbGd3ZEVSTmJsTkxiVVV4YVdGVVMwaFBLMXBpV2pWa1NXSjZOM0J2YlRoclRubHpOMFJ2ZVZWbk9ITnBlREZKWWtNckx6SnBkRmhtWkhGa05HZ3dhVEkzVTNwWFJFOXpWMnRqZURkSUsyZEtTRGxSYzNKeGMyWmxZbEJOU1hnekwwY3pWbmRNV0hScmMyVk5ZVzAzYW5nemQwWndSbVZFYTA5YWIzWndZVVpJVGk5TVZHVk9lRlIwU2poUlpWbDFaVTFaY1dGbGEzWkROVmxUWjNoa2JVMURPVkpWWm1WRlJYb3JiWEpaYkU4NFYzb3JkRTlYUWxOQlZFaFViMHhoVEhKVGQwMU5PVEpHV1dFNFJGb3hWM04zY214SGVGcFNlbEZ2VTFWNEsyaDNLMEpITDJaMGEycGxhM0pEZW0xTWMzUm9PSE4zUjJaV1FrcHRSVk4wVkc1M2FFRmxSbVJUVFZWelpFaEdXSEZyVkhkMU9HWmlabkZKWlhsdU16RlpOREpQYTJabldGaEtlRzR2WjNVM2JUWkJQVDAuSXFFcjcxblloaURBYjl6QUtnN2hRVVNoaE9QbVZQWlhnT1NZQ0lVN3E3NWw0TGtHS2x4OWhDbzVkMVNRQlBnaVRfbkQtemVEdHpyS0F4NldvV0JLZ1E=',
'submit': 'LOGIN'}
r = requests.get(url_1, params = params)
print(r.headers)
print(r.cookies)
print(r.status_code)
headers = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Content-Type': 'application/x-www-form-urlencoded',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-User': '?1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36',
'Upgrade-Insecure-Requests': 1,
'Referer': 'https://utslogin.nlm.nih.gov/cas/login?service=https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2',
'Origin': 'https://utslogin.nlm.nih.gov',
'Host': 'utslogin.nlm.nih.gov',
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8,es-CO;q=0.7,es-AR;q=0.6,es;q=0.5,it;q=0.',
'Content-Length': 4634}
r_2 = requests.post(url_1, data=data, cookies=r.cookies, params = params)
print("-----------------")
print(r_2.headers)
print(r_2.cookies)
print(r_2.status_code)
execution and submit are parameters also passed during the POST request. With the first GET I emulate the normal loading of the page, after which a cookie is given to me. Using that cookie and submitting the form, I expect to get back another cookie and a ticket with which to download the file. However, instead of getting this, what I see in output is the response header of another GET as is anything had happened.
By the way, to whomever wants to help me: the credentials from that site are acquired prior by registering and some hours may pass between registration and credential arrival.
Please help. I'm trying to create a POST request on an .asp site that requires cookies, but the way I handle them seems not to return anything. Read through some questions of similar topic but can't find the _SessionID cookie some are referring to. Please help me formulate this POST request so it works.
Headers
:authority: safer.fmcsa.dot.gov
:method: POST
:path: /query.asp
:scheme: https
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
cache-control: max-age=0
content-length: 85
content-type: application/x-www-form-urlencoded
cookie: ASP.NET_SessionId=ywxszihqlu1yciwe5z5gm4qt; etype=au; ASPSESSIONIDQECTCDRB=KGFOBHBBLKCBKBFIAPEBMIHJ; ASPSESSIONIDQGARCDRA=LKEDBAOBMOMDNGBNBFEMMIPB; ASPSESSIONIDSEBQCBSB=DAMJMNKCNJKHCMDCIJBPKEHD; ASPSESSIONIDCEQRADQC=EIEJCDLBHHCCKHCNJNIMHDKA; ASPSESSIONIDAESTCBQC=KPDPJNHCLOBJENEHPNIFKJLH; LI_carrier=67449; ASPSESSIONIDAGSQADRC=CPIBAKEDDPNFCIPLIGLOKKLA; ASPSESSIONIDAERTDBQD=FMKFHJJAJKNIGCBCCFJFCMNF; AWSALB=Xc7OAuZUmx6vgE5l9NaawsH8oBWjy6eZ3B62kw2rZ5HieoRlMu4SSmVVcaPJPcPjp1fVt9U/T9FaRflgNHwtzmsK4X4e+y+yoGArTfgpb75NWo/ilAek0Qk/sFYI; AWSALBCORS=Xc7OAuZUmx6vgE5l9NaawsH8oBWjy6eZ3B62kw2rZ5HieoRlMu4SSmVVcaPJPcPjp1fVt9U/T9FaRflgNHwtzmsK4X4e+y+yoGArTfgpb75NWo/ilAek0Qk/sFYI
origin: https://safer.fmcsa.dot.gov
referer: https://safer.fmcsa.dot.gov/CompanySnapshot.aspx
sec-fetch-dest: document
sec-fetch-mode: navigate
sec-fetch-site: same-origin
sec-fetch-user: ?1
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36
Form Data
searchtype: ANY
query_type: queryCarrierSnapshot
query_param: USDOT
query_string: 2300842
My Code So Far
def checkDOT():
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:65.0) Gecko/20100101 Firefox/65.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Language': 'en-US,en;q=0.9',
'Cache-Control': 'no-cache,no-store,must-revalidate,max-age=0,private',
'upgrade-insecure-requests': '1',
'Connection': 'keep-alive',
'origin': 'https://safer.fmcsa.dot.gov',
'referer': 'https://safer.fmcsa.dot.gov/CompanySnapshot.aspx'
}
s = requests.Session()
data = {
'searchtype': 'ANY',
'query_type': 'queryCarrierSnapshot',
'query_param': 'USDOT',
'query_string': '2300842'
}
params = (
('pageNumber', '0'),
('itemsPerPage', '15'),
)
url = 'https://safer.fmcsa.dot.gov/CompanySnapshot.aspx'
response = s.get(url, headers=headers, data=data, params=params)
if response:
print(response.content)
else:
print("This did not work")
get requests dont use data parameter, and your code is requests.get,is that right?
I can get a html page with:
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:65.0) Gecko/20100101 Firefox/65.0',
}
url = 'https://safer.fmcsa.dot.gov/CompanySnapshot.aspx'
response = requests.get(url, headers=headers, verify=False)
print(response.text)
I am trying to use the requests.Session.get method in Python which can take a headers dictionary as an argument. When I copy headers from "Inspect Element" in Mozilla Firefox they look like this:
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.5
Connection: keep-alive
Host: cdn.sstatic.net
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
But session.get needs them as a dictionary:
{"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5)
AppleWebKit 537.36 (KHTML, like Gecko) Chrome",
"Accept":"text/html,application/xhtml+xml,application/xml;
q=0.9,image/webp,*/*;q=0.8"}
Is there a method in Python that takes a string and inserts characters into it at given places (in this example inserting an " to the two sides of every colon, for every new-line character inserting ", to it's left and " to it's right, and a \ character for line breaking before every new-line character) or just automatically converts strings of this form into a dictionary?
headers_as_text = """Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.5
Connection: keep-alive
Host: cdn.sstatic.net
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0"""
def get_key_value(text):
colon_position = text.index(":")
if not colon_position:
return text.strip(), ""
return (
text[:colon_position].strip(),
text[colon_position+1:].strip()
)
text_lines = filter(lambda x: x, headers_as_text.split("\n"))
dict_values = dict(map(get_key_value, text_lines))
The final value is a dictionary:
dict_values = {
'Host': 'cdn.sstatic.net',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0',
'Accept-Language': 'en-US,en;q=0.5',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8'
}
Here is what my XHR data looks like when captured in chrome
Request Header
POST my_url?X-Progress-ID=ee821652321919bc7ae61fbe0b625990&userpkgname=file_name.tar.gz HTTP/1.1
Host: 10.110.134.28
Connection: keep-alive
Content-Length: 17461
Accept: application/json, text/plain, */*
Origin: https://10.110.134.28
X-XSRF-TOKEN: bGIwdfFE-oaL_1yVrCzw0iHvv4yUHLC28xjw
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36
Content-Type: multipart/form-data; boundary=----WebKitFormBoundarybfK9jSdLoc2Mpj0i
Referer: https://10.110.134.28/
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.8
Cookie: XSRF-TOKEN=bGIwdfFE-oaL_1yVrCzw0iHvv4yUHLC28xjw; sid=s%3AXglfJNLQ9zzp3eHjQ2QOpk19kFKDDMvJ.ZMKyZd1Gx13lz2MnJgty5WncnilySzfoThGktkhlk4w
Payload
------WebKitFormBoundarybfK9jSdLoc2Mpj0i
Content-Disposition: form-data; name="package"; filename="file_name.tar.gz"
Content-Type: application/x-gzip
------WebKitFormBoundarybfK9jSdLoc2Mpj0i--
And this is how I am building my request.
files = {'package': (<file_name>, open(config_path, 'rb'), 'application/x-gzip')}
request.post(url, files=files)
This is how my request header looks like
{
'Content-Length' : '17449',
'Accept-Encoding' : 'gzip, deflate',
'Accept' : '*/*',
'User-Agent' : 'python-requests/2.10.0',
'Connection' : 'keep-alive',
'Cookie' : 'XSRF-TOKEN=zmklLEL0-gDJOfNBk113MuTpBkLo0j6MAzw0; sid=s%3AC2JZDCfpg_CgkU7qSlS5YTvWXwpgMX35.5nU7W02TPNYtMkIQ4W%2B1bjd87A7KyJbh3shoNqqADXE',
'Content-Type' : 'multipart/form-data; boundary=270d9e02bf214dc7a09c3081cba5b0e0',
'XSRF-TOKEN' : 'zmklLEL0-gDJOfNBk113MuTpBkLo0j6MAzw0'
}
When I make the request I get 502 bad gateway response that too after few seconds while on chrome I get 200 OK instantly
So most probably I am not building my request correctly. Any suggestions?