Issue in passing session information for scraping - python

I went to this website
www4.fmovies.to
then I clicked a movie and checked its CDN URL via Inspect->Network
and got below details
https://cdn.mcloud.to/stream/sf:i0:q2:h3:p23:l1/LR6ljfLn3hrEjSfrOp19wg/1542603600/i/f/2/nr69r8/hls/480/480-0013.ts
:authority: cdn.mcloud.to
:method: GET
:path: /stream/sf:i0:q2:h3:p23:l1/LR6ljfLn3hrEjSfrOp19wg/1542603600/i/f/2/nr69r8/hls/480/480-0001.ts
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
cookie: __cfduid=d0847f9ac6d9a8da1dd131d1a0a91ea991542533053; _ga=GA1.2.485859786.1542533055; _gid=GA1.2.1916946057.1542533055; _gat=1
origin: https://mcloud.to
referer: https://mcloud.to/embed/#P#O8SE2916SEOA5?sub.file=https%253A%252F%252Fstatic1.akacdn.ru%252Fsubtitle%252F40039.vtt%253Fv1&ui=oAhi567w9OQEhJWEdbl0s%40Ep0Ir2VvG1xiK9JqKx
user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36
created header information using the above information and then ran
request = requests.get(url, headers=headers)
But am getting 403 Not Authorized. What is the issue?

You need to pass referer header that is the src attribute of video content iframe that looks like
<iframe src="https://mcloud.to/embed/#9#4ZS04Z10SWOE5?ui=pwxi4Kjr6%40wHmIqHcrl0yeFfpYqUUIW1wCKlJr6x" allow="autoplay; fullscreen" scrolling="no" allowfullscreen="yes" style="width: 100%; height: 100%;" frameborder="no"></iframe>
The code looks like
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:60.0)Gecko/20100101 Firefox/60.', 'pragma': 'no-cache', 'connection': 'keep-alive', 'cache-control': 'no-cache', 'referer': 'https://mcloud.to/embed/#9#4ZS04Z10SWOE5?ui=pwxi4Kjr6%40wHmIqHcrl0yeFfpYqUUIW1wCKlJr6x'}
requests.get('https://cdn.mcloud.to/stream/sf:i0:q2:h2:p24:l1/WjLDZuCBHmtyv63lT-RoVQ/1542603600/g/c/0/rj0m0m/hls/480/480-0000.ts', headers=headers)

Related

Python requests: POST form data, some fields updating but others not

I'm trying to post form data to a printers Web interface, i can post fields successfully but there's a certain type of form data which doesn't seem to work/update.
My Code:
import sys
import requests
IP='xxx.xxx.xxx.xxx'
PrinterSession = requests.session()
csrfdata = PrinterSession .get('http://xxx.xxx.xxx.xxx/properties/authentication/login.php')
csrftoken = str(csrfdata.content)[str(csrfdata.content).index('CSRFToken')+18:str(csrfdata.content).index('CSRFToken')+146]
login_data ={
'webUsername' : 'admin',
'webPassword' : 'xxxx',
'CSRFToken' : csrftoken,
'NextPage' : '/properties/authentication/luidLogin.php',
'_fun_function' : 'HTTP_Authenticate_fn',
'frmaltDomain' : 'default',
}
LoginPrinter = PrinterSession.post("http://xxx.xxx.xxx.xxx/userpost/printer.set", data=login_data )
headersSMTP = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,* /*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Content-Length': '328',
'Content-Type': 'application/x-www-form-urlencoded',
'Cookie': 'PageToShow=; statusNumNodes=8; statusSelected=n1; frmCompany=; frmIFax=; frmFaxNumber=; frmProtocol=SMB; frmDocumentPath=; frmLoginName=Xerox; frmServerName=; frmNdsContext=; frmSmbShare=Scans; frmNdsTree=; frmIpv6_Host_1=%3A%3A; frmFirstName=David; frmLastName=Fester; frmFriendlyName=SMB%20Scan; frmEmail=dawie180#gmail.com; frmDisplayName=Fester%2C%20David; frmServerVolume=Scans; frmIpv4_1_1=12; frmIpv4_1_2=12; frmIpv4_1_3=12; frmIpv4_1_4=12; frmXrxAdd_1=Hn; frmHnAdd_1=asdasdsdasda; propNumNodes=117; PHPSESSID=457428a534bf077cc6bb0fff7ee80f7f; propSelected=n49; propHierarchy=001010000010000000000000000; WebTimerPopupID=15',
'Host': '10.241.24.28',
'Origin': 'https://10.241.24.28',
'Referer': 'https://10.241.24.28/protocols/smtp/required.php?from=email_req_smtp',
'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="90", "Google Chrome";v="90"',
'sec-ch-ua-mobile': '?0',
'Sec-Fetch-Dest': 'frame',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-User': '?1',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36',
}
cookietolist=headersSMTP['Cookie'].split(";")
cookietolist[27]=' PHPSESSID='+PrinterSession.cookies.get_dict()['PHPSESSID']
StringCookie = ";".join(cookietolist)
headersSMTP['Cookie']=StringCookie
postSMTPform = {
'_fun_function': 'HTTP_Set_Config_Attrib_fn',
'_fun_function': 'HTTP_CN_Set_fn',
'_fun_function': 'HTTP_SNMP_Set_SvcMon_fn',
'NextPage': '/properties/email/required.php',
'CSRFToken': csrftoken,
'POP3_MAILBOX_ADDRESS': 'smtp.test.com',
'connectivity.smtp.server': 'xxx.xxx.xxx.xxx:25',
}
PostSMTP = PrinterSession.post('http://'+IP+'/dummypost/printer.set', data=postSMTPform, headers=headersSMTP)
Post Request inspected via Chrome:
Request Headers:
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Cache-Control: max-age=0
Connection: keep-alive
Content-Length: 364
Content-Type: application/x-www-form-urlencoded
Cookie: PHPSESSID=c960f2c4fb2f99c44333e601d3cc6a79; propSelected=n1; propNumNodes=117; WebTimerPopupID=2
Host: 10.241.24.28
Origin: https://10.241.24.28
Referer: https://10.241.24.28/protocols/smtp/required.php?from=email_cfg_over
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="90", "Google Chrome";v="90"
sec-ch-ua-mobile: ?0
Sec-Fetch-Dest: frame
Sec-Fetch-Mode: navigate
Sec-Fetch-Site: same-origin
Sec-Fetch-User: ?1
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36
Form Data:
_fun_function: HTTP_Set_Config_Attrib_fn
_fun_function: HTTP_CN_Set_fn
_fun_function: HTTP_SNMP_Set_SvcMon_fn
NextPage: /config_overview/email.php
connectivity.smtp.server: 10.10.10.10:25
POP3_MAILBOX_ADDRESS: test#home.com
CSRFToken: eb590d22d4b0c75038816bd82a26e188f9a49941029c42765dd790cb8acd07b4966f5a910f493f2d839cb36d5d0a77a7632f614e06fcd67019c0f95f508d7da1
So "POP3_MAILBOX_ADDRESS: test#home.com" works fine, updates the SMTP From Address in the printer, all other Fields using the same Uppercase and underscore style key works fine, but the other field "connectivity.smtp.server: 10.10.10.10:25" does not work, it does not update the setting in the printer and it seems every field which uses this lower case seperated by periods style does not work.
The "POP3_MAILBOX_ADDRESS: test#home.com" works fine even without parsing the headers with the post request, i added the headers after and it still doesn't work. Don't know what im doing wrong.

Fatal erro in POST using request module Python 3

I'm working on a web scraper build in python. Until now I build the following code:
import requests
headers = {
'authority': 'truegamedata.com',
'accept': '*/*',
'x-requested-with': 'XMLHttpRequest',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.152 Safari/537.36',
'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
'sec-gpc': '1',
'origin': 'https://truegamedata.com',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
'referer': 'https://truegamedata.com/weapon_builder.php',
'accept-language': 'pt-BR,pt;q=0.9,en-US;q=0.8,en;q=0.7',
}
data = {
'weapon_name': '^%^5B^%^22Kilo 141^%^22^%^2C^%^22wz^%^22^%^5D'
}
response = requests.post('https://truegamedata.com/SQL_calls/base_data.php', headers=headers, data=data)
print(response.text)
For some reason, I get the following error:
<br />
<b>Fatal error</b>: Uncaught Error: Call to a member function execute() on bool in /home/customer/www/truegamedata.com/public_html/SQL_calls/base_data.php:29
Stack trace:
#0 {main}
thrown in <b>/home/customer/www/truegamedata.com/public_html/SQL_calls/base_data.php</b> on line <b>29</b><br />
Does anyone know why this is happening? And how I can get this response?
Here is the request from Chorme Dev tools:
Request URL: https://truegamedata.com/SQL_calls/base_data.php
Request Method: POST
Status Code: 200
Remote Address: 127.0.0.1:61696
Referrer Policy: strict-origin-when-cross-origin
cache-control: no-store, no-cache, must-revalidate
content-encoding: br
content-type: text/html; charset=UTF-8
date: Fri, 12 Feb 2021 20:08:45 GMT
expires: Thu, 19 Nov 1981 08:52:00 GMT
host-header: 8441280b0c35cbc1147f8ba998a563a7
pragma: no-cache
server: nginx
vary: Accept-Encoding
x-httpd-modphp: 1
x-proxy-cache-info: DT:1
:authority: truegamedata.com
:method: POST
:path: /SQL_calls/base_data.php
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: pt-BR,pt;q=0.9
content-length: 42
content-type: application/x-www-form-urlencoded; charset=UTF-8
cookie: PHPSESSID=375e8ebdfa9174d6db5eb8c1cda4411b; game=wz
origin: https://truegamedata.com
referer: https://truegamedata.com/weapon_builder.php
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
sec-gpc: 1
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.152 Safari/537.36
x-requested-with: XMLHttpRequest
weapon_name: ["FR 5.56","wz"]
I tried to give as much information as possible, if anything is missing let me know

curl get response 403

How can I get around the 403 code?
I was trying to get search result from a website, however I got "Response[403]" message, I've found similar get solving 403 error by adding headers to request.get, however it didn't work for my problem. What should I do to correctly get the result I want?
Google Chrome Request
Request
Request URL: http://178.32.46.165/
Request Method: GET
Status Code: 200 OK
Remote Address: 178.32.46.165:80
Referrer Policy: strict-origin-when-cross-origin
Request Headers
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate
Accept-Language: tr-TR,tr;q=0.9
Cache-Control: max-age=0
Connection: keep-alive
Host: 178.32.46.165
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36
Python Request
import requests
headers = {
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Language': 'tr-TR,tr;q=0.9',
}
response = requests.get('http://178.32.46.165/', headers=headers, verify=False)
print(response)
Response
<Response [403]>
If you're getting a 403 response, you're likely not meant to be able to access this resource. This is not a Python issue, this is an issue with whatever website you're trying to access.
Additionally, this seems to be a temporary issue. I pasted your code as-is into my Python prompt and received <Response [200]>
I think it may provide blocking depending on the country.

Download file after POST form submission

I am trying to download the file from here.
In Firefox, the link redirects me to a webpage where I have to log in with my credentials. After doing so, Firefox automatically prompts me to save the desired file with a pop-up windows.
I am trying to replicate that behavior with Python without success.
I've tried the code from this question, namely
import requests
session = requests.session()
url = "https://metamap.nlm.nih.gov/download/public_mm_linux_main_2020.tar.bz2"
data= {'username': 'xxxxx', 'password':'xxxxxx'}
session.post(url, data=data)
response = requests.get("https://metamap.nlm.nih.gov/download/public_mm_linux_main_2020.tar.bz2")
print(response.text)
Insights from Firefox
Using the Network tab from the Developer Mode, I have been able to see the following:
The POST request, apart from username and password sends also an execution parameter, which seems to be very long hashed string. I don't know what this is or how to replicate this.
POST /cas/login?service=https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2 HTTP/1.1
Host: utslogin.nlm.nih.gov
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Content-Type: application/x-www-form-urlencoded
Content-Length: 4900
Origin: https://utslogin.nlm.nih.gov
Connection: keep-alive
Referer: https://utslogin.nlm.nih.gov/cas/login?service=https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2
Cookie: TGC=exxxxxx; JSESSIONID=xxxxx
Upgrade-Insecure-Requests: 1
Afterwards, there is a GET request to the same address with a ticket parameter.
GET /download/public_mm_linux_main_2020.tar.bz2?ticket=xxxxxx HTTP/1.1
Host: metamap.nlm.nih.gov
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://utslogin.nlm.nih.gov/cas/login?service=https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2
Connection: keep-alive
Cookie: MOD_AUTH_CAS_S=xxxxxx
Upgrade-Insecure-Requests: 1
Finally, another GET triggers the download.
GET /download/public_mm_linux_main_2020.tar.bz2 HTTP/1.1
Host: metamap.nlm.nih.gov
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://utslogin.nlm.nih.gov/cas/login?service=https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2
Connection: keep-alive
Cookie: MOD_AUTH_CAS_S=xxxxx
Upgrade-Insecure-Requests: 1
In between all of this, cookies are set and read.
I don't know what information is useful. I'm putting here all that I can see.
Than you in advanced.
EDIT:
After more exploration I am trying to replicate the 3 request pattern seen in Mozilla/Chrome.
This is my code now:
import requests
url_1 = "https://utslogin.nlm.nih.gov/cas/login"
# url_2 = "https://metamap.nlm.nih.gov/download/public_mm_linux_main_2020.tar.bz2"
params = {'service': 'https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2'}
data = {'username': 'xxxxx',
'password':'xxxxx',
'execution': '1cb37c1a-141b-4891-b961-ea5c4b20deed_ZXlKaGJHY2lPaUpJVXpVeE1pSjkuWXpWb1EzaFdTMmxOZGxsdUwwUlFiWEozSzBkSlVuUXhURlpuZEROSmVpdGtObFpVYjFkU01FMWlhMjkxS3lzNWVrbHlNblptTWt4MVdrdE9SU3RJTmsxTGNISmhNblYxTmtOVVExWkJPWFJZUkVKbFREWk1WV2RaZEhSSU5qbHNNbTlZVldONGMyTndNWGgxVGtWalZtcHRkRWMxZGxaSGNsVmtTVmh6UzJSWmFEQjBVRmRMUlVSalRsTk5WVVYxYlU4elRHUjNZV0pSWVhCblVsQXdjemRhZW5ac1NHeDNhMU5pYjA5UlVYTk1ZWHB1UWtkeGJtUTFRa1V5T0RseVNWcHlSa3RZY2pGS09FeFNjekZNY21KVmFsaE1la3RKZFdsd1pGaEhNVmhIY3pJMU4zRkZVVUZrYTA1aE5IZGljMlZZWTA5Q1kyNXFTbTlUUkRsR1UxcHdVa2wyUlhaU1dsbFpaRU5yVFhwQmQwOWlOVTh4WTNaVmNIWXJaalJqUmxGMU4xSmxVVWx3VjNwR1ZrdDROamhaTjI5UlJXTjVXbXg2YVdGbE5HTlpkM0JxUW5kVWJUaHVXVGx6V0VKek5VRmpjV2c0TmpsSlUydHJRWEZUVVRaeWNEUTNSVko2ZGtnclltZG5iRkY2VW5oQmMwcFVSRXAzV0c5eVJqSTVNSEJXTVhOSWNpczFMMVp3VDFWRlJsUnJlSE5ETlVaS2VYQjJPV2hWU2pOVVZ6RlBMM0pDYzBOWlQzbEJkV3g1TWtSYWMwaEhjR0ZSUjI5NGFIZGpha2RJTkVzeVRGaFRjMnh0UldKSFdVWm9halF6YUhKa04wUnNabmQ2U1ZaUE1tWkRUR2RPTlhORVJHOXhNbWR5WWxCWldFUkJjMmQyY1dGamVWTjZZMWRCUTBOclRXOW9XWGRpUkVKS09XZGlSVzl1WVZOQlpUVjZZamhSVEhKdGEyYzNUM0pxYVhCdmJUTmxRbWMwVXpJMGIycE5WRXBXZVhKV2NWRTJNSEJ6UTNobFVUZzJOa1ZMWlhkMGJGQjBRbVIwUjBGNlMyNURPVEZ6UWtWS1RHbGhWRzR4Ulc0MU0ybHRaRVIzVm1Ga1ZGVTVVemQ0UVd4Q2JtUndTMEZ1Tmtac1l6SmtWRVpxYUhKTlRFVnljMnhCV1VkMk5YaEZkbEpqZDBoSUwzQkZNVlJVYTFkUmFIUjBhMXBwTVZSdWRXcFdkR1EzYkhoQ2JXZDFjekZyV1hGbVVVaDBRV1JWTlZwQk1HSkRSV3BSWmxwSVNDOVNRbTR4V1dkNmJXNVlWRzl6VWxKWlRHUkdTV2h4TUhkVWRrMW9XR3AyVEdGRVZYRkhiMnB1Y2pSemFXNVBRa3h1Y0VGTUszWk9TekpYUjFsRVNYSkdOSFJ3ZVhwT1dqUldVMFozWVRoRVdIRkRZbmxEVGxSdWRqaE1SQzg0V2swdmJYcGpNRmxoUkVObllsVmtNVWRxYmpSM01VUlhURlpWY2t0VVJGRTNhMVZ4WlhSa2JVTnlWVVU0U2pWd2VDOWxZVFJZVmpoU1ZqWkxNVlJDVVUxS1lsUlJZM1k1ZWtwNmNqQTBZbFJXVUhaNmQxUnNkSFpuTWxsQmMxSkpjbWhOZW1SQ1ZEVllWeXR5ZFd0WFRsRkNOVkZNV2xweFRFVlpRelpSV0VoUWIyWlJZMjFSU1hScFJuQkhlRVI0Um5aV1l6RnJXazlpUlRjcmVFeE5ORWQwTTAxWVNYbGtTVzVLYnpsVFdXOUJhVXB5WWpGek1ERjZkMGxsZEVVMWRWTnVXbFprTjIwMFQza3lNWGgxUkhFME1FVlVLMXBqUzJWM1NEaFFPVFkwVEVwR2RtbElVelUzUVhJM1NHMVphMUJOVjJsUGFuZHJWRzVvT0hWdkwzVnVOVnBDV2tGSmJ6UkVWSFUwV2lzd1MwRXhSazlaUW5ZNE9EbFBkekZLZWpWa2RqZ3JUekEwT0ZaeGEzVmthamw0YWpSU09EUmFPRFIzT1VoUGN6QlpOVTlQZDFFMVdtWTJRMjFRYTFSUUx6aGhZMGxETTJsUVZFcHpMeXM0ZEVWS1MzaFdiRnBTU1RaU1dYYzJZWGR0UWpaUldsSXJVV2N3YVRZNWNIbEJSbUZGV2k4eFkwNTFUakUzVm5oWGFWbDVla1JhTlhKeVQzRkRSM05hTUVST1EyNUNTVkJFVkhKMldIRnpha3huVlZSNmQwTkdWVGN4YVhRMVFURldXUzlXUjB4a1VIUndZMFJSTTFobEwzSjZMemc1SzNoVGJGUjNkV3A2ZEd3elZtZFpPRVV2YVhaMlJUVXJkREJxWmtsSlZIRnhNa050ZGpSYUwwZEpVV3BXVVdodFFYbG5SazVOZUZCT2IycENVelkyT1dGRWIwUXlXblZFVFdVMmRFRlhXaXN5TWpVekwyZzRWSHBWZEdWU1pDczVURlI2Y0haYVRubzJXblpJYWl0Qk9Dc3djMlZIU0VJd2EwaFpNVWwxZEV4NE1ESnBWVFJOTUVSNVRXTk9Za3hDYjBwRWNtNUhVVzlSYjBoNk5rOXllSEpZYW5oUVdXbzRkeTkzYjJRM1dqSnRhekZNTVZKaGMyTjRXRGQ2V25sMFZpOVhVakUwTDBZM00yMTZVMDV0UTJRMFUweHBWbmhCUW14ME1ETmlWVGs0VHpaRFpIcGpjMk5xWVhWMGVtbzBTVmhpTTNWeVRHUnJSbmt3TVV4RlVVWkNha2Q1VkVkVWFuTnFOblpuTm1KTWNqZFRiVVZHTmxoTlYwUk5TbG8wZUdKTVNWbG9OU3NyVFRKQ2VGQktTek5hV0RKbVRraHdUbVY2YVdoUk9HMXNPVk42VVRsaGRuYzNaVlZoUzNkaFJHdEpZMjRyUkhwVGIzRjFTVlpZZWxGNllUZENiMVUwU1hOQlVteEtkVkpSZDFoak1TdExjR1JXV0dwWmFYbGtMMU5hYTBjMk5sSTFOSGRoTTBaVGFrZzVia1JOTmpWV0wzQnhaalZVS3pGNWJITm9SV1puVmprMFNIUm1halozWW10Q1VXRldTMGh5Wld4MFMzQkdUMFIyTmxCVmFUZEVLemhTTDNGVWVuTnNNa2hzUWpOUVlrZHdhSGRRVUZadGJWTnNPSFJHYVhoQ1RIVk9iMkZIUVVSdE1WSXpWM0pRZWt0aGMzcExhbWwxUWxNMFJWWXhjSE5uVVRZMVJsQkhXbEJYZDNORFVVaHZRM00wTkRaS1lsSkhaRll6UlhsS09IUmFNMHMxYkhwTk1XYzVSak5yYmpVclVqZzNWRGhNZEVwb1ZEQlRiMWN4VkZkdU1VTlplVmgzTlVRMllYUkpRbEJ5T0N0NVVYZDRhVkpsU1ZsTmMwaHpRelFyVFRBNU5GUnhVVWRUTTJjeFNWZEdTa0Z2YjJsTVFreHFXRTUwYkhveVZtTmhTSE5DVVdaTU1rTkhSR1JVWkRCMkt6RnRhbkpPVVhOeVVFMUZaMUpQVFhZMVJrUm9XVlZpSzA1SFUwMWhlVlJNV2taRGNqTmFVMFZpTkdsT1dVVjJUVzR5Y2pkTk4zQnFhMFZOWjJoSFpsaElXbGxVVjFCalJraGFaak5RYjA4eFJVcDBURUpUYURaeWJXbGlTek4xVG1aUVNYSlJaVEU1V213Mk1qUTJia3RTV1dvemVEVkZNVWhZYVdJMFVtUkJTWGxNWmsxS09WRmxRWFJ3VG1zdmVDdHlNaTlTWnpKSGRrZHRTMWhyV2xKbFUyczVSWFJSVTNaa1JrRlRRMnRFZUZFdlVXRkhOVGhpT0RGaUswTjFRbGd3ZEVSTmJsTkxiVVV4YVdGVVMwaFBLMXBpV2pWa1NXSjZOM0J2YlRoclRubHpOMFJ2ZVZWbk9ITnBlREZKWWtNckx6SnBkRmhtWkhGa05HZ3dhVEkzVTNwWFJFOXpWMnRqZURkSUsyZEtTRGxSYzNKeGMyWmxZbEJOU1hnekwwY3pWbmRNV0hScmMyVk5ZVzAzYW5nemQwWndSbVZFYTA5YWIzWndZVVpJVGk5TVZHVk9lRlIwU2poUlpWbDFaVTFaY1dGbGEzWkROVmxUWjNoa2JVMURPVkpWWm1WRlJYb3JiWEpaYkU4NFYzb3JkRTlYUWxOQlZFaFViMHhoVEhKVGQwMU5PVEpHV1dFNFJGb3hWM04zY214SGVGcFNlbEZ2VTFWNEsyaDNLMEpITDJaMGEycGxhM0pEZW0xTWMzUm9PSE4zUjJaV1FrcHRSVk4wVkc1M2FFRmxSbVJUVFZWelpFaEdXSEZyVkhkMU9HWmlabkZKWlhsdU16RlpOREpQYTJabldGaEtlRzR2WjNVM2JUWkJQVDAuSXFFcjcxblloaURBYjl6QUtnN2hRVVNoaE9QbVZQWlhnT1NZQ0lVN3E3NWw0TGtHS2x4OWhDbzVkMVNRQlBnaVRfbkQtemVEdHpyS0F4NldvV0JLZ1E=',
'submit': 'LOGIN'}
r = requests.get(url_1, params = params)
print(r.headers)
print(r.cookies)
print(r.status_code)
headers = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Content-Type': 'application/x-www-form-urlencoded',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-User': '?1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36',
'Upgrade-Insecure-Requests': 1,
'Referer': 'https://utslogin.nlm.nih.gov/cas/login?service=https%3a%2f%2fmetamap.nlm.nih.gov%2fdownload%2fpublic_mm_linux_main_2020.tar.bz2',
'Origin': 'https://utslogin.nlm.nih.gov',
'Host': 'utslogin.nlm.nih.gov',
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8,es-CO;q=0.7,es-AR;q=0.6,es;q=0.5,it;q=0.',
'Content-Length': 4634}
r_2 = requests.post(url_1, data=data, cookies=r.cookies, params = params)
print("-----------------")
print(r_2.headers)
print(r_2.cookies)
print(r_2.status_code)
execution and submit are parameters also passed during the POST request. With the first GET I emulate the normal loading of the page, after which a cookie is given to me. Using that cookie and submitting the form, I expect to get back another cookie and a ticket with which to download the file. However, instead of getting this, what I see in output is the response header of another GET as is anything had happened.
By the way, to whomever wants to help me: the credentials from that site are acquired prior by registering and some hours may pass between registration and credential arrival.

python post request with cookies asp.net

Please help. I'm trying to create a POST request on an .asp site that requires cookies, but the way I handle them seems not to return anything. Read through some questions of similar topic but can't find the _SessionID cookie some are referring to. Please help me formulate this POST request so it works.
Headers
:authority: safer.fmcsa.dot.gov
:method: POST
:path: /query.asp
:scheme: https
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
cache-control: max-age=0
content-length: 85
content-type: application/x-www-form-urlencoded
cookie: ASP.NET_SessionId=ywxszihqlu1yciwe5z5gm4qt; etype=au; ASPSESSIONIDQECTCDRB=KGFOBHBBLKCBKBFIAPEBMIHJ; ASPSESSIONIDQGARCDRA=LKEDBAOBMOMDNGBNBFEMMIPB; ASPSESSIONIDSEBQCBSB=DAMJMNKCNJKHCMDCIJBPKEHD; ASPSESSIONIDCEQRADQC=EIEJCDLBHHCCKHCNJNIMHDKA; ASPSESSIONIDAESTCBQC=KPDPJNHCLOBJENEHPNIFKJLH; LI_carrier=67449; ASPSESSIONIDAGSQADRC=CPIBAKEDDPNFCIPLIGLOKKLA; ASPSESSIONIDAERTDBQD=FMKFHJJAJKNIGCBCCFJFCMNF; AWSALB=Xc7OAuZUmx6vgE5l9NaawsH8oBWjy6eZ3B62kw2rZ5HieoRlMu4SSmVVcaPJPcPjp1fVt9U/T9FaRflgNHwtzmsK4X4e+y+yoGArTfgpb75NWo/ilAek0Qk/sFYI; AWSALBCORS=Xc7OAuZUmx6vgE5l9NaawsH8oBWjy6eZ3B62kw2rZ5HieoRlMu4SSmVVcaPJPcPjp1fVt9U/T9FaRflgNHwtzmsK4X4e+y+yoGArTfgpb75NWo/ilAek0Qk/sFYI
origin: https://safer.fmcsa.dot.gov
referer: https://safer.fmcsa.dot.gov/CompanySnapshot.aspx
sec-fetch-dest: document
sec-fetch-mode: navigate
sec-fetch-site: same-origin
sec-fetch-user: ?1
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36
Form Data
searchtype: ANY
query_type: queryCarrierSnapshot
query_param: USDOT
query_string: 2300842
My Code So Far
def checkDOT():
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:65.0) Gecko/20100101 Firefox/65.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Language': 'en-US,en;q=0.9',
'Cache-Control': 'no-cache,no-store,must-revalidate,max-age=0,private',
'upgrade-insecure-requests': '1',
'Connection': 'keep-alive',
'origin': 'https://safer.fmcsa.dot.gov',
'referer': 'https://safer.fmcsa.dot.gov/CompanySnapshot.aspx'
}
s = requests.Session()
data = {
'searchtype': 'ANY',
'query_type': 'queryCarrierSnapshot',
'query_param': 'USDOT',
'query_string': '2300842'
}
params = (
('pageNumber', '0'),
('itemsPerPage', '15'),
)
url = 'https://safer.fmcsa.dot.gov/CompanySnapshot.aspx'
response = s.get(url, headers=headers, data=data, params=params)
if response:
print(response.content)
else:
print("This did not work")
get requests dont use data parameter, and your code is requests.get,is that right?
I can get a html page with:
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:65.0) Gecko/20100101 Firefox/65.0',
}
url = 'https://safer.fmcsa.dot.gov/CompanySnapshot.aspx'
response = requests.get(url, headers=headers, verify=False)
print(response.text)

Categories

Resources