I'm trying to make a post request using the following converted from CURL bash:
headers = {
'authority': 'www.discoverecentral.com',
'accept': '*/*',
'accept-language': 'en-US,en;q=0.9',
'content-type': 'application/x-www-form-urlencoded',
# Requests sorts cookies= alphabetically
# 'cookie': 'COOKIE_SUPPORT=true; GUEST_LANGUAGE_ID=en_US; _ga=GA1.2.976004968.1651680892; _gid=GA1.2.1647177146.1652091069; ak_bmsc=376E16054B8CE1667585CF4B843B1281~000000000000000000000000000000~YAAQVJPIF87oQqyAAQAA5y8jrQ9DHy/4GZJUo1mSNg5U7s7R0A1ATGV+bFMIIp99MPTSGgwRJbLppQ33OtTnvp4dT1gF31OZ01N5b7SAvYbzGh6p1JHCPRkuLI7LI/yDQ/Y24KBTfsRYeTkILDOlI948yMwXay1lXdXMwVmiUOhfUV1TqPoS/kuHVjF+Pu5TYaGVoHmz2tARel9ydbLCv44P+yYkEssPPJanuEtdg3A3IYXH4SzSbaqhN+yV2OmwbYj9C4rHP3Vb1R7g2zQAKzS8Z+kwdV5Ns13EVuFPb+bVNxAKUIsnMKy7Lpxa05e+l38JktfKWtto7bBkfAzH7FyibI/6iyCvw/cghpDaE/PkXqXZDZh6GFWkVUABzngytkXRkS1aTG9VwhBJap2iJbWaVvA=; SAML_SP_SESSION_KEY=_bd42396230f077643c06f7bb75c60202169a8011748d5bf587745d054563; JSESSIONID=429078A672F540F1159490C033065E11.jvm6; _gat=1; bm_sv=237874D0F3F147A8B5E9FE30ABD61E37~YAAQVJPIFxrtQqyAAQAA/dqCrQ8Pjm4VHd954FLp0cvcoavAJFayiPFK25Q0lEeLQz4Ejuy7Q2GTzcT1DC0xhWkz2XAC6zLrqBc93TFAOG9zTjPZFqUTKfu9XplU5QowZlz76ekHhvprJpnen+rsaOPGScci0EPsUaU4LXyknJADa97lizWyy/1RpFDuSUnspML6cYGOBwVmpVs3EM13bfVQCuB7r4li7iMJ0toY6hl30+YIzwF7ESB1xrlwvl59Uvumf3j4w4UC1kw=~1; LFR_SESSION_STATE_4814701=1652178477701',
'origin': 'https://www.discoverecentral.com',
'referer': 'https://www.discoverecentral.com/group/merchant/my-reports',
'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="101", "Google Chrome";v="101"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36',
'x-requested-with':'XMLHttpRequest',
}
params = {
'p_p_id': 'DiscoverMyReportPortlet_WAR_discovermyreportportlet',
'p_p_lifecycle': '2',
'p_p_state': 'normal',
'p_p_mode': 'view',
'p_p_resource_id': 'retreiveHierarchyList',
'p_p_cacheability': 'cacheLevelPage',
}
data = '&direction=des&orderBy=Default&selectedLocalEntityId=6011&gridPageSize=5000'
response = s.post('https://www.discoverecentral.com/group/merchant/my-reports', params=params, headers=headers, data=data)
The response is a 401 unauthorized. I know it has something to do with the structure of the data being passed into the request. Anyone come across a similar issue?
Related
I need to get all image prompts from the https://lexica.art/?q=history request, but the website returns 403 error code when I am trying to send a request.
I already tried to set User-Agent property, and copied all the request properties, but it still isn't working.
Here is my code:
import requests
url="https://lexica.art/api/trpc/prompts.infinitePrompts?batch=1&input={%220%22%3A{%22json%22%3A{%22text%22%3A%22history%22%2C%22searchMode%22%3A%22images%22%2C%22source%22%3A%22search%22%2C%22cursor%22%3A250}}}"
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.5',
'Alt-Used': 'lexica.art',
'cache-control': 'max-age=0',
'Connection': 'keep-alive',
'Host': 'lexica.art',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'cross-site',
'TE': 'trailers',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36'
}
r=requests.get(url, headers=headers)
print(r.status_code)
I used selenium library instead, everything is working fine
I am trying to do some web scraping from here but I am struggling to get the access token automaticaly. Everytime I do the web scraping, I need to manually update the Bearer token. Is there a way to do this automaticaly?
Let me show you how I do it manually:
url_WiZink = 'https://www.creditopessoal.wizink.pt/gravitee/gateway/api-chn-loans/v1/loans/quotation'
headers_WiZink = {'Accept': 'application/json',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'pt-PT',
'Authorization': 'Bearer de6ea490-381e-417f-ab77-3aad0d7eb63c',
'Connection': 'keep-alive',
'Content-Length': '266',
'Content-Type': 'application/json;charset=UTF-8',
'Host': 'www.creditopessoal.wizink.pt',
'Origin': 'https://www.wizink.pt',
'Referer': 'https://www.wizink.pt/',
'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="90", "Google Chrome";v="90"',
'sec-ch-ua-mobile': '?0',
'Sec-Fetch-Dest': 'empty',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Site': 'same-site',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36',
'X-Channel-Id': 'LOANSIMULATOR',
'X-Client-Id': 'simWzkPt',
'X-Country-Id': 'PRT',
'X-Device-UUID': 'd14e9b629804cbba1ac7c3e78ab39a56'}
payload_WiZink = {"productCode":"WZP01","fixedTermLoanId":84,"impositionAmount":{"amount":10000,"currency":"EUR"},"settlementDay":"5","dueOrAdvanceInterestIndicator":"3","nominalInterest":"8.0000000","feeRateId":"05","settlementFrequencyId":"0001","deprecationFrequencyId":"0001"}
response_WiZink = requests.post(url_WiZink, headers=headers_WiZink, json=payload_WiZink, verify=False).json()
For that website, you can get an access token by calling their oauth/token endpoint:
import requests
access_token = requests.post(
'https://www.creditopessoal.wizink.pt/gravitee/gateway/api-chn-auth-server/v1/oauth/token',
headers={'Authorization': 'Basic c2ltV3prUHQ6YmllZTktZmR6dzAzLXBvZWpuY2Q='},
data={'grant_type': 'client_credentials'},
).json()['access_token']
print(access_token)
I have a python program that calls the nseindia.com and tries to fetch the indices data using the URL: https://www1.nseindia.com/live_market/dynaContent/live_watch/stock_watch/liveIndexWatchData.json"
This code is working fine on my system, but when I deploy this to Heroku it get stuck at the URL call.
import requests
url = "https://www1.nseindia.com/live_market/dynaContent/live_watch/stock_watch/liveIndexWatchData.json"
headers = {
'authority': 'beta.nseindia.com',
'cache-control': 'max-age=0',
'dnt': '1',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36',
'sec-fetch-user': '?1',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'sec-fetch-site': 'none',
'sec-fetch-mode': 'navigate',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9,hi;q=0.8',
}
response = requests.get(url=url, headers=headers)
print(response.json())
Any suggestions?
So here is the basic login process mapped out (when you successfully login on the site) for a high level understanding. Below it is my code, that only gets me the first cookie, when I need the second cookie to access the JSON I need. I have a working code that currently gets me the file but the cookie (2nd cookie) expires after a week so it is not sustainable.
Step 1 - Purpose: Get first cookie that will be "c1" below
Request URL: https://www.fakesite.com/sign-in.html
Request Method: GET
Status Code: 200
Step 2 - Purpose: Use "c1" to make a POST request to obtain second cookie "c2" <<<<-Can't get past here
Request URL: https://www.fakesite.com/sign-in.html
Request Method: POST
Status Code: 302
Step 3 - Purpose: Use "c2" and auth_token (doesn't change) to GET the json file I need <<<I have working code when manually getting 7-day cookie
Request URL: https://www.fakesite.com/ap1/file.json
Request Method: GET
Status Code: 200
import requests
headers = {
'authority': 'www.fakesite.com',
'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="98", "Google Chrome";v="98"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'navigate',
'sec-fetch-user': '?1',
'sec-fetch-dest': 'document',
'referer': 'https://www.fakesite.com/goodbye.html?service=logout',
'accept-language': 'en-US,en;q=0.9',
}
s = requests.Session()
r1 = s.get('https://www.fakesite.com/sign-in.html', headers=headers)
c1 = s.cookies['fakesite']
print(r1.status_code)
print(c1)
c2= 'fakesite='+c1+'; language=en'
headers = {
'authority': 'www.fakesite.com',
'cache-control': 'max-age=0',
'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="98", "Google Chrome";v="98"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'upgrade-insecure-requests': '1',
'origin': 'https://www.fakesite.com',
'content-type': 'application/x-www-form-urlencoded',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'navigate',
'sec-fetch-user': '?1',
'sec-fetch-dest': 'document',
'referer': 'https://www.fakesite.com/sign-in.html',
'accept-language': 'en-US,en;q=0.9',
'cookie': c2,
}
data = {
'csrf_token': '12345=',
'username': 'userme',
'password': 'passwordme',
'returnUrl': '/sign-in.html',
'service': 'login',
'Login': 'Login'
}
r2 = requests.post('https://www.fakesite.com/sign-in.html', headers=headers, data=data)
c3 = s.cookies['fakesite']
print(r2.status_code)
print(c3)
OUTPUT:
200
151f3e3ba82030e5b7c03bc310ed5ad5
200
151f3e3ba82030e5b7c03bc310ed5ad5
My code results in returning the first cookie over again when I try and print all of them . I feel like I have tried everything to no avail. When I try to go to the last site while directly logging in, I get a 401 because I need the first cookie to get the second cookie first.
I have an issue with the particular website https://damas.terna.it/DMSPCAS08.
I am trying to either scrape the data or to fetch the excel file that it is included.
I tried to fetch the excel file with a post request.
import requests
from bs4 import BeautifulSoup
import json
import datetime
url = 'https://damas.terna.it/api/Ntc/GetNtc'
headers = {
'Host': 'damas.terna.it',
'Connection': 'keep-alive',
'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="90", "Google Chrome";v="90"',
'sec-ch-ua-mobile': '?0',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-User': '?1',
'Sec-Fetch-Dest': 'document',
'Referer': 'https://damas.terna.it/DMSPCAS08',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9',
'Cookie': '__RequestVerificationToken=5mfiSM2dKte532I8fd3MRdn6nnHbSezkQX29r3fyF2tMegXsvnOInpxy8JvFuDyRVS6pZs03y-NL3CsNsItN1yboc128Kk51yEiuUU0mel41; pers_damas_2019=378972352.20480.0000; rxVisitor=1619766836714T8SRPFLUAH62F1R9K6KG3EKK104BSDFE; dtCookie=7$EC34ED7BFB4A503D379D8D8C69242174|846cd19ce7947380|1; rxvt=1619774462088|1619771198468; dtPC=7$369396404_351h-vIDMPHTCURIGKVKBWVVEWOAMRMKDNWCUH-0e1; DamasNetCookie=F-evmYb7kIS_YTMr2mwuClMB1zazemmhl9vzSXynWeuCII_keDb_jQr4VLSYif9t3juDS6LkOuIXKFfe8pydxSzHPfZzGveNB6xryj2Czp9J1qeWFFT9dYFlRXFWAHuaEIyUQQDJmzWfDBrFCWr309mZoE6hkCKzDtoJgIoor9bed1kQgcdeymAH9lrtrKxwsheaQm2qA-vWWqKjCiruO1VkJ6II_bcrAXU2A_ZPQPznE1_8AEC_AwXmBXETubMQwFAnDXsOLDfEYeQ61TGAupF3d-wz3aZfRs5eCA3kw-en-kpEbS0trwFBQzB-098610GIbIPki9ThVitZ2LN2zza6nn1A8qchqfQC_CZEgu6Jt1glfhHceWS6tvWCuyOEqo2jJpxAajMYXPB6mzlHzX13TiV-jgeFSPehugMAgms_exqocw9w27e4lI5laYZu0rkKkznpZ1mJLOhORwny8-bKa3nRUt7erFv7ul3nLLrgd3FP907tHpTh-qXt1Bmr6OqknDZr_EBN8GY_B2YHV-8hC0AjdqQqpS0xOpp7z_CzzgByTOHSNdeKjVgQfZLQ7amnp71lhxgPeJZvOIl_mIWOr_gWRy_iK6UuzrA3udCTV7bAnUXKB8gX89d9ShQf5tZDxPFchrAQBtdmDChQOA; dtLatC=2; dtSa=true%7CC%7C-1%7CExport%20to%20.xls%7C-%7C1619772685174%7C369396404_351%7Chttps%3A%2F%2Fdamas.terna.it%2FDMSPCAS08%7CTerna%20%5Ep%20NTC%7C1619772662568%7C%7C'
}
parameters = {
'busDay': "2021-05-01",
'busDayTill': "2021-05-01",
'borderDirId': '1',
'borderDirName': "TERNA-APG"
}
response = requests.post(url, data=parameters, headers=headers)
soup = BeautifulSoup(response.text, "html.parser")
print(soup.prettify())
I am receiving this error:
The parameters dictionary contains an invalid entry for parameter 'parameters' for method 'System.Web.Mvc.ActionResult GetNtc(Damas.Core.Data.DataSource.Data.ParametersModel)' in 'Terna.Web.Controllers.CapacityManagement.NtcController'. The dictionary contains a value of type 'System.Collections.Generic.Dictionary`2[System.String,System.Object]', but the parameter requires a value of type 'Damas.Core.Data.DataSource.Data.ParametersModel'.
Parameter name: parameters
Please don't post the answer to your question in the question's body; instead, post it in the answer box:
response = requests.post(url, data=json.dumps(parameters), headers=headers) seems to solve the issue.