Python how to format requests data - python

I would like to format my request data field.
I tried many methods but I couldn't
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:83.0) Gecko/20100101 Firefox/83.0',
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Accept-Language': 'tr-TR,tr;q=0.8,en-US;q=0.5,en;q=0.3',
'Content-Type': 'application/json; charset=utf-8',
'Origin': 'https://www.example.com',
'Connection': 'keep-alive',
'Referer': 'https://www.example.com/',
}
data = '{ "loginId": "mailadress#hotmail.com", "password": "password123"}'
response = s.post('https://auth-api.example.com/login', headers=headers, data=data)
I tried:
data = '{ "loginId": "{}", "password": "{}"}'.format(liste[0],liste[1])
Also:
data = "'{ "loginId": "{}", "password": "{}"}'".format(liste[0],liste[1])
quote = ' " '
data = quote + '{ "loginId": "{}", "password": "{}"}' + quote.format(liste[0],liste[1])
So how can I format data field?

The documentation of the post function specifies that data must be a dictionnary, list of tuples, bytes, or file-like object.
So, try:
data = {"loginId": liste[0], "password": list[1]}

I would use the requests module like this:
import requests
try:
data = requests.post('https://auth-api.example.com/login', auth=("mailadress#hotmail.com", "password123"), headers=headers, verify=False)
except requests.exceptions.RequestException as e:
print e
print ("unable to run request on %s " % (url) )
continue

Related

Obtain data of Freight Index in python

I am trying to get the data from this website, https://en.macromicro.me/charts/947/commodity-ccfi-scfi , for China and Shanghai Continerized Freight Index.
I understand that the data is called from an API, how do I find out how the call is made and how do I extract it using python?
I am new in html in general so I have no idea where to start.
I tried,
import requests
url = "https://en.macromicro.me/charts/data/947/commodity-ccfi-scfi"
resp = requests.get(url)
resp = resp.json()
But the response is <Response [404]>
If I change the url to https://en.macromicro.me/charts/data/947/
the response is {'success': 0, 'data': [], 'msg': 'error #644'}
Try the below
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36',
'Referer': 'https://en.macromicro.me/charts/947/commodity-ccfi-scfi',
'X-Requested-With': 'XMLHttpRequest',
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Authorization': 'Bearer 9633cefba71a598adae0fde0b56878fe',
'Cookie': 'PHPSESSID=n74gv10hc8po7mrj491rk4sgo1; _ga=GA1.2.1231997091.1631627585; _gid=GA1.2.1656837390.1631627585; _gat=1; _hjid=c52244fd-b912-4d53-b0e3-3f11f430b51c; _hjFirstSeen=1; _hjAbsoluteSessionInProgress=0'}
r = requests.get('https://en.macromicro.me/charts/data/947', headers=headers)
print(r.json())
output
{'success': 1, 'data': {' ...}

Python requests POST method unexpected content in response

I am tring to get the EPG data at the web page https://www.meo.pt/tv/canais-programacao/guia-tv using Python requests. I use this module a lot, but mainly the GET method. This request however is using POST. Everytime you scroll down the page, a request is sent to the API below using these params to load additional program data to the page:
import requests
#post request
url = 'https://www.meo.pt/_layouts/15/Ptsi.Isites.GridTv/GridTvMng.asmx/getProgramsFromChannels'
headers = {
'Accept': '*/*',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8',
'Connection': 'keep-alive',
'Content-Length': '214',
'Content-type': 'application/json; charset=UTF-8',
'Host': 'www.meo.pt',
'Origin': 'https://www.meo.pt',
'Referer': 'https://www.meo.pt/tv/canais-programacao/guia-tv',
'sec-ch-ua': '"Google Chrome";v="89", "Chromium";v="89", ";Not A Brand";v="99"',
'sec-ch-ua-mobile': '?0',
'Sec-Fetch-Dest': 'empty',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Site': 'same-origin',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36',
'X-KL-Ajax-Request': 'Ajax_Request'
}
data = {"service":"channelsguide",
"channels":["LVTV","TOROTV","CAÇAP","CAÇAV","RTPACRS","CLUBB","MCM T","TRACE","24KITC","E!"],
"dateStart":"2021-04-20T23:00:00.000Z",
"dateEnd":"2021-04-21T23:00:00.000Z",
"accountID":""}
r = requests.post(url=url, headers=headers, data=data)
print(r.text)
I have tried this request, both with and without the headers used, as I don't know if they are needed for a POST request. However, both these options don't return what i was expecting, which was a JSON object containing the program data for these channels.
What am I doing wrong?
Consider using json argument instead of data in request function. The json argument parses your body to JSON format while data you are sending a raw dictionary.
data = {"service":"channelsguide",
"channels":["LVTV","TOROTV","CAÇAP","CAÇAV","RTPACRS","CLUBB","MCM T","TRACE","24KITC","E!"],
"dateStart":"2021-04-20T23:00:00.000Z",
"dateEnd":"2021-04-21T23:00:00.000Z",
"accountID":""}
r = requests.post(url=url, headers=headers, json=data)
If you want to keep using data argument you should parse data dictionary to JSON to send the correct body format.
You can use this example how to POST json data to the API Url:
import json
import requests
url = "https://www.meo.pt/_layouts/15/Ptsi.Isites.GridTv/GridTvMng.asmx/getProgramsFromChannels"
payload = {
"accountID": "",
"channels": [
"SCPHD",
"EURHD",
"EURS2HD",
"DISNY",
"CART",
"BIGGS",
"SICK",
"NICKELO",
"DISNYJ",
"PANDA",
],
"dateEnd": "2021-04-21T22:00:00.000Z",
"dateStart": "2021-04-20T22:00:00.000Z",
"service": "channelsguide",
}
data = requests.post(url, json=payload).json()
# pretty print the data:
print(json.dumps(data, indent=4))
Prints:
{
"d": {
"__type": "Ptsi.Isites.GridTv.CanaisService.GridTV",
"ExtensionData": {},
"services": [],
"channels": [
{
"__type": "Ptsi.Isites.GridTv.CanaisService.Channels",
"ExtensionData": {},
"id": 36,
"name": "SPORTING TV HD",
"sigla": "SCPHD",
"friendlyUrlName": "Sporting_TV_HD",
"url": "https://meogo.meo.pt/direto?canalUrl=Sporting_TV_HD",
"meogo": true,
"logo": "https://www.meo.pt/PublishingImages/canais/sporting-tv-hd.png",
"isAdult": false,
"categories": [
{
"ExtensionData": {},
"id": 2,
"name": "Desporto"
}
],
...
If you want to keep using data argument you should parse data
dictionary to JSON to send the correct body format.
And you should set headers as:
headers = {
'Content-type': 'application/json'
}
complete code is:
import json
import requests
url = "https://www.meo.pt/_layouts/15/Ptsi.Isites.GridTv/GridTvMng.asmx/getProgramsFromChannels"
headers = {
'Content-type': 'application/json'
}
payload = {
"accountID": "",
"channels": [
"SCPHD",
"EURHD",
"EURS2HD",
"DISNY",
"CART",
"BIGGS",
"SICK",
"NICKELO",
"DISNYJ",
"PANDA",
],
"dateEnd": "2021-04-21T22:00:00.000Z",
"dateStart": "2021-04-20T22:00:00.000Z",
"service": "channelsguide",
}
resp = requests.post(url,headers=headers,data=json.dumps(payload))
print(resp.text)

Script gets stuck while sending post requests with parameters

I'm trying to populate json response issuing a post http requests with appropriate parameters from a webpage. When I run the script, I see that the script gets stuck and doesn't bring any result. It doesn't throw any error either. This is the site link. I chose three options from the three dropdowns from this form in that site before hitting Get times & tickets button.
I've tried with:
import requests
from bs4 import BeautifulSoup
url = 'https://www.thetrainline.com/'
link = 'https://www.thetrainline.com/api/journey-search/'
payload = {"passengers":[{"dateOfBirth":"1991-01-31"}],"isEurope":False,"cards":[],"transitDefinitions":[{"direction":"outward","origin":"1f06fc66ccd7ea92ae4b0a550e4ddfd1","destination":"7c25e933fd14386745a7f49423969308","journeyDate":{"type":"departAfter","time":"2021-02-11T22:45:00"}}],"type":"single","maximumJourneys":4,"includeRealtime":True,"applyFareDiscounts":True}
with requests.Session() as s:
s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36'
s.headers['content-type'] = 'application/json'
s.headers['accept'] = 'application/json'
r = s.post(link,json=payload)
print(r.status_code)
print(r.json())
How can I get json response issuing post requests with parameters from that site?
You are missing the required headers: x-version and referer. The referer header is referring to the search form and you can build it. Before journey-search you have to post an availability request.
import requests
from requests.models import PreparedRequest
headers = {
'authority': 'www.thetrainline.com',
'pragma': 'no-cache',
'cache-control': 'no-cache',
'x-version': '2.0.18186',
'dnt': '1',
'accept-language': 'en-GB',
'sec-ch-ua-mobile': '?0',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/88.0.4324.96 Safari/537.36',
'content-type': 'application/json',
'accept': 'application/json',
'origin': 'https://www.thetrainline.com',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
}
with requests.Session() as s:
origin = "6e2242b3f38bbbd8d8124e1d84d319e1"
destination = "15bcf02bc44ea754837c8cf14569f608"
localDateTime = "2021-02-03T19:30:00"
dateOfBirth = "1991-02-03"
passenger_type = "single"
req = PreparedRequest()
url = "http://www.neo4j.com"
params = {
"origin": origin,
"destination": destination,
"outwardDate": localDateTime,
"outwardDateType": "departAfter",
"journeySearchType": passenger_type,
"passengers[]": dateOfBirth
}
req.prepare_url("https://www.thetrainline.com/book/results", params)
headers.update({"referer": req.url})
s.headers = headers
payload_availability = {
"origin": origin,
"destination": destination,
"outwardDefinition": {
"localDateTime": localDateTime,
"searchMethod": "DEPARTAFTER"
},
"passengerBirthDates": [{
"id": "PASSENGER-0",
"dateOfBirth": dateOfBirth
}],
"maximumNumberOfJourneys": 4,
"discountCards": []
}
r = s.post('https://www.thetrainline.com/api/coaches/availability', json=payload_availability)
r.raise_for_status()
payload_search = {
"passengers": [{"dateOfBirth": "1991-02-03"}],
"isEurope": False,
"cards": [],
"transitDefinitions": [{
"direction": "outward",
"origin": origin,
"destination": destination,
"journeyDate": {
"type": "departAfter",
"time": localDateTime}
}],
"type": passenger_type,
"maximumJourneys": 4,
"includeRealtime": True,
"applyFareDiscounts": True
}
r = s.post('https://www.thetrainline.com/api/journey-search/', json=payload_search)
r.raise_for_status()
print(r.json())
As Sers's reply, headers are missing.
When scrawling websites, you have to keep in mind anti-scrawling mechanism. The website will block your requests by taking into consideration your IP address, request headers, cookies, and various other factors.

Python requests appending ambersand (&) to URL when adding results from variable as parameter

Below is my code which basically retrieves data from the database, puts it into a variable in CSV format which I then am trying to append on to a GET request URL. However, the get request results in null as the GET Request URL has an ampersand (&) sign in it.
Question is how do I get rid of it?
This is the URL, note the ampersand (&):
https://demo-api.ig.com/gateway/deal/clientsentiment?marketIds=&JGB,BCHUSD,AT20,
import requests
import json
import time
import datetime
import csv
import pandas as pd
import psycopg2
conn_string = "host=' dbname='' user='' password=''"
conn = psycopg2.connect(conn_string)
cursor=conn.cursor()
# Query to source marketIds
postgreSQL_select_Query = "SELECT DISTINCT () FROM static WHERE TYPE!='' AND marketId!='None'"
cursor.execute(postgreSQL_select_Query)
#print("Selecting marketId from table using cursor.fetchall")
instrument_static_marketId = cursor.fetchall()
cursor.execute(postgreSQL_select_Query )
#This puts the sql result into nice CSV format
y=','.join([y[0] for y in cursor.fetchall() ])
print(y)
# closing database connection.
conn.close ()
def main():
headers = {
'Connection': 'keep-alive',
'Origin': 'https://.com',
'X-IG-API-KEY': '',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36',
'Content-Type': 'application/json; charset=UTF-8',
'Accept': 'application/json; charset=UTF-8',
'X-SECURITY-TOKEN': '',
'CST': '',
'Sec-Fetch-Site': 'same-site',
'Sec-Fetch-Mode': 'cors',
'Referer': 'https://',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8',
}
response = requests.get('https://demo-api.ig.com/gateway/deal/clientsentiment?marketIds=',params=y, headers=headers)
print(response.url)
result = response.json()
print(result)
if __name__ == '__main__':
main()
You've included part of a parameter in your URL which is incorrect and confused requests.
Leave that off, and pass a dictionary for params, just like you're already doing with headers:
y = 'JGB,BCHUSD,AT20'
params = {
'marketIDs': y,
}
url = 'https://demo-api.ig.com/gateway/deal/clientsentiment'
response = requests.get(url, params=params, headers=headers)

Unable to login to a site with requests

For fun, I'm trying to use Python requests to log on to my school's student portal. This is what I've come up with so far. I'm trying to be very explicit on the headers, because I'm getting a 200 status code (the code you also get when failing to login) instead of a 302 (successful login).
import sys
import os
import requests
def login(username, password):
url = '(link)/home.html#sign-in-content'
values = {
'translator_username' : '',
'translator_password' : '',
'translator_ldappassword' : '',
'returnUrl' : '',
'serviceName' : 'PS Parent Portal',
'serviceTicket' : '',
'pcasServerUrl' : '\/',
'credentialType' : 'User Id and Password Credential',
'account' : username,
'pw' : password,
'translatorpw' : password
}
headers = {
'accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'accept-encoding' : 'gzip, deflate, br',
'accept-language' : 'en-US,en;q=0.9',
'cache-control' : 'max-age=0',
'connection' : 'keep-alive',
'content-type' : 'application/x-www-form-urlencoded',
'host' : '(link)',
'origin' : '(link)',
'referer' : '(link)guardian/home.html',
'upgrade-insecure-requests' : '1'
}
with requests.Session() as s:
p = s.post(url, data=values)
if p.status_code == 302:
print(p.text)
print('Authentication error', p.status_code)
r = s.get('(link)guardian/home.html')
print(r.text)
def main():
login('myname', 'mypass')
if __name__ == '__main__':
main()
Using Chrome to examine the network requests, all of these headers are under 'Request Headers' in addition to a long cookie number, content-length, and user-agent.
The forms are as follows:
pstoken:(token)
contextData:(text)
translator_username:
translator_password:
translator_ldappassword:
returnUrl:(url)guardian/home.html
serviceName:PS Parent Portal
serviceTicket:
pcasServerUrl:\/
credentialType:User Id and Password Credential
account:f
pw:(id)
translatorpw:
Am I missing something with the headers/form names? Is it a problem with cookies?
If I look at p.requests.headers, this is what is sent:
{'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.89 Safari/537.36', 'accept-encoding': 'gzip, deflate, br', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8', 'connection': 'keep-alive', 'accept-language': 'en-US,en;q=0.9', 'cache-control': 'max-age=0', 'content-type': 'application/x-www-form-urlencoded', 'host': '(url)', 'origin': '(url)', 'referer': '(url)guardian/home.html', 'upgrade-insecure-requests': '1', 'Content-Length': '263'}
p.text gives me the HTML of the login page
Tested with PowerAPI, requests, Mechanize, and RoboBrowser. All fail.
What response do you expect? You are using a wrong way to analyze your response.
with requests.Session() as s:
p = s.post(url, data=values)
if p.status_code == 302:
print(p.text)
print('Authentication error', p.status_code)
r = s.get('(link)guardian/home.html')
print(r.text)
In your code, you print out Authentication error ignoring status_code, I think it at least should like this:
with requests.Session() as s:
p = s.post(url, data=values)
if p.status_code == 302:
print(p.text)
r = s.get('(link)guardian/home.html')
print(r.text)
else:
print('Authentication error', p.status_code)

Categories

Resources