Webscraping page that require cookies with requests

Webscraping page that require cookies with requests - python

I'm trying to scrape the results of this booking website. The site drops a cookie to recognise the session. I've tried replicating it with requests but I still getting an Invalid Session ID error in my response. What am I doing wrong?
url = 'https://alilauro-tickets.certusonline.com/php/proxy.php'
s = requests.Session()
s.get(url)
data = {
'msg': 'TimeTable',
'req': '{"getAvailability":"Y","getBasicPrice":"Y","getRouteAnalysis":"Y","directOnly":"Y","legs":1,"pax":1,"origin":"BEV","destination":"FOR","tripRequest":[{"tripfrom":"BEV","tripto":"FOR","tripdate":"2020-03-21","tripleg":0}]}'
}
r = s.post(url, data=data, cookies=s.cookies)
Here is the error I get:
'sessionID': none, 'errorCode': '620', 'errorDescription': 'Invalid Session Number'
Here is the cookie information:
Cookie informaiton

Indeed the cookie is present when you call https://alilauro-tickets.certusonline.com/php/proxy.php but the cookie is not valid until a Javascript function call https://alilauro-tickets.certusonline.com/php/proxy.php?msg=Connect. This is a protection against CSRF as Dan-Dev mentionned it in comments.
Using the following would work :
import requests
import json
url = "https://alilauro-tickets.certusonline.com/php/proxy.php"
session = requests.Session()
r = session.post(url, data= { "msg": "Connect"})
r = session.post(url, data= {
"msg": "TimeTable",
"req": json.dumps({
"getAvailability":"Y",
"getBasicPrice":"Y",
"getRouteAnalysis":"Y",
"directOnly":"Y",
"legs":"1",
"pax":1,
"origin":"FOR",
"destination":"BEV",
"tripRequest":[{
"tripfrom":"FOR",
"tripto":"BEV",
"tripdate":"2020-03-20",
"tripleg":0
}]
})
})
print(json.loads(r.text)["VWS_Trips_Trip"])

Related

Moodle Create User API Invalid Parameter Exception

I am trying to register a user via API using Python Request.
Here is my code
import requests
url = "www.website.com/webservice/rest/server.php?createpassword=1&username=thisissifumwike&auth=manual&password=Tester#2023.&firstname=Domain&lastname=Tester&email=hello#gmail.com&maildisplay=sifugmail&city=Daressalaam&country=Tanzania&timezone=99&description=I am a good tester&firstnamephonetic=Sifu&lastnamephonetic=Domain&middlename=Tester&alternatename=Tunety&interests=books, codes, forex&idnumber=FMM001&institution=FMM&department=Traders&phone1=0611222333&phone2=0611333222&address=Daressalaam&lang=en&calendartype=gregorian&theme=default&mailformat=1&customfields=&wstoken=token&wsfunction=core_user_create_users&moodlewsrestformat=json"
payload={}
headers = {}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
However upon sending that request I get this response
{
"exception": "invalid_parameter_exception",
"errorcode": "invalidparameter",
"message": "Invalid parameter value detected"
}

FTX API error "Missing parameter accounts"

I hope someone could help me in this.
Following this official doc from FTX:
https://docs.ftx.com/#request-historical-balances-and-positions-snapshot
I'm continue to get this error:
{'success': False, 'error': 'Missing parameter accounts', 'errorCode': 'parameter_missing'}
The code was ok and getting all the response from the other API call until I tested the the one above and I had to add the parameters, those are not working :-(
Here is my code:
import requests
from requests import Request, Session
import time
import hmac
import json
s = requests.Session()
url = "https://ftx.com/api/historical_balances/requests"
ts = int(time.time() * 1000)
tsN = int(time.time())
params = {
"accounts": ["main","subaccounts"],
"endTime": tsN,
}
request = requests.Request("POST", url, params=params)
prepared = request.prepare()
signature_payload = f'{ts}{prepared.method}{prepared.path_url}'.encode()
if prepared.body:
signature_payload += prepared.body
signature = hmac.new('MYSECRET'.encode(), signature_payload, 'sha256').hexdigest()
request.headers = {
'FTX-KEY': 'MYKEY',
'FTX-SIGN': signature,
'FTX-TS': str(ts),
}
r = s.send(request.prepare())
r.json()
print('Output Json :',r.json())
Do you have any suggestion please? I'm getting crazy with this... Thanks!!

Try switching request = requests.Request("POST", url, params=params) to request = requests.Request("POST", url, data=params) or request = requests.Request("POST", url, json=params), because I believe they want you to send them accounts and endTime as body of request, not as url parameters.

How to give a session from requests to requests_html?

I want to scrape data from a site with login. I used the requests libary to login but i dont get js data from there. So I use also requests_html to get the js data but now i cant give the session from request to request_html or take the active session to scrape.
I know that there is "selenium" but when I use it there is always a recaptcha on the page, so I decided to use request_html.
If there are other possibilities, which might be easier, I will gladly accept suggestions.
Here is my code:
from requests_html import HTMLSession
import requests
url='...'
url2='...'
headers = {
...
}
data = {
'_csrf': '...',
'User[username]': '...',
'User[password]': '...'
}
session = requests.Session()
session.post(url,headers=headers,data=data)
session = HTMLSession()
r = session.get(url2)
r.html.render()
print(r.html.html)

Why don't you use requests_html.HTMLSession as session object, instead of requests.Session?
It inherits from requests.Session, so it's perfectly capable of calling HTML methods like post.

payload = {
'username': 'admin',
'password': 'password',
'Login': 'Login'
}
with HTMLSession() as c:
r = c.get(url)
login_html = BeautifulSoup(
r.html.html, "html.parser")
csrf_token_name = None
csrf_token_value = None
for tag in login_html.find_all('input'):
if tag.attrs['type'] == 'hidden':
csrf_token = True
csrf_token_name = tag.attrs['name']
csrf_token_value = tag.attrs['value']
payload[csrf_token_name] = csrf_token_value
p = c.post(url, data=payload)
r = c.get(
'http://localhost/vulnerabilities/xss_r/?name=xx1xx')
if 'Reflected' in r.text:
print('test')

Login in a website with requests

I need to log me in a website with requests, but all I have try don't work :
from bs4 import BeautifulSoup as bs
import requests
s = requests.session()
url = 'https://www.ent-place.fr/CookieAuth.dll?GetLogon?curl=Z2F&reason=0&formdir=5'
def authenticate():
headers = {'username': 'myuser', 'password': 'mypasss', '_Id': 'submit'}
page = s.get(url)
soup = bs(page.content)
value = soup.form.find_all('input')[2]['value']
headers.update({'value_name':value})
auth = s.post(url, params=headers, cookies=page.cookies)
authenticate()
or :
import requests
payload = {
'inUserName': 'user',
'inUserPass': 'pass'
}
with requests.Session() as s:
p = s.post('https://www.ent-place.fr/CookieAuth.dll?GetLogon?curl=Z2F&reason=0&formdir=5', data=payload)
print(p.text)
print(p.status_code)
r = s.get('A protected web page url')
print(r.text)
When I try this with the .status_code, it return 200 but I want 401 or 403 for do a script like 'if login'...
I have found this but I think it works in python 2, but I use python 3 and I don't know how to convert... :
import requests
import sys
payload = {
'username': 'sopier',
'password': 'somepassword'
}
with requests.Session(config={'verbose': sys.stderr}) as c:
c.post('http://m.kaskus.co.id/user/login', data=payload)
r = c.get('http://m.kaskus.co/id/myform')
print 'sopier' in r.content
Somebody know how to do ?
Because each I have test test all script I have found and it don't work...

When you submit the logon, the POST request is sent to https://www.ent-place.fr/CookieAuth.dll?Logon not https://www.ent-place.fr/CookieAuth.dll?GetLogon?curl=Z2F&reason=0&formdir=5 -- You get redirected to that URL afterwards.
When I tested this, the post request contains the following parameters:
curl:Z2F
flags:0
forcedownlevel:0
formdir:5
username:username
password:password
SubmitCreds.x:69
SubmitCreds.y:9
SubmitCreds:Ouvrir une session
So, you'll likely need to supply those additional parameters as well.
Also, the line s.post(url, params=headers, cookies=page.cookies) is not correct. You should pass headers into the keyword argument data not params -- params encodes to the request url -- you need to pass it in the form data. And I'm assuming you really mean payload when you say headers
s.post(url, data=headers, cookies=page.cookies)
The site you're trying to login to has an onClick JavaScript when you process the login form. requests won't be able to execute JavaScript for you. This may cause issues with the site functionality.

Python requests: Can't send message to myself via Facebook

I wrote simple Python script that suppose to send message from myself to myself via Facebook.
The problems is that I can't obtain page access token.
Permissions are set correctly.
import json
import requests
ACCESS_TOKEN = 'generated this with all possible permissions'
r = requests.get('https://graph.facebook.com/me', params={'access_token': ACCESS_TOKEN})
my_id = r.json()['id']
print(my_id)
r = requests.get('https://graph.facebook.com/me/accounts', params={'access_token': ACCESS_TOKEN})
PAGE_ACCESS_TOKEN = r.json()#####['access_token'] cant be found cos 'data' is empty
print(PAGE_ACCESS_TOKEN)
json = {
'recipient': {'id': my_id},
'message': {'text': 'Hello'}
}
params = {
'access_token': PAGE_ACCESS_TOKEN
}
r = requests.post('https://graph.facebook.com/me/messages', json=json, params=params)
print(r)
It's like I don't have access to data in /user/accounts despite that all permissions are set. I only get my user id.
11330393********
{'data': []}
<Response [400]>
What can be done to fix this?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Webscraping page that require cookies with requests - python

Related

Moodle Create User API Invalid Parameter Exception

FTX API error "Missing parameter accounts"

How to give a session from requests to requests_html?

Login in a website with requests

Python requests: Can't send message to myself via Facebook

Categories

Resources