Python3, download file from url by button click - python

I need to download file from link like this https://freemidi.org/getter-13560
But I cant use urllib.request or requests library cause it downloads html, not midi. Is there is any solution? And also here is the link with the button itself link

By adding the proper headers and using session we can download and save the file using request module.
import requests
headers = {
"Host": "freemidi.org",
"Connection": "keep-alive",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.9",
}
session = requests.Session()
#the website sets the cookies first
req1 = session.get("https://freemidi.org/getter-13560", headers = headers)
#Request again to download
req2 = session.get("https://freemidi.org/getter-13560", headers = headers)
print(len(req2.text)) # This is the size of the mdi file
with open("testFile.mid", "wb") as saveMidi:
saveMidi.write(req2.content)

Related

Signing in to a website using discord Oauth (only using python requests, no selenium)

I'm trying very hard to login to a website which uses Discord Oauth. To do this I have to make a request to
https://discord.com/api/v9/oauth2/authorize?client_id=896549597550358548&response_type=code&redirect_uri=https://www.monkeebot.xyz/oauth/discord&scope=identify guilds
However. Discord returns a an errored response:
<Response [200]>
{'location': 'https://www.monkeebot.xyz/oauth/discord?error=access_denied&error_description=The+resource+owner+or+authorization+server+denied+the+request'}
url = "https://discord.com/api/v9/oauth2/authorize?client_id=896549597550358548&response_type=code&redirect_uri=https://www.monkeebot.xyz/oauth/discord&scope=identify guilds"
headers = {
"authorization": "",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.81 Safari/537.36 Edg/104.0.1293.47",
"accept": "*/*",
"accept-encoding": "gzip, deflate, br",
"accept-language": "en-GB,en;q=0.9,en-US;q=0.8",
"cache-control": "no-cache",
"content-length": "36",
"content-type": "application/json",
"origin": "https://discord.com",
"pragma": "no-cache",
"referer": "https://discord.com/oauth2/authorize?client_id=896549597550358548&redirect_uri=https://www.monkeebot.xyz/oauth/discord&response_type=code&scope=identify%20guilds",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.81 Safari/537.36 Edg/104.0.1293.47",
}
r = requests.post(url, headers=headers, json={}, allow_redirects=True)
Does anybody have any experience with doing this? I'm not entirely sure what I'm doing wrong. I have added the correct authorization etc, and I have tried it on different websites that use Discord oauth, but I get this error every time.
Any helps/debugging tips would be very grateful.
Before executing this script, please create your auth code and use it in below "Authorization" key. I don't want to share mine :)
import requests
import json
def generate_code():
url = "https://discord.com/api/v9/oauth2/authorize?client_id=896549597550358548&response_type=code&redirect_uri=https://www.monkeebot.xyz/oauth/discord&code=0nNWo24HJQgwytTMlqkwgdil9fW24k&scope=identify guilds"
payload = json.dumps({"permissions": "0", "authorize": True})
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0",
"Accept": "*/*",
"Accept-Language": "en-US,en;q=0.5",
"Content-Type": "application/json",
"Authorization": "USE YOUR AUTH CODE HERE. YOu need to CREATE it first",
"X-Super-Properties": "eyJvcyI6IldpbmRvd3MiLCJicm93c2VyIjoiRmlyZWZveCIsImRldmljZSI6IiIsInN5c3RlbV9sb2NhbGUiOiJlbi1VUyIsImJyb3dzZXJfdXNlcl9hZ2VudCI6Ik1vemlsbGEvNS4wIChXaW5kb3dzIE5UIDEwLjA7IFdpbjY0OyB4NjQ7IHJ2OjEwMy4wKSBHZWNrby8yMDEwMDEwMSBGaXJlZm94LzEwMy4wIiwiYnJvd3Nlcl92ZXJzaW9uIjoiMTAzLjAiLCJvc192ZXJzaW9uIjoiMTAiLCJyZWZlcnJlciI6IiIsInJlZmVycmluZ19kb21haW4iOiIiLCJyZWZlcnJlcl9jdXJyZW50IjoiIiwicmVmZXJyaW5nX2RvbWFpbl9jdXJyZW50IjoiIiwicmVsZWFzZV9jaGFubmVsIjoic3RhYmxlIiwiY2xpZW50X2J1aWxkX251bWJlciI6MTQxMjY5LCJjbGllbnRfZXZlbnRfc291cmNlIjpudWxsfQ==",
"X-Discord-Locale": "en-US",
"X-Debug-Options": "bugReporterEnabled",
"Alt-Used": "discord.com",
"Cookie": "__dcfduid=7dc5606419a411ed9b1fee90ba2eef9f; __sdcfduid=7dc5606419a411ed9b1fee90ba2eef9f20120e864cf33496bedf659a7820954f7611454b15423f7553e3e85c08756a75",
}
response = requests.request("POST", url, headers=headers, data=payload)
return response.json()
print(generate_code())

why can't it print response after making a request to an api/url?

I just found the bestbuyCA api by inspecting the xhr.
aboveurl = 'https://www.bestbuy.ca/ecomm-api/availability/products?accept=application%2Fvnd.bestbuy.simpleproduct.v1%2Bjson&accept-language=en-CA&skus=14962185'
I've tried::
response= requests.get(aboveurl)
print(response.text)
//
r = requests.get(url).json()
print(r)
When I run my code in vsc, it starts and keeps running but it will not display anything.
You have to add a header to your request to get the request result:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36', "Upgrade-Insecure-Requests": "1","DNT": "1","Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8","Accept-Language": "en-US,en;q=0.5","Accept-Encoding": "gzip, deflate"}
aboveurl = "https://www.bestbuy.ca/ecomm-api/availability/products?accept=application%2Fvnd.bestbuy.simpleproduct.v1%2Bjson&accept-language=en-CA&skus=14962185"
html = requests.get(aboveurl,headers=headers)
print(f'requrest code: {html.status_code}\n request text: {html.text}')

How to know request cookies when a URL sent to server as a request using Python

I want to know request headers of URL or cookie value in headers.
url = 'https://research.ibfd.org/endecapod/myN=3+10+4293760389+4293746984&Nu=global_rollup_key&Np=2&Nao=20&Nty=0&Ns=decision_date_sortable|1'
Headers for that URL are
headers = {
"Accept":"application/json; charset=utf-8",
"Accept-Encoding":"gzip, deflate, br",
"Accept-Language":"en-US,en;q=0.9",
"Connection":"keep-alive",
"Cookie": "optimizelyEndUserId=oeu1575288961321r0.6329519991798649; optimizelyBuckets=%7B%7D; _ga=GA1.2.693353487.1575288963; _hjid=cfef24b9-ec0f-4677-90eb-6f25e4a4351e; s_ecid=MCMID%7C89844361514127015796100074732098692046; optimizelySegments=%7B%22785891794%22%3A%22gc%22%2C%22786481600%22%3A%22false%22%2C%22787091486%22%3A%22referral%22%7D; angular-ab-tests-ibfd_preview=no; IBFD_SESSION=1xwvMTmbcR714KGplMpDLoYVqTikDSbU; _hjIncludedInSample=1; AMCVS_2F71365B5AA928340A495E05%40AdobeOrg=1; s_cc=true; wt_tags=SUB:kbase:masna.venu#thomsonreuters.com::username:; s_sq=%5B%5BB%5D%5D; _hjAbsoluteSessionInProgress=1; has_js=1; _gid=GA1.2.1180329552.1595404659; s_ppvl=Home%2520-%2520Tax%2520Research%2520Platform%2520-%2520IBFD%2C90%2C90%2C588%2C1422%2C588%2C1280%2C720%2C1.35%2CP; s_ppn=Search%20-%20Tax%20Research%20Platform%20-%20IBFD; s_nr=1595405276724-Repeat; AMCV_2F71365B5AA928340A495E05%40AdobeOrg=1075005958%7CMCIDTS%7C18466%7CMCMID%7C89844361514127015796100074732098692046%7CMCOPTOUT-1595412476s%7CNONE%7CMCAID%7CNONE%7CvVersion%7C4.4.1; s_ppv=Search%2520-%2520Tax%2520Research%2520Platform%2520-%2520IBFD%2C100%2C75%2C21135.408203125%2C360%2C588%2C1280%2C720%2C1.35%2CL",
#"Cookie":"optimizelyEndUserId=oeu1575288961321r0.6329519991798649; optimizelyBuckets=%7B%7D; _ga=GA1.2.693353487.1575288963; _hjid=cfef24b9-ec0f-4677-90eb-6f25e4a4351e; s_ecid=MCMID%7C89844361514127015796100074732098692046; optimizelySegments=%7B%22785891794%22%3A%22gc%22%2C%22786481600%22%3A%22false%22%2C%22787091486%22%3A%22referral%22%7D; angular-ab-tests-ibfd_preview=no; _gid=GA1.2.1850576194.1594914732; IBFD_SESSION=IstWIRdSWRTYlHdmcgyqwaigHIGkr6Q3; AMCVS_2F71365B5AA928340A495E05%40AdobeOrg=1; _hjIncludedInSample=1; s_ppn=Search%20-%20Tax%20Research%20Platform%20-%20IBFD; s_ppvl=%5B%5BB%5D%5D; s_cc=true; wt_tags=SUB:kbase:masna.venu#thomsonreuters.com::username:; s_sq=%5B%5BB%5D%5D; _hjAbsoluteSessionInProgress=1; AMCV_2F71365B5AA928340A495E05%40AdobeOrg=1075005958%7CMCIDTS%7C18461%7CMCMID%7C89844361514127015796100074732098692046%7CMCOPTOUT-1594974899s%7CNONE%7CMCAID%7CNONE%7CvVersion%7C4.4.1; s_nr=1594967699992-Repeat; s_ppv=Search%2520-%2520Tax%2520Research%2520Platform%2520-%2520IBFD%2C100%2C91%2C3987%2C759%2C642%2C1280%2C720%2C1.35%2CP",
"Host":"research.ibfd.org",
"Referer":"https://research.ibfd.org/",
"Sec-Fetch-Dest":"empty",
"Sec-Fetch-Mode":"cors",
"Sec-Fetch-Site":"same-origin",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36"
}
I have got those headers from chrome developer tools. In those headers, cookie value is changing everyday. I want to get that cookie value by some script.
Can anyone help me to get that?
Thanks in advance.

Python POST request says "Not Acceptable!"

My school has a website that students can sign up for classes during a study period and I forget all the time. I'm trying to make a script that logs in and signs up for me. This is what I have:
import requests
payload = {
"username": "",
"pwd": "",
"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246"
}
with requests.Session() as s:
p = s.post('http://www.bloomingtonsouth.org/plus/login.php', data=payload)
print(p.text)
r = s.get('http://www.bloomingtonsouth.org/plus/student.php')
print(r.text)
The output: "Not Acceptable!Not Acceptable!An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security.
Not Acceptable!Not Acceptable!An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security."
What am I doing wrong?
Edit 1:
headers = {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "en-US,en;q=0.9",
"Cache-Control": "max-age=0",
"Connection": "keep-alive",
"Host": "bloomingtonsouth.org",
"Referer": "http://bloomingtonsouth.org/plus/login.php",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36"
}
url = "http://www.bloomingtonsouth.org/plus/login.php"
payload = {"username": "", "password": "", "submitted":"TRUE"}
post = requests.post(url, headers=headers, data=payload)
r = requests.get('http://www.bloomingtonsouth.org/plus/student.php')
print(r.text)
I tried this, using the headers I found the browser using and it still doesn't work.
User-agent needs to be a header, not in the payload.
login()
def login():
headers = headers - {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246"}
url = "http://www.bloomingtonsouth.org/plus/login.php"
payload = {"username": "", "pwd": ""}
post = requests.post(url, headers=headers, data=payload)
try this code for logging in and let me know if it doesn't work.
If you have the full url to login you could use this online tool that converts URL to python request:
https://curl.trillworks.com/
If you don't understand the website post full url please.

Why can't I open this file using python when it opens fine in my browser?

I can open this jpg link in my browser and it displays fine. But when I try and read it in python using the exact same headers it is unreadable.
PIL says "OSError: cannot identify image file <_io.BytesIO object at 0x00000145034C9EB8>"; and the file written to disk cannot be opened.
EDIT: I discovered if I remove the user-agent header then it works.....but strangely if I enter the address in chrome then it uses the exact same user-agent and does show the file in the browser.
import requests
from PIL import Image
import io
headers = {"Host": "i.telegraph.co.uk",
"Connection": "keep-alive",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36",
"Upgrade-Insecure-Requests": "1",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "en-US,en;q=0.8"}
link="http://i.telegraph.co.uk/multimedia/archive/02068/police-news_2068686b.jpg"
r=requests.get(link, headers=headers)
print(r.status_code)
print(r.headers)
print(len(r.content))
with open("c:/users/s/documents/temp123.jpg", "wb") as f:
f.write(r.content)
i=Image.open(io.BytesIO(r.content))

Categories

Resources