Why request fails to download an excel file from web? - python

the url link is the direct link to a web file (xlsb file) which I am trying to downlead. The code below works with no error and the file seems created in the path but once I try to open it, corrupt file message pops up on excel. The response status is 400 so it is a bad request. Any advice on this?
url = 'http://rigcount.bakerhughes.com/static-files/55ff50da-ac65-410d-924c-fe45b23db298'
file_name = r'local path with xlsb extension'
with open(file_name, "wb") as file:
response = requests.request(method="GET", url=url)
file.write(response.content)

Seems working for me. Try this out:
from requests import get
url = 'http://rigcount.bakerhughes.com/static-files/55ff50da-ac65-410d-924c-fe45b23db298'
# make HTTP request to fetch data
r = get(url)
# check if request is success
r.raise_for_status()
# write out byte content to file
with open('out.xlsb', 'wb') as out_file:
out_file.write(r.content)

Related

Can't download csv.gz from website using Python

I'm currently trying to download a csv.gz file from the following link: https://www.cryptoarchive.com.au/bars/pair. As you can see, opening the link with a browser simply opens the save file dialogue. However, passing the link to requests or urllib simply downloads HTML as opposed to the actual file.
This is the current approach I'm trying:
EDIT: Updated to reflect changes I've made.
url = "https://www.cryptoarchive.com.au/bars/pair"
file_name = "test.csv.gz"
headers = {"PLAY_SESSION": play_session}
r = requests.get(url, stream=True, headers=headers)
with open(file_name, "wb") as f:
for chunk in r.raw.stream(1024, decode_content=False):
if chunk:
f.write(chunk)
f.flush()
The only saved cookie I can find is the PLAY_SESSION. Setting that as a header doesn't change the result I'm getting.
Further, I've tried posting a request to the login page like this:
login = "https://www.cryptoarchive.com.au/signup"
data = {"email": email,
"password": password,
"accept": "checked"}
with requests.Session() as s:
p = s.post(login, data=data)
print(p.text)
However, this also doesn't seem to work and I especially can't figure out what to pass to the login page or how to actually check the checkbox...
Just browsing that url from a private navigation shows the error:
Please login/register first.
To get that file, you need to login first into the site. Probably with the login you will get a session token, some cookie or something similar that you need to put in the request command.
Both #Daniel Argüelles' and #Abhyudaya Sharma's answer have helped me. The solution was simply getting the PLAY_SESSION cookie after logging into the website and passing it to the request function.
cookies = {"PLAY_SESSION": play_session}
url = "https://www.cryptoarchive.com.au/bars/pair"
r = requests.get(url, stream=True, cookies=cookies)
with open(file_name, "wb") as f:
for chunk in r.raw.stream(1024, decode_content=False):
if chunk:
f.write(chunk)
f.flush()

How can I send local html file to Telegram with python bot

this is my code
file_address = './charts/candle_chart.html'
response = bot.sendDocument(chat_id=chat_id, document=file_address, timeout=timeout)
but i got Exception with host not found error
telegram think this file in the web
can anybody help me!
Documentation of telegram-bot says:
The document argument can be either a file_id, an URL or a file from disk: open(filename, 'rb')
So it must be something like this:
html_file = open('./charts/candle_chart.html', 'rb')
response = bot.sendDocument(chat_id=chat_id, document=html_file, timeout=timeout)
For more information, see documentation:

Downloading a .csv file from a response URL (which contains s3 bucket credentials) using python

I am trying to download a .csv file from a url as shown below:
http://s3.amazonaws.com/bucket_name/path/to/file/filename?AWSAccessKeyId=Key&Expires=15&Signature=SomeCharacters
The URL is a response which I get from a webservice using python and I can't hard-code the aws credentials since the Signature changes every call and the url remains active for only 15 seconds. I tried the below:
s3.meta.client.download_file('bucketname', url, 'localpath')
and also
r = requests.get(url, auth=('username','password'), verify=False, stream=True)
r.raw.decode_content = True
with open("filename", 'wb') as f:
shutil.copyfileobj(r.raw, f)
But the above methods are not working when it comes to invoking the URL directly. Can someone please suggest something?
Thanks,
Milind

How to upload a pdf by sending a POST Request to an API

I have tried to upload a pdf by sending a POST Request to an API in R and in Python but I am not having a lot of success.
Here is my code in R
library(httr)
url <- "https://envoc-apply-api.azurewebsites.net/api/apply"
POST(url, body = upload_file("filename.pdf"))
The status I received is 500 when I want a status of 202
I have also tried with the exact path instead of just the filename but that comes up with a file does not exist error
My code in Python
import requests
url ='https://envoc-apply-api.azurewebsites.net/api/apply'
files = {'file': open('filename.pdf', 'rb')}
r = requests.post(url, files=files)
Error I received
FileNotFoundError: [Errno 2] No such file or directory: 'filename.pdf'
I have been trying to use these to guides as examples.
R https://cran.r-project.org/web/packages/httr/vignettes/quickstart.html
Python http://requests.readthedocs.io/en/latest/user/quickstart/
Please let me know if you need any more info.
Any help will be appreciated.
You need to specify a full path to the file:
import requests
url ='https://envoc-apply-api.azurewebsites.net/api/apply'
files = {'file': open('C:\Users\me\filename.pdf', 'rb')}
r = requests.post(url, files=files)
or something like that: otherwise it never finds filename.pdf when it tries to open it.

Write PDF file from URL using urllib2

I'm trying to save a dynamic pdf file generated from a web server using python's module urllib2.
I use following code to get data from server and to write that data to a file in order to store the pdf in a local disk.:
import urllib2
import cookielib
theurl = 'https://myweb.com/?pdf&var1=1'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders.append(('Cookie', cookie))
request = urllib2.Request(theurl)
print("... Sending HTTP GET to %s" % theurl)
f = opener.open(request)
data = f.read()
f.close()
opener.close()
FILE = open('report.pdf', "w")
FILE.write(data)
FILE.close()
This code runs well but the written pdf file is not well recognized by adobe reader. If I do the request manually using firefox, I have no problems to receive the file and I can visualize it withouut problems.
Comparing the received http headers (firefox and urrlib) the only difference is a http header field called "Transfer-Encoding = chunked". This field is received in firefox but it seems that is not received when I do the urllib request.
Any suggestion?
Try changing,
FILE = open('report.pdf', "w")
to
FILE = open('report.pdf', "wb")
The extra 'b' indicates to write in binary mode. Currently you are writing a binary file in ASCII/text mode.

Categories

Resources