so i found this code that lets you upload a file from a direct link to google drive using google colab. but i have to edit the code each time i want to add a url to upload to google drive.
can anyone fix the code so that i can enter the url as a form instead of editing the code and maybe so that i can use the form to manually name the file. or auto naming would be fine. like "1.mp4" "2.mp4" and so on.
this is the code
import requests
file_url = "http://1.droppdf.com/files/5iHzx/automate-the-boring-stuff-with-python-2015-.pdf"
r = requests.get(file_url, stream = True)
with open("/content/gdrive/My Drive/python.pdf", "wb") as file:
for block in r.iter_content(chunk_size = 1024):
if block:
file.write(block)
You can make the file URL a form parameter by adding ##param string to the line:
file_url = "http://1.droppdf.com/files/5iHzx/automate-the-boring-stuff-with-python-2015-.pdf" ##param string
Related
from Google Colab, I am trying to create a df from a xlsx file I have on a Github repo.
As url I take the permalink from Github, the repo is public and account in connected to Colab
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\n\n\n\n\n\n<!'
Thank you in advance for your help!
Maybe the problem is due to the URL that you are using.
You should try to do this to see what is returned by request.get.
url = "https://github.com/your-user-name/your-repo-name/blob/main/data/raw/your-file-name.xlsx"
import requests
from pprint import pprint
response = requests.get(url)
pprint(response.content)
It is an HTML page. This is not what you want.
There are a couple of things you can do to solve this. This medium post here might be useful.
However, one simple thing is to use an URL like the example below:
https://raw.githubusercontent.com/your-username/name-of-the-repository/master/name-of-the-file.xlsx
I've already tried this and it works.
import requests
import pandas as pd
url = "https://raw.githubusercontent.com/your-username/name-of-the-repository/master/name-of-the-file.xlsx"
response = requests.get(url)
dest = 'local-file.xlsx'
with open(dest, 'wb') as file:
file.write(response.content)
frame = pd.read_excel(dest)
frame.head()
Conclusion: change your URL.
Please use link from "view raw". for my file I use below url
url = 'https://github.com/mehadisaki/Sales-Forecasting-model-development-/blob/main/TV%20Delivery_2016-2022.xlsx?raw=true'
db=pd.read_excel(url)
With Google Colab one thing you could do is use the wget command, like this.
!wget "https://raw.githubusercontent.com/your-username/name-of-the-repository/master/name-of-the-file.xlsx"
I am downloading some files from the FAO GAEZ database, which uses HTTP POST based login from.
I am thus using the requests module. Here is my code:
my_user = "blabla"
my_pass = "bleble"
site_url = "http://www.gaez.iiasa.ac.at/w/ctrl?_flow=Vwr&_view=Welcome&fieldmain=main_lr_lco_cult&idPS=0&idAS=0&idFS=0"
file_url = "http://www.gaez.iiasa.ac.at/w/ctrl?_flow=VwrServ&_view=AAGrid&idR=m1ed3ed864793f16e83ba9a5a975066adaa6bf1b0"
with requests.Session() as s:
s.get(site_url)
s.post(site_url, data={'_username': 'my_user', '_password': 'my_pass'})
r = s.get(file_url)
if r.ok:
with open(my_path + "\\My file.zip", "wb") as c:
c.write(r.content)
However, with this procedure I download the HTML of the page.
I suspect that to solve the problem I have to add the name of the zip file to the url, i.e. new_file_url = file_url + "/file_name.zip". The problem is that I don't know the "file_name". I've tried with the name of the file which I obtain when I download it manually, but it does not work.
Any of idea on how to solve this? If you need more details on GAEZ website, see also: Python - Login and download specific file from website
I am trying to have my server, in python 3, go grab files from URLs. Specifically, I would like to pass a URL into a function, I would like the function to go grab an audio file(of many varying formats) and save it as an MP3, probably using ffmpeg or ffmpy. If the URL also has a PDF, I would also like to save that, as a PDF. I haven't done much research on the PDF yet, but I have been working on the audio piece and wasn't sure if this was even possible.
I have looked at several questions here, but most notably;
How do I download a file over HTTP using Python?
It's a little old but I tried several methods in there and always get some sort of issue. I have tried using the requests library, urllib, streamripper, and maybe one other.
Is there a way to do this and with a recommended library?
For example, most of the ones I have tried do save something, like the html page, or an empty file called 'file.mp3' in this case.
Streamripper received a try changing user agents error.
I am not sure if this is possible, but I am sure there is something I'm not understanding here, could someone point me in the right direction?
This isn't necessarily the code I'm trying to use, just an example of something I have used that doesn't work.
import requests
url = "http://someurl.com/webcast/something"
r = requests.get(url)
with open('file.mp3', 'wb') as f:
f.write(r.content)
# Retrieve HTTP meta-data
print(r.status_code)
print(r.headers['content-type'])
print(r.encoding)
**Edit
import requests
import ffmpy
import datetime
import os
## THIS SCRIPT CAN BE PASSED A URL AND IF THE URL RETURNS
## HTTP HEADER FOR CONTENT TYPE AUDIO/MPEG, THE FILE WILL
## BE SAVED AS THE CURRENT-DATE-AND-TIME.MP3
##
## THIS SCRIPT CAN BE PASSED A URL AND IF THE URL RETURNS
## HTTP HEADER FOR CONTENT TYPE application/pdf, THE FILE WILL
## BE SAVED AS THE CURRENT-DATE-AND-TIME.PDF
##
## THIS SCRIPT CAN BE PASSED A URL AND IF THE URL RETURNS
## HTTP HEADER FOR CONTENT TYPE other than application/pdf, OR
## audio/mpeg, THE FILE WILL NOT BE SAVED
def BordersPythonDownloader(url):
print('Beginning file download requests')
r = requests.get(url, stream=True)
contype = r.headers['content-type']
if contype == "audio/mpeg":
print("audio file")
filename = '[{}].mp3'.format(str(datetime.datetime.now()))
with open('file.mp3', 'wb+') as f:
f.write(r.content)
ff = ffmpy.FFmpeg(
inputs={'file.mp3': None},
outputs={filename: None}
)
ff.run()
if os.path.exists('file.mp3'):
os.remove('file.mp3')
elif contype == "application/pdf":
print("pdf file")
filename = '[{}].pdf'.format(str(datetime.datetime.now()))
with open(filename, 'wb+') as f:
f.write(r.content)
else:
print("URL DID NOT RETURN AN AUDIO OR PDF FILE, IT RETURNED {}".format(contype))
# INSERT YOUR URL FOR TESTING
# OR CALL THIS SCRIPT FROM ELSEWHERE, PASSING IT THE URL
#DEFINE YOUR URL
#url = 'http://archive.org/download/testmp3testfile/mpthreetest.mp3'
#CALL THE SCRIPT; PASSING IT YOUR URL
#x = BordersPythonDownloader(url)
#ANOTHER EXAMPLE WITH A PDF
#url = 'https://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst6500/ios/12-2SY/configuration/guide/sy_swcg/etherchannel.pdf'
#x = BordersPythonDownloader(url)
Thanks Richard, this code works and helps me understand this better. Any suggestions for improving the above working example?
I am writing an application that creates a midi file using the MIDIUtil library. When the user submits an HTML form, a midi file object is created with MIDIUtil. How do I allow the user to download this as a .mid file? I have tried the following code, but I end up downloading a file of 0 bytes.
return Response(myMIDIFile, mimetype='audio/midi')
I use a variant of the following code to allow my users to download images they generate. The below code should work for you. Please note that you will most likely need to specify the full server path to the file being downloaded.
from flask import send_file
download_filename = FULL_PATH_TO_YOUR_MIDI_FILE
return(send_file(filename_or_fp = download_filename,mimetype="audio/midi",as_attachment=True))
I ended up using this, and it worked.
new_file = open('test.mid', 'wb')
myMIDI.writeFile(new_file)
new_file.close()
new_file = open('test.mid', 'rb')
return send_file(new_file, mimetype='audio/midi')
Might want to just try using send_file
from flask import send_file
return send_file("yourmidifile.mid", as_attachement=True, mimetype="audio\midi")
I'm currently creating an app thats supposed to take a input in form of a url (here a PDF-file) and recognize this as a PDF and then upload it to a tmp folder i have on a server.
I have absolutely no idea how to proceed with this. I've already made a form which contains a FileField which works perfectly, but when it comes to urls i have no clue.
Thank you for all answers, and sorry about the lacking english skills.
The first 4 bytes of a pdf file are %PDF so you could just download the first 4 bytes from that url and compare them to %PDF. If it matches, then download the whole file.
Example:
import urllib2
url = 'your_url'
req = urllib2.urlopen(url)
first_four_bytes = req.read(4)
if first_four_bytes == '%PDF':
pdf_content = urllib2.urlopen(url).read()
# save to temp folder
else:
# file is not PDF