attach_file is not picking the absolute url although file exist. its able to pic internal url and send file but not the absolute url
email.attach_file("http://devuserapi.doctorinsta.com/static/pdfs/Imran_1066.pdf",mimetype="application/pdf")
this file opens when i copy paste the url in browser. what could be the issue.
Thanks in advance
attach_file takes a file from your filesystem, not a URL, so you have to use a local path to it
See https://docs.djangoproject.com/en/1.9/topics/email/
One, untested, possibility is to use the attach method instead and to download the file on the fly:
import urllib2
response = urllib2.urlopen("http://devuserapi.doctorinsta.com/static/pdfs/Imran_1066.pdf")
email.attach('IMran_1066.pdf',response.read(),mimetype="application/pdf")
It lacks error checking to make sure the file was downloaded, of course, and I haven't actually tried it myself, but that might be an alternative for you.
Related
I am using Python 3.8.12. I tried the following code to download files from URLs with the requests package, but got 'Unkown file format' message when opening the zip file. I tested on different zip URLs but the size of all zip files are 18KB and none of the files can be opened successfully.
import requests
file_url = 'https://www.censtatd.gov.
hk/en/EIndexbySubject.html?pcode=D5600091&scode=300&file=D5600091B2022MM11B.zip'
file_download = requests.get(file_url, allow_redirects=True, stream=True)
open(save_path+file_name, 'wb').write(file_download.content)
Zip file opening error message
Zip files size
However, once I updated the url as file_url = 'https://www.td.gov.hk/datagovhk_tis/mttd-csv/en/table41a_eng.csv' the code worked well and the csv file could be downloaded perfectly.
I try to use requests, urllib , wget and zipfile io packages, but none of them work.
The reason may be that the zip URL directs to both the zip file and a web page, while the csv URL directs to the csv file only.
I am really new to this field, could anyone help on it? Thanks a lot!
You might examine headers after sending HEAD request to get information regarding file, examining Content-Type allows you to reveal actual type of file
import requests
file_url = 'https://www.censtatd.gov.hk/en/EIndexbySubject.html?pcode=D5600091&scode=300&file=D5600091B2022MM11B.zip'
r = requests.head(file_url)
print(r.headers["Content-Type"])
gives output
text/html
So file you have URL to is actually HTML page.
import wget
url = 'https://www.censtatd.gov.hk/en/EIndexbySubject.html?
pcode=D5600091&scode=300&file=D5600091B2022MM11B.zip'
#url = 'https://golang.org/dl/go1.17.3.windows-amd64.zip'
wget.download(url)
this is an URL example "https://procurement-notices.undp.org/view_file.cfm?doc_id=257280"
if you put it in the browser a file will start downloading in your system.
I want to download this file using python and store it somewhere on my computer
this is how tried
import requests
# first_url = 'https://readthedocs.org/projects/python-guide/downloads/pdf/latest/'
second_url="https://procurement-notices.undp.org/view_file.cfm?doc_id=257280"
myfile = requests.get(second_url , allow_redirects=True)
# this works for the first URL
# open('example.pdf' , 'wb').write(myfile.content)
# this did't work for both of them
# open('example.txt' , 'wb').write(myfile.content)
# this works for the second URL
open('example.doc' , 'wb').write(myfile.content)
first: if I put the first_url in the browser it will download a pdf file, putting second_url will download a .doc file How can I know what type of file will the URL give to us or what type of file will be downloaded so that I use the correct open(...) method?
second: If I use the second URL in the browser a file with the name "T__proc_notices_notices_080_k_notice_doc_79545_770020123.docx" starts downloading. how can I know this file name when I try to download the file?
if you know any better solution kindly let me know for the implementation.
kindly have a quick look at Downloading Files from URLs and zip downloaded files in python question aswell
myfile.headers['content-type'] will give you the MIME-type of the URL's content and myfile.headers['content-disposition'] gives you info like filename etc. (if the response contains this header at all)
you can use response headers content-type like for first url it is application/pdf and sencond url for is application/msword you save file according to it. you can make extension dictinary where you can store possible file format and their types and match with it. your second question is also same like this one so i am taking your two urls from that question and for file name i am using just integers
all_Urls = ['https://omextemplates.content.office.net/support/templates/en-us/tf16402488.dotx' ,
'https://procurement-notices.undp.org/view_file.cfm?doc_id=257280']
extension_dict = {'application/vnd.openxmlformats-officedocument.wordprocessingml.document':'.docx',
'application/vnd.openxmlformats-officedocument.wordprocessingml.template':'.dotx',
'application/vnd.ms-word.document.macroEnabled.12':'.docm',
'application/vnd.ms-word.template.macroEnabled.12':'.dotm',
'application/pdf':'.pdf',
'application/msword':'.doc'}
for i,url in enumerate(all_Urls):
resp = requests.get(url)
response_headers = resp.headers
file_extension = extensio_dict[response_headers['Content-Type']]
with open(f"{i}.{file_extension}",'wb') as f:
f.write(resp.content)
for MIME-Type see this answer
I have been trying wget recently instead of requests and it is pretty straightforward and easy to use but I have been having a problem with a specific link.
When I try to download a png image from wikipedia for some reason wget.download keeps raising an IndexError when trying to write to the file this error in specific:
wget.download(url, f"C:/Users/Family/Pictures/downloads/{name}")
File "C:\Users\Family\AppData\Local\Programs\Python\Python38\lib\site-packages\wget.py", line 527, in download
filename = detect_filename(url, out, headers)
File "C:\Users\Family\AppData\Local\Programs\Python\Python38\lib\site-packages\wget.py", line 486, in detect_filename
names["headers"] = filename_from_headers(headers) or ''
File "C:\Users\Family\AppData\Local\Programs\Python\Python38\lib\site-packages\wget.py", line 258, in filename_from_headers
name = fnames[0].split('=')[1].strip(' \t"')
IndexError: list index out of range
I tried setting a specific filename but it still didn't work. When I use wget in my cmd it does not seem to have a problem with the url, so how can I fix this?
import wget
# This is the link to the image
url = "https://upload.wikimedia.org/wikipedia/commons/thumb/b/b6/Image_created_with_a_mobile_phone.png/1200px-Image_created_with_a_mobile_phone.png"
# It does not seem to have a problem detecting the filename
name = wget.detect_filename(url)
# I tried to set a specific filename but I still got the error with ot whout it
wget.download(url, f"C:/Users/Family/Pictures/downloads/{name}")
I have the same case as you do.
When I did some research of the reply header,
in my case if the URL header does not contain 'content-disposition': 'attachment; filename="xxx"', it will throw this error, hope this help.
I'm trying to download a PDF file using flask but I don't want the file to download as an attachment. I simply want it to appear in the user's browser as a separate webpage. I've tried passing the as_attachment=False option to the send_from_directory method but no luck.
Here is my function so far:
#app.route('/download_to_browser')
def download_to_browser(filename):
return send_from_directory(directory=some_directory,
filename=filename,
as_attachment=False)
The function works in the sense that the file is downloading to my computer but I'd much rather just display it in the browser (and let the user download the file if they want to).
I read here I need to change thecontent-disposition parameter but I'm not sure how that can be done efficiently (perhaps using a custom response?). Any help?
Note: I'm not using Flask-Uploads at the moment but I probably will down the line.
You can try to add the mimetype parameter to send_from_directory:
return send_from_directory(directory=some_directory,
filename=filename,
mimetype='application/pdf')
That works for me, at least with Firefox.
If you need more control over headers, you can use a custom response, but you will lose the advantages of send_file() (I think it does something clever to serve the file directly from the webserver.)
with open(filepath) as f:
file_content = f.read()
response = make_response(file_content, 200)
response.headers['Content-type'] = 'application/pdf'
response.headers['Content-disposition'] = ...
return response
I'm still pretty new to scripting. I'm trying to figure out a way to output a list of URL after the redirect has occurred. I have about 800 sites in a text file that I want to test for a redirect using a python script and output the final redirect to a file (on it's own line). Is this possible?
With the file open, I can't figure out how to make urllib2.urlopen() read a line in a text file. It seems to require a URL? Maybe there is another module or something else I should be using instead?
Please help.
Thanks!
I'd use the requests library:
import requests
with open('urls.txt') as url_file:
for url in url_file:
resp = requests.get(url.strip())
print resp.url