Django FileResponse don't download HTML file - python

I'm trying to download an pre-generated HTML file, but all that i've tried doesn't work.
Searching StackOverflow i found that return FileResponse(open(file_path, 'rb')) will download a file, but intead of download, the HTML just is rendered on the tab. I think the problem is the browser receive the HTML and instead of display the "Save as" dialog just render it to the current tab.
In my main template i have a form with target="_blank" tag, and a submit button that open (without AJAX) a new tab who suposed to download automatically the file.
What i want: After i submit the code a new tab appears, the view related to that URL do some code (not related to the download) and after that process (that is working fine) download an HTML file to the device. The HTML exists and don't have any problem, te only problem it's that i want to DOWNLOAD the file but the browser display it instead.
Note 1: Yes, i know that with right clic -> download i can download the HTML that i see, but this system is for non IT people, i need to do it the easest way possible.
Note 2: I put the without AJAX message because i've found on another post that FileResponsive don't word from AJAX.

You should put special header in your response
Content-Disposition: attachment; filename="cool.html"
response = FileResponse(open(file_path, 'rb'))
response['Content-Disposition'] = 'attachment; filename="cool.html"'
return response

Related

Download Excel File via button on website with Python

I am currently working on a code that downloads an excel file from a website. The respective file is actually hidden behind an Export button (see website: https://www.amundietf.co.uk/retail/product/view/LU1437018838). However, I have already identified the link behind which is the following: https://www.amundietf.co.uk/retail/ezaap/service/ExportDataProductSheet/Fwd/en-GB/718/LU1437018838/object/export/repartition?idFonctionnel=export-repartition-pie-etf&exportIndex=3&hasDisclaimer=1. Since the link does not directly guide to the file but rather executes some Java widget, I am not able to download the file via python. I have tried the folling code:
import re
import requests
link = 'https://www.amundietf.co.uk/retail/ezaap/service/ExportDataProductSheet/Fwd/en-GB/718/LU1437018838/object/export/repartition?idFonctionnel=export-repartition-pie-etf&exportIndex=3&hasDisclaimer=1'
r = requests.get(link, verify= False)
However, I am not able to connect to the file. Does somebody has an idea for doing this?
I would recommend using HTML:
<html lang=en>
<body>
Click here to download
</body>
</html>
In the href attribute to tag, you can put the path to your own excel file. I used an external link to an example file I found on google. To open in new tab, use target="_blank" as attribute to .
Hope it works!

trying to download full HTML pages

I am tring to download few hundreds of HTML pages in order to parse them and calculate some measures.
I tried it with linux WGET, and with a loop of the following code in python:
url = "https://www.camoni.co.il/411788/168022"
html = urllib.request.urlopen(url).read()
but the html file I got doen't contain all the content I see in the browser in the same page. for example text I see on the screen is not found in the HTML file. only when I right click the page in the browser and "Save As" i get the full page.
the problem - I need a big anount of pages and can not do it by hand.
URL example - https://www.camoni.co.il/411788/168022 - thelast number changes
thank you
That's because that site is not static. It uses JavaScript (in this example jQuery lib) to fetch additional data from server and paste on page.
So instead of trying to GET raw HTML you should inspect requests in developer tools. There's a POST request on https://www.camoni.co.il/ajax/tabberChangeTab with such data:
tab_name=tab_about
memberAlias=ד-ר-דינה-ראלט-PhD
currentURL=/411788/ד-ר-דינה-ראלט-PhD
And the result is HTML that pasted on page after.
So instead of trying to just download page you should inspect page and requests to get data or use headless browser such as Google Chrome to emulate 'Save As' button and save data.

Automatically Upload CSV to Webapp Upload Form

How can I make a bot which will automatically go to a file upload form on my webapp at a public link, click the upload a file button, select a file and submit it?
For example, I have a file called "stored.csv" on my desktop, and I have a webapp with an upload for that looks like this:
All I'm trying to do is have a script which can grab that stored.csv file, go to the public link (http://website.com/upload/) that takes you to this page and then submit the file so that it all happens automatically when the script is run.
It would be much easier to send post request that button does right away.
All you need to do is to:
On that page open dev. tools in your browser (F12 most likely)
In appeared window click on the "Network" tab
Then leaving this window opened, choose any file and click "submit"
New record will appear at the end in "Network" tab containing information about the request that was made
Then knowing the request that you need to make, you can easily implement it in python:
import requests as req
url = "Url that you will acquire"
data = {
"smth" : "path/to/file" # just copy the body from the known request
}
res = req.post(url=url, data=data)
print(res.status)
And that's it.
There is some stuff that you'll need to figure out by your own, but now you got a map.
Hope this will help!

How to download a file using python that is sent after some delay by server?

I have to download a large number of files from a local server. When opening the URL in the browser[Firefox], the page opens with content "File being generated.. Wait.." and then the popup comes up with the option to save the required .xlsx file.
I tried to save the page object using urllib, but it saves the .html file with the content as "File being generated.. Wait..". I used the code as described here (using urllib2):
How do I download a file over HTTP using Python?
I don't know how to download the file that is sent later by the server. It works fine in browser. How to emulate it using python?
first of all you have to know the exact URL where the document is generated. You can use firefox and the addons Http Live Headers.
And then use python to "simulate" the same request.
I hope that help.
PD: or share the url of the site and then I could help to you better.
import requests
url = 'https://readthedocs.org/projects/python-guide/downloads/pdf/latest/'
myfile = requests.get(url, allow_redirects=True)
open('c:/example.pdf', 'wb').write(myfile.content)
A bit old but faced the same problem.
The key to solution is in allow_redirects=True.
Is it as simple as
import urllib2
import time
response = urllib2.urlopen('http://www.example.com/')
time.sleep(10) # Or however long you need.
html = response.read()

How to get the pdf file that downloads when I click 'submit' which also redirects me to new page

I am using mechanize to automatically download some pdf documents from webpages. When there is a pdf icon on the page, I can do this to get the file:
b.find_link(text="PDF download")
req = b.click_link(text="PDF download")
b.open(req)
Then I just write it to a new file.
However, for some of the documents I need, there is no direct 'PDF download' link on the page. Instead I have to click a 'submit' button to make a "delivery request" for the document: after clicking this button, the download starts happening while I am taken to another page which says "delivery request in progress" and then, once the download has finished, " Your delivery request is complete".
I have tried using mechanize to click the submit button, and then save the file that downloads by doing this:
b.select_form(nr=0)
b.submit()
downloaded_file = b.response().read()
but this stores the html of the page I am redirected to, not the file that downloads.
How do I get the file that downloads after I click 'submit'?
For anyone with a similar problem, I found a workaround: mechanize emulates a browser that doesn't have JavaScript so I turned that off on my browser too, then when I went to the download page I could see a link that said 'if the download hasn't already started, click here to download'. Then I could just get mechanize to find that link and follow it in the normal way- and write the response to a new file.

Categories

Resources