Download file from Django Project root using a button - python

So, this is the webpage I'm creating atm with Django 1.8:
Want the user to be able to export the data as .csv.
When the user:
writes a subreddit name in the box
presses the button 'Get Data'
What happens:
it's created a test.csv (saved in the root of the project)
data is retrieved using Praw
data is inserted into the .csv
data is rendered for the users to see
The problem now is:
I want the button with 'Export to Excel', to download the generated file from the root of the Django project.
This is for the button:
<form class="export_excel" id="login_form" action="/app/export">
{% csrf_token %}
<button class="btn btn-lg btn-primary btn-block" value="Export to Excel" type="submit">Export To Excel</button>
</form>
This is in app/views.py:
def export(request):
filename = "test.csv" # this is the file people must download
response['Content-Disposition'] = 'attachment; filename=' + filename
response['Content-Type'] = 'application/vnd.ms-excel; charset=utf-16'
return response
This is in app/urls.py:
# app/urls.py
from django.conf.urls import url
from . import views
# Create your urls here.
urlpatterns = [
(...)
url(r'^export/$', views.export, name='export')
]
This is the error I'm getting when clicking the button:
Question is: How can I make the user export the file using the button? What am I doing wrong?
Thanks in advance for your help / guidance
Handy links:
Link 1
Link 2
Link 3
Link 4

You must first create the response object in order to assign headers to it.
def export(request):
filename = "test.csv" # this is the file people must download
with open(filename, 'rb') as f:
response = HttpResponse(f.read(), content_type='application/vnd.ms-excel')
response['Content-Disposition'] = 'attachment; filename=' + filename
response['Content-Type'] = 'application/vnd.ms-excel; charset=utf-16'
return response
Taken from here

Related

Read from uploaded XLSX file in Python CGI script using Pandas

I am creating a tool where either
A new XLSX file is generated for the user to download
The user can upload an XLSX file they have, I will read the contents of that file, aand use them to generate a new file for the user to download.
I would like to make use of Pandas to read the XLSX file into a dataframe, so I can work with it easily. However, I can't get it working. Can you help me?
Example extract from CGI file:
import pandas as pd
import cgi
from mako.template import Template
from mako.lookup import TemplateLookup
import http.cookies as Cookie
import os
import tempfile
import shutil
import sys
cookie = Cookie.SimpleCookie(os.environ.get("HTTP_COOKIE"))
method = os.environ.get("REQUEST_METHOD", "GET")
templates = TemplateLookup(directories = ['templates'], output_encoding='utf-8')
if method == "GET": # This is for getting the page
template = templates.get_template("my.html")
sys.stdout.flush()
sys.stdout.buffer.write(b"Content-Type: text/html\n\n")
sys.stdout.buffer.write(
template.render())
if method == "POST":
form = cgi.FieldStorage()
print("Content-Type: application/vnd.ms-excel")
print("Content-Disposition: attachment; filename=NewFile.xlsx\n")
output_path = "/tmp/" + next(tempfile._get_candidate_names()) + '.xlsx'
data = *some pandas dataframe previously created*
if "editfile" in form:
myfilename = form['myfile'].filename
with open(myfilename, 'wb') as f:
f.write(form['myfile'].file.read())
data = pd.read_excel(myfilename)
data.to_excel(output_path)
with open(path, "rb") as f:
sys.stdout.flush()
shutil.copyfileobj(f, sys.stdout.buffer)
Example extract from HTML file:
<p>Press the button below to generate a new version of the xlsx file</p>
<form method=post>
<p><input type=submit value='Generate new version of file' name='newfile'>
<div class="wrapper">
</div>
</form>
<br>
<p>Or upload a file.</p>
<p>In this case, a new file will be created using the contents of this file.</p>
<form method="post" enctype="multipart/form-data">
<input id="fileupload" name="myfile" type="file" />
<input value="Upload and create new file" name='editfile' type="submit" />
</form>
This works without the if "editfile" in form: bit so I know something is going wrong when I am trying to access the file that the user has uploaded.
The problem is that whilst a file is created, the created file has a file size of 0 KB and will not open in Excel. Crucially, the file that the user has uploaded can not be found in the location that I have written it out.
You've passed myfilename to pandas; however that file doesn't exist on the server yet. You'll have to save the file somewhere locally first before using it.
The following will download the file to the current directory (same directory as the CGI script). Of course, you're welcome to save it to some more suitable directory, depending on your setup.
form = cgi.FieldStorage()
myfilename = form['myfile'].filename
with open(myfilename, 'wb') as f: # Save the file locally
f.write(form['myfile'].file.read())
data = pd.read_excel(myfilename)

Problem with PDF download that I cannot open

I am working on a script to extract text from law cases using https://case.law/docs/site_features/api. I have created methods for search and create-xlsx, which work well, but I am struggling with the method to open an online pdf link, write (wb) in a temp file, read and extract the data (core text), then close it. The ultimate goal is to use the content of these cases for NLP.
I have prepared a function (see below) to download the file:
def download_file(file_id):
http = urllib3.PoolManager()
folder_path = "path_to_my_desktop"
file_download = "https://cite.case.law/xxxxxx.pdf"
file_content = http.request('GET', file_download)
file_local = open( folder_path + file_id + '.pdf', 'wb' )
file_local.write(file_content.read())
file_content.close()
file_local.close()
The script works well as it download the file and it created on my desktop, but, when I try to open manually the file on the desktop I have this message from acrobat reader:
Adobe Acrobat Reader could not open 'file_id.pdf' because it is either not a supported file type or because the file has been damager (for example, it was sent as a email attachments and wasn't correctly decoded
I thought it was the Library so I tried with Requests / xlswriter / urllib3... (example below - I also tried to read it from the script to see whether it was Adobe that was the issue, but apparently not)
# Download the pdf from the search results
URL = "https://cite.case.law/xxxxxx.pdf"
r = requests.get(URL, stream=True)
with open('path_to_desktop + pdf_name + .pdf', 'w') as f:
f.write(r.text)
# open the downloaded file and remove '<[^<]+?>' for easier reading
with open('C:/Users/amallet/Desktop/r.pdf', 'r') as ff:
data_read = ff.read()
stripped = re.sub('<[^<]+?>', '', data_read)
print(stripped)
the output is:
document.getElementById('next').value = document.location.toString();
document.getElementById('not-a-bot-form').submit();
with 'wb'and 'rb' instead (and removing the *** stripped *** the sript is:
r = requests.get(test_case_pdf, stream=True)
with open('C:/Users/amallet/Desktop/r.pdf', 'wb') as f:
f.write(r.content)
with open('C:/Users/amallet/Desktop/r.pdf', 'rb') as ff:
data_read = ff.read()
print(data_read)
and the output is :
<html>
<head>
<noscript>
<meta http-equiv="Refresh" content="0;URL=?no_js=1&next=/pdf/7840543/In%20re%20the%20Extradition%20of%20Garcia,%20890%20F.%20Supp.%20914%
20(1994).pdf" />
</noscript>
</head>
<body>
<form method="post" id="not-a-bot-form">
<input type="hidden" name="csrfmiddlewaretoken" value="5awGW0F4A1b7Y6bx
rYBaA6GIvqx4Tf6DnK0qEMLVoJBLoA3ZqOrpMZdUXDQ7ehOz">
<input type="hidden" name="not_a_bot" value="yes">
<input type="hidden" name="next" value="/pdf/7840543/In%20re%20
the%20Extradition%20of%20Garcia,%20890%20F.%20Supp.%20914%20(1994).pdf" id="next">
</form>
<script>
document.getElementById(\'next\').value = document.loc
ation.toString();
document.getElementById(\'not-a-bot-form\').submit();
</script>
<a href="?no_js=1&next=/pdf/7840543/In%20re%20the%20Extradition%20of%20Garcia,%2
0890%20F.%20Supp.%20914%20(1994).pdf">Click here to continue</a>
</body>
</html>
but none are working. The pdf is not protected by a password, and I tried on other website and it doesn't work either.
Therefore, I am wondering whether I have another issue that is not link to the code itself.
Please let me know if you need additional information.
thank you
It looks like instead of the PDF the web server is providing you with a web page intended to prevent bots from downloading data from the site.
There is nothing wrong with your code, but if you still want to do this you'll have to work around the bot prevention of the website.

Flask - Empty Files When Uploaded

I have a project which allows users to upload files, they will eventually get processed and then serve a result file.
I've just realised, now that I am trying to implement the processing part of the system, that the files are empty when uploaded and I'm not sure where this is going wrong. The files are successfully named and put in the directories, but are empty.
Any help would be much appreciated.
The html form for uploads:
<div class="tile is-vertical is-parent coach" id='fileUploadTile'>
<div class="tile is-child box has-background-light">
<form action='/' method='POST' enctype="multipart/form-data">
<div class='drop-zone'>
<span class='drop-zone__prompt is-family-monospace'>Drag and drop or click here to upload files</span>
<input class="drop-zone__input" type="file" multiple name="uploaded_file" id='fileLoader'>
</div> <!-- end of drop-zone-->
<div class='buttons are-medium' id='fileButtons'>
<input type='submit' class="button is-success is-light is-outlined is-family-monospace" id='submitFiles' value='Process file(s)'>
</form>
<button class='button is-danger is-light is-outlined is-family-monospace' name='resetFiles' id='resetFiles'>Reset File(s)</button>
</div> <!-- end of buttons -->
</div> <!-- end of tile is-child -->
</div> <!-- end of tile is-vertical -->
The code which attempts to validate the files (check extensions, file size, name, etc) and then save:
def file_upload():
if session:
session_name = session.get('public_user')
print("New request from " + str(session_name))
# update when last request was sent
active_sessions[session_name] = datetime.now()
else:
session_token = generate_session_token()
print("Generating new session... " + session_token)
session['public_user'] = session_token # session = key, last request = value
active_sessions[session_token] = datetime.now()
os.mkdir('uploads/'+session['public_user']) #create directory for uploaded files
if request.method == "POST":
if request.files:
files_processed = True
files = request.files.getlist("uploaded_file")
upload_path = app.config['UPLOAD_FOLDER'] + \
str(session['public_user'])
# loop, as possibility of multiple file uploads
for file_to_upload in files:
file_to_upload.seek(0, os.SEEK_END)
file_length = file_to_upload.tell()
file_name = check_existing_file_name(file_to_upload.filename)
# Secures file name against user input
file_name = secure_filename(file_name)
# Checks the file name isn't blank
if file_to_upload.filename == "":
print("Error with file" + file_to_upload.filename +
" - name must not be blank")
files_processed = False
continue
# Checks the file has an allowed extension
elif not allowed_ext(file_to_upload.filename):
print("Error with file" + file_to_upload.filename +
" - extension not supported")
files_processed = False
continue
# Checks file size
elif file_length > app.config['MAX_FILE_SIZE']:
print("Error with file" +
file_to_upload.filename + " file too big")
files_processed = False
continue
else: # Else, passes all validation and is saved.
file_path = upload_path + "/" + file_name
file_to_upload.save(file_path)
# If files have been processed, return a render with success message
if files_processed is True:
return render_template('index.html', is_home='yes', succ="Now processing your files...")
else: # Else, normal redirect.
return redirect(request.url)
else: # If no files request, redirect to index.
return redirect(request.url)
else: # If not a POST request, load page as normal.
return render_template('index.html', is_home='yes')
Sorry if this is something easy or silly I've missed - this is my first project in Python and my first time using Flask.
This all looks pretty ok.
Unrelated: I would do the check for the empty file name against file_name - not file_to_upload.filename.
About your problem:
The only sensible answer could be that with any of your many actions you put the file pointer at the end of the file, and save cannot handle it.
Unfortunately, I have no time to try this on my pc, so please do a seek 0 again - this time just before you call the save method.
I will try this later and update my answer accordingly.
Another way to find out what is going is using a debugger. That is not too complicated.
I made a 5 min video - just for debugging Flask: https://www.youtube.com/watch?v=DB4peJ1Lm2M
I was having the same issue and the recommended fix above helped me! From what I understand it is because you file pointer at the end of the file when you grab the file size. This just has to be undone with another file_to_upload.seek(0).
Just for clarity, your code should now look like:
file_path = upload_path + "/" + file_name
file_to_upload.seek(0)
file_to_upload.save(file_path)

How to display a page and start the file download from it in django?

I have a method in my views.py that I've constructed with a http response
# # content-type of response
response = HttpResponse(content_type='application/ms-excel')
# #decide file name
response['Content-Disposition'] = 'attachment; filename="ThePythonDjango.xls"'
#adding new sheets and data
new_wb = xlwt.Workbook(encoding='utf-8')
new_wb.save(response)
This works fine if I only have response in my return
But I also want to return a render
return render(request, 'uploadpage/upload.html', {"excel_data": seller_info, "message":message, "errormessage":errormessage})
I was wondering if there's a way to do that
It could be done by making the view behave differently on the existence of some query parameter, the download parameter, for example. Actually, it would be equivalent to having two separate views:
The first view should return the rendered HTML response, which automatically starts the file download. For example via an iframe, just put this code into the template:
<iframe width="1" height="1" frameborder="0" src="?download"></iframe>
You can see some other methods to start automatic download here: How to start automatic download of a file in Internet Explorer?
The second view should contain your XLS generation code and return the HttpResponse with the XLS contents.
Combining this into a single view could look like:
def xls_download_view(request):
if 'download' in request.GET:
response = HttpResponse(content_type='application/ms-excel')
response['Content-Disposition'] = 'attachment; filename="ThePythonDjango.xls"'
new_wb = xlwt.Workbook(encoding='utf-8')
...
new_wb.save(response)
return response
else:
...
return render(...)

Django- Create a downloadable excel file using pd.read_html & df.to_excel

I currently have a python script that uses pd.read_html to pull data from a site. I then use df.to_excel which sets 'xlsxwriter' as the engine.
I am trying to find a way to incorporate this into a django webapp. However, I am lost as how to do this or even know if possible.
I've seen a few ways to create downloadable excel files in django but none that have pandas as the driving force of creating the data in the excel file. My python code for creating the excel file without django is somewhat long so not sure what to show. Below is part of my pandas code:
xlWriter = pd.ExcelWriter(excel_sheet2, engine='xlsxwriter')
workbook = xlWriter.book
money_fmt = workbook.add_format({'num_format': 42, 'align': 'center', 'text_wrap': True})
text_fmt = workbook.add_format({'bold': True, 'align': 'center', 'text_wrap': True})
for i, df in enumerate(dfs):
for col in df.columns[1:]:
df.loc[df[col] == '-', col] = 0
df[col] = df[col].astype(float)
df.to_excel(xlWriter, sheet_name='Sheet{}'.format(i))
Below is my templates.html code
{% block content %}
<form type="get" action="." style="margin: 0">
<input id="search_box" type="text" name="search_box" placeholder="Enter URL..." >
<button id="search_submit" type="submit" >Submit</button>
</form>
{% endblock %}
And this is the beginning of my views.py
def financials(request):
return render(request, 'personal/financials.html')
if request.method == 'GET':
search_query = request.GET.get('search_box', None)
url = search_query
dfs = pd.read_html(url, flavor='html5lib')
Why don't you just call your pandas functions within the Django view and save the file to /tmp. Once you have the file you can just send it and tell the browser to treat it as a file in your response.
You can then just return the file
from django.http import HttpResponse
def my_view(request):
# your pandas code here to grab the data
response = HttpResponse(my_data, content_type='application/vnd.ms-excel')
response['Content-Disposition'] = 'attachment; filename="foo.xls"'
return response
https://docs.djangoproject.com/en/dev/ref/request-response/#telling-the-browser-to-treat-the-response-as-a-file-attachment
I just wanted to add what I finally came up with that got everything working. Instead of including my data within the HttpResponse, I included the response within the wb.save() command. This got everything working correctly including my formatting of the spreadsheet prior to downloading.
wb = load_workbook(excel_sheet2)
response = HttpResponse(content_type='application/vnd.ms-excel')
response['Content-Disposition'] = 'attachment; filename= "Data.xlsx"'
wb.save(response)
return response

Categories

Resources