i have build an app in django to extract data from an mssql server and display the results on a table on a template.
what i want to do now is to export the same sql query results to an excel file. I have used pymssql driver to connect to the db and pysqlalchemy.
This is what i did, but some how excel file wasn't created when the function was call
def download_excel(request):
if "selectdate" in request.POST:
if "selectaccount" in request.POST:
selected_date = request.POST["selectdate"]
selected_acc = request.POST["selectaccount"]
if selected_date==selected_date:
if selected_acc==selected_acc:
convert=datetime.datetime.strptime(selected_date, "%Y-%m-%d").toordinal()
engine=create_engine('mssql+pymssql://username:password#servername /db')
connection = engine.connect()
df = pd.read_sql(stmt,connection)
writer = pd.ExcelWriter('C:\excel\export.xls')
df.to_excel(writer, sheet_name ='bar')
my code actually worked. I thought it was going to save the excel file to 'C:\excel' folder so i was looking for the file in the folder but i couldn't find the excel file. The excel file was actually exported to my django project folder instead.
How to i allow the end user to be able to download the file to their desktop instead of exporting it to the server itself
I manage to get it to work with much time spend research. This code will export sql query to excel file which will allow end user to download the excel file
import pandas as pd
from django.http import HttpResponse
from io import BytesIO as IO # for modern python
except ImportError:
from StringIO import StringIO as IO # for legacy python
def download_excel(request):
if "selectdate" in request.POST:
if "selectaccount" in request.POST:
selected_date = request.POST["selectdate"]
selected_acc = request.POST["selectaccount"]
if selected_date==selected_date:
if selected_acc==selected_acc:
convert=datetime.datetime.strptime(selected_date, "%Y-%m-%d").toordinal()
engine=create_engine('mssql+pymssql://username:password#servername /db')
sio = StringIO()
df = pd.DataFrame(data=list(results), columns=results.keys())
####dowload excel file##########
excel_file = IO()
xlwriter = pd.ExcelWriter(excel_file, engine='xlsxwriter')
df.to_excel(xlwriter, 'sheetname')
response = HttpResponse(excel_file.read(), content_type='application/ms-excel vnd.openxmlformats-officedocument.spreadsheetml.sheet')
# set the file name in the Content-Disposition header
response['Content-Disposition'] = 'attachment; filename=myfile.xls'
return response
I have this dataframe, and I want to save it as a excel file in a sharepoint folder.
This is my code:
from office365.runtime.auth.client_credential import ClientCredential
from office365.sharepoint.client_context import ClientContext
# auth
client_credentials = ClientCredential(var_client_id, var_client_secret)
ctx = ClientContext(var_sp_site).with_credentials(client_credentials)
df = pd.DataFrame(sql_table)
var_relative_url = "sharepoint_path/sharepoint_path"
target_folder = ctx.web.get_folder_by_server_relative_url(var_relative_url)
target_folder.upload_file(content=df.to_excel(excel_writer='teste.xlsx'), file_name='teste.xlsx').execute_query() # Here is my problem
When I execute this code, the excel file is created at the folder, but when I try to open the file on sharepoint interface it raises a error ("cannot be opened").
This code will run on a cloud function, so I can't use local files to upload.
I'm investigating this issue right now. Not solved yet buy I can give you a work around: use .save()
wb = pd.ExcelWriter( outputFile, mode='w', engine="openpyxl" )
myDataFrame.to_excel( wb, sheet_name='sheet1', index=False )
From error to warning ;)
I am trying to run a query, with the result saved as a CSV that is uploaded to a SharePoint folder. This is within Databricks via Pyspark.
My code below is close to doing this, but the final line is not functioning correctly - the file generated in SharePoint does not contain any data, though the dataframe does.
I'm new to Python and Databricks, if anyone can provide some guidance on how to correct that final line I'd really appreciate it!
from shareplum import Site
from shareplum.site import Version
import pandas as pd
sharepointUsername =
sharepointPassword =
sharepointSite =
website =
sharepointFolder =
# Connect to SharePoint Folder
authcookie = Office365(website, username=sharepointUsername, password=sharepointPassword).GetCookies()
site = Site(sharepointSite, version=Version.v2016, authcookie=authcookie)
folder = site.Folder(sharepointFolder)
FileName = "Data_Export.csv"
df = spark.sql(Query)
pandasdf = df.toPandas()
folder.upload_file(pandasdf.to_csv(FileName, encoding = 'utf-8'), FileName)
Sure my code is still garbage, but it does work. I needed to convert the dataframe into a variable containing CSV formatted data prior to uploading it to SharePoint; effectively I was trying to skip a step before. Last two lines were updated:
from shareplum.site import Version
import pandas as pd
sharepointUsername =
sharepointPassword =
sharepointSite =
website =
sharepointFolder =
# Connect to SharePoint Folder
authcookie = Office365(website, username=sharepointUsername, password=sharepointPassword).GetCookies()
site = Site(sharepointSite, version=Version.v2016, authcookie=authcookie)
folder = site.Folder(sharepointFolder)
FileName = "Data_Export.csv"
df = (spark.sql(QueryAllocation)).toPandas().to_csv(header=True, index=False, encoding='utf-8')
folder.upload_file(df, FileName)
In Python I am utilizing Office 365 REST Python Client library to access and read an excel workbook that contains many sheets.
While the authentication is successful, I am unable to append the right path of sheet name to the file name in order to access the 1st or 2nd worksheet by its name, which is why the output from the sheet is not JSON, rather IO Bytes which my code is not able to process.
My end goal is to simply access the specific work sheet by its name 'employee_list' and transform it into JSON or Pandas Data frame for further usage.
Code snippet below -
import io
import json
import pandas as pd
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.runtime.auth.user_credential import UserCredential
from office365.runtime.http.request_options import RequestOptions
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File
from io import BytesIO
username = 'abc#a.com'
password = 'abcd'
site_url = 'https://sample.sharepoint.com/sites/SAMPLE/_layouts/15/Doc.aspx?OR=teams&action=edit&sourcedoc={739271873}'
ctx = ClientContext(site_url).with_credentials(UserCredential(username, password))
request = RequestOptions("{0}/_api/web/".format(site_url))
response = ctx.execute_request_direct(request)
json_data = json.loads(response.content) # ERROR ENCOUNTERED JSON DECODE ERROR SINCE DATA IS IN BYTES
You can access it by sheet index, check the following code....
import xlrd
loc = ("File location")
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
# For row 0 and column 0
print(sheet.cell_value(1, 0))
You can try to add the component 'sheetname' to the url like so.
It seems that URL constructed to access data is not correct. You should test full URL in your browser as working and then modify code to get going. You may try this with some changes, I have verified that URL formed with this logic would return JSON data.
import io
import json
import pandas as pd
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.runtime.auth.user_credential import UserCredential
from office365.runtime.http.request_options import RequestOptions
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File
from io import BytesIO
username = 'abc#a.com'
password = 'abcd'
site_url = 'https://sample.sharepoint.com/_vti_bin/ExcelRest.aspx/RootFolder/ExcelFileName.xlsx/Model/Ranges('employee_list!A1%7CA10')?$format=json'
# Replace RootFolder/ExcelFileName.xlsx with actual path of excel file from the root.
# Replace A1 and A10 with actual start and end of cell range.
ctx = ClientContext(site_url).with_credentials(UserCredential(username, password))
request = RequestOptions(site_url)
response = ctx.execute_request_direct(request)
json_data = json.loads(response.content)
Source: https://learn.microsoft.com/en-us/sharepoint/dev/general-development/sample-uri-for-excel-services-rest-api
The update I'm using (Office365-REST-Python-Client==2.3.11) allows simpler access to an Excel file in the SharePoint repository.
# from original_question import pd,\
# username,\
# password,\
# UserCredential,\
# File,\
# BytesIO
user_credentials = UserCredential(user_name=username,
file_url = ('https://sample.sharepoint.com'
## absolute path of excel file on SharePoint
excel_file = BytesIO()
## initiating binary object
excel_file_online = File.from_url(abs_url=file_url)
## requesting file from SharePoint
excel_file_online = excel_file_online.with_credentials(
## validating file with accessible credentials
## writing binary response of the
## file request into bytes object
We now have a binary copy of the Excel file as BytesIO named excel_file. Progressing, reading it as pd.DataFrame is straight-forward like usual Excel file stored in local drive. Eg.:
pd.read_excel(excel_file) # -> pd.DataFrame
Hence, if you are interested in a specific sheet like 'employee_list', you may preferably read it as
employee_list = pd.read_excel(excel_file,
# -> pd.DataFrame
data = pd.read_excel(excel_file,
sheet_name=None) # -> dict
employee_list = data.get('employee_list')
# -> [pd.DataFrame, None]
I know you stated you can't use a BytesIO object, but for those coming here who are reading the file in as a BytesIO object like I was looking for, you can use the sheet_name arg in pd.read_excel:
url = "https://sharepoint.site.com/sites/MySite/MySheet.xlsx"
sheet_name = 'Sheet X'
response = File.open_binary(ctx, relative_url)
bytes_file_obj = io.BytesIO()
df = pd.read_excel(bytes_file_obj, sheet_name = sheet_name) //call sheet name
I'd like to upload an excel file in my web app, read the contents of it and display some cells. So basically I don't need to save the file as it's a waste of time.
Relevant code:
if form.validate_on_submit():
f = form.xml_file.data.stream
xml = f.read()
workbook = xlrd.open_workbook(xml)
sheet = workbook.sheet_by_index(0)
I can't wrap my mind around this as I keep getting filetype errors no matter what I try. I'm using Flask Uploads, WTF.file and xlrd for reading the file.
Reading the file works okay if I save it previously with f.save
To answer my own question, I solved it with
if form.validate_on_submit():
# Put the file object(stream) into a var
xls_object = form.xml_file.data.stream
# Open it as a workbook
workbook = xlrd.open_workbook(file_contents=xls_object.read())
for example the following code creates the xlsx file first and then streams it as a download but I'm wondering if it is possible to send the xlsx data as it is being created. For example, imagine if a very large xlsx file needs to be generated, the user has to wait until it is finished and then receive the download, what I'd like is to start the xlsx file download in the user browser, and then send over the data as it is being generated. It seems trivial with a .csv file but not so with an xlsx file.
import cStringIO as StringIO
except ImportError:
import StringIO
from django.http import HttpResponse
from xlsxwriter.workbook import Workbook
def your_view(request):
# your view logic here
# create a workbook in memory
output = StringIO.StringIO()
book = Workbook(output)
sheet = book.add_worksheet('test')
sheet.write(0, 0, 'Hello, world!')
# construct response
response = HttpResponse(output.read(), mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
response['Content-Disposition'] = "attachment; filename=test.xlsx"
return response
Are you able to write tempfiles to disk while generating the XLSX?
If you are able to use tempfile you won't be memory bound, which is nice, but the download will still only start when the XLSX writer is done assembling the document.
If you can't write tempfiles, you'll have to follow this example http://xlsxwriter.readthedocs.org/en/latest/example_http_server.html and your code is unfortunately completely memory bound.
Streaming CSV is very easy, on the other hand. Here is code we use to stream any iterator of rows in a CSV response:
import csv
import io
def csv_generator(data_generator):
csvfile = io.BytesIO()
csvwriter = csv.writer(csvfile)
def read_and_flush():
data = csvfile.read()
return data
for row in data_generator:
yield read_and_flush()
def csv_stream_response(response, iterator, file_name="xxxx.csv"):
response.content_type = 'text/csv'
response.content_disposition = 'attachment;filename="' + file_name + '"'
response.charset = 'utf8'
response.content_encoding = 'utf8'
response.app_iter = csv_generator(iterator)
return response
xlsx format is a zip file that contains several individual files, so you can't create it on the fly and send it out as it is being created.