How to read in tableau (twbx) file into python? - python

I would like to produce some custom output with python with data from Tableau files. I dont have access to the Tableau server to run the 'Tabpy' library.
Is there any other way to do it?
Thank you in advance

You may find the following link useful.
https://community.tableau.com/thread/152463
One of the posts in the thread mentioned the following which is worth exploring:
If you're looking to generate a TWBX dynamically, you should rename
your .twbx file to .zip, extract the contents and you can do whatever
you want with those in Python to dynamically create or adjust a
workbook file. The structure / definition of the workbook file is just
XML so no special code needed to read and parse that.

Related

How to have a pandas DATAFRAME saved into a SHAREPOINT as csv file?

I have a DataFrame that I would like to store as a CSV file in a Sharepoint.
It seems that the only way is to first save CSV file locally and then, using Shareplum, upload file to Sharepoint.
Is there a way to directly save DataFrame into Sharepoint as CSV file, without creating a local file?
Thanks a lot for your help.
It should be possible to write the csv content to an in-memory text buffer (e.g. StringIO or ByteIO) rather than to a local file - here is an example (last section of the page).
After that, you could use a library for writing the content directly to a Sharepoint: This discussion shows several approaches how to do that, including the Office365-REST-Python-Client and also SharePlum, which you have already mentioned.
Here are two more sources (Microsoft technical doc) that you might find useful:
How can I upload a file to Sharepoint using Python?
How to get and upload files from sharepoint with python?

Adding the "isExportable" attribute to an excel xmlMaps.xml file

So for a previous question I asked how to add a custom made xmlMap to an excel file in python and I was "successful" by opening the xlsx file as an archive and extracting the file structure, followed by adding the xmlMaps.xml file to the structure and including it in the "rels". I can now open the excel file and see that the xml source map is attached, but I cannot export it. It mentions that I need to set the attribute "xmlmap.isExportable" to True, but I have no clue about how to do this, preferably using python.
All I have found on google is this: https://learn.microsoft.com/en-us/office/vba/api/excel.xmlmap.isexportable
My old question regarding the case: Adding XML Source to xlsx file in python
Any help is greatly appreciated
Best regards
Martin
Turns out I just needed to understand how a xlsx file works and how xml connections are stored and referenced throughout multiple sub files

Embed CSV in Excel and import the data

I wrote a tool that extracts data from a large DB and outputs it to an Excel file along with (conditional) formatting to improve readability. For this I use Python with openpyxl on a Linux machine. It works great, but this package is rather slow for writing Excel.
It seems to be a lot quicker to dump the table as (compressed) csv, import that into Excel and apply formatting there using a macro/vba.
To automate the process I'd like to create an empty Excel file pre-loaded with the required VBA to do the formatting; a template. For every data dump, the data is embedded (compressed using deflate) into the Excel file and loaded into the Workbook upon opening the document (or using a "LOAD" button to circumvent macro related security things).
However, just adding some file into the Excel file raises an error when opened:
We found a problem with some content in 'Werkmap1_test_embed.xlsx'. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes.
Clicking Yes opens the file and shows some tracing information as XML:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<recoveryLog xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<logFileName>Repair Result to Werkmap1_OLE_Word0.xml</logFileName>
<summary>Errors were detected in file '/Users/joostk/mnt/cluster/Werkmap1_OLE_Word.xlsx'</summary>
<additionalInfo>
<info>Excel completed file level validation and repair. Some parts of this workbook may have been repaired or discarded.</info>
</additionalInfo>
</recoveryLog>
Is it possible to avoid this? How would I embed a file into the Excel ZIP? Do I need to update some file table (which I could not file easily).
When that's done, I'd like to import the data. Can I access files in the Excel ZIP from VBA? I guess not, and I need to extract the data to some temporary path and load it from there.
I have found these helpful answers elsewhere to load ZIP and plain text:
https://stackoverflow.com/a/35781621/4998990
https://stackoverflow.com/a/11267603/4998990
Many thanks for sharing your thoughts!
so my "Answer" here is that this is caused by using Named Ranges, or an underlying table, or an embedded Query/Connection. When you start manipulating this file you will get the error that you are talking about:
There is no harm to the file if you click "yes" and open. Excel will open this in Repaired Mode which will require you to re-save the file.
The way I've worked around this is to re-read the "repaired" file, in python, and save it as another file or replace it. Essentially just do an extra step of re-reading the data into memory, and write it to a new file. The error will go away. As always, test this method before deploying to production to ensure no records are lost. The way I solve it is with two lines of pandas.
import pandas as pd
repair = pd.read_excel('PATH_TO_REPAIR_FILE')
new_file = repair.to_excel('PATH_TO_WHERE_NEW_FILE_GOES')

Create a new CSV File?

I'm looking to create a new .csv file using python and then write to the file after that. I couldn't find a command to create a new file using the CSV library. I thought something like
NewFile = csv.creatfile(PATH)
might exist but I couldn't find something like that.
Thank you in advance! Happy to answer any questions you may have!
The csv module doesn't handle file creation; Python handles file creation as a builtin function called open -- once you have a file handle created using open, you can use it with csv.writer or csv.DictWriterto write data to the file.

Editing excel spreadsheets with python

I'll start off by saying that I'm new to python. I'm trying to create an application that is a simple Q+A and will export the answers to specific cells of an excel. I have an existing spreadsheet that i would like to modify and save as a separate outfile leaving the original untouched. I've seen various ways that i can append the file but will overwrite the original.
As an example, i would like this code;
hq = input('Headquarters: ')
to put the response in cell S1
Am I way off base trying to use Python for this task? Any Help would be greatly appreciated!
-Paul
There may not be very straightforward solutions but there are a couple of tools which might help you.
The first one is openpyxl: https://openpyxl.readthedocs.org/en/2.0.2/# If you have xlsx files, you should be able to modify them with this.
You might also be able to do what you want to do by using xlutils module: http://pythonhosted.org/xlutils/index.html However, then you'll need to first read the file, then edit it, and then save it to another file. Formatting may be lost, etc.
This is heavily YMMV due to the not-so-well defined file format, but I'd start with openpyxl.

Categories

Resources