I am working on a project that researches into book records as cataloged in libraries. For a part of the analysis, I need to create an excel sheet with many regions like the following:
The data for these structure are stored in a database, and I'm using python.
Being a Pandas-newbee, I tried to see if I can do this through dataFrames/mulitIndexes, but from the tutorials I have been following so far, I am doubtful this is the best way, because, as you can see from the screen-dump, the row/columns are not very regular.
Should I rather use openpyxl for that task? Or some other API/stack I have not yet encountered?
Related
Thanks for taking the time to read my question.
I am working on a personal project to learn python scripting for excel, and I want to learn how to move data from one workbook to another.
In this example, I am emulating a company employee ledger that has name, position, address, and more (The organizations is by row so every employee takes up one row). But the project is to have a selected number of people be transferred to a new ledger (another excel file). So I have a list of emails in a .txt file (it could even be another excel file but I thought .txt would be easier), and I would want the script to run through the .txt file, get the emails, and look for any rows that have a matching email address(all emails are in cell 'B'). And if any are found, then copy that entire row to the new excel file.
I tried a lot of ways to make this work, but I could not figure it out. I am really new to python so I am not even sure if this is possible. Would really appreciate some help!
You have essentially two packages that will allow manipulation of Excel files. For reading in data and performing analysis the standard package for use is pandas. You can save the files as .xlsx however you are only really working with base table data and not the file itself (IE, you are extracing data FROM the file, not working WITH the file)
However what you need is really to perform manipulation on Excel files directly which is better done with openpyxl
You can also read files (such as your text file) using with open function that is native to Python and is not a third party import like pandas or openpyxl.
Part of learning to program includes learning how to use documentation.
As such, here is the documentation you require with sufficient examples to learn openpyxl: https://openpyxl.readthedocs.io/en/stable/
And you can learn about pandas here: https://pandas.pydata.org/docs/user_guide/index.html
And you can learn about python with open here: https://docs.python.org/3/tutorial/inputoutput.html
Hope this helps.
EDIT: It's possible I or another person can give you a specific example using your data / code etc, but you would have to provide it fully. Since you're learning, I suggest using the documentation or youtube.
I'm building a website that'll have a django backend. I want to be able to serve the medical billing data from a database that django will have access to. However, all of the data we receive is in excel spreadsheets. So I've been looking for a way to get the data from a spreadsheet, and then import it into a django model. I know there are some different django packages that can do this, but I'm having a hard time understanding how to use these packages. On top of that I'm using python 3 for this project. I've used win32com for automation stuff in excel in the past. I could write a function that could grab the data from the spreadsheet. Though what I want figure out is how would I write the data to a django model? Any advice is appreciated.
Use http://www.python-excel.org/ and consider this process:
Make a view where user can upload the xls file.
Open the file with xlrd. xlrd.open_workbook(filename)
Extract, create dict to map the data you want to sync in db.
Use the models to add, update or delete the information.
If you follow the process, you can learn a lot of how loading and extracting works and how does it fits with the requirements. I recommend to you first do the step 2 and 3 in shell to get more quicker experiments and avoid to be uploading/testing/error with a django view.
Hope this kickoff base works for you.
Why don't you use django-import-export?
It's a widget that allows you to import excel files from admin section.
It's very easy to install, here you find the installation tutorial, and here an example.
Excel spreadsheets are saved as .csv files, and there are plenty of examples and explanations on how to work with them, such as here and here, online already.
In general, if you are having difficulty understanding documentation or packages, my advice would be to search for specific examples or see if whatever you are trying to do has already been done. Play with it to get a working understanding, and then modify it to fit your needs.
I've been trying to do this and I really got no clue. I've search a lot and i know that i can merge the files with easily with VBA or other languages, but i really want to do it with Python.
Can anyone get me on track?
I wished there was a straight forward support from openpyxl/xlsxwriter to copy sheets across different workbooks.
However, I see you would have to mash up a recipe using a couple of libraries:
One for reading the worksheet data and,
Another for writing data to a unified xlsx
For both of the above there are lot of options in terms of python packages.
I need to create some excel tables, but these tables don't have simple look.
There are some pictures, some special fonts etc.
But the complicated parts are static, that means always the same.
So my idea was, I will create an excel-template with these tricky parts and then from python just insert dynamic data to this template.
I am working with pandas framework, but I didn't find a way how to do that with or without this framework.
Any idea?
There isn't an easy way to do this with any of the usual "direct file manipulation" libraries in Python (xlrd, xlwt, XlsxWriter, OpenPyXL; these are what pandas uses). The reason is that the structure of a workbook file is such that it's impossible or prohibitively difficult (depending on whether you're talking about .xls or .xlsx) to do anything resembling "in-place" editing, short of re-implementing Excel itself.
So for what you're trying to do, your best option is to let Excel do the work. (I'm assuming you can run Excel, since you mention that you'd like to create Excel templates.) There are ways to automate Excel, the most straightforward probably being Microsoft's VBA or VBScript. But if you want to do it in Python, you can, using PyWin32 or pywinauto.
I recently started to automate a report at work using Python. Since my data was provided to me in the form of an excel sheet, I felt the best way to do this was to use an excel python module. My module of choice was openpyxl. It worked great, I've used it to perform calculations and organise my data ready to plot charts. Now here's the problem...
I know that you cannot update existing charts using openpyxl so that option went out the window.
What I then tried to do was link the data in my openpyxl spreadsheet to another spreadsheet containing the charts (which is then linked to my word document where the charts are to be displayed). So after doing this I ran my script and to my annoyance, the data links between my openpyxl spreadsheet and charts spreadsheet had been severed. I guess this is because openpyxl creates a new spreadsheet when you save using the save function links are severed.
My question is.. are there any ways to maintain the data links?
It is currently not possible to maintain links between files. I think it would be possible to keep them metadata but, for fairly obvious reasons, it won't necessarily be possible to validate them. This best way for this to happen would be through a pull request.
If you're on Windows you might look at using the Python for Windows stuff which will allow you to remote control the applications.