Is it possible to convert 'dynamic' excel formulas to python code? - python

Is it possible to convert excel formulas to python code? For example:
"=TEXT(SORT(PROPER(UNIQUE(FILTER("
"ws_1!A:A,ws_2!B:B=ws_3!C3"
')))), "")'
Or it is not possible. I was looking into Pycel, xlcalculator, formulas module. But unfortunately i cannot find more complicated example than sum(A,B).
Probably i could do it with pandas, but it won't work constantly in spreadsheet. Or can i save some python script instead formula to cell?
if you have any idea how to translate easier formulas eg. or any library that can do it, I would be grateful for the tips :
'=IFERROR(VLOOKUP(C2,ws!A2:B3,2,0), "Invalid")'
My motivation is to avoid a long excel formula in python code. And make it testable

Related

Python tool that shows calculations in Excel

Is there a good tool where I can do a complex calculation in Python but have it show the results as Excel formulas? I'm thinking of a use case where I want to do complex financial projections with more business logic than is comfortable to write directly in Excel. However, the end-users are familiar with Excel and want to verify my work by checking a spreadsheet.
To be more concrete, I would like to write things like
total_sales = europe_sales + us_sales
and have that translate to Excel formulas like
A3 = A2 + A1
Obviously, this would be for generating more complex spreadsheets with dozens of columns across an arbitrary number of rows
Using dataframe from pandas can be an option for you.
import pandas as pd
df = pd.read_excel('excel.xlsx')
df['total_sales'] = df['europe_sales'] + df['us_sales']
There is no tool to my knowledge that does exactly what you're asking. That said, I think you have a couple options.
If what you are doing is not too complex, write it in excel.
You can use VBA to write macros, however, if your supervisors can understand your VBA code, more than likely they will understand your python script.
If your boss really wants to check your work, try and replicate a previous example (if one exists) and confirm the output, or create a couple edge/test cases to confirm your script works, or just have them confirm the results of the actual data for the first time and trust it going forward.
Create a flow chart to help explain what you're doing
Excel is very powerful but there is a lot that ends up being a pain. If you're working on a task that is not straight forward to do in excel, chances are it will be difficult for your boss to confirm/check your excel logic anyway. Best advice is to talk to your boss and explain you can give them the result in excel but the functions/logic is not easily implemented.
best of luck
So to have it know how to translate itself from python to Excel formulas would be, well, a lot. That said, you can enter in the format of the Excel formula and it'll work (but may not be all that easy or useful...).
For example, I did df.to_excel() on this dataframe and the formula-column evaluated in Excel.
pd.DataFrame({'a':[1,2,3],'b':[2,3,4],'c':['=b2+c2','=b3+c3','=b4+c4']})
a b c
0 1 2 =b2+c2
1 2 3 =b3+c3
2 3 4 =b4+c4

Python and CSV with formulae

I have a CSV file with formulae, like this:
1;;2.74;0
=A1+C1;=A2;=C1
What's the best way to convert formulae into numbers, as follows
1;;2.74;0
3.74;3.74;2.74
?
The only way I know is to read it with csv.reader as a list of lists and then loop through each element. But it seems there must be a simpler way.
P. S. I have a hint to use eval
The CSV format does not support formulas. It is a plain text only format.
Although some popular software like MS Excel, will calculate the formulas. I am not aware of a parser that allows this. You may however, attempt to write your own parser. The success of this will depend on how advanced formulas you are looking to have in the CSV.

Looping through various excel worksheets for curve fitting in python

I'm relatively new to performing data analysis with python, thus I'm sorry if this question seems too noob.
I have an excel-file with many different sheets. I've also written a script which fits and plots the data contained in these excel sheets. I have code written to perform curve fitting for only one of the sheets.
My idea was to create a loop that would iterate through all my sheets and apply the script in each one of them at a time, but I'm not really sure how to do this. Could you provide me any guidance, or any place where I could learn/read about how to do this? I have tried to search a bit around but I haven't been able to find anything useful.
Thanks!
I'm assuming you're trying to loop through different sheets of a single excel file. If so, you can use pd.read_excel's sheet_name parameter to pass sheets programmatically. (See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html)
So, if you wanted to loop through the first three sheets, you could use:
for i in range(3):
df = pd.read_excel('path/to/file.xlsx', sheet_name=i)
do_stuff(df)

Can I translate/duplicate an existing graph in Excel into Python?

I have many graphs in Excel that I would like to convert to Python but am struggling with how to do so using Matplotlib. Is there a package or method that would essentially convert/translate all the formatting and data series selection into python?
Once I could see a few examples of the correct code I think I could start doing this directly in python but I do not have much experience manually creating graph code (I use Excel insert graphs mostly) so am looking for a bridge.

How to find the source data of a series chart in Excel

I have some pretty strange data i'm working with, as can be seen in the image. Now I can't seem to find any source data for the numbers these graphs are presenting.
Furthermore if I search for the source it only points to an empty cell for each graph.
Ideally I want to be able to retrieve the highlighted labels in each case using python, and it seems finding the source is the only way to do this, so if you know of a python module that can do that i'd be happy to use it. Otherwise if you can help me find the source data that would be even perfecter :P
So far i've tried the XLDR module for python as well as manually showing all hidden cells, but neither work.
Here's a link to the file: Here
EDIT I ended up just converting the xlsx to a pdf using cloudconvert.com API
Then using pdftotext to convert the data to a .txt which just analyses everything including the numbers on the edge of the chart which can then be searched using an algorithm.
If a hopeless internet wanderer comes upon this thread with the same problem, you can PM me for more details :P

Categories

Resources