Python tool that shows calculations in Excel - python

Is there a good tool where I can do a complex calculation in Python but have it show the results as Excel formulas? I'm thinking of a use case where I want to do complex financial projections with more business logic than is comfortable to write directly in Excel. However, the end-users are familiar with Excel and want to verify my work by checking a spreadsheet.
To be more concrete, I would like to write things like
total_sales = europe_sales + us_sales
and have that translate to Excel formulas like
A3 = A2 + A1
Obviously, this would be for generating more complex spreadsheets with dozens of columns across an arbitrary number of rows

Using dataframe from pandas can be an option for you.
import pandas as pd
df = pd.read_excel('excel.xlsx')
df['total_sales'] = df['europe_sales'] + df['us_sales']

There is no tool to my knowledge that does exactly what you're asking. That said, I think you have a couple options.
If what you are doing is not too complex, write it in excel.
You can use VBA to write macros, however, if your supervisors can understand your VBA code, more than likely they will understand your python script.
If your boss really wants to check your work, try and replicate a previous example (if one exists) and confirm the output, or create a couple edge/test cases to confirm your script works, or just have them confirm the results of the actual data for the first time and trust it going forward.
Create a flow chart to help explain what you're doing
Excel is very powerful but there is a lot that ends up being a pain. If you're working on a task that is not straight forward to do in excel, chances are it will be difficult for your boss to confirm/check your excel logic anyway. Best advice is to talk to your boss and explain you can give them the result in excel but the functions/logic is not easily implemented.
best of luck

So to have it know how to translate itself from python to Excel formulas would be, well, a lot. That said, you can enter in the format of the Excel formula and it'll work (but may not be all that easy or useful...).
For example, I did df.to_excel() on this dataframe and the formula-column evaluated in Excel.
pd.DataFrame({'a':[1,2,3],'b':[2,3,4],'c':['=b2+c2','=b3+c3','=b4+c4']})
a b c
0 1 2 =b2+c2
1 2 3 =b3+c3
2 3 4 =b4+c4

Related

Python: Take CSV tables and find the 4 highest values and their location

I have a folder of 310 .csv files. Here is a sample of what the contents look like
I need to create a program that goes through all the files, lists the file name, then lists the top 4 values from the table and the x-value associated with it. Ideally this would all be saved to a text doc but as long as it prints in a readable format that would be ideal.
So, what is stopping you? Loop through the files, use pandas.read_csv to import each csv file and merge/join them all into one DataFrame. Use slicing to select the 4 top rows, and you can always print/visualize anything directly in a Jupyter Notebook. Exporting can be done using df.to_csv or any other method and if you need a short introduction to pandas, look here.
Keep in mind that it is always a good idea, to include a Minimal, Reproducible Example. Especially for a complicated merge operation between many DataFrames, this could help you a lot. However, there is no way around some research.

Is it possible to convert 'dynamic' excel formulas to python code?

Is it possible to convert excel formulas to python code? For example:
"=TEXT(SORT(PROPER(UNIQUE(FILTER("
"ws_1!A:A,ws_2!B:B=ws_3!C3"
')))), "")'
Or it is not possible. I was looking into Pycel, xlcalculator, formulas module. But unfortunately i cannot find more complicated example than sum(A,B).
Probably i could do it with pandas, but it won't work constantly in spreadsheet. Or can i save some python script instead formula to cell?
if you have any idea how to translate easier formulas eg. or any library that can do it, I would be grateful for the tips :
'=IFERROR(VLOOKUP(C2,ws!A2:B3,2,0), "Invalid")'
My motivation is to avoid a long excel formula in python code. And make it testable

Looping through various excel worksheets for curve fitting in python

I'm relatively new to performing data analysis with python, thus I'm sorry if this question seems too noob.
I have an excel-file with many different sheets. I've also written a script which fits and plots the data contained in these excel sheets. I have code written to perform curve fitting for only one of the sheets.
My idea was to create a loop that would iterate through all my sheets and apply the script in each one of them at a time, but I'm not really sure how to do this. Could you provide me any guidance, or any place where I could learn/read about how to do this? I have tried to search a bit around but I haven't been able to find anything useful.
Thanks!
I'm assuming you're trying to loop through different sheets of a single excel file. If so, you can use pd.read_excel's sheet_name parameter to pass sheets programmatically. (See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html)
So, if you wanted to loop through the first three sheets, you could use:
for i in range(3):
df = pd.read_excel('path/to/file.xlsx', sheet_name=i)
do_stuff(df)

Divide one "column" by another in Tab Delimited file

I have many files with three million lines in identical tab delimited format. All I need to do is divide the number in the 14th "column" by the number in the 12th "column", then set the number in the 14th column to the result.
Although this is a very simple function I'm actually really struggling to work out how to achieve this. I've spent a good few hours searching this website but unfortunately the answers I've seen have completely gone over the top of my head as I'm a novice coder!
The tools I have Notepad++ and Ultraedit (which has the ability to use Javascript, although i'm not familiar with this), and Python 3.6 (I have very basic Python knowledge). Other answers have suggested using something called "awk", but when I looked this up it needs Unix - I only have Windows. What's the best tool for getting this done? I'm more than willing to learn something new.
In python there are a few ways to handle csv. For your particular use case
I think pandas is what you are looking for.
You can load your file with df = pandas.read_csv(), then performing your division and replacement will be as easy as df[13] /= df[11].
Finally you can write your data back in csv format with df.to_csv().
I leave it to you to fill in the missing details of the pandas functions, but I promise it is very easy and you'll probably benefit from learning it for a long time.
Hope this helps

Move data from a MATLAB program to python

I am fairly new to programming, and started programming in python 3. The data I want to analyze have already been processed in a matlab program, and I need to export them. I don't know anything about matlab, and after searching the web I came up with this:
fileID = fopen('VarA.txt','w');
fprintf(fileID,'%.10f \n',data_1(:,1));
fclose(fileID);
fileID = fopen('varB.txt','w');
fprintf(fileID,'%.10f \n',data_1(:,2));
fclose(fileID);
This gives me 2 .txt files with x and y coordinates respectively. I have about 1000 strings (which each contains about 10k datapoints), so this seems like an awful way to do this.
My question is then; how can I export these datasets with an efficient code? For instance: I am writing a dataset to 2 different .txt files which separates the 2 variables, instead of a hash saved in 1 file.
If you are interested in exporting a lot of structured numerical data, i recommend you use the HDF5 functions in matlab to write these, and the corresponding python library to read these.
To simplify the code you showed there, read about dlmwrite in the matlab help.
The way you choose (or via dlmwrite) may sound awful in the beginning, but very often will not really have an impact on the performance.

Categories

Resources