Semi-Interactive Pandas Dataframe in a GUI - python

There are a number of excellent answers to this question GUIs for displaying dataframes, but what I'm looking to do is a bit more advanced.
I'd like to display a dataframe, but have a couple of the columns be interactive where the user can manually overwrite values (and the rest be static). It would be useful to have "total" rows that change with the overwritten values and eventually have some interactive buttons around the dataframe for loading and clearing data.
QTPandas looks promising, but appears to be dead as it is build off of a really old version of Pandas (0.17.1). Can this be done in QT? Is something else better?

I love Rstudio as my IDE as I can not only view all objects created but I can also edit data in the IDE itself. There are many other great features too.
And you can use R Studio for Python coding too (using reticulate package).
Spyder too gives this feature of viewing or editing the data frame.
However, if you're looking for a dedicated GUI with drag & drop features, you can use Pandas GUI.
Features of pandasgui are:
View DataFrames and Series (with MultiIndex support)
Interactive plotting
Filtering
Statistical summary
Data editing and copy / paste
Import CSV files with drag & drop Search toolbar
It's first version was released in Mar 2019 & still developing. As of date, you can't use it in Colab

While not a GUI in itself, XLWings leveraged Excel as a GUI and makes pandas dataframes interactive for users and was our library of choice.

Related

Is there a way to export the data viewer in VS Code?

I am viewing the data viewer for the counter dictionary. The data is nicely put in 2 columns, but I can't seem to find an option to export as a CSV or to excel. Selecting all and copying doesn't work for some reason, only the rows that are currently on the screen are copied, even though all the rows are selected. I am running VScode on a Mac.
Sorry, but it's impossible for now. This feature request is still open on GitHub. You can refer to here to join the discussion.

moving data in excel with python

I would like to be able to move a data of the table automatically to place it on a new column and duplicate it as many times as I have rows before a row with only one data but I don't know which tool to use.
This is probably not a python, pandas or dataframe question but more about running a macro in excel.
One can run macro's in excel with python using: https://www.xlwings.org/
This is open source and free, comes preinstalled with Anaconda and WinPython, and works on Windows and macOS
Although, you might simple prefer the natural excel vba editor for this and "record a macro".
Hope this is helpful.
Using ffill answer directly the question.
df['col'] = df['col'].ffill()

View dataframe while debugging in VS Code

I'm trying to explore switching from PyCharm to VS Code. I can't find a way right now to view my pandas DataFrames in a tabular format while debugging.
When I right click on a df object, there is no option to view.
I have the python extension downloaded. Am I missing something?
Microsoft VSCode team finally made this feature available with latest update of the product. More details could be found in official blog
It works like a charm and is very intuitive. In short:
Set up a break point (by clicking at the left most point of code area, before line number)
Start debugging (Run menu at top have Start Debugging option)
When debugger stops at the debug point, find the required dataframe inside VARIABLES panel. (VARIABLES panel is inside Run and Debug area)
Right click on dataframe and select option View Value in Data Viewer. TADA :)
You can now print the DataFrame in the DEBUG CONSOLE:
From the Github issue mentioned in #Christina Zhou's answer.
My solution for viewing DataFrames in a tabular format while debugging is to simply copy and paste them into an Excel spreadsheet using
df.to_clipboard()
from the debug console. Even some of my colleagues running PyCharm are using this technique, since it gives you way more flexibility to inspect your data.
It seems like currently you can do it only using the Jupyter notebook in VS Code, using the variables explorer.
So it looks like this isn't a thing right now in VS Code.
If anyone wants to show their support for the development of this feature, I found this open issue here:
https://github.com/microsoft/vscode-python/issues/7063
you can use the view() function from xlwings library. It will show you the DataFrame in Excel:
import pandas as pd
from xlwings import view
df = pd.DataFrame({'A':[1,2], 'B':[3,4]})
view(df)
A better way would be to convert the function to pandas method:
from pandas.core.base import PandasObject
PandasObject.view = view
now you only need to type:
df.view()
Two more options for vscode are the following ones:
jupyter notebooks
saving to CSV and using the edit csv extension
Both require more effort but the view is more helpful.
The interactive shell looks like a good start. Right click the .py file in your explorer. You'll be able to view pandas dataframes from there.

Charts from Excel to PowerPoint with Python

I have an excel workbook that is created using an excellent "xlsxwriter" module. In this workbook, there about about 200 embedded charts. I am now trying to export all those charts into several power point presentations. Ideally, I want to preserve the original format and embedded data without linking to external excel work book.
I am sure there is a way to do this using VBA. But, I was wondering if there is a way to do this using Python. Is there a way to put xlsxwriter chart objects into powerpoints ?
I have looked at python-pptx and can't find anything about getting charts or data series from excel work book.
Any help is appreciated !
After spending hours of trying different things, I have found the solution to this problem. Hopefully,it will help someone save some time.The following code will copy all the charts from "workbook_with_charts.xlsx" to "Final_PowerPoint.pptx."
For some reason, that I am yet to understand, it works better when running this Python program from CMD terminal. It sometimes breaks down if you tried to run this several times, even though the first run is usually OK.
Another issue is that in the fifth line, if you make False using "presentation=PowerPoint.Presentations.Add(False)," it does not work with Microsoft Office 2013, even though both "True" and "False" will still work with Microsoft Office 2010.
It would be great if someone can clarify these about two issues.
# importing the necessary libraries
import win32com.client
from win32com.client import constants
PowerPoint=win32com.client.Dispatch("PowerPoint.Application")
Excel=win32com.client.Dispatch("Excel.Application")
presentation=PowerPoint.Presentations.Add(True)
workbook=Excel.Workbooks.Open(Filename="C:\\.........\\workbook_with_charts.xlsx",ReadOnly=1,UpdateLinks=False)
for ws in workbook.Worksheets:
for chart in ws.ChartObjects():
# Copying all the charts from excel
chart.Activate()
chart.Copy()
Slide=presentation.Slides.Add(presentation.Slides.Count+1,constants.ppLayoutBlank)
Slide.Shapes.PasteSpecial(constants.ppPasteShape)
# WE are going to make the title of slide the same chart title
# This is optional
textbox=Slide.Shapes.AddTextbox(1,100,100,200,300)
textbox.TextFrame.TextRange.Text=str(chart.Chart.ChartTitle.Text)
presentation.SaveAs("C:\\...........\\Final_PowerPoint.pptx")
presentation.Close()
workbook.Close()
print 'Charts Finished Copying to Powerpoint Presentation'
Excel.Quit()
PowerPoint.Quit()
The approach I'd be inclined toward with the current python-pptx version is to read the Excel sheets for their data and recreate the charts in python-pptx. That of course would require knowing what the chart formatting is, etc., so I could see why you might not want to do that.
Importing charts directly from Excel has been done in the past, see the pull request here on GitHub: https://github.com/scanny/python-pptx/pull/65
But it involved a large amount of surgery on python-pptx, and many versions back now, so at most it might be a good guide to what strategies might work. You'd need to want it pretty bad I suppose to go that route :)
I don't have enough reputation to comment but if you get the same issue as #R__raki__ then you can use the integer value defined by the VBA reference. For this case it would be 12.
So replace
Slide=presentation.Slides.Add(presentation.Slides.Count+1,constants.ppLayoutBlank)
with
Slide=presentation.Slides.Add(presentation.Slides.Count+1,12)
See here for more.

Python Excel - OpenPyxl Limitation

I recently started to automate a report at work using Python. Since my data was provided to me in the form of an excel sheet, I felt the best way to do this was to use an excel python module. My module of choice was openpyxl. It worked great, I've used it to perform calculations and organise my data ready to plot charts. Now here's the problem...
I know that you cannot update existing charts using openpyxl so that option went out the window.
What I then tried to do was link the data in my openpyxl spreadsheet to another spreadsheet containing the charts (which is then linked to my word document where the charts are to be displayed). So after doing this I ran my script and to my annoyance, the data links between my openpyxl spreadsheet and charts spreadsheet had been severed. I guess this is because openpyxl creates a new spreadsheet when you save using the save function links are severed.
My question is.. are there any ways to maintain the data links?
It is currently not possible to maintain links between files. I think it would be possible to keep them metadata but, for fairly obvious reasons, it won't necessarily be possible to validate them. This best way for this to happen would be through a pull request.
If you're on Windows you might look at using the Python for Windows stuff which will allow you to remote control the applications.

Categories

Resources