Can't see csv file (converted from df) in files - python

After saving my dataframe to a csv in a specific location, the csv file doesn't appear in the location I saved it to. Is there any reason why it possibly is not showing?
Here is the code to save my dataframe to csv:
df.to_csv(r'C:\Users\gibso\OneDrive\Documents\JOSEPH\export_dataframe.csv', index = False)
Even changing an empty df does not seem to work.
import pandas as pd
olympics={}
df = pd.DataFrame(olympics)
df.to_csv(r'C:\Users\gibso\OneDrive\Documents\JOSEPH\export_dataframe.csv', index = False)
Thanks for the help!

I would rather use the module openpyxl. Example of saving:
import openpyxl
workbook = openpyxl.Workbook()
sheet = workbook.active
# Work on your workbook. Once finished:
workbook.save(file_name) # file_name is a variable you must define
Don't forget installing openpyxl with pip first!

Related

I want this piece of code to make a new sheet in an existing xlsx instead of making a new file

As of Right Now this piece of code makes a new sheet and converts the CSV for me into an xlsx but it makes a whole new file, I want it to add a new sheet into an existing workbook and convert the CSV into xlsx there,
import pandas as pd
import numpy as np
df_new = pd.read_csv('Names.csv')
GFG = pd.ExcelWriter('Names.xlsx')
df_new.to_excel(GFG, index = False)
GFG.save()
If you could write the whole code out it would be helpful as Im not fully aware of the pandas architecture
To append new data you have to write in append mode
GFG = pd.ExcelWriter('Names.xlsx', mode='a')
It seems it needs to use engine openpyxl instead of xlsxwriter
GFG = pd.ExcelWriter('Names.xlsx', mode='a', engine='openpyxl')
and it may need to install module openpyxl
pip install openpyxl
Maybe if you install openpyxl then it will use it automatically when you use mode='a'
If you want to add new sheet then you may need to use sheet_name
df_new.to_excel(GFG, index=False, sheet_name='First Sheet')
df_new.to_excel(GFG, index=False, sheet_name='Second Sheet')
Doc with examples: ExcelWriter

How to load win32com Excel worksheet to Pandas df?

I have the following code:
import pandas as pd
import win32com.client
excel_app = win32com.client.Dispatch("Excel.Application")
file_path = r"path to the file"
file_password = "file password"
workbook = excel_app.Workbooks.Open(file_path, Password=file_password)
sheet = workbook.Sheets("sheet name")
Now I'd like to take the sheet variable and load it into a Pandas df. I was trying to accomplish it via saving the sheet to a separate file and then reading it from Pandas, but it seems to be over-complicating the issue, as the file is both password protected and in .xlsm format, so re-opening it directly from Pandas isn't straightforward.
How do I do it?
The UsedRange property of the sheet will return an array that encompasses all the cells in the worksheet that have data.
df = pd.DataFrame(sheet.UsedRange())
With the column headers as the column number, and the index as the row number. Both zero-based.

Pandas how to keep sheets untouched

I have a excel workbook that has more than one worksheets (i.e. sheet1 and sheet2)
and i did like this:
import pandas
df1 = pandas.read_excel('file.xlsx', sheet_name='sheet1')
####doing something on shee1, sheet2 is not touched######
df1.to_excel('file.xlsx', sheet_name='sheet1')
By doing above, I found sheet2 missing after saving the file.
Is there a way to open and save on same file without affecting other worksheets?
A possible way to do that is by loading all of your sheets, then modifying only the first one. Although it works, you may loose any custom styling from your tables.
# Load all sheets
workbook = pd.read_excel('file.xlsx', sheet_name=None)
# do something to workbook['sheet1']
# Write all sheets to excel file
writer = pd.ExcelWriter('file.xlsx', engine='xlsxwriter')
for sheet, df in workbook.items():
df.to_excel(writer, sheet_name=sheet)
writer.save()
As far as I know, the only way to overwrite a sheet ─ while keeping the other ones untouched ─ requires using third-party libraries. For instance,
here's an option with openpyxl:
First, modify the data as you wish:
import pandas as pd
fname = 'file.xlsx'
target_sheet = 'sheet1'
df = pd.read_excel('file.xlsx', sheet_name='sheet1')
# further modification to `df` ...
then, save it to the specified sheet:
# Load required functions
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
# Read excel file (all sheets)
wb = load_workbook(fname)
# Get the index from target-sheet
idx = wb.sheetnames.index(target_sheet)
# Delete the existing target-sheet
del wb[target_sheet]
# Create a new empty target-sheet
wb.create_sheet(target_sheet, idx)
# Write `df` data on it
for r in dataframe_to_rows(df, index=False, header=True):
wb[target_sheet].append(r)
# Save file
wb.save(fname)
I suspect something is going on in the below section that is throwing off the code:
####doing something on ws1, ws2 is not touched######
When I ran your code on my system the workbook still returned both worksheets
As an isolation test can you comment/remove the code in that section and confirm if the error still appears.

How to write multiIndex-columns excel with pandas

I want to export a multiIndex-column.
I read an excel file (https://drive.google.com/open?id=1G6nE5wiNRf5sip22dQ8dfhuKgxzm4f8E) and exported it with the following code:
df = pd.read_excel('sample.xlsx')
df.to_excel('sample2.xlsx', index = False)
However, sample2.xlsx has different format as sample.xlsx.
For example, there are merged cells in sample.xlsx but not in sample2.xlsx and the blank cells in sample.xlsx become Unnamed:xx.
You can view sample2.xlsx here.
How to solve this problem?
Thank you.
Since you working with xlsx files, openpyxl package will do the job.
import openpyxl
wb_obj = openpyxl.load_workbook('sample.xlsx')
wb_obj.save('sample2.xlsx')
Further reading on openpyxl

Overwriting existing cells in an XLSX file using Python

I am trying to find a library that overwrites an existing cell to change its contents using Python.
what I want to do:
read from .xlsx file
compare cell data determine if change is needed.
change data in cell Eg. overwrite date in cell 'O2'
save file.
I have tried the following libraries:
xlsxwriter
combination of:
xlrd
xlwt
xlutils
openpyxl
xlsxwriter only writes to a new excel sheet and file.
combination: works to read from .xlsx but only writes to .xls
openpyxl: reads from existing file but doesn't write to existing cells can only create new rows and cells, or can create entire new workbook
Any suggestions would greatly be appreciated. Other libraries? how to manipulate the libraries above to overwrite data in an existing file?
from win32com.client import Dispatch
import os
xl = Dispatch("Excel.Application")
xl.Visible = True # otherwise excel is hidden
# newest excel does not accept forward slash in path
wbs_path = r'C:\path\to\a\bunch\of\workbooks'
for wbname in os.listdir(wbs_path):
if not wbname.endswith(".xlsx"):
continue
wb = xl.Workbooks.Open(wbs_path + '\\' + wbname)
sh = wb.Worksheets("name of sheet")
sh.Range("A1").Value = "some new value"
wb.Save()
wb.Close()
xl.Quit()
Alternatively you can use xlwing, which (if I had to guess) seems to be using this approach under the hood.
>>> import xlwings as xw
>>> wb = xw.Book() # this will create a new workbook
>>> wb = xw.Book('FileName.xlsx') # connect to an existing file in the current working directory
>>> wb = xw.Book(r'C:\path\to\file.xlsx') # on Windows: use raw strings to escape backslashes

Categories

Resources