Can't export dataframes and other objects to Excel using pandas? - python

I am trying to export multiple dataframes to a single spreadsheet in excel from Pandas.
I keep getting the error:
'numpy.int64' object has no attribute 'to_excel'
despite never importing numpy. Here is my code:
import datetime
import pandas as pd
import warnings
import xlwings as xw
... dataframe calculations ...
# Export to Excel
writer = pd.ExcelWriter('C:/users/test.xlsx', engine='xlsxwriter')
workbook = writer.book
worksheet = workbook.add_worksheet('PythonCalculations')
writer.sheets['PythonCalculations'] = worksheet
myNumber.to_excel(writer,sheet_name='PythonCalculations',startrow = 0 , startcol=0)
df1.to_excel(writer,sheet_name='PythonCalculations',startrow = 2, startcol = 0)
Where, for example, myNumber = 100. Type is numpy.int64. whenever I try to execute that line with myNumber in it, it keeps returning the error 'numpy.int64' object has no attribute 'to_excel'.
When I try to execute the line with df1, it manages to execute, however no excel file is created. Even when I created the excel file and ran the lines, it didn't populate. How do I fix this?

I also wonder why you don't just use xlwings. First make sure to take a look at the xlwings docs, https://docs.xlwings.org/en/stable/quickstart.html . Once you are able to connect you should be able to do something as simple as the below code. I'm not adding a worksheet, I'm using an existing worksheet.
wb = xw.Book('my_workbook.xlsm')
sht = wb.sheets['a_sheet_that_already_exists']
rng1 = sht.range('A1') # Ambiguous cell.
rng2 = sht.range('J2') # Ambiguous cell with enough room for df1.
# Your dataframe collections are df1 and df2.
# Showing .options in case you care.
rng1.options(index=True, header=True).value = df1
rng2.options(index=True, header=True).value = df2

Related

Is there a way to pass keys as a variable in a DataFrame's 'get' method in pandas?

So I have an Excel dataset with 2 sheets that I want to read using pandas. To make the code more dynamic, I decided to store the sheet names list into a variable which I later pass into the 'get' method to create separate DataFrames for both.
import pandas as pd
# access all sheets as one DataFrame
excel_file = pd.ExcelFile(r"C:\Users\user\Desktop\example.xlsx")
email_list = pd.read_excel(excel_file)
sheets = excel_file.sheet_names
# separate sheets into separate DataFrames.
sheet1 = email_list.get(sheets[0])
sheet2 = email_list.get(sheets[1])
Then from these DataFrames I exract a column of emails as a Series
sheet1list = sheet1['Email']
sheet2list = sheet2['Email']
But when I run it, I run into the following error:
Traceback (most recent call last):
File "c:\Users\ga201\Desktop\My Python Projects\main.py", line 30, in <module>
sheet1list = sheet1['Email']
TypeError: 'NoneType' object is not subscriptable
Why does it make the sheet1 variable a NoneType?
Is there a way to actually use it through variables or is the only
way to type in the sheet name?
Just curious. I've tried using f strings, but they don't work either.
You're misinterpreting the default behaviour of pd.read_excel. If you don't specify the parameter sheet_name, the function will only return the first sheet as a DataFrame.
Use email_list = pd.read_excel(excel_file, sheet_name=None) to get DataFrames for all sheets.
The return value will now be a dictionary of DataFrames with the sheet_names as keys.

Can't see csv file (converted from df) in files

After saving my dataframe to a csv in a specific location, the csv file doesn't appear in the location I saved it to. Is there any reason why it possibly is not showing?
Here is the code to save my dataframe to csv:
df.to_csv(r'C:\Users\gibso\OneDrive\Documents\JOSEPH\export_dataframe.csv', index = False)
Even changing an empty df does not seem to work.
import pandas as pd
olympics={}
df = pd.DataFrame(olympics)
df.to_csv(r'C:\Users\gibso\OneDrive\Documents\JOSEPH\export_dataframe.csv', index = False)
Thanks for the help!
I would rather use the module openpyxl. Example of saving:
import openpyxl
workbook = openpyxl.Workbook()
sheet = workbook.active
# Work on your workbook. Once finished:
workbook.save(file_name) # file_name is a variable you must define
Don't forget installing openpyxl with pip first!

excel rename multiple sheet names from list

I have multiple sheets in excel converted from dataframe. I have collected the sheetnames in a list. I want to change the sheetname to the duplicate column values in which I have collected as shown below.
Here is my code:
dups = df.set_index('Group').index.get_duplicates()
After converting from dataframe to excel I have collected the sheetnames in a list.
xls = pd.ExcelFile('filename', on_demand=True)
sheets=xls.sheet_names
I also used as shown below:
for i in group: #names to be renamed, collected as list
wb=openpyxl.load_workbook('file.xlsx')
worksheet = wb.get_sheet_names()
worksheet.title = i
wb1.save('file.xlsx')
But, I got the AttributeError: 'list' object has no attribute 'title'.
Now, I want to rename the sheets to the dups value.
I would like to know if it is possible.
Pleased to hear some suggestions.
You can use openpyxl for this:
import openpyxl
file_loc = 'myexcel.xlsx'
workbook = openpyxl.load_workbook(file_loc)
worksheet = workbook.get_sheet_by_name('Sheet1')
worksheet.title = 'MySheetName'
workbook.save(file_loc)
You can run this in a loop to rename all the sheets. Let me know if this helps.
It is possible to iterate over the workbook using for sheet in wb
Here is an example:
import openpyxl
import os
os.chdir('C:\\Users\\Vahan\\Desktop\\xlsx')
wb = openpyxl.load_workbook('example.xlsx')
for sheet in wb: # or wb.worksheets
sheet.title = 'RenamedSheets'
wb.save('example.xlsx')
This functionality may help you achieve what you are trying to do.

Openpyxl: 'Worksheet' object has no attribute 'values'

My goal is to read in an excel file and view the codes in a pandas dataframe (i.e. '= A3') rather than the resulting values from excel executing the codes, which is the pandas default if read in using pandas.
My goal was described here: How can I see the formulas of an excel spreadsheet in pandas / python?
Openpyxl is supposed to support this, but I can't get the import to function correctly. Anyone spot the error?
import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
df = pd.DataFrame()
wb = load_workbook(filename = 'name.xlsx')
sheet_names = wb.get_sheet_names()
name = sheet_names[0]
sheet_ranges = wb[name]
df = pd.DataFrame(sheet_ranges.values)
> AttributeError: 'Worksheet' object has no attribute 'values'
(Note: the exact implementation of the answer at the linked question yields KeyError: 'Worksheet range names does not exist.' My code above resolved this, but then gets stuck as described.)
Check your version of openpyxl, It seems you have an older version.
openpyxl 2.4.2
import openpyxl
print(openpyxl.__version__)
Values property for worksheets were added only from 2.4.0-a1 (2016-04-11)

Is it possible to read data from an Excel sheet in Python using Xlsxwriter? If so how?

I'm doing the following calculation.
worksheet.write_formula('E5', '=({} - A2)'.format(number))
I want to print the value in E5 on the console. Can you help me to do it? Is it possible to do it with Xlsxwriter or should I use a different library to the same?
It is not possible to read data from an Excel file using XlsxWriter.
There are some alternatives listed in the documentation.
If you want to use xlsxwriter for manipulating formats and formula that you can't do with pandas, you can at least import your excel file into an xlsxwriter object using pandas. Here's how.
import pandas as pd
import xlsxwriter
def xlsx_to_workbook(xlsx_in_file_url, xlsx_out_file_url, sheetname):
"""
Read EXCEL file into xlsxwriter workbook worksheet
"""
workbook = xlsxwriter.Workbook(xlsx_out_file_url)
worksheet = workbook.add_worksheet(sheetname)
#read my_excel into a pandas DataFrame
df = pd.read_excel(xlsx_in_file_url)
# A list of column headers
list_of_columns = df.columns.values
for col in range(len(list_of_columns)):
#write column headers.
#if you don't have column headers remove the folling line and use "row" rather than "row+1" in the if/else statments below
worksheet.write(0, col, list_of_columns[col] )
for row in range (len(df)):
#Test for Nan, otherwise worksheet.write throws it.
if df[list_of_columns[col]][row] != df[list_of_columns[col]][row]:
worksheet.write(row+1, col, "")
else:
worksheet.write(row+1, col, df[list_of_columns[col]][row])
return workbook, worksheet
# Create a workbook
#read you Excel file into a workbook/worksheet object to be manipulated with xlsxwriter
#this assumes that the EXCEL file has column headers
workbook, worksheet = xlsx_to_workbook("my_excel.xlsx", "my_future_excel.xlsx", "My Sheet Name")
###########################################################
#Do all your fancy formatting and formula manipulation here
###########################################################
#write/close the file my_new_excel.xlsx
workbook.close()
Not answering this specific question, just a suggestion - simply try pandas and read data from excel. Thereafter you can simply manipulate the data using pandas DataFrame built-in methods:
df = pd.read_excel(file_,index_col=None, header=0)
df is the pandas.DataFrame, just go through DataFrame from this it's cookbook site. If you are unaware about this package, you might get surprised by this awesome python module.

Categories

Resources