I am trying to load an excel file using openpyxl library but I kept getting a value error. I also created a new excel file and tried loading the file using pandas but I still get an exception error related cell "C7".
openpyxl
DestFile ="C:\\Users\\yaxee\\OneDrive\\Desktop\\NBET Extraction data\\December Data Extration 2020 NBET\\XYX.xlsx"
wb2 = xl.load_workbook(DestFile)
Error
ValueError: invalid literal for int() with base 10: '7.0'
pandas
df = pd.read_excel (r'C:\Users\yaxee\OneDrive\Desktop\NBET Extraction data\December Data Extration 2020 NBET\XYX.xlsx')
Error
Exception: cell name 'C7.0' but row number is '7'
I can post the full error script if needed.
here is the full script i'm working with:
import openpyxl as xl
ExtractionFile ="C:\\Users\\yaxee\\OneDrive\\Desktop\\NBET Extraction data\\August Data Extration 2020 NBET\\NOR04082020.xlsx"
wb1 = xl.load_workbook(ExtractionFile, data_only=True)
daily_broadcast = wb1.worksheets[0]
DestFile ="C:\\Users\\yaxee\\OneDrive\\Desktop\\NBET Extraction data\\August Data Extration 2020 NBET\\SampleOct.xlsx"
wb2 = xl.load_workbook(DestFile)
peak_gen = wb2.worksheets[0]
off_gen = wb2.worksheets[1]
energy_gen = wb2.worksheets[2]
energy_sent = wb2.worksheets[3]
instlld_cap = wb2.worksheets[4]
gen_cap = wb2.worksheets[5]
onBar_cap = wb2.worksheets[6]
gen_6am = wb2.worksheets[7]
unutilized = wb2.worksheets[8]
col_count = 6
step = 2
read_start_row = 73
write_start_row = 4
amount_of_rows = 54
#peak generation capability code
for row in range(5, 34):
a = daily_broadcast.cell(row = row, column = 25)
peak_gen.cell(row = row-1,column = col_count).value = a.value
wb2.save(str(DestFile))
#off generation capability code
for row in range(5, 34):
b = daily_broadcast.cell(row = row, column = 27)
off_gen.cell(row = row-1,column = col_count).value = b.value
wb2.save(str(DestFile))
#Energy generated code
for row in range(39, 68):
c = daily_broadcast.cell(row = row, column = 25)
energy_gen.cell(row = row-35,column = col_count).value = c.value
wb2.save(str(DestFile))
#Energy dispatched code
for row in range(39, 68):
d = daily_broadcast.cell(row = row, column = 27)
energy_sent.cell(row = row-35,column = col_count).value = d.value
wb2.save(str(DestFile))
#Installed Capacity code
for i in range(0, amount_of_rows, step):
e = daily_broadcast.cell(row = read_start_row + i, column = 13)
instlld_cap.cell(row = write_start_row+(i/step),column = col_count).value = e.value
wb2.save(str(DestFile))
#Generation Capablity code
for i in range(0, amount_of_rows, step):
f = daily_broadcast.cell(row = read_start_row + i, column = 15)
gen_cap.cell(row = write_start_row+(i/step),column = col_count).value = f.value
wb2.save(str(DestFile))
#On Bar Capablity code
for i in range(0, amount_of_rows, step):
g = daily_broadcast.cell(row = read_start_row + i, column = 19)
onBar_cap.cell(row = write_start_row+(i/step),column = col_count).value = g.value
wb2.save(str(DestFile))
#Generation at 6am code
for i in range(0, amount_of_rows, step):
g = daily_broadcast.cell(row = read_start_row + i, column = 21)
gen_6am.cell(row = write_start_row+(i/step),column = col_count).value = g.value
wb2.save(str(DestFile))
[
This happens when a worksheet is created with the row or the column designation with a float point number. For example, worksheet.cell(row=1.0, column=1).value = 'some value'. While Excel reads the file without any issues, having openpyxl open the file causes the error. A simple remedy is to always use an integer for the row and the column designations.
Related
How can I copy a row for example from D51 to F51 and paste these values in the row T20 to AF20.
I know how to load a spreadsheet
workbook = load_workbook(output)
sheet = workbook.active
But I dont know how to itenarate in a loop to get this
sheet["T2"] = "=D6"
sheet["U2"] = "=E6"
sheet["V2"] = "=F6"
sheet["W2"] = "=G6"
sheet["X2"] = "=H6"
sheet["Y2"] = "=I6"
sheet["Z2"] = "=J6"
sheet["AA2"] = "=K6"
sheet["AB2"] = "=L6"
sheet["AC2"] = "=M6"
sheet["AD2"] = "=N6"
sheet["AE2"] = "=O6"
sheet["AF2"] = "=P6"
You can achieve this by using code below...
Note that the file output.xlsx is opened, updated and saved. The function num_to_excel_col is borrowed from here.
This will update columns 20 (T) onwards for the next 15 columns (all row 2) with the text as "=D6", "=E6", etc. The num_to_col function will convert the col number to equivalent excel string (for eg. 27 will be converted to AA, etc.)
import pandas as pd
import numpy as np
import openpyxl
workbook = openpyxl.load_workbook('output.xlsx')
ws = workbook.active
def num_to_excel_col(n):
if n < 1:
raise ValueError("Number must be positive")
result = ""
while True:
if n > 26:
n, r = divmod(n - 1, 26)
result = chr(r + ord('A')) + result
else:
return chr(n + ord('A') - 1) + result
outcol = 4 #Paste in col 'D'
for col in range(20,35): #col 20 is T and doing this for next 15 columns
txt = "="+num_to_excel_col(outcol)+"6"
print(txt)
ws.cell(row=2, column=col).value = txt
outcol += 1
workbook.save("output.xlsx")
I am trying to write the results from the loop into an Excel file (keys = column names) and (values = rows data). This code generates the file for me, but it only prints one row of data in the file. How can i make it append the other rows to the file?
import pandas as pd
p = (('BusinessName', 'CustomerNameToSearch'), ('PageSize', '2'), ('CountryCode', 'CA'))
prepare_link = requests.get('https://api.myapiloopuplink?', auth=BearerAuth('PMay4TY5K577b76154i97yC9DlbPytqd'), params=p)
test = requests.get(prepare_link.url, auth=BearerAuth('PMay4TY5K577b76154i97yC9DlbPytqd'), params=p)
data = json.loads(test.text)
CustomerIdList = []
for customer in data['Data']:
BusinessID = customer['BusinessId']
BusinessName = customer['BusinessName']
CustomerIdList.append(str(customer['BusinessId']))
for i in CustomerIdList:
links2 = ("https://api.myapiloopuplink/"+i+"/History?count=1")
test2 = requests.get(links2, auth=BearerAuth('PMay4TY5K577b76154i97yC9DlbPytqd'))
data2 = json.loads(test2.text)
start_row = 0
for extradetails in data2['Data']:
myDict = {}
myDict["BusinessId"] = customer['BusinessId']
myDict["BusinessName"] = customer['BusinessName']
myDict["Year"] = extradetails['Year']
myDict["Rate"] = extradetails['Rate']
print(myDict)
k = list(myDict.keys())
v = list(myDict.values())
#print(k)
#print(v)
x = [myDict]
df = pd.DataFrame(x)
df.to_excel ('locationandnameoffile.xlsx', sheet_name = 'sheet1', index = False, startrow=start_row)
start_row = start_row + len(df) + 1
This is the output i currently get
This is the output i am trying to get
In the loop i get the right results when i print (it shows multiple rows)
print(myDict)
I think the problem is here:
for extradetails in data2['Data']:
myDict = {}
myDict["BusinessId"] = customer['BusinessId']
myDict["BusinessName"] = customer['BusinessName']
myDict["Year"] = extradetails['Year']
myDict["Rate"] = extradetails['Rate']
print(myDict)
k = list(myDict.keys())
v = list(myDict.values())
#print(k)
#print(v)
x = [myDict]
df = pd.DataFrame(x) #problem
df.to_excel ('locationandnameoffile.xlsx', sheet_name = 'sheet1', index = False, startrow=start_row)#problem
start_row = start_row + len(df) + 1
You are creating an excel file in every loop. How about create an excel file after the loop completes. like this:
datas=[]
for extradetails in data2['Data']:
myDict = {}
myDict["BusinessId"] = customer['BusinessId']
myDict["BusinessName"] = customer['BusinessName']
myDict["Year"] = extradetails['Year']
myDict["Rate"] = extradetails['Rate']
print(myDict)
k = list(myDict.keys())
v = list(myDict.values())
#print(k)
#print(v)
datas.append([myDict])
start_row = start_row + len(df) + 1
df = pd.DataFrame(datas)
df.to_excel ('locationandnameoffile.xlsx', sheet_name = 'sheet1', index = False, startrow=start_row)
I'm trying to write a Script in Python with openpyxl, that saves data in a excel-file and also draws a scatterchart with some trendlines. The problem with openpyxl is, that trendlines in default-mode are always colored black, and so my questione is, how to color them in different colors.
from openpyxl.chart import *
from openpyxl.chart.trendline import Trendline
from openpyxl.styles import *
from openpyxl.chart.shapes import *
from openpyxl.drawing.colors import *
wb = Workbook()
ws = wb.create_sheet(name)
#writes the Headlines of the columns into excel-file
for c in range(1,5):
ws.cell(1, c).value = ueberschriften[c-1]
#writes the data into excel-file
for c in range(2, points+2):
e = c-2
for d in range(1,5):
f = d-1
b = a[e][f]
ws.cell(c, d).value = b
for row in a:
ws.append(row)
chart = ScatterChart()
chart.title = "Measuring"
chart.style = 13
chart.x_axis.title = 'Time'
chart.y_axis.title = 'Value'
xvalues = Reference(ws, min_col=1, min_row=2, max_row=points+1)
for i in range(2, 5):
values = Reference(ws, min_col=i, min_row=1, max_row=points+1)
series = Series(values, xvalues, title_from_data=True)
chart.series.append(series)
l = chart.series[0]
l.graphicalProperties.line.solidFill = "FF0000"
l.trindline = Trendline(trendlineType = 'poly', order = fit_order) #trendline with polinomial fitting
l1 = chart.series[1]
l1.graphicalProperties.line.solidFill = "0000FF"
line1.trendline = Trendline(trendlineType = 'poly', order = fit_order)
l2 = chart.series[2]
l2.graphicalProperties.line.solidFill = "00FF00"
l2.trendline = Trendline(trendlineType = 'poly', order = fit_order)
ws.add_chart(chart, "A10")
wb.save("C:\****\****\testFile.xlsx")```
Let's say you want your linear trend line to be green. Then...
from openpyxl.chart.trendline import Trendline
from openpyxl.chart.shapes import GraphicalProperties
from openpyxl.drawing.line import LineProperties
line_props = LineProperties(solidFill='00FF00')
g_props = GraphicalProperties(ln=line_props)
linear_trendline = Trendline(spPr=g_props)
my_chart.series[0].trendline = linear_trendline
I'm trying to write a scipt that can update a column based on a transaction ID. Im using Python3, Openpyxl to read the excel file
In the above image, it would be to update the highlighted cells with the same value in column K, as they have the same transaction ID in column C. Then when it gets to C12, it updates column K with a different value as the value of C has changed...and so on and so on.
So far I have:
from openpyxl import load_workbook, Workbook
import re
wb = load_workbook(filename = 'Testing.xlsx')
ws = wb['Test']
for r in range(2, ws.max_row + 1):
column_c = ws.cell(row = r, column = 3).value
column_h = ws.cell(row = r, column = 8).value
column_i = ws.cell(row = r, column = 9).value
column_j = ws.cell(row = r, column = 10).value
previous = None
while (previous == column_c):
ws.cell(row = r, column = 11).value = column_j_formatted
if (previous != column_c):
continue
wb.save('Testing_processed.xlsx')
UPDATE
I have tried to replace the while loop with:
previous_col_c = ws.cell(row=r-1, column=3)
for row_num in range (2, ws.max_row + 1):
current_col_c = ws.cell(row=r, column=3)
current_col_j = ws.cell(row=r, column=11)
if current_col_c == previous_col_c:
ws.cell(row = r, column = 11).value = column_j_formatted
previous_col_c = current_col_c
Just to illustrate how the openpyxl API makes this kind of task very easy.
txn = None
filler = None
for row in ws.iter_rows(min_row=2):
a = row[0]
k = row[10]
if a.value != txn:
txn = a.value
filler = k.value
if not k.value:
k.value = filler
But really the work should be done in the source of the data, presumably a database.
I am trying to copy the values from some cells but it give me this error, i tried even without using the def cell(x,y) but still the same error.
This is the error:
learn_tar.cell(row=learn_tar, column=1).value = sheet.cell(row=learn_tar, column=1).value
AttributeError: 'int' object has no attribute 'cell'
Source:
import openpyxl
def cell(x,y):
cell = sheet.cell(row=x,column=y).value
return cell;
def percentage(percent, whole):
return int((percent * whole) / 100.0);
ex = openpyxl.load_workbook("Final_excel2.xlsx")
sheet = ex.get_sheet_by_name('Sheet1')
num = [0,0,0]
per = [0,0,0]
for row in range(2,4798):
if cell(row,1) == '1: Progression':
num[0] = num[0] + 1
elif cell(row,1) == '2: Incidence':
num[1] = num[1] + 1
elif cell(row,1) == '3: Non-exposed control group':
num[2] = num[2] + 1
for column in range(2,49):
#doing stuff
per[0] = percentage(70,num[0])
per[1] = percentage(70,num[1])
per[2] = percentage(70,num[2])
learn_att = ex.create_sheet('Learn-Att',2)
learn_tar = ex.create_sheet('Learn-Tar',3)
test_att = ex.create_sheet('Test-Att',4)
test_tar = ex.create_sheet('Test-Tar',5)
learn_att = 1
learn_tar = 1
test_att = 1
test_tar = 1
for row in range(2,4798):
if row<=1391:
if row<=974:
learn_tar.cell(row=learn_tar, column=1).value = cell(row,1)
learn_att+= 1
learn_tar+= 1
else:
test_tar.cell(row = test_tar,column = 1).value = cell(row,1)
test_att+= 1
test_tar+= 1
for column in range(2,49):
if row<=1391:
if row<=974:
learn_att.cell(row = learn_att,column = column - 1).value = cell(row,column)
else:
test_att.cell(row = test_att,column = column - 1).value = cell(row,column)
You override learn_tar with 1:
learn_tar = ex.create_sheet('Learn-Tar',3)
...
learn_tar = 1
Remove:
learn_tar = 1
and:
learn_tar+= 1
from your code.