Is there a good way to replace values in classes for openpyxl in Python? - python

I have extracted cell values from excel, placed them in classes, and want to replace the wrong values after a specific conversion factor.
I.e something like this at the bottom.
Any suggestions?
products = []
for row in sheet.iter_rows(min_row=8, max_row=3030, max_col=10, values_only=True):
product = Product(id=row[PRODUCT_ID],
Molecule=row[PRODUCT_MOLECULE],
Pack_Size=row[PRODUCT_PACK_SIZE],
Pack_Quantity=row[PRODUCT_PACK_QUANTITY],
Product_Treatment=row[PRODUCT_TREATMENT])
products.append(product)
print(products[1000])
for index, item in enumerate(products):
if item != PRODUCT_PACK_QUANTITY * PRODUCT_PACK_SIZE / 1:
products[index] = round(products[index]) - (products[index] PRODUCT_PACK_QUANTITY * PRODUCT_PACK_SIZE / 1)```

Related

Is it possible to update a row of data using position of column (e.g. like a list index) in Python / SQLAlchemy?

I am trying to compare two rows of data to one another which I have stored in a list.
for x in range(0, len_data_row):
if company_data[0][0][x] == company_data[1][0][x]:
print ('MATCH 1: {} - {}'.format(x, company_data[0][0][x]))
# do nothing
if company_data[0][0][x] == None and company_data[1][0][x] != None:
print ('MATCH 2: {} - {}'.format(x, company_data[1][0][x]))
# update first company_id with data from 2nd
if company_data[0][0][x] != None and company_data[1][0][x] == None:
print ('MATCH 3: {} - {}'.format(x, company_data[0][0][x]))
# update second company_id with data from 1st
Psuedocode of what I want to do:
If data at index[x] of a list is not None for row 2, but is blank for row 1, then write the value of row 2 at index[x] for row 1 data in my database.
The part I can't figure out is if in SQLAlchemy you can do specify which column is being updated by an "index" (I think in db-land index means something different than what I mean. What I mean is like a list index, e.g., list[1]). And also if you can dynamically specify which column is being updated by passing a variable to the update code? Here's what I'm looking to do (it doesn't work of course):
def some_name(column_by_index, column_value):
u = table_name.update().where(table_name.c.id==row_id).values(column_by_index=column_value)
db.execute(u)
Thank you!

Trying to iterate over a df and add a column with a value for each row

I am trying to use two dataframes to add a new column, num_events to one of my dataframes.
My first attempt was to use iteritems(), but I am getting the error: KeyError: 'min_lat'
Basic code:
for index, row in v_roads.iteritems():
min_lat = row['min_lat']
min_lon = row['min_lon']
max_lat = row['max_lat']
max_lon = row['max_lon']
total_events = 0
for i, r in accidents.iteritems():
e_lat = r['latitude']
e_lon = r['longitude']
if e_lat >= min_lat and e_lat <= max_lat and e_lon >= min_lon and e_lon <= max_lon:
total_events += 1
row['num_events'] = total_events
I understand that iteritems() lazily iterates over (index, value) tuples, but I am unsure of another way to do what I want, which is for each row, get the data at the row's columns, min_lat, min_lon, etc. and store that data in variables.
Could someone please point me in the right direction towards a correct approach?
EDIT: To clear this up for some, yes I do want to add a new column, but I am stuck at reading data from specific row's columns.
Data example
v_roads
accidents
So to solve this this is what I did:
for row in v_roads.itertuples():
# Where 11 is the index of the min_lat col...
min_lat = row[11]
min_lon = row[12]
...
It's not that hard, so hopefully this helps anybody else with a similar issue.
Note that tuples are immutable, so if you wanted to make changes you would need to do something like:
for i, row in v_roads.head(1000).iterrows():
v_roads.set_value(i, 'colNameHere', variableToPlace)

Find an efficient way of searching in nested python lists

I am very new to this forum and am basically a Network Engineer learning Python to automate some tasks and make my work more efficient. Well, straight to the point. I have a big excel workbook of 4 sheets with around 50K rows in each sheet. After learning for couple of weeks and extensive search I was able to load the whole excel cell values in a nested list e.g.
list [sheet_index][row_index][column_index].
Now after getting the inputs, next part is manipulation of those data. My task is to find specific column value from each row and search in the entire workbook and if found, corresponding data from a different column should be written in line with the original searched object.
My method is like below:
Getting the cell values in a big list (as I mentioned earlier)
flatten that list in a different variable as a one dimensional list.
in a loop, get the specific value from a row (fixed column) and search in entire one-dimensional list, if found, write the corresponding value in a different excel file.
So far, this method is working fine with a extra long delay which was the motivation for drifting from Excel VBA program to Python. So, I am here to ask the experts if theres something very basic I am missing. Here is the code below:
import xlrd
import xlwt
from compiler.ast import flatten
datafile = 'Peering_DB.xls'
# Data Read Function Definition
def main(datafile):
wb = xlrd.open_workbook(datafile)
wwb = copy(wb)
data = [[[wb.sheet_by_index(i).cell_value(r, col)
for col in range(wb.sheet_by_index(i).ncols)]
for r in range(wb.sheet_by_index(i).nrows)]
for i in range(0,4)]
data1 = flatten(data)
k = 2
x = 0
while x < 4:
r = wb.sheet_by_index(x).nrows
A = data[x][k][1]
B = data[x][k][2]
counter = 4
loc = [loc for (loc , e ) in enumerate(data1) if e == A]
if len(loc) != 1:
for n in range(len(loc)):
if data1[loc[n] + 1] != B:
wwb.get_sheet(x).write(k,counter,data1[loc[n] + 1])
counter = counter + 1
else:
wwb.get_sheet(x).write(k,counter,"No Backup")
k = k + 1
if k == r - 1 and x < 3:
print 'Page number ', x , 'Completed'
x = x + 1
k = 2
elif k == r and x == 3:
print "Operation Completed Successfully"
break
wwb.save('Peering_output.xls')
main(datafile)

Google chart input data

I have a python script to build inputs for a Google chart. It correctly creates column headers and the correct number of rows, but repeats the data for the last row in every row. I tried explicitly setting the row indices rather than using a loop (which wouldn't work in practice, but should have worked in testing). It still gives me the same values for each entry. I also had it working when I had this code on the same page as the HTML user form.
end1 = number of rows in the data table
end2 = number of columns in the data table represented by a list of column headers
viewData = data stored in database
c = connections['default'].cursor()
c.execute("SELECT * FROM {0}.\"{1}\"".format(analysis_schema, viewName))
viewData=c.fetchall()
curDesc = c.description
end1 = len(viewData)
end2 = len(curDesc)
Creates column headers:
colOrder=[curDesc[2][0]]
if activityOrCommodity=="activity":
tableDescription={curDesc[2][0] : ("string", "Activity")}
elif (activityOrCommodity == "commodity") or (activityOrCommodity == "aa_commodity"):
tableDescription={curDesc[2][0] : ("string", "Commodity")}
for i in range(3,end2 ):
attValue = curDesc[i][0]
tableDescription[curDesc[i][0]]= ("number", attValue)
colOrder.append(curDesc[i][0])
Creates row data:
data=[]
values = {}
for i in range(0,end1):
for j in range(2, end2):
if j == 2:
values[curDesc[j][0]] = viewData[i][j].encode("utf-8")
else:
values[curDesc[j][0]] = viewData[i][j]
data.append(values)
dataTable = gviz_api.DataTable(tableDescription)
dataTable.LoadData(data)
return dataTable.ToJSon(columns_order=colOrder)
An example javascript output:
var dt = new google.visualization.DataTable({cols:[{id:'activity',label:'Activity',type:'string'},{id:'size',label:'size',type:'number'},{id:'compositeutility',label:'compositeutility',type:'number'}],rows:[{c:[{v:'AA26FedGovAccounts'},{v:49118957568.0},{v:1.94956132673}]},{c:[{v:'AA26FedGovAccounts'},{v:49118957568.0},{v:1.94956132673}]},{c:[{v:'AA26FedGovAccounts'},{v:49118957568.0},{v:1.94956132673}]},{c:[{v:'AA26FedGovAccounts'},{v:49118957568.0},{v:1.94956132673}]},{c:[{v:'AA26FedGovAccounts'},{v:49118957568.0},{v:1.94956132673}]}]}, 0.6);
it seems you're appending values to the data but your values are not being reset after each iteration...
i assume this is not intended right? if so just move values inside the first for loop in your row setting code

How to refer another row in row_iterator Pandas?

I have the following code
row_iterator = temp.iterrows()
for i, row in row_iterator:
row['InterE'] = row['xs'] - (row['xs'] - row['InterS']) * exp(-row['ak1'])
if row['InterE'][:-1] < 1:
row['InterS'] = row['InterE'][:-1]
else:
row['InterS'] = row['InterE'][:-1] - row['InterE'][:-1] * row['xi'][:-1]
But it returns me the following error:
invalid index to scalar variable.
Could you help me?
You should avoid iterating especially when you can vectorise the operation.
So
# calculate 'InterE' column for entire dataframe
temp['InterE'] = temp['xs'] - (temp['xs'] - temp['InterS']) * exp(-temp['ak1'])
# now for those values less than 1 assign the previous row value, this is what shift does
temp.loc[temp['InterE'] < 1, 'InterS'] = temp['InterE'].shift(-1)
# for the other condition perform the alternative calculation and assign
temp.loc[temp['InterE'] >=1, 'InterS'] = temp['InterE'].shift(-1) - (temp['InterE'].shift(-1) * temp['xi'].shift(-1))
let me know if this does what you want, if not then post the data and desired output

Categories

Resources