How can I make the cell number increase by one every time it loops through all of the sheets? I got it to loop through the different sheets itself but I'm not sure how to add +1 to the cell value.
for sheet in sheetlist:
wsX = wb.get_sheet_by_name('{}'.format(sheet))
ws2['D4'] = wsX['P6'].value
I'm trying to get just the ['D4'] to change to D5,D6,D7.. etc up to 25 automatically.
No need for counters or clumsy string conversion: openpyxl provides an API for programmatic access.
for idx, sheet in enumerate(sheetlist, start=4):
wsX = wb[sheet]
cell = ws2.cell(row=idx, column=16)
cell.value = wsX['P6']
for i, sheet in enumerate(sheetlist):
wsX = wb.get_sheet_by_name('{}'.format(sheet))
cell_no = 'D' + str(i + 4)
ws2[cell_no] = wsX['P6'].value
write this outside of the loop :
x = 'D4'
write this in the loop :
x = x[0] + str(int(x[1:])+1)
Try this one... it's commented so you can understand what it's doing.
#counter
i = 4
for sheet in sheetlist:
#looping from D4 to D25
while i <= 25:
wsX = wb.get_sheet_by_name('{}'.format(sheet))
#dynamic way to get the cell
cell1 = 'D' + str(i)
ws2[cell1] = wsX['P6'].value
#incrementing counter
i += 1
Related
I would need help saving the values my for-loop iterates over.
With this script I create my csv the way I need it, but there is no information what value w and c has in each row. How can I add this information in two more columns?
import pandas as pd
df = pd.read_csv(...)
country_list = df.Country.unique()
wave_list = df.Wave.unique()
dn = pd.DataFrame()
for w in wave_list:
print ("Wave is: " + str(w))
wave_select =df[df["Wave"] == w] # Select Rows for Waves
for c in country_list:
print ("Country is: " + str(c))
country_select = df[df["Country"] == c] # Select Rows for Countries
out = country_select["Sea"].value_counts(normalize=True)*100 # Calculate Percentage
print (out)
dn = dn.append(out)
dn.to_csv (...)
I would be very grateful for help.
Before loop:
dn = pd.DataFrame(columns=['wave','country','out'])
Inside inner loop instead of dn = dn.append(out):
dn = dn.append({'wave':w,'country':c,'out':out}, ignore_index=True)
Recently I have been working with an excel sheet for work and I need to format it in a certain way (shown below). The following is the excel sheet I'm working with (apologies for the REDACTED, some of the information is sensitive, also apologize for the image, I am fairly new to Stack Overflow and do not know how to add excel data):
Above is the format that I currently am using, but I need to convert the data to the following format:
As you can see I need the data to go from 10 lines, down to 1 line per unique LBREFID. I have already tried to use different Pandas functions such as .tolist() and .pivot() for the data, but that would result in data that does not resemble the desired format. This is an interesting problem that I, unfortunately, do not have the time to solve. Thank you in advance for your help.
import pandas._testing as tm
import pandas as pd
import numpy as np
tests = ["BMCELL", "PLASMA", "NEOPLASMABM", "NEOPLASMATBM", "CD138", "CD56", "CYCLIND1", "KAPPA", "LAMBDA", "NEOPLASMA"]
df = load_workbook(filename='GregFileComparison\\NovemberData.xlsx')
sheet = df['Sheet1']
i = 0
a = 3
e = 11
while (i <= 227):
for row in sheet['A' + str(a) + ':E' + str(e)]:
for cell in row:
cell.value = None
for row in sheet['I' + str(a) + ':J' + str(e)]:
for cell in row:
cell.value = None
for row in sheet['M' + str(a) + ':N' + str(e)]:
for cell in row:
cell.value = None
a += 10
e += 10
i += 1
sheet.delete_cols(12)
sheet.delete_cols(7)
i = 11
while (i <= 19):
sheet.insert_cols(i)
i += 1
counter = 10
i = 0
while (i <= 9):
sheet.cell(row=1, column=counter).value = tests[i]
counter += 1
i += 1
j = 0
i = 3
counter = 1
while (j <= 250):
while (counter <= 9):
sheet.move_range("J" + str(i), rows=-(counter), cols=counter)
i += 1
counter += 1
j += 1
counter = 0
sheet.delete_cols(6)
sheet.delete_cols(6)
df.save('output.xlsx')```
I found that hardcoding the transformations on the excel sheet worked best.
I have a huge dataframe in python +7mil rows. My general problem is that I need to run over a column and make a new 'numer' every time I see a '#' in that column. So the first time I see a # I overwrite it with 1 and drop this row, then I continue in the next row with the same number until I see again a '#' and i procede that.
I already have some code in place, but at it is a loop it is super slow!
i=0
j=0
while i <len(data):
if data.iloc[i][0] == '#':
j=j+1
data = data.drop(data.index[i])
else:
data.iloc[i][0] = j
i=i+1
return data
Try with something like this:
m = (data.iloc[:, 0] == '#')
data.iloc[:, 0] = m.cumsum()
data.drop(m.index[m], inplace=True)
I want the entire row to be removed(shift cells up) if there are no values in the entire row. I'm using Openpyxl.
My code:
for row in range(1, ws1.max_row):
flag = 0
for col in range(1, 50):
if ws1.cell(row, col).value is not None:
flag = 1
if flag == 0:
ws1.delete_rows(row, 1)
The rows are not getting deleted in the above case.
I tried using iter_rows function to do the same and it gives me:
TypeError: '>' not supported between instances of 'tuple' and 'int'
for row in ws1.iter_rows(min_row = 1, max_col=50, max_row = ws1.max_row):
flag = 0
for cell in row:
if cell.value is not None:
flag = 1
if flag == 0:
ws1.delete_rows(row, 1)
Help is appreciated!
The following is a generic approach to finding and then deleting empty rows.
empty_rows = []
for idx, row in enumerate(ws.iter_rows(max_col=50), start=1):
empty = not any((cell.value for cell in row))
if empty:
empty_rows.append(idx)
for row_idx in reversed(empty_rows):
ws.delete_rows(row_idx, 1)
Thanks to Charlie Clark for the help, here is a working solution I came up with, let me know if I can make any improvements to it:
i = 1
emptyRows = []
for row in ws1.iter_rows(min_row = 1, max_col=50, max_row = ws1.max_row):
flag = 0
for cell in row:
if cell.value is not None:
flag = 1
if flag == 0:
emptyRows.append(i)
i += 1
for x in emptyRows:
ws1.delete_rows(x, 1)
emptyRows[:] = [y - 1 for y in emptyRows]
I have a python program which does a SOAP request to a server, and it works fine:
I get the answer from the server, parse it, clean it, and when I am done, I end up with a string like that:
name|value|value_name|default|seq|last_modify|record_type|1|Detail|0|0|20150807115904|zero_out|0|No|0|0|20150807115911|out_ind|1|Partially ZeroOut|0|0|20150807115911|...
Basically, it is a string with values delimited by "|". I also know the structure of the database I am requesting, so I know that it has 6 columns and various rows. I basically need to split the string after every 6th "|" character, to obtain something like:
name|value|value_name|default|seq|last_modify|
record_type|1|Detail|0|0|20150807115904|
zero_out|0|No|0|0|20150807115911|
out_ind|1|Partially ZeroOut|0|0|20150807115911|...
Can you tell me how to do that in Python? Thank you!
Here's a functional-style solution.
s = 'name|value|value_name|default|seq|last_modify|record_type|1|Detail|0|0|20150807115904|zero_out|0|No|0|0|20150807115911|out_ind|1|Partially ZeroOut|0|0|20150807115911|'
for row in map('|'.join, zip(*[iter(s.split('|'))] * 6)):
print(row + '|')
output
name|value|value_name|default|seq|last_modify|
record_type|1|Detail|0|0|20150807115904|
zero_out|0|No|0|0|20150807115911|
out_ind|1|Partially ZeroOut|0|0|20150807115911|
For info on how zip(*[iter(seq)] * rowsize) works, please see the links at Splitting a list into even chunks.
data = "name|value|value_name|default|seq|last_modify|record_type|1|Detail|0|0|20150807115904|zero_out|0|No|0|0|20150807115911|out_ind|1|Partially ZeroOut|0|0|20150807115911|"
splits = data.split('|')
splits = list(filter(None, splits)) # Filter empty strings
row_len = 6
rows = ['|'.join(splits[i:i + row_len]) + '|' for i in range(0, len(splits), row_len)]
print(rows)
>>> ['name|value|value_name|default|seq|last_modify|', 'record_type|1|Detail|0|0|20150807115904|', 'zero_out|0|No|0|0|20150807115911|', 'out_ind|1|Partially ZeroOut|0|0|20150807115911|']
How about this:
a = 'name|value|value_name|default|seq|last_modify|record_type|1|Detail|0|0|20150807115904|zero_out|0|No|0|0|20150807115911|out_ind|1|Partially ZeroOut|0|0|20150807115911|'
b = a.split('|')
c = [b[6*i:6*(i+1)] for i in range(len(b)//6)] # this is a very workable form of data storage
print('\n'.join('|'.join(i) for i in c)) # produces your desired output
# prints:
# name|value|value_name|default|seq|last_modify
# record_type|1|Detail|0|0|20150807115904
# zero_out|0|No|0|0|20150807115911
# out_ind|1|Partially ZeroOut|0|0|20150807115911
Here is a flexible generator approach:
def splitOnNth(s,d,n, keep = False):
i = s.find(d)
j = 1
while True:
while i > 0 and j%n != 0:
i = s.find(d,i+1)
j += 1
if i < 0:
yield s
return #end generator
else:
yield s[:i+1] if keep else s[:i]
s = s[i+1:]
i = s.find(d)
j = 1
#test runs, showing `keep` in action:
test = 'name|value|value_name|default|seq|last_modify|record_type|1|Detail|0|0|20150807115904|zero_out|0|No|0|0|20150807115911|out_ind|1|Partially ZeroOut|0|0|20150807115911|'
for s in splitOnNth(test,'|',6,True): print(s)
print('')
for s in splitOnNth(test,'|',6): print(s)
Output:
name|value|value_name|default|seq|last_modify|
record_type|1|Detail|0|0|20150807115904|
zero_out|0|No|0|0|20150807115911|
out_ind|1|Partially ZeroOut|0|0|20150807115911|
name|value|value_name|default|seq|last_modify
record_type|1|Detail|0|0|20150807115904
zero_out|0|No|0|0|20150807115911
out_ind|1|Partially ZeroOut|0|0|20150807115911
There are really many ways to do it. Even with a loop:
a = 'name|value|value_name|default|seq|last_modify|record_type|1|Detail|0|0|20150807115904' \
'|zero_out|0|No|0|0|20150807115911|out_ind|1|Partially ZeroOut|0|0|20150807115911|'
new_a = []
ind_start, ind_end = 0, 0
for i in range(a.count('|')// 6):
for i in range(6):
ind_end = a.index('|', ind_end+1)
print(a[ind_start:ind_end + 1])
new_a.append(a[ind_start:ind_end+1])
ind_start = ind_end+1
The print is just to saw the results, you remove it:
name|value|value_name|default|seq|last_modify|
record_type|1|Detail|0|0|20150807115904|
zero_out|0|No|0|0|20150807115911|
out_ind|1|Partially ZeroOut|0|0|20150807115911|