How to switch pointer to next column in csv module of python - python

I am writing data from xls file column by column. Now i want to write that data in csv same as column by column.
Problem is i am not getting how to switch the pointer to next column.
Currently i am getting O/P as below from my code:
abc
pqr
,
def
ghi
What i want is
abc,def
pqr,ghi
My sample code:
for k in col1,col2:
for i in range(2,10):
test = (sheet.cell(row=i, column=k).value)
c.writerow([test])
c.writerow(",") #switch to next column.. Not working
Please help...

The problem is that you are writing the row out on every element, you should probably put each column element into a list, cast it as a tuple (because that's what writerow expects, and write the entire row out at once.
You are also using rows on the inner loop, where you should likely be using columns. The way you had it you were going down each row then column, when it should be the other way around. Like reading a book, you tackle each word individually (column item in inner loop) and you move down to the next line (row item in outer loop) when you are done with the current line.
# Rows 2-9
for i in range(2,10):
row = []
for k in col1,col2:
row.append(unicode(sheet.cell(row=i, column=k).value))
c.writerow(tuple(row)) #switch to next column.. Not working

Related

How to delete a first and second line in each cell of column in csv

I didnt find solution for it, so maybe someone can help me
I have an csv file and it has "splitted_text" column, how can I delete the first and second line of text in each row?
enter image description here
Based on the screenshot, I'm assuming that by:
delete the first and second line of text
you mean "delete the first and second element of each list contained in the splitter_text column"
In which case, you can simply apply a function along the column:
df["clean_splitted_text"] = df["splitted_text"].apply(lambda x: x[2:])
to save the cleaned output in a new column, or
df["splitted_text"] = df["splitted_text"].apply(lambda x: x[2:])
if you want to overwrite the content of that column (!)
Here, lambda x: x[2:] is a function that removes the first two elements of a list. If that list has 2 or fewer elements, then it returns the empty list: [].

iterrows() loop is only reading last value and only modifying first row

I have a dataframe test. My goal is to search in the column t1 for specific strings, and if it matches exactly a specific string, put that string in the next column over called t1_selected. Only thing is, I can't get iterrows() to go over the entire dataframe, and to report results in respective rows.
for index, row in test.iterrows():
if any(['ABCD_T1w_MPR_vNav_passive' in row['t1']]):
#x = ast.literal_eval(row['t1'])
test.loc[i, 't1_selected'] = str(['ABCD_T1w_MPR_vNav_passive'])
I am only trying to get ABCD_T1w_MPR_vNav_passive to be in the 4th row under the t1_selected, while all the other rows will have not found. The first entry in t1_selected is from the last row under t1 which I didn't include in the screenshot because the dataframe has over 200 rows.
I tried to initialize an empty list to append output of
import ast
x = ast.literal_eval(row['t1'])
to see if I can put x in there, but the same issue occurred.
Is there anything I am missing?
for index, row in test.iterrows():
if any(['ABCD_T1w_MPR_vNav_passive' in row['t1']]):
#x = ast.literal_eval(row['t1'])
test.loc[index, 't1_selected'] = str(['ABCD_T1w_MPR_vNav_passive'])
Where index is the row its written to. With i it was not changing

How to return the string of a header based on the max value of a cell in Openpyxl

Good morning guys! quick question for Openpyxl:
I am working with Python editing a xlsx document and generating various stats. Part of my script is to generate max values of a cell range :
temp_list=[]
temp_max=[]
for row in sheet.iter_rows(min_row=3, min_col=10, max_row=508, max_col=13):
print(row)
for cell in row:
temp_list.append(cell.value)
print(temp_list)
temp_max.append(max(temp_list))
temp_list=[]
I would also like to be able to print the string of the header of the column that contains the max value for the cell range desired. My data structure looks like this :
Any idea on how to do so?
Thanks!
This seems like a typical INDEX/MATCH Excel problem.
Have you tried retrieving the index for the max value in each temp_list?
You can use a function like numpy.argmax() to get the index of your max value within your "temp_list" array, then use this index to locate the header and append the string to a new list called, say, "max_headers" which contains all the header strings in order of appearance.
It would look something like this
for cell in row:
temp_list.append(cell.value)
i_max = np.argmax(temp_list)
max_headers.append(cell(row = 1, column = i_max).value)
And so on and so forth. Of course, for that to work, your temp_list should be a numpy array instead of a simple python list, and the max_headers list would have to be defined.
First, Thanks Bernardo for the hint. I found a decently working solution but still have a little issue. Perhaps someone can be of assistance.
Let me amend my initial statement : here is the code I am working with now :
temp_list=[]
headers_list=[]
for row in sheet.iter_rows(min_row=3, min_col=27, max_row=508, max_col=32): #Index starts at 1 // Here we set the rows/columns containing the data to be analyzed
for cell in row:
temp_list.append(cell.value)
for cell in row:
if cell.value == max(temp_list):
print(str(cell.column))
print(cell.value)
print(sheet.cell(row=1, column=cell.column).value)
headers_list.append(sheet.cell(row=1,column=cell.column).value)
else:
print('keep going.')
temp_list = []
This formula works, but has a little issue : If, for instance, a row has the same value twice (ie : 25,9,25,8,9), this loop will print 2 headers instead of one. My question is :
how can I get this loop to take in account only the first match of a max value in a row?
You probably want something like this:
headers = [c for c in next(ws.iter_rows(min_col=27, max_col=32, min_row=1, max_row=1, values_only=True))]
for row in ws.iter_rows(min_row=3, min_col=27, max_row=508, max_col=32, values_only=True):
mx = max(row)
idx = row.index(mx)
col = headers[idx]

manipulating excel spreadsheets with python

I am new to Python especially when it comes to using it with Excel. I need to write code to search for the string “Mac”, “Asus”, “AlienWare”, “Sony”, or “Gigabit” within a longer string for each cell in column A. Depending on which of these strings it finds within the entire entry in column A’s cell, it should write one of these 5 strings to the corresponding row in column C’s cell. Else if it doesn’t find any of the five, it would write “Other” to the corresponding row in column C. For example, if Column A2’s cell contained the string “ProLiant Asus DL980 G7, the correct code would write “Asus” to column C2’s cell. It should do this for every single cell in column A, writing the appropriate string to the corresponding cell in column C. Every cell in column A will have one of the five strings Mac, Asus, AlienWare, Sony, or Gigabit within it. If it doesn’t contain one of those strings, I want the corresponding cell in column 3 to have the string “Other” written to it. So far, this is the code that I have (not much at all):
import openpyxl
wb = openpyxl.load_workbook(path)
sheet = wb.active
for i in range (sheet.max_row):
cell1 = sheet.cell (row = i, column = 1)
cell2 = sheet.cell (row = I, column = 3)
# missing code here
wb.save(path)
You haven't tried writing any code to solve the problem. You might want to first get openpyxl to write to the excel workbook and verify that is working - even if it's dummy data. This page looks helpful - here
Once that is working all you'd need is a simple function that takes in a string as an argument.
def get_column_c_value(string_from_column_a):
if "Lenovo" in string_from_column_a:
return "Lenovo"
else if "HP" in string_from_column_a:
return "HP"
# strings you need to check for here in the same format as above
else return "other"
Try out those and if you have any issues let me know where you're getting stuck.
I have not worked much with openpyxl, but it sounds like you are trying to do a simple string search.
You can access individual cells by using
cell1.internal_value
Then, your if/else statement would look something like
if "HP" in str(cell1.internal_value):
Data can be assigned directly to a cell so you could have
ws['C' + str(i)] = "HP"
You could do this for all of the data in your cells

write comma delimited text to excel

I'm iterating through a bunch of SDE's, then iterating through those and producing feature class titles, and it's in a list. I then perform a list.replace() and turn it into a comma delimited string.
So, I want to take that delimited string such as:
SDE_name1,thing1,thing2,thing3,thing4
SDE_name2,thing1,thing2,thing3,thing4
SDE_name3,thing1,thing2,thing3,thing4
and insert it into excel using XLWT
I can get it to write the entire length into one cell, with or without the commas,
but I'd like it to write each item into a new column...
So column A would have SDE_name1
Column B would have thing1
columb C would have thing2 etc etc etc
So far I've tried:
listrow=2
listcol=0
for row in list:
worksheet.write(listrow,listcol,row,style)
listrow+=1
wbk.save(bookname)
and
listrow=2
listcol=0
list=something.split(anotherlist,delimiter=",")
for row in list:
worksheet.write(listrow,listcol,row,style)
listrow+=1
wbk.save(bookname)
So both with and without a delimiter. Either way, both write everything to one column left to right. It will write each item in the list to a new row...but I need it to write each item after the comma to a new column.
Any idea?
You are not iterating through the columns. As it's not clear exactly what the variables in your example refer to, assume you start with the following
data = ["SDE_name1,thing1,thing2,thing3,thing4",
"SDE_name2,thing1,thing2,thing3,thing4",
"SDE_name3,thing1,thing2,thing3,thing4"]
The code to write the Excel file becomes:
rowCount = 2
for row in data: # row = "SDE_name1,thing1,thing2,thing3,thing4"
colCount = 0
for column in row.split(","): # column = ["SDE_name1", "thing1", "thing2", "thing3", "thing4"]
worksheet.write(rowCount, colCount, column, style)
colCount += 1
rowCount += 1
wbk.save(bookname)

Categories

Resources