I am new to python but have experience with C/sh/java..
My first homework is csv field replacement, if field has specific value
My csv file has 24 columns and has ~ as field seperator.
I want to change field 24th with T if it's True
My code is like that
import csv
with open ('MyCustomerList.csv', 'r', encoding='utf-8' ) as fi:
reader = csv.reader (fi, delimiter = '~')
for row in reader:
if row[23] == 'True':
print (row[23])
But it gives an error like that
C:\>c:\Python34\python.exe pyt1.py
Traceback (most recent call last):
File "pyt1.py", line 5, in <module>
if row[23] == 'True':
IndexError: list index out of range
I could not figured out the issue
what is the error ?
The error 'list index out of range' says that you tried to get element after the array ends. I assume that your file really has 24 values, so at first try printing all of them out with index:
i = 0
for row in reader:
print (i + ' ' + row)
i += 1
and check if you really get 23rd field (indexing from 0 :).
Related
I am trying to find few items from a CSV file when I run the code sometimes it works but sometimes it produces error list index out of range
def find_check_in(name,date):
x = 0
f = open('employee.csv','r')
reader = csv.reader(f, delimiter=',')
for row in reader:
id = row[0]
dt = row[1]
v = row[2]
a = datetime.strptime(dt,"%Y-%m-%d")
if v == "Check-In" and id=="person":
x = 1
f.close()
return x
Traceback (most recent call last):
File "", line 51, in
x=find_check_in(name,date)
File "", line 21, in find_check_in
id = row[0]
IndexError: list index out of range
Your CSV file contains blank lines, resulting in row becoming an empty list, in which case there is no index 0, hence the error. Make sure your input CSV has no blank line, or add a condition to process the row only if it isn't empty:
for row in reader:
if row:
# the rest of your code
Seems like reader is returning a row with no elements. Does your data contain any such rows? Or perhaps you need to use the newline='' argument to reader?
https://docs.python.org/3/library/csv.html#csv.reader
Following is my input csv file contents
file3.csv:
a,ab
b,cd
c,nav
d,test
name,port
I want to write this into a existing csv file, in a specific column numbers.
For example:
I want to write, a,b,c,d,name into a column number --- AA
And I need to write ab,cd,nav,test,port into a column number ---AB
Python Script:
import csv
f1 = open ("file3.csv","r") # open input file for reading
with open('file4.csv', 'wb') as f: # output csv file
writer = csv.writer(f)
with open('file3.csv','r') as csvfile: # input csv file
reader = csv.reader(csvfile, delimiter=',')
for row in reader:
row[7] = f1.readline() # edit the 8th column
writer.writerow(row)
f1.close()
I am getting following error:
MacBook-Pro:test$ python three.py
Traceback (most recent call last):
File "three.py", line 10, in
row[7] = f1.readline() # edit the 8th column
IndexError: list assignment index out of range
You can not use an index into a list for an element that does not already exist. You will need to increase the length of the row before assigning elements to specific indices.
If you want to assign to row[7] try this before:
if len(row) < 8:
row += [None] * (8 - len(row))
So, your inner loop will likely need to look something like:
for row in reader:
if len(row) < 8:
row += [None] * (8 - len(row))
new_values = f1.readline().strip().split(',')
row[7:7+1+len(new_values)] = new_values
writer.writerow(row)
I am 99% of the way there...
def xl_to_csv(xl_file):
wb = xlrd.open_workbook(xl_file)
sh = wb.sheet_by_index(0)
output = 'output.csv'
op = open(output, 'wb')
wr = csv.writer(op, quoting=csv.QUOTE_ALL)
for rownum in range(sh.nrows):
part_number = sh.cell(rownum,1)
#wr.writerow(sh.row_values(rownum)) #writes entire row
wr.writerow(part_number)
op.close()
using wr.writerow(sh.row_values(rownum)) I can write the entire row from the Excel file to a CSV, but there are like 150 columns and I only want one of them. So, I'm grabbing the one column that I want using part_number = sh.cell(rownum,1), but I can't seem to get the syntax correct to just write this variable out to a CSV file.
Here's the traceback:
Traceback (most recent call last):
File "test.py", line 61, in <module>
xl_to_csv(latest_file)
File "test.py", line 32, in xl_to_csv
wr.writerow(part_number)
_csv.Error: sequence expected
Try this:
wr.writerow([part_number.value])
The argument must be a list-like object.
The quickest fix is to throw your partnum in a list (and as per Abdou you need to add .value to get the value out of a cell):
for rownum in range(sh.nrows):
part_number = sh.cell(rownum,1).value # added '.value' to get value from cell
wr.writerow([part_number]) # added brackets to give writerow the list it wants
More generally, you can use a list comprehension to grab the columns you want:
cols = [1, 8, 110]
for rownum in range(sh.nrows):
wr.writerow([sh.cell(rownum, colnum).value for colnum in cols])
I have a CSV that looks something like this:
F02303521,"Smith,Andy",GHI,"Smith,Andy",GHI,,,
F04300621,"Parker,Helen",CERT,"Yu,Betty",IOUS,,,
I want to delete all the lines where the 2nd column equal the 4th column (ex. when Smith,Andy = Smith,Andy). I tried to do this in python by using " as the delimiter and splitting the columns into:
F02303521, Smith,Andy ,GHI, Smith,Andy ,GHI,,,
I tried this python code:
testCSV = 'test.csv'
deletionText = 'linestodelete.txt'
correct = 'correctone.csv'
i = 0
j = 0 #where i & j keep track of line number
with open(deletionText,'w') as outfile:
with open(testCSV, 'r') as csv:
for line in csv:
i = i + 1 #on the first line, i will equal 1.
PI = line.split('"')[1]
investigator = line.split('"')[3]
#if they equal each other, write that line number into the text file
as to be deleted.
if PI == investigator:
outfile.write(i)
#From the TXT, create a list of line numbers you do not want to include in output
with open(deletionText, 'r') as txt:
lines_to_be_removed_list = []
# for each line number in the TXT
# remove the return character at the end of line
# and add the line number to list domains-to-be-removed list
for lineNum in txt:
lineNum = lineNum.rstrip()
lines_to_be_removed_list.append(lineNum)
with open(correct, 'w') as outfile:
with open(deletionText, 'r') as csv:
# for each line in csv
# extract the line number
for line in csv:
j = j + 1 # so for the first line, the line number will be 1
# if csv line number is not in lines-to-be-removed list,
# then write that to outfile
if (j not in lines_to_be_removed_list):
outfile.write(line)
but for this line:
PI = line.split('"')[1]
I get:
Traceback (most recent call last):
File "C:/Users/sskadamb/PycharmProjects/vastDeleteLine/manipulation.py", line 11, in
PI = line.split('"')[1]
IndexError: list index out of range
and I thought it would do PI = Smith,Andy investigator = Smith,Andy... why does that not happen?
Any help would be greatly appreciated, thanks!
When you think csv, think pandas, which is a great data analysis library for Python. Here's how to accomplish what you want:
import pandas as pd
fields = ['field{}'.format(i) for i in range(8)]
df = pd.read_csv("data.csv", header=None, names=fields)
df = df[df['field1'] != df['field3']]
print df
This prints:
field0 field1 field2 field3 field4 field5 field6 field7
1 F04300621 Parker,Helen CERT Yu,Betty IOUS NaN NaN NaN
Try splitting on comma, not qoute.
x.split(",")
I am developing a simple application in where it reads the CSV file sent in and produces some results based on the data points in the columns. Data.csv:
Something, everything, 6, xy
Something1, everything1, 7, ab
Something2, everything2, 9, pq
I open the file as following,
FileOpen = opne('../sources/data.csv', 'rU')
FileRead = csv.reader(FileOpen, delimiter = ',')
FileRead.next()
for row in FileRead:
#This does not work
if row[0] == 'something' and row[1] == 'something1':
print row[2]
#This works
if row[0] == 'something' and row[3] = 'xy':
print row[2]
The above code does not show anything. But if I used row[0] and row [3] in the if condition, it works well. So the problem is with the column 1, 2. But 0 and 3 columns work fine. Is the file format of CSV wrong? I following microsoft procedure to create csv from excel file.
The use and naming of row is completely correct. The main problem is the white space in your file. If I print row, I get
['Something', ' everything', ' 6']
^ ^
The solution will most likely deal with
Dialect.skipinitialspace
When True, whitespace immediately following the delimiter is ignored. The default is False.
from here https://docs.python.org/2/library/csv.html#dialects-and-formatting-parameters
You pass this option in the constructor like this:
FileRead = csv.reader(FileOpen, delimiter = ',', skipinitialspace=True)
Yes, they were the spaces after all. To remove spaces in Excel, insert a new column near the column with the spaces and user =TRIM(C1). Then you can copy paste the data in a new file and create a CSV from that.