I am having some problems writing back the values that I want to the output csv file. My intention is to get the values of the csv file and change those values that are equal to 'Never-worked' or 'Without-pay' to being 'Not-working'. My code successfully fulfills this, however, it does not write it back to the csv file appropriately, meaning the changed values are not written back, but the file remains the same as the original. What am I doing wrong?
import csv
infile = open('income.csv','rb')
outfile = open('income-new.csv', 'wb')
def changeOccupation(cell):
if (cell.lower() == 'never-worked' or cell.lower() == 'without-pay'):
cell = 'Not-working'
print(cell)
#go trough each line of the file
for line in infile:
row = line.split(',')
for cell in row:
changeOccupation(cell)
#print(row)
outfile.write(','.join(row))
infile.close()
outfile.close()
You need to retun the new value and wright the changed row pack:
def changeOccupation(cell):
if (cell.lower() == 'never-worked' or cell.lower() == 'without-pay'):
return 'Not-working'
return cell
#go trough each line of the file
for line in infile:
row = line.split(',')
new_row = [changeOccupation(cell) for cell in row]
outfile.write(','.join(new_row))
Related
I have a csv file looks like this:
I have a column called “Inventory”, within that column I pulled data from another source and it put it in a dictionary format as you see.
What I need to do is iterate through the 1000+ lines, if it sees the keywords: comforter, sheets and pillow exist than write “bedding” to the “Location” column for that row, else write “home-fashions” if the if statement is not true.
I have been able to just get it to the if statement to tell me if it goes into bedding or “home-fashions” I just do not know how I tell it to write the corresponding results to the “Location” field for that line.
In my script, im printing just to see my results but in the end I just want to write to the same CSV file.
from csv import DictReader
with open('test.csv', 'r') as read_obj:
csv_dict_reader = DictReader(read_obj)
for line in csv_dict_reader:
if 'comforter' in line['Inventory'] and 'sheets' in line['Inventory'] and 'pillow' in line['Inventory']:
print('Bedding')
print(line['Inventory'])
else:
print('home-fashions')
print(line['Inventory'])
The last column of your csv contains commas. You cannot read it using DictReader.
import re
data = []
with open('test.csv', 'r') as f:
# Get the header row
header = next(f).strip().split(',')
for line in f:
# Parse 4 columns
row = re.findall('([^,]*),([^,]*),([^,]*),(.*)', line)[0]
# Create a dictionary of one row
item = {header[0]: row[0], header[1]: row[1], header[2]: row[2],
header[3]: row[3]}
# Add each row to the list
data.append(item)
After preparing your data, you can check with your conditions.
for item in data:
if all([x in item['Inventory'] for x in ['comforter', 'sheets', 'pillow']]):
item['Location'] = 'Bedding'
else:
item['Location'] = 'home-fashions'
Write output to a file.
import csv
with open('output.csv', 'w') as f:
dict_writer = csv.DictWriter(f, data[0].keys())
dict_writer.writeheader()
dict_writer.writerows(data)
csv.DictReader returns a dict, so just assign the new value to the column:
if 'comforter' in line['Inventory'] and ...:
line['Location'] = 'Bedding'
else:
line['Location'] = 'home-fashions'
print(line['Inventory'])
I am trying to find few items from a CSV file when I run the code sometimes it works but sometimes it produces error list index out of range
def find_check_in(name,date):
x = 0
f = open('employee.csv','r')
reader = csv.reader(f, delimiter=',')
for row in reader:
id = row[0]
dt = row[1]
v = row[2]
a = datetime.strptime(dt,"%Y-%m-%d")
if v == "Check-In" and id=="person":
x = 1
f.close()
return x
Traceback (most recent call last):
File "", line 51, in
x=find_check_in(name,date)
File "", line 21, in find_check_in
id = row[0]
IndexError: list index out of range
Your CSV file contains blank lines, resulting in row becoming an empty list, in which case there is no index 0, hence the error. Make sure your input CSV has no blank line, or add a condition to process the row only if it isn't empty:
for row in reader:
if row:
# the rest of your code
Seems like reader is returning a row with no elements. Does your data contain any such rows? Or perhaps you need to use the newline='' argument to reader?
https://docs.python.org/3/library/csv.html#csv.reader
I have a CSV file with contents:
scenario1,5,dosomething
scenario2,10,donothing
scenario3,8,dosomething
scenario4,5,donothing
I would like to take the contents of a variable to firstly see if it is in the first column, if true - I would like to get the row number where it is found and the entire line contents. There will be no duplicate values in column 1 of the csv.
I can partly do the first step which is to find if the variable is in the csv, returning the whole line.
import csv
filename = csv.reader(open('/file.csv', "rb"), delimiter=",")
v = 'scenario1'
for row in configfile:
if 'v' in row[0]:
print row
The results I receive would be:
['scenario1','5','dosomething']
But I need assistance with the second part please. This is to find the row number.
Try this:
import csv
with open("ooo.csv", "r") as f:
reader = csv.reader(f)
for line_num, content in enumerate(reader):
if content[0] == "scenario1":
print content, line_num + 1
Or without csv module:
with open("ooo.csv") as f:
for l, i in enumerate(f):
data = i.split(",")
if data[0] == "scenario1":
print data, l + 1
Output:
['scenario1', '5', 'dosomething'] 1
I'm trying to make a small Python script to speed up things at work and have a small script kind of working, but it's not working as I want it to. Here's the current code:
import re
import csv
#import pdb
#pdb.set_trace()
# Variables
newStock = "newStock.csv" #csv file with list of new stock
allActive = "allActive.csv" #csv file with list of all active
skusToCheck= []
totalNewProducts = 0
i = 0
# Program Start - Open first csv
a = open(newStock)
csv_f = csv.reader(a)
# Copy each row into array thingy
for row in csv_f:
skusToCheck.append(row[0])
# Get length of array
totalNewProducts = len(skusToCheck)
# Open second csv
b = open(allActive)
csv_f = csv.reader(b)
# Open blank csv file to write to
csvWriter = csv.writer(open('writeToMe.csv', 'w'), delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
# Check first value in first row,first file against each entry in 2nd row in second file
with open(allActive, 'rt') as b:
reader = csv.reader(b, delimiter=",")
for row in reader:
if skusToCheck[i] == row[1]:
print(skusToCheck[i]) # output to screen for debugging
print(row) # debugging
csvWriter.writerow(row) #write matching row to new file
i += 1 # increment where we are in the first file
Pseudo code would be:
Open file one and store all values from column one in skusToCheck
Check this value against values in column 2 in file 2
If it finds a match, (once I have this working, i want it to look for partial matches too) copy the row to file 3
If not move onto the next value in skusToCheck and repeat
I can't seem to get lines 33 - 40 to loop. It will check the first value and find a match in the second file, but won't move onto the next value from skusToCheck.
You need to follow the hint from jonrsharpe's first comment, i.e. modify your while loop to
# Check first value in first row,first file against each entry in 2nd row in second file
with open(allActive, 'rt') as b:
reader = csv.reader(b, delimiter=",")
for row in reader:
if len(row)>1:
for sku in skusToCheck:
if sku == row[1]:
print(sku) # output to screen for debugging
print(row) # debugging
csvWriter.writerow(row) #write matching row to new file
break
This checks if each single sku is matching for all of the rows in allActive
unique.txt file contains: 2 columns with columns separated by tab. total.txt file contains: 3 columns each column separated by tab.
I take each row from unique.txt file and find that in total.txt file. If present then extract entire row from total.txt and save it in new output file.
###Total.txt
column a column b column c
interaction1 mitochondria_205000_225000 mitochondria_195000_215000
interaction2 mitochondria_345000_365000 mitochondria_335000_355000
interaction3 mitochondria_345000_365000 mitochondria_5000_25000
interaction4 chloroplast_115000_128207 chloroplast_35000_55000
interaction5 chloroplast_115000_128207 chloroplast_15000_35000
interaction15 2_10515000_10535000 2_10505000_10525000
###Unique.txt
column a column b
mitochondria_205000_225000 mitochondria_195000_215000
mitochondria_345000_365000 mitochondria_335000_355000
mitochondria_345000_365000 mitochondria_5000_25000
chloroplast_115000_128207 chloroplast_35000_55000
chloroplast_115000_128207 chloroplast_15000_35000
mitochondria_185000_205000 mitochondria_25000_45000
2_16595000_16615000 2_16585000_16605000
4_2785000_2805000 4_2775000_2795000
4_11395000_11415000 4_11385000_11405000
4_2875000_2895000 4_2865000_2885000
4_13745000_13765000 4_13735000_13755000
My program:
file=open('total.txt')
file2 = open('unique.txt')
all_content=file.readlines()
all_content2=file2.readlines()
store_id_lines = []
ff = open('match.dat', 'w')
for i in range(len(all_content)):
line=all_content[i].split('\t')
seq=line[1]+'\t'+line[2]
for j in range(len(all_content2)):
if all_content2[j]==seq:
ff.write(seq)
break
Problem:
but istide of giving desire output (values of those 1st column that fulfile the if condition). i nead somthing like if jth of unique.txt == ith of total.txt then write ith row of total.txt into new file.
import csv
with open('unique.txt') as uniques, open('total.txt') as total:
uniques = list(tuple(line) for line in csv.reader(uniques))
totals = {}
for line in csv.reader(total):
totals[tuple(line[1:])] = line
with open('output.txt', 'w') as outfile:
writer = csv.writer(outfile)
for line in uniques:
writer.writerow(totals.get(line, []))
I will write your code in this way:
file=open('total.txt')
list_file = list(file)
file2 = open('unique.txt')
list_file2 = list(file2)
store_id_lines = []
ff = open('match.dat', 'w')
for curr_line_total in list_file:
line=curr_line_total.split('\t')
seq=line[1]+'\t'+ line[2]
if seq in list_file2:
ff.write(curr_line_total)
Please, avoid readlines() and use the with syntax when you open your files.
Here is explained why you don't need to use readlines()