python csv replace listitem - python

i have following output from a csv file:
word1|word2|word3|word4|word5|word6|01:12|word8
word1|word2|word3|word4|word5|word6|03:12|word8
word1|word2|word3|word4|word5|word6|01:12|word8
what i need to do is change the time string like this 00:01:12.
my idea is to extract the list item [7] and add a "00:" as string to the front.
import csv
with open('temp', 'r') as f:
reader = csv.reader(f, delimiter="|")
for row in reader:
fixed_time = (str("00:") + row[7])
begin = row[:6]
end = row[:8]
print begin + fixed_time +end
get error message:
TypeError: can only concatenate list (not "str") to list.
i also had a look on this post.
how to change [1,2,3,4] to '1234' using python
i neeed to know if my approach to soloution is the right way. maybe need to use split or anything else for this.
thx for any help

The line that's throwing the exception is
print begin + fixed_time +end
because begin and end are both lists and fixed_time is a string. Whenever you take a slice of a list (that's the row[:6] and row[:8] parts), a list is returned. If you just want to print it out, you can do
print begin, fixed_time, end
and you won't get an error.
Corrected code:
I'm opening a new file for writing (I'm calling it 'final', but you can call it whatever you want), and I'm just writing everything to it with the one modification. It's easiest to just change the one element of the list that has the line (row[6] here), and use '|'.join to write a pipe character between each column.
import csv
with open('temp', 'r') as f, open('final', 'w') as fw:
reader = csv.reader(f, delimiter="|")
for row in reader:
# just change the element in the row to have the extra zeros
row[6] = '00:' + row[6]
# 'write the row back out, separated by | characters, and a new line.
fw.write('|'.join(row) + '\n')

you can use regex for that:
>>> txt = """\
... word1|word2|word3|word4|word5|word6|01:12|word8
... word1|word2|word3|word4|word5|word6|03:12|word8
... word1|word2|word3|word4|word5|word6|01:12|word8"""
>>> import re
>>> print(re.sub(r'\|(\d\d:\d\d)\|', r'|00:\1|', txt))
word1|word2|word3|word4|word5|word6|00:01:12|word8
word1|word2|word3|word4|word5|word6|00:03:12|word8
word1|word2|word3|word4|word5|word6|00:01:12|word8

Related

Read and convert row in text file into list of string

I have a text file data.txt that contains 2 rows of text.
first_row_1 first_row_2 first_row_3
second_row_1 second_row_2 second_row_3
I would like to read the second row of the text file and convert the contents into a list of string in python. The list should look like this;
txt_list_str=['second_row_1','second_row_2','second_row_3']
Here is my attempted code;
import csv
with open('data.txt', newline='') as f:
reader = csv.reader(f)
row1 = next(reader)
row2 = next(reader)
my_list = row2.split(" ")
I got the error AttributeError: 'list' object has no attribute 'split'
I am using python v3.
EDIT: Thanks for all the answers. I am sure all of them works. But can someone tell me what is wrong with my own attempted code? Thanks.
The reason your code doesn't work is you are trying to use split on a list, but it is meant to be used on a string. Therefore in your example you would use row2[0] to access the first element of the list.
my_list = row2[0].split(" ")
Alternatively, if you have access to the numpy library you can use loadtxt.
import numpy as np
f = np.loadtxt("data.txt", dtype=str, skiprows=1)
print (f)
# ['second_row_1' 'second_row_2' 'second_row_3']
The result of this is an array as opposed to a list. You could simply cast the array to a list if you require a list
print (list(f))
#['second_row_1', 'second_row_2', 'second_row_3']
Use read file method to open file.
E.g.
>>> fp = open('temp.txt')
Use file inbuilt generator to iterate lines by next method, and ignore first line.
>>> next(fp)
'first_row_1 first_row_2 first_row_3)\n'
Get second line in any variable.
>>> second_line = next(fp)
>>> second_line
'second_row_1 second_row_2 second_row_3'
Use Split string method to get items in list. split method take one or zero argument. if not given they split use white space for split.
>>> second_line.split()
['second_row_1', 'second_row_2', 'second_row_3']
And finally close the file.
fp.close()
Note: There are number of way to get respective output.
But you should attempt first as DavidG said in comment.
with open("file.txt", "r") as f:
next(f) # skipping first line; will work without this too
for line in f:
txt_list_str = line.split()
print(txt_list_str)
Output
['second_row_1', 'second_row_2', 'second_row_3']

Using python to parsing a log file with real case - how to skip lines in csv.reader for loop, how to use different delimiter

Here is a section of the log file I want to parse:
And here is the code I am writing:
import csv
with open('Coplanarity_PedCheck.log','rt') as tsvin, open('YX2.csv', 'wt') as csvout:
read_tsvin = csv.reader(tsvin, delimiter='\t')
for row in read_tsvin:
print(row)
filters = row[0]
if "#Log File Initialized!" in filters:
print(row)
datetime = row[0]
print("looklook",datetime[23:46])
csvout.write(datetime[23:46]+",")
BS = row[0]
print("looklook",BS[17:21])
csvout.write(datetime[17:21]+",")
csvout.write("\n")
csvout.close()
I need to get the date and time information from row1, then get "left" from row2, then need to skip section 4. How should I do it?
Since the csv.reader makes row1 an list with only 1 element I converted it to string again to split out the datetime info I need. But I think it is not efficient.
I did same thing for row2, then I want to skip row 3-6, but I don't know how.
Also, csv.reader converts my float data into text, how can I convert them back before I write them into another file?
You are going to want to learn to use regular expressions.
For example, you could do something like this:
with open('Coplanarity_PedCheck.log','rt') as tsvin, open('YX2.csv', 'wt') as csvout:
read_tsvin = csv.reader(tsvin, delimiter='\t')
# Get the first line and use the first field
header = read_tsvin.next()[0]
m = re.search('\[([0-9: -]+)\]', row)
datetime = m.group(1)
csvout.write(datetime, ',')
# Find if 'Left' is in line 1
direction = read_tsvin.next()[0]
m = re.search('Left', direction)
if m:
# If m is found print the whole line
csvout.write(m.group(0))
csvout.write('\n')
# Skip lines 3-6
for i in range(4):
null = read_tsvin.next()
# Loop over the rest of the rows
for row in tsvin:
# Parse the data
csvout.close()
Modifying this to look for a line containing '#Log File Initialized!' rather than hard coding for the first line would be fairly simple using regular expressions. Take a look at the regular expression documentation
This probably isn't exactly what you want to do, but rather a suggestion for a good starting point.

Use python to parse values from ping output into csv

I wrote a code using RE to look for "time=" and save the following value in a string. Then I use the csv.writer attribute writerow, but each number is interpreted as a column, and this gives me trouble later. Unfortunately there is no 'writecolumn' attribute. Should I save the values as an array instead of a string and write every row separately?
import re
import csv
inputfile = open("ping.txt")
teststring = inputfile.read()
values = re.findall(r'time=(\d+.\d+)', teststring)
with open('parsed_ping.csv', 'wb') as csvfile:
writer = csv.writer(csvfile, delimiter=' ',
quotechar='|', quoting=csv.QUOTE_MINIMAL)
writer.writerow(values)
EDIT: I understood that "values" is already a list. I tried to iterate it and write a row for each item with
for item in values:
writer.writerow(item)
Now i get a space after each character, like
4 6 . 6
4 7 . 7
EDIT2: The spaces are the delimiters. If i change the delimiter to comma, i get commas between digits. I just don't get why he's interpreting each digit as a separate column.
If your csv file only contains one column, it's not really a "comma-separated file" anymore, is it?
Just write the list to the file directly:
import re
inputfile = open("ping.txt")
teststring = inputfile.read()
values = re.findall(r'time=(\d+\.\d+)', teststring)
with open('parsed_ping.csv', 'w') as csvfile:
csvfile.write("\n".join(values)
I solved this. I just needed to use square brackets in the writer.
for item in values:
writer.writerow([item])
This gives me the correct output.

python: adding a zero if my value is less then 3 digits long

I have a csv file that needs to add a zero in front of the number if its less than 4 digits.
I only have to update a particular row:
import csv
f = open('csvpatpos.csv')
csv_f = csv.reader(f)
for row in csv_f:
print row[5]
then I want to parse through that row and add a 0 to the front of any number that is shorter than 4 digits. And then input it into a new csv file with the adjusted data.
You want to use string formatting for these things:
>>> '{:04}'.format(99)
'0099'
Format String Syntax documentation
When you think about parsing, you either need to think about regex or pyparsing. In this case, regex would perform the parsing quite easily.
But that's not all, once you are able to parse the numbers, you need to zero fill it. For that purpose, you need to use str.format for padding and justifying the string accordingly.
Consider your string
st = "parse through that row and add a 0 to the front of any number that is shorter than 4 digits."
In the above lines, you can do something like
Implementation
parts = re.split(r"(\d{0,3})", st)
''.join("{:>04}".format(elem) if elem.isdigit() else elem for elem in parts)
Output
'parse through that row and add a 0000 to the front of any number that is shorter than 0004 digits.'
The following code will read in the given csv file, iterate through each row and each item in each row, and output it to a new csv file.
import csv
import os
f = open('csvpatpos.csv')
# open temp .csv file for output
out = open('csvtemp.csv','w')
csv_f = csv.reader(f)
for row in csv_f:
# create a temporary list for this row
temp_row = []
# iterate through all of the items in the row
for item in row:
# add the zero filled value of each temporary item to the list
temp_row.append(item.zfill(4))
# join the current temporary list with commas and write it to the out file
out.write(','.join(temp_row) + '\n')
out.close()
f.close()
Your results will be in csvtemp.csv. If you want to save the data with the original filename, just add the following code to the end of the script
# remove original file
os.remove('csvpatpos.csv')
# rename temp file to original file name
os.rename('csvtemp.csv','csvpatpos.csv')
Pythonic Version
The code above is is very verbose in order to make it understandable. Here is the code refactored to make it more Pythonic
import csv
new_rows = []
with open('csvpatpos.csv','r') as f:
csv_f = csv.reader(f)
for row in csv_f:
row = [ x.zfill(4) for x in row ]
new_rows.append(row)
with open('csvpatpos.csv','wb') as f:
csv_f = csv.writer(f)
csv_f.writerows(new_rows)
Will leave you with two hints:
s = "486"
s.isdigit() == True
for finding what things are numbers.
And
s = "486"
s.zfill(4) == "0486"
for filling in zeroes.

Python split and csv; Modification of existing Python Script

Another users was kind enough to help me with a script that reads in a file and removes/replaces '::' and moves columns to headers:
(I am reposting as it may be useful to someone in this form- my question follows)
with open('infile', "rb") as fin, open('outfile', "wb") as fout:
reader = csv.reader(fin)
writer = csv.writer(fout)
for i, line in enumerate(reader):
split = [item.split("::") for item in line if item.strip()]
if not split: # blank line
continue
keys, vals = zip(*split)
if i == 0:
# first line: write header
writer.writerow(keys)
writer.writerow(vals)
I was not aware that the last column of this file had the following text at the end:
{StartPoint::7858.35924983374[%2C]1703.69341358077[%2C]-3.075},{EndPoint::7822.85045874375[%2C]1730.80294308742[%2C]-3.53962362760298}
How do I modify this existing code to take the above and:
1. remove the brackets { }
2. convert the '[%2C]' to a ',' - making it comma delim like the rest of the file
3. Produce 'Xa Ya Za' and 'Xb Yb Zb' as headers for the values liberated in #2
The above text is the input file. Output from the original script produces this:
{StartPoint,EndPoint}
7858.35924983374[%2C]1703.69341358077[%2C]-3.075, 7822.85045874375[%2C]1730.80294308742[%2C]-3.53962362760298}
Is it possible to insert a simple strip command in there?
Thanks, I appreciate your guidance - I am a Python newbie
It seems like you're looking for the replace method on strings: http://docs.python.org/2/library/stdtypes.html#str.replace
Perhaps something like this after the zip:
keys = [key.replace('{', '') for key in keys]
vals = [val.split('[%2C]') for val in vals]
I'm not sure if csv.writer will handle a nested list like would be in vals after this, but you get the idea. If so, vals should be a flat list. Perhaps something like this would flatten it out:
vals = [item for sublist in vals for item in sublist]

Categories

Resources