parse list in python with \t - python

I have a file, each line looks like this:
#each line is a list:
a = ['1\t2\t3\t4\t5']
#type(a) is list
#str(a) shows as below:
["['1\\t2\\t3\\t4\\t5']"]
I want an output of only 1 2 3 4 5, how should I accomplish that? Thanks everyone.

I agree with Bhargav Rao. Use a for loop.
for i in a:
i.replace('\t',' ')
print i
I believe it will print out the way you want.
Otherwise please specify.

If you are reading lines from a file, you can use the following approach:
with open("file.txt", "r") as f_input:
for line in f_input:
print line.replace("\t"," "),
If the file you are reading from is actually columns of data, then the following approach might be suitable:
import csv
with open("file.txt", "r") as f_input:
csv_input = csv.reader(f_input, delimiter="\t")
for cols in csv_input:
print " ".join(cols)
cols would give you a list of columns for each row in the file, i.e. cols[0] would be the first column. You can then easily print them out with spaces as shown.

there are many ways of doing this.
you can use a for loop or you can just join the list and replace all the \t
a = ['1\t2\t3\t4\t5']
a = "".join(a).replace("\t", "")
print(a)

Related

Removing certain separators from csv file with pandas or csv

I've got multiple csv files, which I received in the following line format:
-8,000E-04,2,8E+1,
The first and the third comma are meant to be decimal separators, the second comma is a column delimiter and I think the last one is supposed to indicate a new line. So the csv should only consist of two columns and I have to prepare the data in order to plot it. Therefore I need to specify the two columns as x and y to plot the data.I tried removing or replacing the separators in every line but by doing that I'm no longer able to specify the two columns. Is there a way to remove certain separators from every line of the csv?
You can use the string returned by reading line as follow
line="-8,000E-04,2,8E+1,"
list_string = line.split(',')
x= float(list_string[0]+"."+list_string[1])
y= float(list_string[2]+"."+list_string[3])
print(x,y)
Result is
-0.0008 28.0
you can arrange x and y in columns also or whatever you want
Here a short program in python to convert your csv-files
import csv
f1 = "in_test.csv"
f2 = "out_test.csv"
with open(f1, newline='') as csv_reader:
reader = csv.reader(csv_reader, delimiter=',')
with open(f2, mode='w', newline='') as csv_writer:
writer = csv.writer(csv_writer, delimiter=";")
for row in reader:
out_row = [row[0] + '.' + row[1], row[2] + '.' + row[3]]
writer.writerow(out_row)
Sample input:
-8,000E-04,2,8E+1,
-2,000E-03,2,7E+2,
Sample output:
-8.000E-04;2.8E+1
-2.000E-03;2.7E+2
I think you should replace the second comma using regex. Well, I'm definitely not an expert at it, but I've managed to come up with this:
import re
s = "-8,000E-04,2,8E+1,"
pattern = "^([^,]*,[^,]*),(.*),$"
grps = re.search(pattern, s).groups()
res = [float(s.replace(",", ".")) for s in grps]
print(res)
# [-0.0008, 28.0]
Sample csv file:
-8,000E-04,2,8E+1,
6,0E-6,-45E+2,
-5,550E-6,-6,2E+1,
And you can do something like this:
x = []
y = []
regex = re.compile("^([^,]*,[^,]*),(.*),$")
with open("a.csv") as f:
for line in f:
result = regex.search(line).groups()
x.append(float(result[0].replace(",", ".")))
y.append(float(result[1].replace(",", ".")))
The result is:
print(x, y)
# [-0.0008, 6e-06, -5.55e-06] [28.0, -4500.0, -62.0]
I'm not sure this is the most efficient way, but it works.

how to select a specific column of a csv file in python

I am a beginner of Python and would like to have your opinion..
I wrote this code that reads the only column in a file on my pc and puts it in a list.
I have difficulties understanding how I could modify the same code with a file that has multiple columns and select only the column of my interest.
Can you help me?
list = []
with open(r'C:\Users\Desktop\mydoc.csv') as file:
for line in file:
item = int(line)
list.append(item)
results = []
for i in range(0,1086):
a = list[i-1]
b = list[i]
c = list[i+1]
results.append(b)
print(results)
You can use pandas.read_csv() method very simply like this:
import pandas as pd
my_data_frame = pd.read_csv('path/to/your/data')
results = my_data_frame['name_of_your_wanted_column'].values.tolist()
A useful module for the kind of work you are doing is the imaginatively named csv module.
Many csv files have a "header" at the top, this by convention is a useful way of labeling the columns of your file. Assuming you can insert a line at the top of your csv file with comma delimited fieldnames, then you could replace your program with something like:
import csv
with open(r'C:\Users\Desktop\mydoc.csv') as myfile:
csv_reader = csv.DictReader(myfile)
for row in csv_reader:
print ( row['column_name_of_interest'])
The above will print to the terminal all the values that match your specific 'column_name_of_interest' after you edit it to match your particular file.
It's normal to work with lots of columns at once, so that dictionary method of packing a whole row into a single object, addressable by column-name can be very convenient later on.
To a pure python implementation, you should use the package csv.
data.csv
Project1,folder1/file1,data
Project1,folder1/file2,data
Project1,folder1/file3,data
Project1,folder1/file4,data
Project1,folder2/file11,data
Project1,folder2/file42a,data
Project1,folder2/file42b,data
Project1,folder2/file42c,data
Project1,folder2/file42d,data
Project1,folder3/filec,data
Project1,folder3/fileb,data
Project1,folder3/filea,data
Your python program should read it by line
import csv
a = []
with open('data.csv') as csv_file:
reader = csv.reader(csv_file, delimiter=',')
for row in reader:
print(row)
# ['Project1', 'folder1/file1', 'data']
If you print the row element you will see it is a list like that
['Project1', 'folder1/file1', 'data']
If I would like to put in my list all elements in column 1, I need to put that element in my list, doing:
a.append(row[1])
Now in list a I will have a list like:
['folder1/file1', 'folder1/file2', 'folder1/file3', 'folder1/file4', 'folder2/file11', 'folder2/file42a', 'folder2/file42b', 'folder2/file42c', 'folder2/file42d', 'folder3/filec', 'folder3/fileb', 'folder3/filea']
Here is the complete code:
import csv
a = []
with open('data.csv') as csv_file:
reader = csv.reader(csv_file, delimiter=',')
for row in reader:
a.append(row[1])

How to write a sentence into a CSV file using Python

I have a sentence like !=This is great
How do I write the sentence into a .CSV file with "!" in the first column and "This is great" in another column?
You can use pandas to_csv method
code:
import pandas as pd
col1 = []
col2 = []
f = '!=This is great'
l1 = f.split('=')
col1.append(l1[0])
col2.append(l1[1])
df = pd.DataFrame()
df['col1'] = col1
df['col2'] = col2
df.to_csv('test.csv')
split the text, and write it to an output file:
text = open('in.txt').read() #if from input file
text = '!=This is great' #if not from input file
with open('out.csv','w') as f:
f.write(','.join(text.split('=')))
output:
!,This is great
if you have multiple lines, you will have to loop through the input file and split each one
Of course, you could write using standard io with open() and manually write with comma delimiter for each line, but python has csv standard library that will help you with this. You could specify the dialect
In [1]: import csv
In [2]: sentence="!=This is great"
In [3]: with open("test.csv", "w", newline='') as f:
...: my_csvwriter = csv.writer(f)
...: my_csvwriter.writerow(sentence.split("="))
With multiple data, assuming it's in list, you could iterate through it when writing.
with open("test.csv", "w", newline='') as f:
my_csvwriter = csv.writer(f)
for sentence in sentences:
my_csvwriter.writerow(sentence.split("="))
This library helps handling comma in a sentence, instead of handling it yourself. For instance you have:
sentence = "!=Hello, my name is.."
with open("test.csv", "w", newline='') as f:
my_csvwriter = csv.writer(f)
my_csvwriter.writerow(sentence.split("="))
# This will be written: !,"Hello, my name is.."
# With that quote, you could still open it in excel without confusing it
# and it knows that `Hello, my name is..` is in the same column

Iterating through particular rows in a csvFile in Python

I have a programming assignment that include csvfiles. So far, I only have a issue with obtaining values from specific rows only, which are the rows that the user wants to look up.
When I got frustrated I just appended each column to a separate list, which is very slow (when the list is printed for test) because each column has hundreds of values.
Question:
The desired rows are the rows whose index[0] == user_input. How can I obtain these particular rows only and ignore the others?
This should give you an idea:
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f, delimiter=',')
user_rows = filter(lambda row: row[0] == user_input, reader)
Python has the module csv
import csv
rows=[]
for row in csv.reader(open('a.csv','r'),delimiter=','):
if(row[0]==user_input):
rows.append(row)
def filter_csv_by_prefix (csv_path, prefix):
with open (csv_path, 'r') as f:
return tuple (filter (lambda line : line.split(',')[0] == prefix, f.readlines ()))
for line in filter_csv_by_prefix ('your_csv_file', 'your_prefix'):
print (line)

Python: Even after specifying delimiter, csv writer delimits at wrong place

I'm trying to write lists like this to a CSV file:
['ABC','One,Two','12']
['DSE','Five,Two','52']
To a file like this:
ABC One,Two 12
DSE Five,Two 52
Basically, write anything inside '' to a cell.
However, it is splitting One and Two into different cells and merging ABC with One in the first cell.
Part of my script:
out_file_handle = open(output_path, "ab")
writer = csv.writer(out_file_handle, delimiter = "\t", dialect='excel', lineterminator='\n', quoting=csv.QUOTE_NONE)
output_final = (tsv_name_list.split(".")[0]+"\t"+key + "\t" + str(listOfThings))
output_final = str([output_final]).replace("[","").replace("]","").replace('"',"").replace("'","")
output_final = output_final.split("\\t")
print output_final #gives the first lists of strings I mentioned above.
writer.writerow(output_final)
First output_final line gives
ABC One,Two 12
DSE Five,Two 52
Using the csv module simply works, so you're going to need to be more specific about what's convincing you that the elements are bleeding across cells. For example, using the (now quite outdated) Python 2.7:
import csv
data_lists = [['ABC','One,Two','12'],
['DSE','Five,Two','52']]
with open("out.tsv", "wb") as fp:
writer = csv.writer(fp, delimiter="\t", dialect="excel", lineterminator="\n")
writer.writerows(data_lists)
I get an out.tsv file of:
dsm#winter:~/coding$ more out.tsv
ABC One,Two 12
DSE Five,Two 52
or
>>> out = open("out.tsv").readlines()
>>> for row in out: print repr(row)
...
'ABC\tOne,Two\t12\n'
'DSE\tFive,Two\t52\n'
which is exactly as it should be. Now if you take these rows, which are tab-delimited, and for some reason split them using commas as the delimiter, sure, you'll think that there are two columns, one with ABC\tOne and one with Two\t12. But that would be silly.
You've set up the CSV writer, but then for some reason you completely ignore it and try to output the lines manually. That's pointless. Use the functionality available.
writer = csv.writer(...)
for row in tsv_list:
writer.writerow(row)

Categories

Resources