Entire string in For loop, not just character by character - python

Using CSV writer, I am trying to write a list of strings to a file.
Each string should occupy a separate row.
sectionlist = ["cat", "dog", "frog"]
When I implement the following code:
with open('pdftable.csv', 'wt') as csvfile:
writer = csv.writer(csvfile, delimiter=',')
for i in sectionlist:
writer.writerow(i)
I create
c,a,t
d,o,g
f,r,o,g
when I want
cat
dog
frog
Why does the for loop parse each character separately and how can I pass the entire string into csv.writer together so each can be written?

It doesn't look like you even need to use csv writer.
l = ["cat", "dog", "frog"] # Don't name your variable list!
with open('pdftable.csv', 'w') as csvfile:
for word in l:
csvfile.write(word + '\n')
Or as #GP89 suggested
with open('pdftable.csv', 'w') as csvfile:
csvfile.writelines(l)

I think that what you need is:
with open('pdftable.csv', 'wt') as csvfile:
writer = csv.writer(csvfile, delimiter=',')
for i in sectionlist:
writer.writerow([i]) # note the square brackets here
writerow treats its argument as an iterable, so if you pass a string, it will see it as if each character is one element in the row; however, you want the whole string to be an item, so you must enclose it in a list or a tuple.
PD: That said, if your particular case is not any more complex than what you are posting, you may not need csv.writer at all, as suggested by other answers.

The problem is i represents a string (word), not a list (row). Strings are iterable sequences (of characters) as well in Python so the CSV function accepts the object it without error, even though the results are "strange".
Fix sectionlist such that it is a list of lists of strings (rows) so i will be a list of strings, wrap each word in a list when used as a writerow parameter, or simply don't use writerow which expects a list of strings.
Trivially, the following structure would be saved correctly:
sectionlist = [
["cat", "meow"],
["dog"],
["frog", "hop", "pond"]
]

Related

Why does writing a list of words to a CSV file write the individual characters instead? [duplicate]

Objective: To extract the text from the anchor tag inside all lines in models and put it in a csv.
I'm trying this code:
with open('Sprint_data.csv', 'ab') as csvfile:
spamwriter = csv.writer(csvfile)
models = soup.find_all('li' , {"class" : "phoneListing"})
for model in models:
model_name = unicode(u' '.join(model.a.stripped_strings)).encode('utf8').strip()
spamwriter.writerow(unicode(u' '.join(model.a.stripped_strings)).encode('utf8').strip())
It's working fine except each cell in the csv contains only one character.
Like this:
| S | A | M | S | U | N | G |
Instead of:
|SAMSUNG|
Of course I'm missing something. But what?
.writerow() requires a sequence ('', (), []) and places each index in it's own column of the row, sequentially. If your desired string is not an item in a sequence, writerow() will iterate over each letter in your string and each will be written to your CSV in a separate cell.
after you import csv
If this is your list:
myList = ['Diamond', 'Sierra', 'Crystal', 'Bridget', 'Chastity', 'Jasmyn', 'Misty', 'Angel', 'Dakota', 'Asia', 'Desiree', 'Monique', 'Tatiana']
listFile = open('Names.csv', 'wb')
writer = csv.writer(listFile)
for item in myList:
writer.writerow(item)
The above script will produce the following CSV:
Names.csv
D,i,a,m,o,n,d
S,i,e,r,r,a
C,r,y,s,t,a,l
B,r,i,d,g,e,t
C,h,a,s,t,i,t,y
J,a,s,m,y,n
M,i,s,t,y
A,n,g,e,l
D,a,k,o,t,a
A,s,i,a
D,e,s,i,r,e,e
M,o,n,i,q,u,e
T,a,t,i,a,n,a
If you want each name in it's own cell, the solution is to simply place your string (item) in a sequence. Here I use square brackets []. :
listFile2 = open('Names2.csv', 'wb')
writer2 = csv.writer(listFile2)
for item in myList:
writer2.writerow([item])
The script with .writerow([item]) produces the desired results:
Names2.csv
Diamond
Sierra
Crystal
Bridget
Chastity
Jasmyn
Misty
Angel
Dakota
Asia
Desiree
Monique
Tatiana
writerow accepts a sequence. You're giving it a single string, so it's treating that as a sequence, and strings act like sequences of characters.
What else do you want in this row? Nothing? If so, make it a list of one item:
spamwriter.writerow([u' '.join(model.a.stripped_strings).encode('utf8').strip()])
(By the way, the unicode() call is completely unnecessary since you're already joining with a unicode delimiter.)
This is usually the solution I use:
import csv
with open("output.csv", 'w', newline= '') as output:
wr = csv.writer(output, dialect='excel')
for element in list_of_things:
wr.writerow([element])
output.close()
This should provide you with an output of all your list elements in a single column rather than a single row.
Key points here is to iterate over the list and use '[list]' to avoid the csvwriter sequencing issues.
Hope this is of use!
Just surround it with a list sign (i.e [])
writer.writerow([str(one_column_value)])

List to csv without commas in Python

I have a following problem.
I would like to save a list into a csv (in the first column).
See example here:
import csv
mylist = ["Hallo", "der Pixer", "Glas", "Telefon", "Der Kühlschrank brach kaputt."]
def list_na_csv(file, mylist):
with open(file, "w", newline="") as csv_file:
csv_writer = csv.writer(csv_file)
csv_writer.writerows(mylist)
list_na_csv("example.csv", mylist)
My output in excel looks like this:
Desired output is:
You can see that I have two issues: Firstly, each character is followed by comma. Secondly, I don`t know how to use some encoding, for example UTF-8 or cp1250. How can I fix it please?
I tried to search similar question, but nothing worked for me. Thank you.
You have two problems here.
writerows expects a list of rows, said differently a list of iterables. As a string is iterable, you write each word in a different row, one character per field. If you want one row with one word per field, you should use writerow
csv_writer.writerow(mylist)
by default, the csv module uses the comma as the delimiter (this is the most common one). But Excel is a pain in the ass with it: it expects the delimiter to be the one of the locale, which is the semicolon (;) in many West European countries, including Germany. If you want to use easily your file with your Excel you should change the delimiter:
csv_writer = csv.writer(csv_file, delimiter=';')
After your edit, you want all the data in the first column, one element per row. This is kind of a decayed csv file, because it only has one value per record and no separator. If the fields can never contain a semicolon nor a new line, you could just write a plain text file:
...
with open(file, "w", newline="") as csv_file:
for row in mylist:
print(row, file=file)
...
If you want to be safe and prevent future problems if you later want to process more corner cases values, you could still use the csv module and write one element per row by including it in another iterable:
...
with open(file, "w", newline="") as csv_file:
csv_writer = csv.writer(csv_file, delimiter=';')
csv_writer.writerows([elt] for elt in mylist)
...
l = ["Hallo", "der Pixer", "Glas", "Telefon", "Der Kühlschrank brach kaputt."]
with open("file.csv", "w") as msg:
msg.write(",".join(l))
For less trivial examples:
l = ["Hallo", "der, Pixer", "Glas", "Telefon", "Der Kühlschrank, brach kaputt."]
with open("file.csv", "w") as msg:
msg.write(",".join([ '"'+x+'"' for x in l]))
Here you basically set every list element between quotes, to prevent from the intra field comma problem.
Try this it will work 100%
import csv
mylist = ["Hallo", "der Pixer", "Glas", "Telefon", "Der Kühlschrank brach kaputt."]
def list_na_csv(file, mylist):
with open(file, "w") as csv_file:
csv_writer = csv.writer(csv_file)
csv_writer.writerow(mylist)
list_na_csv("example.csv", mylist)
If you want to write the entire list of strings to a single row, use csv_writer.writerow(mylist) as mentioned in the comments.
If you want to write each string to a new row, as I believe your reference to writing them in the first column implies, you'll have to format your data as the class expects: "A row must be an iterable of strings or numbers for Writer objects". On this data that would look something like:
csv_writer.writerows((entry,) for entry in mylist)
There, I'm using a generator expression to wrap each word in a tuple, thus making it an iterable of strings. Without something like that, your strings are themselves iterables and lead to it delimiting between each character as you've seen.
Using csv to write a single entry per line is almost pointless, but it does have the advantage that it will escape your delimiter if it appears in the data.
To specify an encoding, the docs say:
Since open() is used to open a CSV file for reading, the file will by
default be decoded into unicode using the system default encoding (see
locale.getpreferredencoding()). To decode a file using a different
encoding, use the encoding argument of open:
import csv with open('some.csv', newline='', encoding='utf-8') as f:
reader = csv.reader(f)
for row in reader:
print(row)
The same applies to writing in something other than the system default encoding: specify the encoding argument when
opening the output file.
try split("\n")
example:
counter = 0
amazing list = ["hello","hi"]
for x in titles:
ok = amazinglist[counter].split("\n")
writer.writerow(ok)
counter +=1

How to write list elements into a tab-separated file?

I have searched the web but I haven't found the answer for my problem:
I have a a dictionary with lists as elements and every list has a different length. For example like that:
dict_with_lists[value1] = [word1, word2, word3, word4]
dict_with_lists[value2] = [word1, word2, word3]
My main problem is, that I want to write the list elements in to a file which should be tab-separated and if the list is finished it should write the new list in a new line.
I found a solution like that:
with open('fname', 'w') as file:
file.writelines('\t'.join(i) + '\n' for i in nested_list)
But it doesn't only separate the words with tabs but also the characters.
If nested_list is one of your dictionary values, then you are applying '\t'.join() to the individual words. You'd want to join the whole list:
file.write('\t'.join(nested_list) + '\n')
or, if you were to loop over the values of the dictionary:
file.writelines(
'\t'.join(nested_list) + '\n'
for nested_list in dict_with_lists.values())
The above uses the file.writelines() method correctly; passing in an iterable of strings to write. If you were to pass in a single string, then you are only causing Python extra work as it loops over all the individual characters of that string to write those separately, after which the underlying buffer has to assemble those back into bigger strings again.
However, there is no need to re-invent the character-separated-values writing wheel here. Use the csv module, setting the delimiter to '\t':
import csv
with open('fname', 'w', newline='') as file:
writer = csv.writer(file, delimiter='\t')
writer.writerows(dict_with_lists.values())
The above writes all lists in the dict_with_lists dictionary to a file. The csv.writer() object doesn't mind if your lists are of differing lengths.
You need to turn each list value in the dictionary into a string of tab-separated values that also have a '\n' newline character at the end of each one of them:
value1, value2 = 'key1', 'key2'
dict_with_lists = {}
dict_with_lists[value1] = ['word1', 'word2', 'word3', 'word4']
dict_with_lists[value2] = ['word1', 'word2', 'word3']
fname = 'dict_list_values.tsv'
with open(fname, 'w') as file:
file.writelines(
'\t'.join(values)+'\n' for values in dict_with_lists.values()
)
I think you're doing a list comprehension over the list inside of the dictionary. An alternative solution would be
with open('fname', 'w') as file:
for nested_list in dict_with_lists.values():
for word in nested_list:
file.write(word + '\t')
file.write('\n')
\I'm just looping over the values of the dictionaries, which are lists in this case and joining them using a tab and writing a newline at the end of each list. I haven't tested it but theoretically I think it should work.
Instead of jumping on an answering your question I'm going to give you a hint on how to tackle your actual problem.
There is another way to store data like that (dictionaries of non-tabular form) and it is by saving it in the JSON-string format.
import json
with open('fname','w') as f:
json.dump(dict_with_lists, f)
And then the code to load it would be:
import json
with open('fname') as f:
dict_with_lists = json.load(f)

Read Tuple from csv in Python

I am trying to read a row in a csv, which I have previously written.
That written row looks like this when is read: ['New York', '(30,40)']
and like this: ['New York', '(30,40)'] (converts the tuple in a string).
I need to read each item from the tuple to operate with the ints, but I can't if it is read like a string because if I do something like this: tuple[0], what I get is: '(' -the first character of the string tuple-
Maybe this is a question about how I write and read the rows, which actually is this way:
def writeCSV(data,name):
fileName = name+'.csv'
with open(fileName, 'a') as csvfile:
writer = csv.writer(csvfile, delimiter=',')
writer.writerow(data)
def readCSV(filename):
allRows = []
with open(filename, 'rb') as f:
reader = csv.reader(f, delimiter=' ')
for row in reader:
allRows.append(row)
return allRows
What I want is to read that tuple for row, not like a string, but like a tuple to operate with each item after.
Is it possible?
You need to use ast.literal_eval() on you tuple string as:
>>> my_file_line = ['New York', '(30,40)']
>>> import ast
>>> my_tuple = ast.literal_eval(my_file_line[1])
>>> my_tuple[0]
30
Because currently the list you got after reading the file at index 1 is holding the valid string of the tuple format. ast.literal_eval will convert your tuple string to the tuple object and then you can access the tuple based on the index.
since you're producing the file yourself, why not making it right from the start:
import csv
data = ['New York', (30,40)]
with open("out.csv","w",newline="") as f:
cw=csv.writer(f)
cw.writerow(data) # wrong
cw.writerow(data[:1]+list(data[1])) # okay
the first line writes each item of data but tries to convert each item as str. tuple gets converted, where selecting another format could avoid this.
New York,"(30, 40)"
which explains the need to evaluate it afterwards.
The second write line writes 3 elements (now 3 columns), but preserves the data (except for the int part which is converted to str anyway)
New York,30,40
in that case, a simple string to integer conversion on rows does the job.
Note that csv isn't the best way to serialize mixed data. Better use json for that.

Use python to parse values from ping output into csv

I wrote a code using RE to look for "time=" and save the following value in a string. Then I use the csv.writer attribute writerow, but each number is interpreted as a column, and this gives me trouble later. Unfortunately there is no 'writecolumn' attribute. Should I save the values as an array instead of a string and write every row separately?
import re
import csv
inputfile = open("ping.txt")
teststring = inputfile.read()
values = re.findall(r'time=(\d+.\d+)', teststring)
with open('parsed_ping.csv', 'wb') as csvfile:
writer = csv.writer(csvfile, delimiter=' ',
quotechar='|', quoting=csv.QUOTE_MINIMAL)
writer.writerow(values)
EDIT: I understood that "values" is already a list. I tried to iterate it and write a row for each item with
for item in values:
writer.writerow(item)
Now i get a space after each character, like
4 6 . 6
4 7 . 7
EDIT2: The spaces are the delimiters. If i change the delimiter to comma, i get commas between digits. I just don't get why he's interpreting each digit as a separate column.
If your csv file only contains one column, it's not really a "comma-separated file" anymore, is it?
Just write the list to the file directly:
import re
inputfile = open("ping.txt")
teststring = inputfile.read()
values = re.findall(r'time=(\d+\.\d+)', teststring)
with open('parsed_ping.csv', 'w') as csvfile:
csvfile.write("\n".join(values)
I solved this. I just needed to use square brackets in the writer.
for item in values:
writer.writerow([item])
This gives me the correct output.

Categories

Resources