problem in writing number in csv file using python - python

I have this array of strings:
my_array = ['5.0', '6.066', '7.5', '7.83', '9.75']
and I want to write first 3 items on my csv file.
I am using this code
n=0
with open("file.csv",'w',newline="",)as e:
while n<3:
writer=csv.writer(e)
writer.writerow(my_array[n])
n=n+1
The output is:
5,.,0
6,.,0,6,6
7,.,5
but I don't want to separate numbers with comma
for example the output must be:
5.0
6.066
7.5
What should I do?

writerow() wants an iterable, and each element of that iterable is written as one column of the csv. Because a string is an iterable, each element (character) of that string is written to a separate column. To fix this, pass a list to writerow() that only contains your single string
You can do something like this:
import csv
my_array=['5.0', '6.066', '7.5', '7.83', '9.75']
with open('example.csv', 'w+') as f:
writer = csv.writer(f)
for elm in my_array[:3]:
writer.writerow([elm])
Output of example.csv:
5.0
6.066
7.5

Apparently you don't actually want a CSV (comma separated values) file, so don't use the csv module.
Simply write each string from the list to a separate line in the file.
Some options are:
with open("file.txt", "w") as f:
for value in my_array[:3]:
print(value, file=f)
with open("file.txt", "w") as f:
f.write("\n".join(my_array[:3]))

Related

List to csv without commas in Python

I have a following problem.
I would like to save a list into a csv (in the first column).
See example here:
import csv
mylist = ["Hallo", "der Pixer", "Glas", "Telefon", "Der Kühlschrank brach kaputt."]
def list_na_csv(file, mylist):
with open(file, "w", newline="") as csv_file:
csv_writer = csv.writer(csv_file)
csv_writer.writerows(mylist)
list_na_csv("example.csv", mylist)
My output in excel looks like this:
Desired output is:
You can see that I have two issues: Firstly, each character is followed by comma. Secondly, I don`t know how to use some encoding, for example UTF-8 or cp1250. How can I fix it please?
I tried to search similar question, but nothing worked for me. Thank you.
You have two problems here.
writerows expects a list of rows, said differently a list of iterables. As a string is iterable, you write each word in a different row, one character per field. If you want one row with one word per field, you should use writerow
csv_writer.writerow(mylist)
by default, the csv module uses the comma as the delimiter (this is the most common one). But Excel is a pain in the ass with it: it expects the delimiter to be the one of the locale, which is the semicolon (;) in many West European countries, including Germany. If you want to use easily your file with your Excel you should change the delimiter:
csv_writer = csv.writer(csv_file, delimiter=';')
After your edit, you want all the data in the first column, one element per row. This is kind of a decayed csv file, because it only has one value per record and no separator. If the fields can never contain a semicolon nor a new line, you could just write a plain text file:
...
with open(file, "w", newline="") as csv_file:
for row in mylist:
print(row, file=file)
...
If you want to be safe and prevent future problems if you later want to process more corner cases values, you could still use the csv module and write one element per row by including it in another iterable:
...
with open(file, "w", newline="") as csv_file:
csv_writer = csv.writer(csv_file, delimiter=';')
csv_writer.writerows([elt] for elt in mylist)
...
l = ["Hallo", "der Pixer", "Glas", "Telefon", "Der Kühlschrank brach kaputt."]
with open("file.csv", "w") as msg:
msg.write(",".join(l))
For less trivial examples:
l = ["Hallo", "der, Pixer", "Glas", "Telefon", "Der Kühlschrank, brach kaputt."]
with open("file.csv", "w") as msg:
msg.write(",".join([ '"'+x+'"' for x in l]))
Here you basically set every list element between quotes, to prevent from the intra field comma problem.
Try this it will work 100%
import csv
mylist = ["Hallo", "der Pixer", "Glas", "Telefon", "Der Kühlschrank brach kaputt."]
def list_na_csv(file, mylist):
with open(file, "w") as csv_file:
csv_writer = csv.writer(csv_file)
csv_writer.writerow(mylist)
list_na_csv("example.csv", mylist)
If you want to write the entire list of strings to a single row, use csv_writer.writerow(mylist) as mentioned in the comments.
If you want to write each string to a new row, as I believe your reference to writing them in the first column implies, you'll have to format your data as the class expects: "A row must be an iterable of strings or numbers for Writer objects". On this data that would look something like:
csv_writer.writerows((entry,) for entry in mylist)
There, I'm using a generator expression to wrap each word in a tuple, thus making it an iterable of strings. Without something like that, your strings are themselves iterables and lead to it delimiting between each character as you've seen.
Using csv to write a single entry per line is almost pointless, but it does have the advantage that it will escape your delimiter if it appears in the data.
To specify an encoding, the docs say:
Since open() is used to open a CSV file for reading, the file will by
default be decoded into unicode using the system default encoding (see
locale.getpreferredencoding()). To decode a file using a different
encoding, use the encoding argument of open:
import csv with open('some.csv', newline='', encoding='utf-8') as f:
reader = csv.reader(f)
for row in reader:
print(row)
The same applies to writing in something other than the system default encoding: specify the encoding argument when
opening the output file.
try split("\n")
example:
counter = 0
amazing list = ["hello","hi"]
for x in titles:
ok = amazinglist[counter].split("\n")
writer.writerow(ok)
counter +=1

Incorrect format in CSV excel file when loading JSON data into CSV

I have some data in a JSON file, and I have used the code below to write them into csv file, but I found that the each word in a sentence has occupied one column, I want to store whole sentence in a single column.
This is the code:
for line in open('test1.json', 'r'):
if not line.strip():
continue
data = json.loads(line)
text = data["text"]
filtered_text = clean_tweets(text)
print(filtered_text)
with open ('test1.csv', 'a', encoding='utf-8') as f:
csvWriter = csv.writer(f)
csvWriter.writerow(filtered_text)
f.close()
This is the output of csv file.
csv.writerow() expects an iterable parameter. Each item in the iterable is placed in a column. Strings are iterable, hence you get a single character in each column.
Put the string(s) in a list:
csvWriter.writerow([filtered_text])
But since you seem to only have one column, using the csv module is unnecessary. Just use:
with open('test1.csv', 'a', encoding='utf8') as f:
f.write(filtered_text + '\n') # add newline if needed
Another option:
with open('test1.csv', 'a', encoding='utf8') as f:
print(filtered_text,file=f) # will add the newline
csvWriter.writerow() takes a list or similar of columns - you need to supply for example csvWriter.writerow([tweet_id, filtered_tweet])

How to split tsv file into smaller tsv file based on row values

I have a tsv file in.txt which I would like to split into a smaller tsv file called out.txt.
I would like to import only the rows of in.txt which contain a string value My String Value in column 6 into out.txt.
import csv
# r is textmode
# rb is binary mode
# binary mode is faster
with open('in.txt','rb') as tsvIn, open('out.txt', 'w') as tsvOut:
tsvIn = csv.reader(tsvIn, delimiter='\t')
tsvOut = csv.writer(tsvOut)
for row in tsvIn:
if "My String Value" in row:
tsvOut.writerows(row)
My output looks like this.
D,r,a,m,a
1,9,6,1,-,0,4,-,1,3
H,y,u,n, ,M,o,k, ,Y,o,o
B,e,o,m,-,s,e,o,n, ,L,e,e
M,u,-,r,y,o,n,g, ,C,h,o,i,",", ,J,i,n, ,K,y,u, ,K,i,m,",", ,J,e,o,n,g,-,s,u,k, ,M,o,o,n,",", ,A,e,-,j,a, ,S,e,o
A, ,p,u,b,l,i,c, ,a,c,c,o,u,n,t,a,n,t,',s, ,s,a,l,a,r,y, ,i,s, ,f,a,r, ,t,o,o, ,s,m,a,l,l, ,f,o,r, ,h,i,m, ,t,o, ,e,v,e,n, ,g,e,t, ,a, ,c,a,v,i,t,y, ,f,i,x,e,d,",", ,l,e,t, ,a,l,o,n,e, ,s,u,p,p,o,r,t, ,h,i,s, ,f,a,m,i,l,y,., ,H,o,w,e,v,e,r,",", ,h,e, ,m,u,s,t, ,s,o,m,e,h,o,w, ,p,r,o,v,i,d,e, ,f,o,r, ,h,i,s, ,s,e,n,i,l,e,",", ,s,h,e,l,l,-,s,h,o,c,k,e,d, ,m,o,t,h,e,r,",", ,h,i,s, ,.,.,.
K,o,r,e,a,n,",", ,E,n,g,l,i,s,h
S,o,u,t,h, ,K,o,r,e,a
It should look like this with tab separated values
Drama Hyn Mok Yoo A public accountant's salary is far to small for him...etc
There are a few things wrong with your code. Let's look at this line by line..
import csv
Import module csv. Ok.
with open('in.txt','rb') as tsvIn, open('out.txt', 'w') as tsvOut:
With auto-closed binary file read handle tsvIn from in.txt, and text write handle tsvOut from out.txt, do... (Note: you probably want to use mode wb instead of mode w; see this post)
tsvIn = csv.reader(tsvIn, delimiter='\t')
Let tsvIn be the result of the call of function reader in module csv with arguments tsvIn and delimiter='\t'. Ok.
tsvOut = csv.writer(tsvOut)
Let tsvOut be the result of the call of function writer in module csv with argument tsvOut. You proably want to add another argument, delimiter='\t', too.
for row in tsvIn:
For each element in tsvIn as row, do...
if "My String Value" in row:
If string "My String Value" is present in row. You mentioned that you wanted to show only those rows whose sixth element was equal to the string, thus you should use something like this instead...
if len(row) >= 6 and row[5] == "My String Value":
This means: If the length of row is at least 6, and the sixth element of row is equal to "My String Value", do...
tsvOut.writerows(row)
Call method writerows of object tsvOut with argument row. Remember that in Python, a string is just a sequence of characters, and a character is a single-element string. Thus, a character is a sequence. Then, we have that row is, according to the docs, a list of strings, each representing a column of the row. Thus, a row is a list of strings. Then, we have the writerows method, that expects a list of rows, that is, a list of lists of strings, that is, a list of lists of sequences of characters. It happens that you can interpret each of row's elements as a row, when it's actually a string, and each element of that string as a string (as characters are strings!). All of this means is that you'll get a messy, character-by-character output. You should try this instead...
tsvOut.writerow(row)
Method writerow expects a single row as an argument, not a list of rows, thus this will yield the expected result.
try this:
import csv
# r is textmode
# rb is binary mode
# binary mode is faster
with open('in.txt','r') as tsvIn, open('out.txt', 'w') as tsvOut:
reader = csv.reader(tsvIn, delimiter='\t')
writer = csv.writer(tsvOutm, delimiter='\t')
[writer.writerow(row) for row in reader if "My String Value" in row]

Use python to parse values from ping output into csv

I wrote a code using RE to look for "time=" and save the following value in a string. Then I use the csv.writer attribute writerow, but each number is interpreted as a column, and this gives me trouble later. Unfortunately there is no 'writecolumn' attribute. Should I save the values as an array instead of a string and write every row separately?
import re
import csv
inputfile = open("ping.txt")
teststring = inputfile.read()
values = re.findall(r'time=(\d+.\d+)', teststring)
with open('parsed_ping.csv', 'wb') as csvfile:
writer = csv.writer(csvfile, delimiter=' ',
quotechar='|', quoting=csv.QUOTE_MINIMAL)
writer.writerow(values)
EDIT: I understood that "values" is already a list. I tried to iterate it and write a row for each item with
for item in values:
writer.writerow(item)
Now i get a space after each character, like
4 6 . 6
4 7 . 7
EDIT2: The spaces are the delimiters. If i change the delimiter to comma, i get commas between digits. I just don't get why he's interpreting each digit as a separate column.
If your csv file only contains one column, it's not really a "comma-separated file" anymore, is it?
Just write the list to the file directly:
import re
inputfile = open("ping.txt")
teststring = inputfile.read()
values = re.findall(r'time=(\d+\.\d+)', teststring)
with open('parsed_ping.csv', 'w') as csvfile:
csvfile.write("\n".join(values)
I solved this. I just needed to use square brackets in the writer.
for item in values:
writer.writerow([item])
This gives me the correct output.

Parse a string containing a large integer in Python

I am having trouble parsing a data set from a .txt file into an Excel file (.csv) in Python.
The source code looks like:
fin = open(filename,'r')
reader = csv.reader(fin)
for line in reader:
list = str(line).split()
print list3
print str(list3[1])
My data sample looks like:
10134.5 -123 9.9527
And Python screen output looks like this
["['10134.5", '-123', '9.9527,"']"
-131.7000
So I'm assuming list3[1] is a float or a number at this moment, which cause some overflow because 100,000 is large than it can hold...
Do you know how to let Python treat it as a string not a integer..
You do not need to split, or to cast to string... numbers inside the list are strings.
fin = open(filename,'r')
reader = csv.reader(fin)
for line in reader:
print(line)
output
['10134.5', '-123', '9.9527']

Categories

Resources