python doesn't append each line but skips some

python doesn't append each line but skips some - python

I have a complete_list_of_records which has a length of 550
this list would look something like this:
Apples
Pears
Bananas
The issue is that when i use:
with open("recordedlines.txt", "a") as recorded_lines:
for i in complete_list_of_records:
recorded_lines.write(i)
the outcome of the file is 393 long and the structure someplaces looks like so
Apples
PearsBananas
Pineapples
I have tried with "w" instead of "a" append and manually inserted "\n" for each item in the list but this just creates blank spaces on every second row and still som rows have the same issue with dual lines in one.
Anyone who has encountered something similar?

From the comments seen so far, I think there are strings in the source list that contain newline characters in positions other than at the end. Also, it seems that some strings end with newline character(s) but not all.
I suggest replacing embedded newlines with some other character - e.g., underscore.
Therefore I suggest this:
with open("recordedlines.txt", "w") as recorded_lines:
for line in complete_list_of_records:
line = line.rstrip() # remove trailing whitespace
line = line.replace('\n', '_') # replace any embedded newlines with underscore
print(line, file=recorded_lines) # print function will add a newline

You could simply strip all whitespaces off in any case and then insert a newline per hand like so:
with open("recordedlines.txt", "a") as recorded_lines:
for i in complete_list_of_records:
recorded_lines.write(i.strip() + "\n")

you need to use
file.writelines(listOfRecords)
but the list values must have '\n'
f = open("demofile3.txt", "a")
li = ["See you soon!", "Over and out."]
li = [i+'\n' for i in li]
f.writelines(li)
f.close()
#open and read the file after the appending:
f = open("demofile3.txt", "r")
print(f.read())
output will be
See you soon!
Over and out.
you can also use for loop with write() having '\n' at each iteration

[Soln][1]
complete_list_of_records =['1.Apples','2.Pears','3.Bananas','4.Pineapples']
with open("recordedlines.txt", "w") as recorded_lines:
for i in complete_list_of_records:
recorded_lines.write(i+"\n")
I think it should work.
Make sure that, you write as a string.

Related

Im trying to unify words together on my python application [duplicate]

This question already has answers here:
How to read a file without newlines?
(12 answers)
Closed 5 years ago.
I have a .txt file with values in it.
The values are listed like so:
Value1
Value2
Value3
Value4
My goal is to put the values in a list. When I do so, the list looks like this:
['Value1\n', 'Value2\n', ...]
The \n is not needed.
Here is my code:
t = open('filename.txt')
contents = t.readlines()

This should do what you want (file contents in a list, by line, without \n)
with open(filename) as f:
mylist = f.read().splitlines()

I'd do this:
alist = [line.rstrip() for line in open('filename.txt')]
or:
with open('filename.txt') as f:
alist = [line.rstrip() for line in f]

You can use .rstrip('\n') to only remove newlines from the end of the string:
for i in contents:
alist.append(i.rstrip('\n'))
This leaves all other whitespace intact. If you don't care about whitespace at the start and end of your lines, then the big heavy hammer is called .strip().
However, since you are reading from a file and are pulling everything into memory anyway, better to use the str.splitlines() method; this splits one string on line separators and returns a list of lines without those separators; use this on the file.read() result and don't use file.readlines() at all:
alist = t.read().splitlines()

After opening the file, list comprehension can do this in one line:
fh=open('filename')
newlist = [line.rstrip() for line in fh.readlines()]
fh.close()
Just remember to close your file afterwards.

I used the strip function to get rid of newline character as split lines was throwing memory errors on 4 gb File.
Sample Code:
with open('C:\\aapl.csv','r') as apple:
for apps in apple.readlines():
print(apps.strip())

for each string in your list, use .strip() which removes whitespace from the beginning or end of the string:
for i in contents:
alist.append(i.strip())
But depending on your use case, you might be better off using something like numpy.loadtxt or even numpy.genfromtxt if you need a nice array of the data you're reading from the file.

from string import rstrip
with open('bvc.txt') as f:
alist = map(rstrip, f)
Nota Bene: rstrip() removes the whitespaces, that is to say : \f , \n , \r , \t , \v , \x and blank ,
but I suppose you're only interested to keep the significant characters in the lines. Then, mere map(strip, f) will fit better, removing the heading whitespaces too.
If you really want to eliminate only the NL \n and RF \r symbols, do:
with open('bvc.txt') as f:
alist = f.read().splitlines()
splitlines() without argument passed doesn't keep the NL and RF symbols (Windows records the files with NLRF at the end of lines, at least on my machine) but keeps the other whitespaces, notably the blanks and tabs.
.
with open('bvc.txt') as f:
alist = f.read().splitlines(True)
has the same effect as
with open('bvc.txt') as f:
alist = f.readlines()
that is to say the NL and RF are kept

I had the same problem and i found the following solution to be very efficient. I hope that it will help you or everyone else who wants to do the same thing.
First of all, i would start with a "with" statement as it ensures the proper open/close of the file.
It should look something like this:
with open("filename.txt", "r+") as f:
contents = [x.strip() for x in f.readlines()]
If you want to convert those strings (every item in the contents list is a string) in integer or float you can do the following:
contents = [float(contents[i]) for i in range(len(contents))]
Use int instead of float if you want to convert to integer.
It's my first answer in SO, so sorry if it's not in the proper formatting.

I recently used this to read all the lines from a file:
alist = open('maze.txt').read().split()
or you can use this for that little bit of extra added safety:
with f as open('maze.txt'):
alist = f.read().split()
It doesn't work with whitespace in-between text in a single line, but it looks like your example file might not have whitespace splitting the values. It is a simple solution and it returns an accurate list of values, and does not add an empty string: '' for every empty line, such as a newline at the end of the file.

with open('D:\\file.txt', 'r') as f1:
lines = f1.readlines()
lines = [s[:-1] for s in lines]

The easiest way to do this is to write file.readline()[0:-1]
This will read everything except the last character, which is the newline.

Removing formatting when reading from file [duplicate]

This question already has answers here:
How to read a file without newlines?
(12 answers)
Closed 5 years ago.
I have a .txt file with values in it.
The values are listed like so:
Value1
Value2
Value3
Value4
My goal is to put the values in a list. When I do so, the list looks like this:
['Value1\n', 'Value2\n', ...]
The \n is not needed.
Here is my code:
t = open('filename.txt')
contents = t.readlines()

This should do what you want (file contents in a list, by line, without \n)
with open(filename) as f:
mylist = f.read().splitlines()

I'd do this:
alist = [line.rstrip() for line in open('filename.txt')]
or:
with open('filename.txt') as f:
alist = [line.rstrip() for line in f]

You can use .rstrip('\n') to only remove newlines from the end of the string:
for i in contents:
alist.append(i.rstrip('\n'))
This leaves all other whitespace intact. If you don't care about whitespace at the start and end of your lines, then the big heavy hammer is called .strip().
However, since you are reading from a file and are pulling everything into memory anyway, better to use the str.splitlines() method; this splits one string on line separators and returns a list of lines without those separators; use this on the file.read() result and don't use file.readlines() at all:
alist = t.read().splitlines()

After opening the file, list comprehension can do this in one line:
fh=open('filename')
newlist = [line.rstrip() for line in fh.readlines()]
fh.close()
Just remember to close your file afterwards.

I used the strip function to get rid of newline character as split lines was throwing memory errors on 4 gb File.
Sample Code:
with open('C:\\aapl.csv','r') as apple:
for apps in apple.readlines():
print(apps.strip())

for each string in your list, use .strip() which removes whitespace from the beginning or end of the string:
for i in contents:
alist.append(i.strip())
But depending on your use case, you might be better off using something like numpy.loadtxt or even numpy.genfromtxt if you need a nice array of the data you're reading from the file.

from string import rstrip
with open('bvc.txt') as f:
alist = map(rstrip, f)
Nota Bene: rstrip() removes the whitespaces, that is to say : \f , \n , \r , \t , \v , \x and blank ,
but I suppose you're only interested to keep the significant characters in the lines. Then, mere map(strip, f) will fit better, removing the heading whitespaces too.
If you really want to eliminate only the NL \n and RF \r symbols, do:
with open('bvc.txt') as f:
alist = f.read().splitlines()
splitlines() without argument passed doesn't keep the NL and RF symbols (Windows records the files with NLRF at the end of lines, at least on my machine) but keeps the other whitespaces, notably the blanks and tabs.
.
with open('bvc.txt') as f:
alist = f.read().splitlines(True)
has the same effect as
with open('bvc.txt') as f:
alist = f.readlines()
that is to say the NL and RF are kept

I had the same problem and i found the following solution to be very efficient. I hope that it will help you or everyone else who wants to do the same thing.
First of all, i would start with a "with" statement as it ensures the proper open/close of the file.
It should look something like this:
with open("filename.txt", "r+") as f:
contents = [x.strip() for x in f.readlines()]
If you want to convert those strings (every item in the contents list is a string) in integer or float you can do the following:
contents = [float(contents[i]) for i in range(len(contents))]
Use int instead of float if you want to convert to integer.
It's my first answer in SO, so sorry if it's not in the proper formatting.

I recently used this to read all the lines from a file:
alist = open('maze.txt').read().split()
or you can use this for that little bit of extra added safety:
with f as open('maze.txt'):
alist = f.read().split()
It doesn't work with whitespace in-between text in a single line, but it looks like your example file might not have whitespace splitting the values. It is a simple solution and it returns an accurate list of values, and does not add an empty string: '' for every empty line, such as a newline at the end of the file.

with open('D:\\file.txt', 'r') as f1:
lines = f1.readlines()
lines = [s[:-1] for s in lines]

The easiest way to do this is to write file.readline()[0:-1]
This will read everything except the last character, which is the newline.

Reduce multiple blank lines to single (Pythonically)

How can I reduce multiple blank lines in a text file to a single line at each occurrence?
I have read the entire file into a string, because I want to do some replacement across line endings.
with open(sourceFileName, 'rt') as sourceFile:
sourceFileContents = sourceFile.read()
This doesn't seem to work
while '\n\n\n' in sourceFileContents:
sourceFileContents = sourceFileContents.replace('\n\n\n', '\n\n')
and nor does this
sourceFileContents = re.sub('\n\n\n+', '\n\n', sourceFileContents)
It's easy enough to strip them all, but I want to reduce multiple blank lines to a single one, each time I encounter them.
I feel that I'm close, but just can't get it to work.

This is a reach, but perhaps some of the lines aren't completely blank (i.e. they have only whitespace characters that give the appearance of blankness). You could try removing all possible whitespace between newlines.
re.sub(r'(\n\s*)+\n+', '\n\n', sourceFileContents)
Edit: realized the second '+' was superfluous, as the \s* will catch newlines between the first and last. We just want to make sure the last character is definitely a newline so we don't remove leading whitespace from a line with other content.
re.sub(r'(\n\s*)+\n', '\n\n', sourceFileContents)
Edit 2
re.sub(r'\n\s*\n', '\n\n', sourceFileContents)
Should be an even simpler solution. We really just want to a catch any possible space (which includes intermediate newlines) between our two anchor newlines that will make the single blank line and collapse it down to just the two newlines.

Your code works for me. Maybe there is a chance of carriage return \r would be present.
re.sub(r'[\r\n][\r\n]{2,}', '\n\n', sourceFileContents)

You can use just str methods split and join:
text = "some text\n\n\n\nanother line\n\n"
print("\n".join(item for item in text.split('\n') if item))

Very simple approach using re module
import re
text = 'Abc\n\n\ndef\nGhijk\n\nLmnop'
text = re.sub('[\n]+', '\n', text) # Replacing one or more consecutive newlines with single \n
Result:
'Abc\ndef\nGhijk\nLmnop'

If the lines are completely empty, you can use regex positive lookahead to replace them with single lines:
sourceFileContents = re.sub(r'\n+(?=\n)', '\n', sourceFileContents)

If you replace your read statement with the following, then you don't have to worry about whitespace or carriage returns:
with open(sourceFileName, 'rt') as sourceFile:
sourceFileContents = ''.join([l.rstrip() + '\n' for l in sourceFile])
After doing this, both of your methods you tried in the OP work.
OR
Just write it out in a simple loop.
with open(sourceFileName, 'rt') as sourceFile:
lines = ['']
for line in (l.rstrip() for l in sourceFile):
if line != '' or lines[-1] != '\n':
lines.append(line + '\n')
sourceFileContents = "".join(lines)

I guess another option which is longer, but maybe prettier?
with open(sourceFileName, 'rt') as sourceFile:
last_line = None
lines = []
for line in sourceFile:
# if you want to skip lines with only whitespace, you could add something like:
# line = line.lstrip(" \t")
if last_line != "\n":
lines.append(line)
last_line = line
contents = "".join(lines)
I was trying to find some clever generator function way of writing this, but it's been a long week so I can't.
Code untested, but I think it should work?
(edit: One upside is I removed the need for regular expressions which fixes the "now you have two problems" problem :) )
(another edit based on Marc Chiesa's suggestion of lingering whitespace)

For someone who can't do regex like me, if the code to process is python:
import autopep8
autopep8.fixcode('your_code')
Another quick solution, just in case your code isn't Python:
for x in range(100):
content.replace(" ", " ") # reduce the number of multiple whitespaces
# then
for x in range(20):
content.replace("\n\n", "\n") # reduce the number of multiple white lines
Note that if you have more than 100 consecutive whitespaces or 20 consecutive new lines, you'll want to increase the repetition times.

If decoding from unicode, watch out for non-breaking spaces which show up in cat -vet as M-BM-:
sourceFileContents = sourceFile.read()
sourceFileContents = re.sub(r'\n(\s*\n)+','\n\n',sourceFileContents.replace("\xc2\xa0"," "))

Getting rid of \n when using .readlines() [duplicate]

This question already has answers here:
How to read a file without newlines?
(12 answers)
Closed 5 years ago.
I have a .txt file with values in it.
The values are listed like so:
Value1
Value2
Value3
Value4
My goal is to put the values in a list. When I do so, the list looks like this:
['Value1\n', 'Value2\n', ...]
The \n is not needed.
Here is my code:
t = open('filename.txt')
contents = t.readlines()

This should do what you want (file contents in a list, by line, without \n)
with open(filename) as f:
mylist = f.read().splitlines()

I'd do this:
alist = [line.rstrip() for line in open('filename.txt')]
or:
with open('filename.txt') as f:
alist = [line.rstrip() for line in f]

You can use .rstrip('\n') to only remove newlines from the end of the string:
for i in contents:
alist.append(i.rstrip('\n'))
This leaves all other whitespace intact. If you don't care about whitespace at the start and end of your lines, then the big heavy hammer is called .strip().
However, since you are reading from a file and are pulling everything into memory anyway, better to use the str.splitlines() method; this splits one string on line separators and returns a list of lines without those separators; use this on the file.read() result and don't use file.readlines() at all:
alist = t.read().splitlines()

After opening the file, list comprehension can do this in one line:
fh=open('filename')
newlist = [line.rstrip() for line in fh.readlines()]
fh.close()
Just remember to close your file afterwards.

I used the strip function to get rid of newline character as split lines was throwing memory errors on 4 gb File.
Sample Code:
with open('C:\\aapl.csv','r') as apple:
for apps in apple.readlines():
print(apps.strip())

for each string in your list, use .strip() which removes whitespace from the beginning or end of the string:
for i in contents:
alist.append(i.strip())
But depending on your use case, you might be better off using something like numpy.loadtxt or even numpy.genfromtxt if you need a nice array of the data you're reading from the file.

from string import rstrip
with open('bvc.txt') as f:
alist = map(rstrip, f)
Nota Bene: rstrip() removes the whitespaces, that is to say : \f , \n , \r , \t , \v , \x and blank ,
but I suppose you're only interested to keep the significant characters in the lines. Then, mere map(strip, f) will fit better, removing the heading whitespaces too.
If you really want to eliminate only the NL \n and RF \r symbols, do:
with open('bvc.txt') as f:
alist = f.read().splitlines()
splitlines() without argument passed doesn't keep the NL and RF symbols (Windows records the files with NLRF at the end of lines, at least on my machine) but keeps the other whitespaces, notably the blanks and tabs.
.
with open('bvc.txt') as f:
alist = f.read().splitlines(True)
has the same effect as
with open('bvc.txt') as f:
alist = f.readlines()
that is to say the NL and RF are kept

I had the same problem and i found the following solution to be very efficient. I hope that it will help you or everyone else who wants to do the same thing.
First of all, i would start with a "with" statement as it ensures the proper open/close of the file.
It should look something like this:
with open("filename.txt", "r+") as f:
contents = [x.strip() for x in f.readlines()]
If you want to convert those strings (every item in the contents list is a string) in integer or float you can do the following:
contents = [float(contents[i]) for i in range(len(contents))]
Use int instead of float if you want to convert to integer.
It's my first answer in SO, so sorry if it's not in the proper formatting.

I recently used this to read all the lines from a file:
alist = open('maze.txt').read().split()
or you can use this for that little bit of extra added safety:
with f as open('maze.txt'):
alist = f.read().split()
It doesn't work with whitespace in-between text in a single line, but it looks like your example file might not have whitespace splitting the values. It is a simple solution and it returns an accurate list of values, and does not add an empty string: '' for every empty line, such as a newline at the end of the file.

with open('D:\\file.txt', 'r') as f1:
lines = f1.readlines()
lines = [s[:-1] for s in lines]

The easiest way to do this is to write file.readline()[0:-1]
This will read everything except the last character, which is the newline.

CSV File to list

I have a CSV file which is made of words in the first column. (1 word per row)
I need to print a list of these words, i.e.
CSV File:
a
and
because
have
Output wanted:
"a","and","because","have"
I am using python and so far I have the follwing code;
text=open('/Users/jessieinchauspe/Dropbox/Smesh/TMT/zipf.csv')
text1 = ''.join(ch for ch in text)
for word in text1:
print '"' + word + '"' +','
This is returning:
"a",
"",
"a",
"n",
...
Whereas I need everything one one line, and not by character but by word.
Thank you for your help!
EDIT: this is a screenshot of the preview of the CSV file

Just loop over the file directly:
with open('/Users/jessieinchauspe/Dropbox/Smesh/TMT/zipf.csv') as text:
print ','.join('"{0}"'.format(word.strip()) for word in text)
The above code:
Loops over the file; this gives you a line (including the newline \n character).
Uses .strip() to remove whitespace around the word (including the newline).
Uses .format() to put the word in quotes ('word' becomes '"word"')
Uses ','.join() to join all quoted words together into one list with commas in between.

When you do :
text=open('/Users/jessieinchauspe/Dropbox/Smesh/TMT/zipf.csv')
that basically returns an iterator with each line as an element. If you want a list out of that and you're sure that there is only one word per line than all you need to do is
result=list(text)
print result
Otherwise you can get the first words only like so :
result = list(x.split(',')[0] for x in text)
print result

You could also use the CSV module:
import csv
input_f = '/Users/jessieinchauspe/Dropbox/Smesh/TMT/zipf.csv'
output_f = '/Users/jessieinchauspe/Dropbox/Smesh/TMT/output.csv'
with open(input_f, 'r') as input_handle, open(output_f, 'w') as output_handle:
writer = csv.writer(output_handle)
writer.writerow(list(input_handle))

If you put a comma at the end of the print statement it suppresses the newline.
print '"' + word + '"' +',',
Will give you the output on one line.

print ','.join('"%s"' % line.strip() for line in open('/tmp/test'))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python doesn't append each line but skips some - python

You could simply strip all whitespaces off in any case and then insert a newline per hand like so: with open("recordedlines.txt", "a") as recorded_lines: for i in complete_list_of_records: recorded_lines.write(i.strip() + "\n")

[Soln][1] complete_list_of_records =['1.Apples','2.Pears','3.Bananas','4.Pineapples'] with open("recordedlines.txt", "w") as recorded_lines: for i in complete_list_of_records: recorded_lines.write(i+"\n") I think it should work. Make sure that, you write as a string.

Related

Im trying to unify words together on my python application [duplicate]

Removing formatting when reading from file [duplicate]

Reduce multiple blank lines to single (Pythonically)

Getting rid of \n when using .readlines() [duplicate]

CSV File to list

Categories

Resources