join one word lines in a file - python

I have a file in the format of one word for line, and I want to join the lines with one space, I tries this, but it does not work
for line in file:
new = ' '.join(line)
print (new)
also this does not work
new = file.replace('\n'', ' ')
print (new)

You can also use list comprehensions:
whole_string = " ".join([word.strip() for word in file])
print(whole_string)

You can add each line to a list, then join it up after:
L = []
for line in file:
L.append(line.strip('\n'))
print " ".join(L)
Your current solution tries to use join with a string not a list

A one line solution to this problem would be the following:
print(open('thefile.txt').read().replace('\n', ' '))

This is I think what you want..
' '.join(l.strip() for l in file)

yet another way:
with open('yourfilename.txt', 'r') as file:
words = ' '.join(map(str.rstrip, file))
As you can see from several other answers, file is an iterator, so you can iterate over it and at each loop it will give you a line read from the file (including the \n at the end, that is why we're all stripping it off).
Logically speaking, map applies the given function (i.e. str.rstrip) to each line read in and the results are passed on to join.

Related

python doesn't append each line but skips some

I have a complete_list_of_records which has a length of 550
this list would look something like this:
Apples
Pears
Bananas
The issue is that when i use:
with open("recordedlines.txt", "a") as recorded_lines:
for i in complete_list_of_records:
recorded_lines.write(i)
the outcome of the file is 393 long and the structure someplaces looks like so
Apples
PearsBananas
Pineapples
I have tried with "w" instead of "a" append and manually inserted "\n" for each item in the list but this just creates blank spaces on every second row and still som rows have the same issue with dual lines in one.
Anyone who has encountered something similar?
From the comments seen so far, I think there are strings in the source list that contain newline characters in positions other than at the end. Also, it seems that some strings end with newline character(s) but not all.
I suggest replacing embedded newlines with some other character - e.g., underscore.
Therefore I suggest this:
with open("recordedlines.txt", "w") as recorded_lines:
for line in complete_list_of_records:
line = line.rstrip() # remove trailing whitespace
line = line.replace('\n', '_') # replace any embedded newlines with underscore
print(line, file=recorded_lines) # print function will add a newline
You could simply strip all whitespaces off in any case and then insert a newline per hand like so:
with open("recordedlines.txt", "a") as recorded_lines:
for i in complete_list_of_records:
recorded_lines.write(i.strip() + "\n")
you need to use
file.writelines(listOfRecords)
but the list values must have '\n'
f = open("demofile3.txt", "a")
li = ["See you soon!", "Over and out."]
li = [i+'\n' for i in li]
f.writelines(li)
f.close()
#open and read the file after the appending:
f = open("demofile3.txt", "r")
print(f.read())
output will be
See you soon!
Over and out.
you can also use for loop with write() having '\n' at each iteration
[Soln][1]
complete_list_of_records =['1.Apples','2.Pears','3.Bananas','4.Pineapples']
with open("recordedlines.txt", "w") as recorded_lines:
for i in complete_list_of_records:
recorded_lines.write(i+"\n")
I think it should work.
Make sure that, you write as a string.

Can someone explain me the 5th line of this code?

I could understand a little but I want the exact explanation of that particular line. I'm confused about the syntax.
Otherwise, I know how the code works and what is it doing, I just want to clarify my concept about the syntax.
Code :
import docx2txt
def extract_text_from_doc(doc_path):
temp = docx2txt.process("resumes/Chinmaya_Kaundanya_Resume.docx")
text = [line.replace('\t', ' ') for line in temp.split('\n') if line]
return ' '.join(text)
It's the list comprehension version for:
text = []
for line in temp.split('\n'):
if line:
text.append(line.replace('\t', ' '))
It iterates through temp line by line, if the line is not empty it replaces '\t' (tabs) with spaces, and puts the results in a the array text.
it's basically a list comprehension
it will iterate through each line, checking if line is not empty then replace the tab character with spaces.

Python - Get specific characters from text file or from list

I have a text file with this in it
Curtain Open time: 8:00
When I wrote to the file I used this
File.write("Curtain Open Time: " + Var_CurtainOpenTime, + "\n")
I used the "\n" to go onto the next line for more data to be wrote. "Var_CurtainOpenTime" is a variable in this case it was "8:00". I have some code to read the line which looks like this:
FileRead = open('File.txt', 'r')
Printing this would read "Curtain Open Time: 8:00".
I want to be able to just get "8:00". I had previously used FileRead.split(" ") to separate each word but after the 8:00 I get ["Curtain", "Open", "Time:", "8:00\n"]. So I believe I would need to remove the first 3 indexes somehow and somehow remove '\n' from the last index. I don't know how I would approach this. Any help?
Try the following, I will comment the explain
with open('File.txt') as f:
[line.replace('\n','').split()[3:][0] for line in f][0]
or just:
FileRead = open('File.txt', 'r')
result = [line.replace('\n','').split()[3:][0] for line in FileRead][0]
you just need to change from the .split(" ") to .split() and then get the last list item
with open('file.txt') as f:
print f.read().split()[-1]
Well once you have the list from the split, you can remove the first 3 terms by doing l=l[3:] (where l is your list). Then you can remove the \n by doing s = s[:-1] where s is your desired string. This is using list slicing. You can look at documentation if you want to understand it further.

python reading file infinite loop

pronunciation_file = open('dictionary.txt')
pronunciation = {}
line = pronunciation_file.readline()
while line != '':
n_line = line.strip().split(' ' , 1)
pronunciation[n_line[0]] = n_line[1].strip()
line = pronunciation_file.readline()
print(pronunciation)
the code is to turn a file of words and its pronunciation into a dictionary (keys are words and value is pronunciation) for example 'A AH0\n...' into {'A':'AH0'...}
the problem is if I put the print inside the loop, it prints normal(but it prints all the unfinished dictionaries) however if i put the print outside the loop like the one above, the shell returns nothing and when i close it ,it prompts the program is still running(where is probably a infinite loop)
Help please
I also tried cutting out first few hundred words and run the program, it works for very short files but it starts returning nothing at a certain length:|
That is not how to read from a file:
# with will also close your file
with open(your_file) as f:
# iterate over file object
for line in f:
# unpack key/value for your dict and use rstrip
k, v = line.rstrip().split(' ' , 1)
pronunciation[k] = v
You simply open the file and iterate over the file object. Use .rstrip() if you want to remove from the end of string, there is also no need to call strip twice on the same line.
You can also simplify your code to just using dict and a generator expression
with open("dictionary.txt") as f:
pronunciation = dict(line.rstrip().split(" ",1) for line in f)
Not tested, but if you want to use a while loop, the idiom is more like this:
pronunciation={}
with open(fn) as f:
while True:
line=f.readline()
if not line:
break
l, r=line.split(' ', 1)
pronunciation[l]=r.strip()
But the more modern Python idiom for reading a file line-by-line is to use a for loop as Padraic Cunningham's answer uses. A while loop is more commonly used to read a binary file fixed chunk by fixed chunk in Python.

How to read a text file into a string variable and strip newlines?

I have a text file that looks like:
ABC
DEF
How can I read the file into a single-line string without newlines, in this case creating a string 'ABCDEF'?
For reading the file into a list of lines, but removing the trailing newline character from each line, see How to read a file without newlines?.
You could use:
with open('data.txt', 'r') as file:
data = file.read().replace('\n', '')
Or if the file content is guaranteed to be one-line
with open('data.txt', 'r') as file:
data = file.read().rstrip()
In Python 3.5 or later, using pathlib you can copy text file contents into a variable and close the file in one line:
from pathlib import Path
txt = Path('data.txt').read_text()
and then you can use str.replace to remove the newlines:
txt = txt.replace('\n', '')
You can read from a file in one line:
str = open('very_Important.txt', 'r').read()
Please note that this does not close the file explicitly.
CPython will close the file when it exits as part of the garbage collection.
But other python implementations won't. To write portable code, it is better to use with or close the file explicitly. Short is not always better. See https://stackoverflow.com/a/7396043/362951
To join all lines into a string and remove new lines, I normally use :
with open('t.txt') as f:
s = " ".join([l.rstrip("\n") for l in f])
with open("data.txt") as myfile:
data="".join(line.rstrip() for line in myfile)
join() will join a list of strings, and rstrip() with no arguments will trim whitespace, including newlines, from the end of strings.
This can be done using the read() method :
text_as_string = open('Your_Text_File.txt', 'r').read()
Or as the default mode itself is 'r' (read) so simply use,
text_as_string = open('Your_Text_File.txt').read()
I'm surprised nobody mentioned splitlines() yet.
with open ("data.txt", "r") as myfile:
data = myfile.read().splitlines()
Variable data is now a list that looks like this when printed:
['LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN', 'GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE']
Note there are no newlines (\n).
At that point, it sounds like you want to print back the lines to console, which you can achieve with a for loop:
for line in data:
print(line)
It's hard to tell exactly what you're after, but something like this should get you started:
with open ("data.txt", "r") as myfile:
data = ' '.join([line.replace('\n', '') for line in myfile.readlines()])
I have fiddled around with this for a while and have prefer to use use read in combination with rstrip. Without rstrip("\n"), Python adds a newline to the end of the string, which in most cases is not very useful.
with open("myfile.txt") as f:
file_content = f.read().rstrip("\n")
print(file_content)
Here are four codes for you to choose one:
with open("my_text_file.txt", "r") as file:
data = file.read().replace("\n", "")
or
with open("my_text_file.txt", "r") as file:
data = "".join(file.read().split("\n"))
or
with open("my_text_file.txt", "r") as file:
data = "".join(file.read().splitlines())
or
with open("my_text_file.txt", "r") as file:
data = "".join([line for line in file])
you can compress this into one into two lines of code!!!
content = open('filepath','r').read().replace('\n',' ')
print(content)
if your file reads:
hello how are you?
who are you?
blank blank
python output
hello how are you? who are you? blank blank
You can also strip each line and concatenate into a final string.
myfile = open("data.txt","r")
data = ""
lines = myfile.readlines()
for line in lines:
data = data + line.strip();
This would also work out just fine.
This is a one line, copy-pasteable solution that also closes the file object:
_ = open('data.txt', 'r'); data = _.read(); _.close()
f = open('data.txt','r')
string = ""
while 1:
line = f.readline()
if not line:break
string += line
f.close()
print(string)
python3: Google "list comprehension" if the square bracket syntax is new to you.
with open('data.txt') as f:
lines = [ line.strip('\n') for line in list(f) ]
Oneliner:
List: "".join([line.rstrip('\n') for line in open('file.txt')])
Generator: "".join((line.rstrip('\n') for line in open('file.txt')))
List is faster than generator but heavier on memory. Generators are slower than lists and is lighter for memory like iterating over lines. In case of "".join(), I think both should work well. .join() function should be removed to get list or generator respectively.
Note: close() / closing of file descriptor probably not needed
Have you tried this?
x = "yourfilename.txt"
y = open(x, 'r').read()
print(y)
To remove line breaks using Python you can use replace function of a string.
This example removes all 3 types of line breaks:
my_string = open('lala.json').read()
print(my_string)
my_string = my_string.replace("\r","").replace("\n","")
print(my_string)
Example file is:
{
"lala": "lulu",
"foo": "bar"
}
You can try it using this replay scenario:
https://repl.it/repls/AnnualJointHardware
I don't feel that anyone addressed the [ ] part of your question. When you read each line into your variable, because there were multiple lines before you replaced the \n with '' you ended up creating a list. If you have a variable of x and print it out just by
x
or print(x)
or str(x)
You will see the entire list with the brackets. If you call each element of the (array of sorts)
x[0]
then it omits the brackets. If you use the str() function you will see just the data and not the '' either.
str(x[0])
Maybe you could try this? I use this in my programs.
Data= open ('data.txt', 'r')
data = Data.readlines()
for i in range(len(data)):
data[i] = data[i].strip()+ ' '
data = ''.join(data).strip()
Regular expression works too:
import re
with open("depression.txt") as f:
l = re.split(' ', re.sub('\n',' ', f.read()))[:-1]
print (l)
['I', 'feel', 'empty', 'and', 'dead', 'inside']
with open('data.txt', 'r') as file:
data = [line.strip('\n') for line in file.readlines()]
data = ''.join(data)
from pathlib import Path
line_lst = Path("to/the/file.txt").read_text().splitlines()
Is the best way to get all the lines of a file, the '\n' are already stripped by the splitlines() (which smartly recognize win/mac/unix lines types).
But if nonetheless you want to strip each lines:
line_lst = [line.strip() for line in txt = Path("to/the/file.txt").read_text().splitlines()]
strip() was just a useful exemple, but you can process your line as you please.
At the end, you just want concatenated text ?
txt = ''.join(Path("to/the/file.txt").read_text().splitlines())
This works:
Change your file to:
LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE
Then:
file = open("file.txt")
line = file.read()
words = line.split()
This creates a list named words that equals:
['LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN', 'GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE']
That got rid of the "\n". To answer the part about the brackets getting in your way, just do this:
for word in words: # Assuming words is the list above
print word # Prints each word in file on a different line
Or:
print words[0] + ",", words[1] # Note that the "+" symbol indicates no spaces
#The comma not in parentheses indicates a space
This returns:
LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN, GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE
with open(player_name, 'r') as myfile:
data=myfile.readline()
list=data.split(" ")
word=list[0]
This code will help you to read the first line and then using the list and split option you can convert the first line word separated by space to be stored in a list.
Than you can easily access any word, or even store it in a string.
You can also do the same thing with using a for loop.
file = open("myfile.txt", "r")
lines = file.readlines()
str = '' #string declaration
for i in range(len(lines)):
str += lines[i].rstrip('\n') + ' '
print str
Try the following:
with open('data.txt', 'r') as myfile:
data = myfile.read()
sentences = data.split('\\n')
for sentence in sentences:
print(sentence)
Caution: It does not remove the \n. It is just for viewing the text as if there were no \n

Categories

Resources