Python whitespace in file [duplicate] - python

This question already has answers here:
How to print without a newline or space
(26 answers)
Closed 3 years ago.
I have a .txt file with some words in it like "example". I also have the following code would be:
Name = open("file.txt", "r")
print(name.read())
print("text")
input()
Why is there a whitespace in the output like
"example
text"
And how do I stop that from happening?

The reason why you have got an "extra" whitespace may be that the file "file.txt" ends with extra whitespace. You should check every byte of the file, especially the '\n' and '\r' characters.
To avoid the problem,
print(name.read().rstrip())
print("text")
str.rstrip wipes out extra whitespaces at the end of the string. Although I am not sure what caused your problem, str.rstrip should stop that from happening.

Use this link:Python Remove Character from String. It will solve your problem.
Name = open("file.txt", "r")
print(Name.read().replace('\n', ''))
print("text")
input()

use can do something like this
ans=''
with open('test.txt','r') as f:
for line in f:
for word in line.split():
ans+=word+' '
print(ans)
separate it word by word and do whatever you want .

Related

Python reading from file vs directly assigning literal

I asked a Python question minutes ago about how Python's newline work only to have it closed because of another question that's not even similar or have Python associated with it.
I have text with a '\n' character and '\t' in it, in a file. I read it using
open().read()
I then Stored the result in an identifier. My expectations is that such a text e.g
I\nlove\tCoding
being read from a file and assigned to an identifier should be same as one directly assigned to the string literal
"I\nlove\tCoding"
being directly assigned to a file.
My assumption was wrong anyway
word = I\nlove\tCoding
ends up being different from
word = open(*.txt).read()
Where the content of *.txt is exactly same as string "I\nlove\tCoding"
Edit:
I did make typo anyway, I meant \t && \n , searching with re module's search() for \t, it return None, but \t is there. Why is this please?
You need to differentiate between newlines/tabs and their corresponding escape sequences:
for filename in ('test1.txt', 'test2.txt'):
print(f"\n{filename} contains:")
fileData = open(filename, 'r').read()
print(fileData)
for pattern in (r'\\n', r'\n'):
# first is the escape sequences, second the (real) newline!
m = re.search(pattern, fileData)
if m:
print(f"found {pattern}")
Out:
test1.txt contains:
I\nlove\tCoding
found \\n
test2.txt contains:
I
love Coding
found \n
The string you get after reading from file is I\\nlove\\nCoding.If you want your string from literal equals string from file you should use r prefix. Something like this - word = r"I\nlove\nCoding"

Why string getting from file is not equal to common string? [duplicate]

This question already has answers here:
Is there a difference between "==" and "is"?
(13 answers)
Closed 6 years ago.
I am on python 3.5 and want to find the matched words from a file. The word I am giving is awesome and the very first word in the .txt file is also awesome. Then why addedWord is not equal to word? Can some one give me the reason?
myWords.txt
awesome
shiny
awesome
clumsy
Code for matching
addedWord = "awesome"
with open("myWords.txt" , 'r') as openfile:
for word in openfile:
if addedWord is word:
print ("Match")
I also tried as :
d = word.replace("\n", "").rstrip()
a = addedWord.replace("\n", "").rstrip()
if a is d:
print ("Matched :" +word)
I also tried to get the class of variables by typeOf(addedWord) and typeOf(word) Both are from 'str' class but are not equal. Is any wrong here?
There are two problems with your code.
1) Strings returned from iterating files include the trailing newline. As you suspected, you'll need to .strip(), .rstrip() or .replace() the newline away.
2) String comparison should be performed with ==, not is.
So, try this:
if addedWord == word.strip():
print ("Match")
Those two strings will never be the same object, so you should not use is to compare them. Use ==.
Your intuition to strip off the newlines was spot-on, but you just need a single call to strip() (it will strip all whitespace including tabs and newlines).

Python code for searching .txt file. Print lines containing words in parantheses

I think my regex function is wrong. I get an error. I want the program to read the .txt file line by line and print only lines which don't contain words in parenthesis.
Here is the code I used.
import os
infile=open("/Users/Julio/Desktop/database.txt", "r")
for line in infile:
if not re.search('(')
print(line, end='')
infile.close()
For something as simple matching a single character, why not just avoid regex and do something like:
with open("database.txt","rt") as f:
for line in f:
if "(" not in line:
print(line)
A solution not using Regex, and looks for single words in parenthesis. Try to avoid using regex unless you have to.
for line in file_text.split("\n"):
for word in line.split(" "):
paren_flag = True if word[0] == "(" and word[len(word) - 1] == ")" else False
if not paren_flag:
print(line)
Some people, when confronted with a problem, think
“I know, I'll use regular expressions.” Now they have two problems.
Note: I hope you were asking how to look for single words in parenthesis, because your question was/is unclear at the time of answering.

file.readlines leaving blank lines [duplicate]

This question already has answers here:
How to read a file without newlines?
(12 answers)
Closed 2 years ago.
I have read that file.readlines reads the whole file line by line and stores it in a list.
If I have a file like so -
Sentence 1
Sentence 2
Sentence 3
and I use readlines to print each sentence like so -
file = open("test.txt")
for i in file.readlines():
print i
The output is
Sentence 1
Sentence 2
Sentence 3
My question is why do I get the extra line between each sentence and how can I get rid of it?
UPDATE
I found that using i.strip also removes the extra lines. Why does this happen? As far as I know, split removes the white spaces at the end and beginning of a string.
file.readlines() return list of strings. Each string contain trailing newlines. print statement prints the passed parameter with newlnie.; That's why you got extra lines.
To remove extra newline, use str.rstrip:
print i.rstrip('\n')
or use sys.stdout.write
sys.stdout.write(i)
BTW, don't use file.readlines unless you need all lines at once. Just iterate the file.
with open("test.txt") as f:
for i in f:
print i.rstrip('\n')
...
UPDATE
In Python 3, to prevent print prints trailing newline, you can use print(i, end='').
In Python 2, you can use same feature if you do : from __future__ import print_function
Answer to UPDATE
Tabs, Newlines are also considers as whitespaces.
>> ' \r\n\t\v'.isspace()
True
file.readlines()
(and also file.readline()) includes the newlines.
Do
print i.replace('\n', '')
if you don't want them.
It may seem weird to include the newline at the end of the line, but this allows, for example, you to tell whether the last line has a newline character or not. That case in tricky in many languages' I/O.
The below will strip the newline for you.
i = i.rstrip("\n") #strips newline
Hope this helps.
This worked out for me with Python 3.
I found rstrip() function really useful in these situations
for i in readtext:
print(i.rstrip())
with open(txtname, 'r') as txtfile:
lines = txtfile.readlines()
lines = [j for j in lines if j != '\n']
with open(outname, 'w') as outfile:
outfile.writelines(lines)

Iterating over full text lines instead of characters [duplicate]

This question already has answers here:
Iterate over the lines of a string
(6 answers)
Closed 7 years ago.
I noticed when I try to iterate over a file with lines such as
"python"
"please"
"work"
I only get individual characters back, such as,
"p"
"y"
"t"...
how could I get it to give me the full word? I've been trying a couple hours and can't find a method. I'm using the newest version of python.
Edit: All the quotation marks are new lines.
You can iterate over a file object:
for line in open('file'):
for word in line.split():
do_stuff(word)
See the docs for the details:
http://docs.python.org/2/library/stdtypes.html#bltin-file-objects
If you are storing the words as a string, you can split the words by space using split function.
>>> "python please work".split(' ')
['python', 'please', 'work']
If you have your data in a single string which spans several lines (e.g. it contains '\n' characters), you will need to split it before iterating. This is because iterating over a string (rather than a list of strings) will always iterate over characters, rather than words or lines.
Here's some example code:
text = "Spam, spam, spam.\Lovely spam!\nWonderful spam!"
lines = text.splitlines() # or use .split("\n") to do it manually
for line in lines:
do_whatever(line)

Categories

Resources