I want to read file including spaces in each lines
My current code
def data():
f = open("save.aln")
for line in f.readlines():
print "</br>"
print line
I am using python and output embedded in html
File to be read - http://pastebin.com/EaeKsyvg
Thanks
It seems that your problem is that you need space preserving in HTML. The simple solution would be to put your output between <pre> elemenets
def data():
print "<pre>"
f = open("save.aln")
for line in f.readlines():
print line
print "</pre>"
Note that in this case you don't need the <br> elements either, since the newline characters are also preserved.
The problem that you are faced with is that HTML ignores multiple whitespaces. #itsadok's solution is great. I upvoted it. But, it's not the only way to do this either.
If you want to explicitly turn those whitespaces into HTML whitespace characters, you could to this:
def data():
f = open("save.aln")
for line in f.readlines():
print "<br />"
print line.replace(" ", " ")
Cheers
import cgi
with open('save.aln') as f:
for line in f:
print cgi.escape(line) # escape <, >, &
print '<br/>'
Related
I'm using Python 3 to loop through lines of a .txt file that contains strings. These strings will be used in a curl command. However, it is only working correctly for the last line of the file. I believe the other lines end with newlines, which throws the string off:
url = https://
with open(file) as f:
for line in f:
str = (url + line)
print(str)
This will return:
https://
endpoint1
https://
endpoint2
https://endpoint3
How can I resolve all strings to concatonate like the last line?
I've looked at a couple of answers like How to read a file without newlines?, but this answer converts all content in the file to one line.
Use str.strip
Ex:
url = https://
with open(file) as f:
for line in f:
s = (url + line.strip())
print(s)
If the strings end with newlines you can call .strip() to remove them. i.e:
url = https://
with open(file) as f:
for line in f:
str = (url + line.strip())
print(str)
I think str.strip() will solve your problem
Everytime i run this part code everything goes smoothly BUT when it writes the variable to the file it shows up with quotation marks, is there a way to remove them and write it as simple text?
try:
with open(tokens) as f:
lines = f.readlines()
answer = random.choice(lines)
print(answer)
except:
file_name = tokens
opened_file = open(tokens, 'a')
opened_file.write("%r\n" %user_input)
opened_file.close()
writing in the file looks like this:
'Whats up'
and i want it to look like this:
Whats up
In your line that writes to the log you are using %r as your format in the string. the Python docs say
Replacing %s and %r:
>
"repr() shows quotes: {!r}; str() doesn't: {!s}".format('test1', 'test2') "repr() shows quotes: 'test1'; str() doesn't: test2"
So replace this line
opened_file.write("%r\n" %user_input)
with
opened_file.write("%s\n" %user_input)
So far on python I have made a file using the code:
text_file = open("Sentences_Positions.txt", "w")
text_file.write (str(positions))
text_file.write (str(ssplit))
text_file.close()
The code makes the file and writes individual words to it which I previously split, I need to find a way to open the file and join the split words then print it I have tried.
text_file = open("Sentences_Positions.txt", "r")
rejoin = ("Sentences_positions.txt").join('')
print (rejoin)
But all this does is print a blank line in the shell, how should I approach this and what other code could i try?
Read the file content and join them by ''
content = textfile.read().split(' ')
print ''.join(content)
Replace:
rejoin = ("Sentences_positions.txt").join('')
with:
rejoin = ''.join(text_file.read().split(' '))
Also, you should probably not use open but rather the context manager:
with open("Sentences_Positions.txt") as text_file:
rejoin = ''.join(text_file.read().split(' '))
print (rejoin)
Otherwise the file remains open. Using the context manager, it will close it when it's done. (True for the first part of your code as well).
I have a quick and dirty build script that needs to update a couple of lines in a small xml config file. Since the file is so small, I'm using an admittedly inefficient process to update the file in place just to keep things simple:
def hide_osx_dock_icon(app):
for line in fileinput.input(os.path.join(app, 'Contents', 'Info.plist'), inplace=True):
line = re.sub(r'(<key>CFBundleDevelopmentRegion</key>)', '<key>LSUIElement</key><string>1</string>\g<1>', line.strip(), flags=re.IGNORECASE)
print line.strip()
The idea is to find the <key>CFBundleDevelopmentRegion</key> text and insert the LSUIElement content right in front of it. I'm doing something just like this in another area and it's working fine so I guess I'm just missing something, but I don't see it.
What am I doing wrong?
You are printing only the last line, because your print statement falls outside of the for loop:
for line in fileinput.input(os.path.join(app, 'Contents', 'Info.plist'), inplace=True):
line = re.sub(r'(<key>CFBundleDevelopmentRegion</key>)', '<key>LSUIElement</key><string>1</string>\g<1>', line.strip(), flags=re.IGNORECASE)
print line.strip()
Indent that line to match the previous:
for line in fileinput.input(os.path.join(app, 'Contents', 'Info.plist'), inplace=True):
line = re.sub(r'(<key>CFBundleDevelopmentRegion</key>)', '<key>LSUIElement</key><string>1</string>\g<1>', line.strip(), flags=re.IGNORECASE)
print line.strip()
I have a file in UTF-8, where some lines contain the U+2028 Line Separator character (http://www.fileformat.info/info/unicode/char/2028/index.htm). I don't want it to be treated as a line break when I read lines from the file. Is there a way to exclude it from separators when I iterate over the file or use readlines()? (Besides reading the entire file into a string and then splitting by \n.) Thank you!
I can't duplicate this behaviour in python 2.5, 2.6 or 3.0 on mac os x - U+2028 is always treated as non-endline. Could you go into more detail about where you see this error?
That said, here is a subclass of the "file" class that might do what you want:
#/usr/bin/python
# -*- coding: utf-8 -*-
class MyFile (file):
def __init__(self, *arg, **kwarg):
file.__init__(self, *arg, **kwarg)
self.EOF = False
def next(self, catchEOF = False):
if self.EOF:
raise StopIteration("End of file")
try:
nextLine= file.next(self)
except StopIteration:
self.EOF = True
if not catchEOF:
raise
return ""
if nextLine.decode("utf8")[-1] == u'\u2028':
return nextLine+self.next(catchEOF = True)
else:
return nextLine
A = MyFile("someUnicode.txt")
for line in A:
print line.strip("\n").decode("utf8")
I couldn't reproduce that behavior but here's a naive solution that just merges readline results until they don't end with U+2028.
#!/usr/bin/env python
from __future__ import with_statement
def my_readlines(f):
buf = u""
for line in f.readlines():
uline = line.decode('utf8')
buf += uline
if uline[-1] != u'\u2028':
yield buf
buf = u""
if buf:
yield buf
with open("in.txt", "rb") as fin:
for l in my_readlines(fin):
print l
Thanks to everyone for answering.
I think I know why you might not have been able to replicate this.I just realized that it happens if I decode the file when opening, as in:
f = codecs.open(filename, encoding='utf-8')
for line in f:
print line
The lines are not separated on u2028, if I open the file first and then decode individual lines:
f = open(filename)
for line in f:
print line.decode("utf8")
(I'm using Python 2.6 on Windows. The file was originally UTF16LE and then it was converted into UTF8).
This is very interesting, I guess I won't be using codecs.open much from now on :-).
If you use Python 3.0 (note that I don't, so I can't test), according to the documentation you can pass an optional newline parameter to open to specifify which line seperator to use. However, the documentation doesn't mention U+2028 at all (it only mentions \r, \n, and \r\n as line seperators), so it's actually a suprise to me that this even occurs (although I can confirm this even with Python 2.6).
The codecs module is doing the RIGHT thing. U+2028 is named "LINE SEPARATOR" with the comment "may be used to represent this semantic unambiguously". So treating it as a line separator is sensible.
Presumably the creator would not have put the U+2028 characters there without good reason ... does the file have u"\n" as well? Why do you want lines not to be split on U+2028?