How can I use readline method? - python

I have this trivial code:
from sys import argv
script, input_file = argv
def fma(f):
f.readline()
current_file = open(input_file)
fma(current_file)
The contents of the txt file is:
Hello this is a test.\n
I like cheese and macaroni.\n
I love to drink juice.\n
\n
\n
I put the \n chars so you know I hit enter in my text editor.
What I want to accomplish is to get back every single line and every \n character.
The problem is, when running the script I get nothing back. What am I doing wrong and how can I fix it in order to run as I stated above?

Your function reads a line, but does nothing with it:
def fma(f):
f.readline()
You'd need to return the string that f.readline() gives you. Keep in mind, in the interactive prompt, the last value produced is printed automatically, but that isn't how Python code in a .py file works.

Pretty certain what you actually want is f.readlines, not f.readline.
# module start
from __future__ import with_statement # for python 2.5 and earlier
def readfile(path):
with open(path) as src:
return src.readlines()
if __name__ == '__main__':
import sys
print readfile(sys.argv[1])
# module end
Note that I am using the with context manager to open your file more efficiently (it does the job of closing the file for you). In Python 2.6 and later you don't need the fancy import statement at the top to use it, but I have a habit of including it for anyone still using older Python.

def fma(f):
f.readline()
f.readline() is a function, it returns a value which is in this case a line from the file, you need to "do" something with that value like:
def fma(f):
print f.readline()

Your function is not returning anything.
def fma(f):
data = f.readline()
return data
f.readline() reads a single line from the file; a newline character (\n) is left at the end of the string, and is only omitted on the last line of the file if the file doesn’t end in a newline.

the script should be as follow:
from sys import argv
script, input_file = argv
def fma(f):
line = f.readline()
return line
current_file = open(input_file)
print fma(current_file)

Related

replace new line in a different file with an underscore (without using with)

I posted a question yesterday in similar regards to this but didn't quite gauge the response I wanted because I wasn't specific enough. Basically the function takes a .txt file as the argument and returns a string with all \n characters replaced with an '_' on the same line. I want to do this without using WITH. I thought I did this correctly but when I run it and check the file, nothing has changed. Any pointers?
This is what I did:
def one_line(filename):
wordfile = open(filename)
text_str = wordfile.read().replace("\n", "_")
wordfile.close()
return text_str
one_line("words.txt")
but to no avail. I open the text file and it remains the same.
The contents of the textfile are:
I like to eat
pancakes every day
and the output that's supposed to be shown is:
>>> one_line("words.txt")
’I like to eat_pancakes every day_’
The fileinput module in the Python standard library allows you to do this in one fell swoop.
import fileinput
for line in fileinput.input(filename, inplace=True):
line = line.replace('\n', '_')
print(line, end='')
The requirement to avoid a with statement is trivial but rather pointless. Anything which looks like
with open(filename) as handle:
stuff
can simply be rewritten as
try:
handle = open(filename)
stuff
finally:
handle.close()
If you take out the try/finally you have a bug which leaves handle open if an error happens. The purpose of the with context manager for open() is to simplify this common use case.
You are missing some steps. After you obtain the updated string, you need to write it back to the file, example below without using with
def one_line(filename):
wordfile = open(filename)
text_str = wordfile.read().replace("\n", "_")
wordfile.close()
return text_str
def write_line(s):
# Open the file in write mode
wordfile = open("words.txt", 'w')
# Write the updated string to the file
wordfile.write(s)
# Close the file
wordfile.close()
s = one_line("words.txt")
write_line(s)
Or using with
with open("file.txt",'w') as wordfile:
#Write the updated string to the file
wordfile.write(s)
with pathlib you could achieve what you want this way:
from pathlib import Path
path = Path(filename)
contents = path.read_text()
contents = contents.replace("\n", "_")
path.write_text(contents)

Python failing to read lines properly

I'm supposed to open a file, read it line per line and display the lines out.
Here's the code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import re
in_path = "../vas_output/Glyph/20140623-FLYOUT_mins_cleaned.csv"
out_path = "../vas_gender/Glyph/"
csv_read_line = open(in_path, "rb").read().split("\n")
line_number = 0
for line in csv_read_line:
line_number+=1
print str(line_number) + line
Here's the contents of the input file:
12345^67890^abcedefg
random^test^subject
this^sucks^crap
And here's the result:
this^sucks^crapjectfg
Some weird combo of all three. In addition to this, the result of line_number is missing. Printing out the result of len(csv_read_line) outputs 1, for some reason, no matter how many is in the input file. Changing the split type from \n to ^ gives the expected output, though, so I'm assuming the problem is probably with the input file.
I'm using a Mac, and did both the python code and the input file (on Sublime Text) on the Mac itself.
Am I missing something?
You seem to be splitting on "\n" which isn't necessary, and could be incorrect depending on the line terminators used in the input file. Python includes functionality to iterate over the lines of a file one at a time. The advantages are that it will worry about processing line terminators in a portable way, as well as not requiring the entire file to be held in memory at once.
Further, note that you are opening the file in binary mode (the b character in your mode string) when you actually intend to read the file as text. This can cause problems similar to the one you are experiencing.
Also, you do not close the file when you are done with it. In this case that isn't a problem, but you should get in the habit of using with blocks when possible to make sure the file gets closed at the earliest possible time.
Try this:
with open(in_path, "r") as f:
line_number = 0
for line in f:
line_number += 1
print str(line_number) + line.rstrip('\r\n')
So your example just works for me.
But then, i just copied your text into a text editor on linux, and did it that way, so any carriage returns will have been wiped out.
Try this code though:
import os
in_path = "input.txt"
with open(in_path, "rb") as inputFile:
for lineNumber, line in enumerate(inputFile):
print lineNumber, line.strip()
It's a little cleaner, and the for line in file style deals with line breaks for you in a system independent way - Python's open has universal newline support.
I'd try the following Pythonic code:
#!/usr/bin/env python
in_path = "../vas_output/Glyph/20140623-FLYOUT_mins_cleaned.csv"
out_path = "../vas_gender/Glyph/"
with open(in_path, 'rb') as f:
for i, line in enumerate(f):
print(str(i) + line)
There are several improvements that can be made here to make it more idiomatic python.
import csv
in_path = "../vas_output/Glyph/20140623-FLYOUT_mins_cleaned.csv"
out_path = "../vas_gender/Glyph/"
#Lets open the file and make sure that it closes when we unindent
with open(in_path,"rb") as input_file:
#Create a csv reader object that will parse the input for us
reader = csv.reader(input_file,delimiter="^")
#Enumerate over the rows (these will be lists of strings) and keep track of
#of the line number using python's built in enumerate function
for line_num, row in enumerate(reader):
#You can process whatever you would like here. But for now we will just
#print out what you were originally printing
print str(line_num) + "^".join(row)

remove first char from each line in a text file

im new to Python, to programming in general.
I want to remove first char from each line in a text file and write the changes back to the file. For example i have file with 36 lines, and the first char in each line contains a symbol or a number, and i want it to be removed.
I made a little code here, but it doesn't work as expected, it only duplicates whole liens. Any help would be appreciated in advance!
from sys import argv
run, filename = argv
f = open(filename, 'a+')
f.seek(0)
lines = f.readlines()
for line in lines:
f.write(line[1:])
f.close()
Your code already does remove the first character. I saved exactly your code as both dupy.py and dupy.txt, then ran python dupy.py dupy.txt, and the result is:
from sys import argv
run, filename = argv
f = open(filename, 'a+')
f.seek(0)
lines = f.readlines()
for line in lines:
f.write(line[1:])
f.close()
rom sys import argv
un, filename = argv
= open(filename, 'a+')
.seek(0)
ines = f.readlines()
or line in lines:
f.write(line[1:])
.close()
It's not copying entire lines; it's copying lines with their first character stripped.
But from the initial statement of your problem, it sounds like you want to overwrite the lines, not append new copies. To do that, don't use append mode. Read the file, then write it:
from sys import argv
run, filename = argv
f = open(filename)
lines = f.readlines()
f.close()
f = open(filename, 'w')
for line in lines:
f.write(line[1:])
f.close()
Or, alternatively, write a new file, then move it on top of the original when you're done:
import os
from sys import argv
run, filename = argv
fin = open(filename)
fout = open(filename + '.tmp', 'w')
lines = f.readlines()
for line in lines:
fout.write(line[1:])
fout.close()
fin.close()
os.rename(filename + '.tmp', filename)
(Note that this version will not work as-is on Windows, but it's simpler than the actual cross-platform version; if you need Windows, I can explain how to do this.)
You can make the code a lot simpler, more robust, and more efficient by using with statements, looping directly over the file instead of calling readlines, and using tempfile:
import tempfile
from sys import argv
run, filename = argv
with open(filename) as fin, tempfile.NamedTemporaryFile(delete=False) as fout:
for line in fin:
fout.write(line[1:])
os.rename(fout.name, filename)
On most platforms, this guarantees an "atomic write"—when your script finishes, or even if someone pulls the plug in the middle of it running, the file will end up either replaced by the new version, or untouched; there's no way it can end up half-way overwritten into unrecoverable garbage.
Again this version won't work on Windows. Without a whole lot of work, there is no way to implement this "write-temp-and-rename" algorithm on Windows. But you can come close with only a bit of extra work:
with open(filename) as fin, tempfile.NamedTemporaryFile(delete=False) as fout:
for line in fin:
fout.write(line[1:])
outname = fout.name
os.remove(filename)
os.rename(outname, filename)
This does prevent you from half-overwriting the file, but it leaves a hole where you may have deleted the original file, and left the new file in a temporary location that you'll have to search for. You can make this a little nicer by putting the file somewhere easier to find (see the NamedTemporaryFile docs to see how). Or renaming the original file to a temporary name, then writing to the original filename, then deleting the original file. Or various other possibilities. But to actually get the same behavior as on other platforms is very difficult.
You can either read all lines in memory then recreate file,
from sys import argv
run, filename = argv
with open(filename, 'r') as f:
data = [i[1:] for i in f
with open(filename, 'w') as f:
f.writelines(i+'\n' for i in data) # this is for linux. for win use \r\n
or You can create other file and move data from first file to second line by line. Then You can rename it If You'd like
from sys import argv
run, filename = argv
new_name = filename + '.tmp'
with open(filename, 'r') as f_in, open(new_name, 'w') as f_out:
for line in f_in:
f_out.write(line[1:])
os.rename(new_name, filename)
At its most basic, your problem is that you need to seek back to the beginning of the file after you read its complete contents into the array f. Since you are making the file shorter, you also need to use truncate to adjust the official length of the file after you're done. Furthermore, open mode a+ (a is for append) overrides seek and forces all writes to go to the end of the file. So your code should look something like this:
import sys
def main(argv):
filename = argv[1]
with open(filename, 'r+') as f:
lines = f.readlines()
f.seek(0)
for line in lines:
f.write(line[1:])
f.truncate()
if __name__ == '__main__': main(sys.argv)
It is better, when doing something like this, to write the changes to a new file and then rename it over the old file when you're done. This causes the update to happen "atomically" - a concurrent reader sees either the old file or the new one, not some mangled combination of the two. That looks like this:
import os
import sys
import tempfile
def main(argv):
filename = argv[1]
with open(filename, 'r') as inf:
with tempfile.NamedTemporaryFile(dir=".", delete=False) as outf:
tname = outf.name
for line in inf:
outf.write(line[1:])
os.rename(tname, filename)
if __name__ == '__main__': main(sys.argv)
(Note: Atomically replacing a file via rename does not work on Windows; you have to os.remove the old name first. This unfortunately does mean there is a brief window (no pun intended) where a concurrent reader will find that the file does not exist. As far as I know there is no way to avoid this.)
import re
with open(filename,'r+') as f:
modified = re.sub('^.','',f.read(),flags=re.MULTILINE)
f.seek(0,0)
f.write(modified)
In the regex pattern:
^ means 'start of string'
^ with flag re.MULTILINE means 'start of line'
^. means 'the only one character at the start of a line'
The start of a line is the start of the string or any position after a newline (a newline is \n)
So, we may fear that some newlines in sequences like \n\n\n\n\n\n\n could match with the regex pattern.
But the dot symbolizes any character EXCEPT a newline, then all the newlines don't match with this regex pattern.
During the reading of the file triggered by f.read(), the file's pointer goes until the end of the file.
f.seek(0,0) moves the file's pointer back to the beginning of the file
f.truncate() puts a new EOF = end of file at the point where the writing has stopped. It's necessary since the modified text is shorter than the original one.
Compare what it does with a code without this line
To be hones, i'm really not sure how good/bad is an idea of nesting with open(), but you can do something like this.
with open(filename_you_reading_lines_FROM, 'r') as f0:
with open(filename_you_appending_modified_lines_TO, 'a') as f1:
for line in f0:
f1.write(line[1:])
While there seemed to be some discussion of best practice and whether it would run on Windows or not, being new to Python, I was able to run the first example that worked and get it to run in my Win environment that has cygwin binaries in my environmental variables Path and remove the first 3 characters (which were line numbers from a sample file):
import os
from sys import argv
run, filename = argv
fin = open(filename)
fout = open(filename + '.tmp', 'w')
lines = fin.readlines()
for line in lines:
fout.write(line[3:])
fout.close()
fin.close()
I chose not to automatically overwrite since I wanted to be able to eyeball the output.
python c:\bin\remove1st3.py sampleCode.txt

Printing the output of a Line search

I'm new to programming pretty much in general and I am having difficulty trying to get this command to print it's output to the .txt document. My goal in the end is to be able to change the term "Sequence" out for a variable where I can integrate it into a custom easygui for multiple inputs and returns, but that's a story for later down the road. For the sake of testing and completion of the current project I will be just manually altering the term.
I've been successful in being able to get another program to send it's output to a .txt but this one is being difficult. I don't know if I have been over looking something simple, but I have been grounded for more time than I would like to have been on this.
When the it searches for the lines it prints the fields in the file I want, however when it goes to write it finds the last line of the file and then puts that in the .txt as the output. I know the issue but I haven't been able to wrap my head around how to fix it, mainly due to my lack of knowledge of the language I think.
I am using Sublime Text 2 on Windows
def main():
import os
filelist = list()
filed = open('out.txt', 'w')
searchfile = open("asdf.csv")
for lines in searchfile:
if "Sequence" in lines:
print lines
filelist.append(lines)
TheString = " ".join(filelist)
searchfile.close()
filed.write(TheString)
filed.close()
main()
It sounds like you want to the lines you are printing out collected in the variable "filelist", which will then be printed to the file at the .write() call. Only a difference of indentation (which is significant in Python) prevents this from happening:
def main():
import os
filelist = list()
filed = open('out.txt', 'w')
searchfile = open("asdf.csv")
for lines in searchfile:
if "Sequence" in lines:
print lines
filelist.append(lines)
TheString = " ".join(filelist)
searchfile.close()
filed.write(TheString)
filed.close()
main()
Having
filelist.append(lines)
at the same level of indentation as
print lines
tells Python that they are in the same block, and that the second statement also belongs to the "then" clause of the if statement.
Your problem is that you are not appending inside the loop, as a consequence you are only appending the last line, do like this:
for lines in searchfile:
if "Sequence" in lines:
print lines
filelist.append(lines)
BONUS: This is the "pythonic" way to do what you want:
def main():
with open('asdf.csv', 'r') as src, open('out.txt', 'w') as dest:
dest.writelines(line for line in src if 'sequence' in line)
def main():
seq = "Sequence"
record = file("out.txt", "w")
search = file("in.csv", "r")
output = list()
for line in search:
if seq in line: output.append(line)
search.close()
record.write(" ".join(output))
record.close()

python zipfile module with TextIOWrapper

I wrote the following piece of code to read a text file inside of a zipped directory. Since I don't want the output in bytes I added the TextIOWrapper to display the output as a string. Assuming that this is the right way to read a zip file line by line (if it isn't let me know), then why does the output print a blank line? Is there any way to get rid of it?
import zipfile
import io
def test():
zf = zipfile.ZipFile(r'C:\Users\test\Desktop\zip1.zip')
for filename in zf.namelist():
words = io.TextIOWrapper(zf.open(filename, 'r'))
for line in words:
print (line)
zf.close()
test()
>>>
This is a test line...
This is a test line...
>>>
The two lines in the file inside of the zipped folder are:
This is a test line...
This is a test line...
Thanks!
zipfile.open opens the zipped file in binary mode, which doesn't strip out carriage returns (i.e. '\r'), and neither did the defaults for TextIOWrapper in my test. Try configuring TextIOWrapper to use universal newlines (i.e. newline=None):
import zipfile
import io
zf = zipfile.ZipFile('data/test_zip.zip')
for filename in zf.namelist():
with zf.open(filename, 'r') as f:
words = io.TextIOWrapper(f, newline=None)
for line in words:
print(repr(line))
Output:
'This is a test line...\n'
'This is a test line...'
The normal behavior when iterating a file by line in Python is to retain the newline at the end. The print function also adds a newline, so you'll get a blank line. To just print the file you could instead use print(words.read()). Or you could use the end option of the print function: print(line, end='').

Categories

Resources