Deleting a line from a file in Python - python

I'm trying to delete a specific line that contains a specific string.
I've a file called numbers.txt with the following content:
peter
tom
tom1
yan
What I want to delete is that tom from the file, so I made this function:
def deleteLine():
fn = 'numbers.txt'
f = open(fn)
output = []
for line in f:
if not "tom" in line:
output.append(line)
f.close()
f = open(fn, 'w')
f.writelines(output)
f.close()
The output is:
peter
yan
As you can see, the problem is that the function delete tom and tom1, but I don't want to delete tom1. I want to delete just tom. This is the output that I want to have:
peter
tom1
yan
Any ideas to change the function to make this correctly?

change the line:
if not "tom" in line:
to:
if "tom" != line.strip():

That's because
if not "tom" in line
checks, whether tom is not a substring of the current line. But in tom1, tom is a substring. Thus, it is deleted.
You probably could want one of the following:
if not "tom\n"==line # checks for complete (un)identity
if "tom\n" != line # checks for complete (un)identity, classical way
if not "tom"==line.strip() # first removes surrounding whitespace from `line`

Just for fun, here's a two-liner to do it.
lines = filter(lambda x:x[0:-1]!="tom", open("names.txt", "r"))
open("names.txt", "w").write("".join(lines))
Challenge: someone post a one-liner for this.
You could also use the fileinput module to get arguably the most readable result:
import fileinput
for l in fileinput.input("names.txt", inplace=1):
if l != "tom\n": print l[:-1]

You can use regex.
import re
if not re.match("^tom$", line):
output.append(line)
The $ means the end of the string.

I'm new in programing and python (a few months)... this is my solution:
import fileinput
c = 0 # counter
for line in fileinput.input("korrer.csv", inplace=True, mode="rb"):
# the line I want to delete
if c == 3:
c += 1
pass
else:
line = line.replace("\n", "")
print line
c +=1
I'm sure there is a simpler way, just it's an idea. (my English it's not very good looking!!)

Related

print last line from the variable output in python

I am looking to get the last line produced from the line variable
bash-4.1$ cat file1_dup.py
#!/usr/bin/python
with open("file1.txt") as f:
lines = f.readlines()
for line in lines:
if "!" in line:
line = line.split()[-1].strip()
print line
output i am getting is as follows ..
-122.1058
-123.1050
-125.10584323
The result i wanted to be printed out is
-125.10584323
Moreover, i got the hint from some goghling and getting the output
desired but that seems bit complicated to me at the point ..
bash-4.1$ cat file2_dup.py
#!/usr/bin/python
def file_read_from_tail(fname,n):
with open(fname) as f:
f=f.read().splitlines()
lines=[x for x in f]
for i in range(len(lines)-n,len(lines)):
line = lines[i].split()[-1]
#line = line.split()[-1]
print line
file_read_from_tail('file1.txt',1)
this yeilds teh desired as folows..
bash-4.1$ ./file2_dup.py
-125.10584323
PS: i just borrow the question for the sake of intrest from:
how to read a specific line and print a specific position in this line using python
You could test if the new line is smaller as the one before like this
#!/usr/bin/python
res_line = 0
with open("file1.txt") as f:
lines = f.readlines()
for line in lines:
if "!" in line:
line = float(line.split()[-1].strip())
if res_line > line:
res_line = line
print res_line
Edit:
you can use enumerate() to get the lines indexed in a loop:
with open("file1.txt", "rt") as f:
lines = f.readlines()
for line, content in enumerate(lines):
# apply your logic to line and/or content here
# by adding ifs to select the lines you want...
print line, content.strip() # do your thing
will output (just to illustrate because I didn't specify any conditions in code above):
0 -122.1058
1 -123.1050
2 -125.10584323
3
or in alternative select your specific line with a condition in a listcomp
by using this code instead:
with open("file1.txt", "rt") as f:
lines = f.readlines()
result = [ content.strip() for line, content in enumerate(lines)
if line == len(lines) - 2] # creates a list with
# only the last line
print result[0]
that will output:
-125.10584323
Try the following:
print [line.split()[-1].strip() for line in lines if '!' in line][-1]
I see a better way by creating an empty list and appending the value comes from the condition and then choose the Index of your choice and list the output , this is good in the sense that it can be used for any line of your intrest which you require to pick.
Lets Suppose i wana the last second line then it can be resuable putting the value in the print section print(lst[-2]) , which will print the last index of second line..
#!/usr/bin/python
file = open('file1.txt', 'r')
lst = list()
for line in file:
if "!" in line:
x= line.split()
lst.append(x[-1])
print(lst[-1])

python parse and print text before .(dot)

I am trying a program where it has to parse the text file:qwer.txt and print the value before '=' and after ',':
qwer.txt
john.xavier=s/o john
jane.victory=s/o ram
output:
xavier
victory
My program shows the entire line,please help on how to display specific text after . and =
with open("qwer.txt", 'r') as my_file:
a = my_file.readlines()
for line in a:
for part in line.split():
if "=" in part:
print part.split(' ')[-1]
Please help! answers will be appreciated.
with open("qwer.txt", 'r') as my_file:
for line in my_file:
print line.split('=')[0].split('.')[1]
You might need to understand the with statement better :-)
Here is my solution:
with open("qwer.txt", 'r') as my_file:
for line in my_file:
name = line.split("=", 1)[0]
print name.split(".")[-1]
The two lines can be combines like this as well:
print line.split("=", 1)[0].split(".")[-1]
The official doc of "with" statement is here
Fun little way using regex rather than splitting, and will ignore bad lines rather than erroring (pretty slick if I do say so myself). Also gives you a nice list of names if you want to use them further rather than outputting.
import re
r = re.compile('.+?\.(.+)?\=.+')
with open("qwer.txt", 'r') as f:
names = [r.match(x).group(1) for x in f.read().splitlines() if r.match(x)]
for name in names: print name

Cancel next() function in python script

I have a very large file with wrong informations.
this one
is the
xxx 123gt few 1121
12345 fre 233fre
problematic file.
It contains
xxx hy 456 efe
rtg 1215687 fwe
many errors
That I'd like
toget rid of
I wrote a script. Whenever xxx is encountered:
The line is replaced with a custom string (something).
The very next line is replaced with another custom string (stg).
Here is the script:
subject='problematic.txt'
pattern='xxx'
subject2='resolved.txt'
output = open(subject2, 'w')
line1='something'
line2='stg'
with open(subject) as myFile:
for num, line in enumerate(myFile, 1): #to get the line number
if pattern in line:
print 'found at line:', num
line = line1 #replace the line containing xxx with 'something'
output.write(line)
line = next(myFile, "") # move to the next line
line = line2 #replace the next line with 'stg'
output.write(line)
else:
output.write(line) # save as is
output.close()
myFile.close()
It works well with the first xxx occurrence, but not with the subsequents. The reason comes from next() that moves forward the iteration thus my script makes changes at wrong places.
Here is the output:
found at line: 3
found at line: 6
instead of :
found at line: 3
found at line: 7
Consequently the changes are not made in the write place... Ideally, canceling next() after I changed the line with line2 would solve my problem, but I didn't find a previous() function. Anyone? Thanks!!
Your current code almost works. I believe that it correctly identifies and filters out the right lines of your input file, but it reports the line numbers it finds the matches at incorrectly, since the enumerate generator doesn't see the skipped lines.
Though you could rewrite it in various ways as the other answers suggest, you don't need to make major changes (unless you want to, for other design reasons). Here's the code with the minimal changes needed pointed out by new comments:
with open(subject) as myFile:
gen = enumerate(myFile, 1) # save the enumerate generator to a variable
for num, line in gen: # iterate over it, as before
if pattern in line:
print 'found at line:', num
line = line1
output.write(line)
next(gen, None) # advance the generator and throw away the results
line = line2
output.write(line)
else:
output.write(line)
When you think you need to look ahead, it is almost always simpler to restate the problem in terms of looking back. In this case, just keep track of the previous line and look at that to see if it matches your target string.
infilename = "problematic.txt"
outfilename = "resolved.txt"
pattern = "xxx"
replace1 = "something"
replace2 = "stg"
with open(infilename) as infile:
with open(outfilename, "w") as outfile:
previous = ""
for linenum, current in enumerate(infile):
if pattern in previous:
print "found at line", linenum
previous, current = replace1, replace2
if linenum: # skip the first (blank) previous line
outfile.write(previous)
previous = current
outfile.write(previous) # write the final line
This seems to work with the string to be replaced appearing both at odd and even line numbers:
with open ('test.txt', 'r') as f:
for line in f:
line = line.strip ()
if line == 'apples': #to be replaced
print ('manzanas') #replacement 1
print ('y más manzanas') #replacement 2
next (f)
continue
print (line)
Sample input:
apples
pears
apples
pears
pears
apples
pears
pears
Sample output:
manzanas
y más manzanas
manzanas
y más manzanas
pears
manzanas
y más manzanas
pears
There is no previous function because that's not how the iterator protocol works. Especially with generators, the concept of a "previous" element may not even exist.
Instead you want to iterate over your file with two cursors, zipping them together:
from itertools import tee
with open(subject) as f:
its = tee(f)
next(its[1]) # advance the second iterator to first line
for first,second in zip(*its): # in python 2, use itertools.izip
#do something to first and/or second, comparing them appropriately
The above is just like doing for line in f:, except you now have your first line in first and the line immediately after it in second.
I would just set a flag to indicate that you want to skip the next line, and check for that in the loop instead of using next:
with open(foo) as myFile:
skip = False
for line in myFile:
if skip:
skip = False
continue
if pattern in line:
output.write("something")
output.write("stg")
skip = True
else:
output.write(line)
You need to buffer the lines in some way. This is easy to do for a single line:
class Lines(object):
def __init__(self, f):
self.f = f # file object
self.prev = None # previous line
def next(self):
if not self.prev:
try:
self.prev = next(self.f)
except StopIteration:
return
return self.prev
def consume(self):
if self.prev is not None:
self.prev = next(self.f)
Now you need to call Lines.next() to fetch the next line, and Lines.consume() to consume it. A line is kept buffered until it is consumed:
>>> f = open("table.py")
>>> lines = Lines(f)
>>> lines.next()
'import itertools\n'
>>> lines.next() # same line
'import itertools\n'
>>> lines.consume() # remove the current buffered line
>>> lines.next()
'\n' # next line
you can zip lines this way to get both pointers at once:
with open(subject) as myFile:
lines = myFile.readlines()
for current, next in zip(lines, lines[1:])
...
edit: this is just to demonstrate the idea of zipping the lines, for big files use iter(myFile), meaning:
with open(subject) as myFile:
it1 = myFile
myFile.next()
for current, next in zip(it1,myFile):
...
note that file is iterable, no need to add any extra wrapping to it

Count number of lines in a txt file with Python excluding blank lines

I wish to count the number of lines in a .txt file which looks something like this:
apple
orange
pear
hippo
donkey
Where there are blank lines used to separate blocks. The result I'm looking for, based on the above sample, is five (lines).
How can I achieve this?
As a bonus, it would be nice to know how many blocks/paragraphs there are. So, based on the above example, that would be two blocks.
non_blank_count = 0
with open('data.txt') as infp:
for line in infp:
if line.strip():
non_blank_count += 1
print 'number of non-blank lines found %d' % non_blank_count
UPDATE: Re-read the question, OP wants to count non-blank lines .. (sigh .. thanks #RanRag).
(I need a break from the computer ...)
A short way to count the number of non-blank lines could be:
with open('data.txt', 'r') as f:
lines = f.readlines()
num_lines = len([l for l in lines if l.strip(' \n') != ''])
I am surprised to see that there isn't a clean pythonic answer yet (as of Jan 1, 2019). Many of the other answers create unnecessary lists, count in a non-pythonic way, loop over the lines of the file in a non-pythonic way, do not close the file properly, do unnecessary things, assume that the end of line character can only be '\n', or have other smaller issues.
Here is my suggested solution:
with open('myfile.txt') as f:
line_count = sum(1 for line in f if line.strip())
The question does not define what blank line is. My definition of blank line: line is a blank line if and only if line.strip() returns the empty string. This may or may not be your definition of blank line.
sum([1 for i in open("file_name","r").readlines() if i.strip()])
Considering the blank lines will only contain the new line character, it would be pretty faster to avoid calling str.strip which creates a new string but instead to check if the line contains only spaces using str.isspace and then skip it:
with open('data.txt') as f:
non_blank_lines = sum(not line.isspace() for line in f)
Demo:
from io import StringIO
s = '''apple
orange
pear
hippo
donkey'''
non_blank_lines = sum(not line.isspace() for line in StringIO(s)))
# 5
You can further use str.isspace with itertools.groupby to count the number of contiguous lines/blocks in the file:
from itertools import groupby
no_paragraphs = sum(k for k, _ in groupby(StringIO(s), lambda x: not x.isspace()))
print(no_paragraphs)
# 2
Not blank lines Counter:
lines_counter = 0
with open ('test_file.txt') as f:
for line in f:
if line != '\n':
lines_counter += 1
Blocks Counter:
para_counter = 0
prev = '\n'
with open ('test_file.txt') as f:
for line in f:
if line != '\n' and prev == '\n':
para_counter += 1
prev = line
This bit of Python code should solve your problem:
with open('data.txt', 'r') as f:
lines = len(list(filter(lambda x: x.strip(), f)))
This is how I would've done it:
f = open("file.txt")
l = [x for x in f.readlines() if x != "\n"]
print len(l)
readlines() will make a list of all the lines in the file and then you can just take those lines that have at least something in them.
Looks pretty straightforward to me!
Pretty straight one! I believe
f = open('path','r')
count = 0
for lines in f:
if lines.strip():
count +=1
print count
My one liner would be
print(sum(1 for line in open(path_to_file,'r') if line.strip()))

I am having a problem with my python program that im running

From an input file I'm suppose to extract only first name of the student and then save the result in a new file called "student-­‐firstname.txt" The output file should contain a list of
first names (not include middle name). I was able to get delete of the last name but I'm having problem deleting the middle name any help or suggestion?
the student name in the file look something like this (last name, first name, and middle initial)
Martin, John
Smith, James W.
Brown, Ashley S.
my python code is:
f=open("studentname.txt", 'r')
f2=open ("student-firstname.txt",'w')
str = ''
for line in f.readlines():
str = str + line
line=line.strip()
token=line.split(",")
f2.write(token[1]+"\n")
f.close()
f2.close()
f=open("studentname.txt", 'r')
f2=open ("student-firstname.txt",'w')
for line in f.readlines():
token=line.split()
f2.write(token[1]+"\n")
f.close()
f2.close()
Split token[1] with space.
fname = token[1].split(' ')[0]
with open("studentname.txt") as f, open("student-firstname.txt", 'w') as fout:
for line in f:
firstname = line.split()[1]
print >> fout, firstname
Note:
you could use a with statement to make sure that the files are always closed even in case of an exception. You might need contextlib.nested() on old Python versions
'r' is a default mode for files. You don't need to specify it explicitly
.readlines() reads all lines at once. You could iterate over the file line by line directly
To avoid hardcoding the filenames you could use fileinput. Save it to firstname.py:
#!/usr/bin/env python
import fileinput
for line in fileinput.input():
firstname = line.split()[1]
print firstname
Example: $ python firstname.py studentname.txt >student-firstname.txt
Check out regular expressions. Something like this will probably work:
>>> import re
>>> nameline = "Smith, James W."
>>> names = re.match("(\w+),\s+(\w+).*", nameline)
>>> if names:
... print names.groups()
('Smith', 'James')
Line 3 basically says find a sequence of word characters as group 0, followed by a comma, some space characters and another sequence of word characters as group 1, followed by anything in nameline.
f = open("file")
o = open("out","w")
for line in f:
o.write(line.rstrip().split(",")[1].strip().split()+"\n")
f.close()
o.close()

Categories

Resources