Using Python, how do I print the lines of a text file, given a starting and ending line number?
I have come up with a function, but this doesn't work.
def printPart(src, des, varFile):
returnLines = ""
for curLine in range(src, des):
returnLines += linecache.getline(varFile, curLine)
return returnLines
Since file objects are iterable in Python, you can apply all the functions from itertools to them. Have a look at itertools.islice(). (Since this is homework, I'll leave the details to you.)
I would start with the first line in the file, using readline() reading each line counting count += 1. once count gets to the start line number, start printing. Once it gets to the last line number, sys.exit()
Related
I am trying to check the first character in each line from a separate data file. This is the loop that I am using, but for some reason I get an error that says string index out of range.
for line_no in length:
line_being_checked = linecache.getline(file_path, line_no)
print(line_being_checked[0])
From what I understand (not very in english), lenght is the number of lines you want to check in the files.
You could do something like that:
for line in open("file.txt", "r").read().splitlines():
print(line[0])
This way, you'll be sure that the lenght is correct.
For the error, it is possible that you have an empty line, so you could len(line) to check if it is the case.
In Codio, there is a challenge to take a file, search for the number of times a
string appears in it, then print that number. I was able to get the result using some suggestions, but I am still unclear on a few things.
The main question is, at what point is the loop searching for the substring S? The count() syntax that I see everywhere involves using the string to be searched, followed by the dot operator, and then the function with the substring we want to find as the parameter. It would look something like: P.count(S)
What confuses me is that the function is using line in place of P. So does this mean the function is searching line for the substring? And if so, how does that work if line is simply the counter variable for the for loop? I just want to have a clearer understanding of how this function is working in this context to get me the correct count of times that substring S appears in file P.
import sys
P= sys.argv[1]
S= sys.argv[2]
# Your code goes here
f = open(P, 'r')
c = 0
for line in f.readlines():
c += line.count(S)
print(c)
does this mean the function is searching "line" for the substring
Yes, that's exactly what it means. And the value of line changes in every loop iteration.
And if so, how does that work if "line" is simply the counter variable for the "for" loop
It's not. Python for loops don't have counters. line is the actual line of text.
for letter in ['A', 'B', 'C']:
print(letter)
prints
A
B
C
Let's dissect the loop:
for line in f.readlines():
c += line.count(S)
f is a file descriptor of your open file.
readlines is a generator, a function sort of thing that returns the lines of the file. If you think of it as a list of strings, each of which is a line of the file, you'll be close enough to understand the loop operation.
Thus, the statement for line in f.readlines(): iterates the variable line through the file contents; on each loop iteration, line will be the appropriate string value, the next line of the file.
Therefore, line.count(S) returns the quantity of times the target string S appears in that line of the file. The increment c += adds that to your counter.
Does that make things clear enough?
BTW, please learn to use descriptive variable names. One-letter names with mixed upper- and lower-case are a bad habit in the long run.
In the code below readline() will not increment. I've tried using a value, no value and variable in readline(). When not using a value I don't close the file so that it will iterate but that and the other attempts have not worked.
What happens is just the first byte is displayed over and over again.
If I don't use a function and just place the code in the while loop (without 'line' variable in readline()) it works as expected. It will go through the log file and print out the different hex numbers.
i=0
x=1
def mFinder(line):
rgps=open('c:/code/gps.log', 'r')
varr=rgps.readline(line)
varr=varr[12:14].rstrip()
rgps.close()
return varr
while x<900:
val=mFinder(i)
i+=1
x+=1
print val
print 'this should change'
It appears you have misunderstood what file.readline() does. Passing in an argument does not tell the method to read a specific numbered line.
The documentation tells you what happens instead:
file.readline([size])
Read one entire line from the file. A trailing newline character is kept in the string (but may be absent when a file ends with an incomplete line). If the size argument is present and non-negative, it is a maximum byte count (including the trailing newline) and an incomplete line may be returned.
Bold emphasis mine, you are passing in a maximum byte count and rgps.readline(1) reads a single byte, not the first line.
You need to keep a reference to the file object around until you are done with it, and repeatedly call readline() on it to get successive lines. You can pass the file object to a function call:
def finder(fileobj):
line = fileobj.readline()
return line[12:14].rstrip()
with open('c:/code/gps.log') as rgps:
x = 0
while x < 900:
section = finder(rgps)
print section
# do stuff
x += 1
You can also loop over files directly, because they are iterators:
for line in openfilobject:
or use the next() function to get a next line, as long as you don't mix .readline() calls and iteration (including next()). If you combine this witha generator function, you can leave the file object entirely to a separate function that will read lines and produce sections until you are done:
def read_sections():
with open('c:/code/gps.log') as rgps:
for line in rgps:
yield line[12:14].rstrip()
for section in read_sections():
# do something with `section`.
My function only says that the last word in a file of words is an anagram (the first helper function). But every word in the file is an anagram of the word I tested and returns true independently with the helper function outside of the main function. I am not sure if it has something to do with /n being a part of the string, and then it accounting for that, but I tried putting in an if statement saying to delete it if it was in there and that did not work either. I also did test to make sure it is running through each word in the .txt file and it is.
def is_anagram(string1, string2):
"""Returns True if the two strings are anagrams of eachother.
str, str -> bool"""
if sorted(string1)==sorted(string2):
return True
else:
return False
def find_anagrams(word):
final = []
content = open("small_list.txt")
content.close
while True:
line = content.readline()
print(line)
if is_anagram(word, line) == True:
print("bruh")
final.append(line)
elif line == '':
break
return final
This is expected, based on the method you use to read a line (file.readline). From the documentation:
f.readline() reads a single line from the file; a newline character
(\n) is left at the end of the string, and is only omitted on the last
line of the file if the file doesn’t end in a newline.
Your line has a trailing newline, but word certainly does not. So, in the end, all you'd need to change is:
line = content.readline().rstrip()
Well, that's all you'd need to change to get it working. Additionally, I'd also recommend using the with...as context manager to handle file I/O. It's good practice, and you'll thank yourself for it.
with open("small_list.txt") as f:
for line in f:
if is_anagram(word, line.rstrip()):
... # do something here
It's better to use a for loop to iterate over the lines of a file (rather than a while, it's cleaner). Also, there's no need to explicitly call f.close() when you use a context manager (you're not currently doing it, you're only referencing the method without actually calling it).
Incorporating #Christian Dean's suggestion in this answer, you can simplify your anagram function as well - call sorted and return the result in a single line:
def is_anagram(a, b):
return sorted(a) == sorted(b)
Update: My current question is how can I get my code to read to the EOF starting from the beginning with each new search phrase.
This is an assignment I am doing and currently stuck on. Mind you this is a beginner's programming class using Python.
jargon = open("jargonFile.txt","r")
searchPhrase = raw_input("Enter the search phrase: ")
while searchPhrase != "":
result = jargon.readline().find(searchPhrase)
if result == -1:
print "Cannot find this term."
else:
print result
searchPhrase = raw_input("Enter the search phrase: ")
jargon.close()
The assignment is to take a user's searchPhrase and find it in a file (jargonFile.txt) and then have it print the result (which is the line it occured and the character occurence). I will be using a counter to find the line number of the occurence but I will implement this later. For now my question is the error I am getting. I cann't find a way for it to search the entire file.
Sample run:
Enter the search phrase: dog
16
Enter the search phrase: hack
Cannot find this term.
Enter the search phrase:
"dog" is found in the first line however it is also found in other lines of the jargonFile (multiple times as a string) but it is only showing the first occurence in the first line. The string hack is found numerous times in the jargonFile but my code is setup to only search the first line. How may I go about solving this problem?
If this is not clear enough I can post up the assignment if need be.
First you open the file and read it into a string with readline(). Later on you try to readline() from the string you obtained in the first step.
You need to take care what object (thing) you're handling: open() gave you a file "jargon", readline on jargon gave you the string "jargonFile".
So jargonFile.readline does not make sense anymore
Update as answer to comment:
Okay, now that the str error problem is solved think about the program structure:
big loop
enter a search term
open file
inner loop
read a line
print result if string found
close file
You'd need to change your program so it follows that descripiton
Update II:
SD, if you want to avoid reopening the file you'd still need two loops, but this time one loop reads the file into memory, when that's done the second loop asks for the search term. So you would structure it like
create empty list
open file
read loop:
read a line from the file
append the file to the list
close file
query loop:
ask the user for input
for each line in the array:
print result if string found
For extra points from your professor add some comments to your solution that mention both possible solutions and say why you choose the one you did. Hint: In this case it is a classic tradeoff between execution time (memory is fast) and memory usage (what if your jargon file contains 100 million entries ... ok, you'd use something more complicated than a flat file in that case, bu you can't load it in memory either.)
Oh and one more hint to the second solution: Python supports tuples ("a","b","c") and lists ["a","b","c"]. You want to use the latter one, because list can be modified (a tuple can't.)
myList = ["Hello", "SD"]
myList.append("How are you?")
foreach line in myList:
print line
==>
Hello
SD
How are you?
Okay that last example contains all the new stuff (define list, append to list, loop over list) for the second solution of your program. Have fun putting it all together.
Hmm, I don't know anything at all about Python, but it looks to me like you are not iterating through all the lines of the file for the search string entered.
Typically, you need to do something like this:
enter search string
open file
if file has data
start loop
get next line of file
search the line for your string and do something
Exit loop if line was end of file
So for your code:
jargon = open("jargonFile.txt","r")
searchPhrase = raw_input("Enter the search phrase: ")
while searchPhrase != "":
<<if file has data?>>
<<while>>
result = jargon.readline().find(searchPhrase)
if result == -1:
print "Cannot find this term."
else:
print result
<<result is not end of file>>
searchPhrase = raw_input("Enter the search phrase: ")
jargon.close()
Cool, did a little research on the page DNS provided and Python happens to have the "with" keyword. Example:
with open("hello.txt") as f:
for line in f:
print line
So another form of your code could be:
searchPhrase = raw_input("Enter the search phrase: ")
while searchPhrase != "":
with open("jargonFile.txt") as f:
for line in f:
result = line.find(searchPhrase)
if result == -1:
print "Cannot find this term."
else:
print result
searchPhrase = raw_input("Enter the search phrase: ")
Note that "with" automatically closes the file when you're done.
Your file is jargon, not jargonFile (a string). That's probably what's causing your error message. You'll also need a second loop to read each line of the file from the beginning until you find the word you're looking for. Your code currently stops searching if the word is not found in the current line of the file.
How about trying to write code that only gives the user one chance to enter a string? Input that string, search the file until you find it (or not) and output a result. After you get that working you can go back and add the code that allows multiple searches and ends on an empty string.
Update:
To avoid iterating the file multiple times, you could start your program by slurping the entire file into a list of strings, one line at a time. Look up the readlines method of file objects. You can then search that list for each user input instead of re-reading the file.
you shouldn't try to re-invent the wheel. just use the
re module functions.
your program could work better if you used:
result = jargon.read() .
instead of:
result = jargon.readline() .
then you could use the re.findall() function
and join the strings (with the indexes) you searched for with str.join()
this could get a little messy but if take some time to work it out, this could fix your problem.
the python documentation has this perfectly documented
Everytime you enter a search phrase, it looks for it on the next line, not the first one. You need to re-open the file for every search phrase, if you want it behave like you describe.
Take a look at the documentation for File objects:
http://docs.python.org/library/stdtypes.html#file-objects
You might be interested in the readlines method. For a simple case where your file is not enormous, you could use that to read all the lines into a list. Then, whenever you get a new search string, you can run through the whole list to see whether it's there.