Function keeps returning None - python

I'm trying to write a function to count how many lines in my input file begin with 'AJ000012.1' but my function keeps returning None. I'm a beginner and not entirely sure what the problem is and why this keeps happening. The answer is supposed to be 13 and when I just write code eg:
count=0
input=BLASTreport
for line in input:
if line.startswith('AJ000012.1'):
count=count+1
print('Number of HSPs: {}'.format(count))
I get the right answer. When I try to make this a function and call it, it does not work:
def nohsps(input):
count=0
for line in input:
if line.startswith('AJ000012.1'):
count=count+1
return
ans1=nohsps(BLASTreport)
print('Number of HSPs: {}'.format(ans1))
Any help would be seriously appreciated, thank you!
(HSP stands for high scoring segment pair if you're wondering. The input file is a BLAST report file that lists alignment results for a DNA sequence)

When you simply return without specifying what you are returning, you will not return anything. It will be None. You want to return something. Based on your specifications, you want to return count. Furthermore, you are returning inside your for loop, which means you are never going to get the count you expect. You want to count all occurrences of your match, so you need to move this return outside of your loop:
def nohsps(input):
count=0
for line in input:
if line.startswith('AJ000012.1'):
count=count+1
return count

Related

How to do this recursion method to print out every 3 letters from a string on a separate line?

I'm making a method that takes a string, and it outputs parts of the strings on separate line according to a window.
For example:
I want to output every 3 letters of my string on separate line.
Input : "Advantage"
Output:
Adv
ant
age
Input2: "23141515"
Output:
231
141
515
My code:
def print_method(input):
mywindow = 3
start_index = input[0]
if(start_index == input[len(input)-1]):
exit()
print(input[1:mywindow])
printmethod(input[mywindow:])
However I get a runtime error.... Can someone help?
I think this is what you're trying to get. Here's what I changed:
Renamed input to input_str. input is a keyword in Python, so it's not good to use for a variable name.
Added the missing _ in the recursive call to print_method
Print from 0:mywindow instead of 1:mywindow (which would skip the first character). When you start at 0, you can also just say :mywindow to get the same result.
Change the exit statement (was that sys.exit?) to be a return instead (probably what is wanted) and change the if condition to be to return once an empty string is given as the input. The last string printed might not be of length 3; if you want this, you could use instead if len(input_str) < 3: return
def print_method(input_str):
mywindow = 3
if not input_str: # or you could do if len(input_str) == 0
return
print(input_str[:mywindow])
print_method(input_str[mywindow:])
edit sry missed the title: if that is not a learning example for recursion you shouldn't use recursion cause it is less efficient and slices the list more often.
def chunked_print (string,window=3):
for i in range(0,len(string) // window + 1): print(string[i*window:(i+1)*window])
This will work if the window size doesn't divide the string length, but print an empty line if it does. You can modify that according to your needs

How can I simplify this Python code (assignment from a book)?

I am studying "Python for Everybody" book written by Charles R. Severance and I have a question to the exercise2 from Chapter7.
The task is to go through the mbox-short.txt file and "When you encounter a line that starts with “X-DSPAM-Confidence:” pull apart the line to extract the floating-point number on the line. Count these lines and then compute the total of the spam confidence values from these lines. When you reach the end of the file, print out the average spam confidence."
Here is my way of doing this task:
fname = input('Enter the file name: ')
try:
fhand = open(fname)
except:
print('File cannot be opened:', fname)
exit()
count = 0
values = list()
for line in fhand:
if line.startswith('X-DSPAM-Confidence:'):
string = line
count = count + 1
colpos = string.find(":")
portion = string[colpos+1:]
portion = float(portion)
values.append(portion)
print('Average spam confidence:', sum(values)/count)
I know this code works because I get the same result as in the book, however, I think this code can be simpler. The reason I think so is because I used a list in this code (declared it and then stored values in it). However, "Lists" is the next topic in the book and when solving this task I didn't know anything about lists and had to google them. I solved this task this way, because this is what I'd do in the R language (which I am already quite familiar with), I'd make a vector in which I'd store the values from my iteration.
So my question is: Can this code be simplified? Can I do the same task without using list? If yes, how can I do it?
I could change the "values" object to a floating type. The overhead of a list is not really needed in the problem.
values = 0.0
Then in the loop use
values += portion
Otherwise, there really is not a simpler way as this problem has tasks and you must meet all of the tasks in order to solve it.
Open File
Check For Error
Loop Through Lines
Find certain lines
Total up said lines
Print average
If you can do it in 3 lines of code great but that doesn't make what goes on in the background necessarily simpler. It will also probably look ugly.
You could filter the file's lines before the loop, then you can collapse the other variables into one, and get the values using list-comprehension. From that, you have your count from the length of that list.
interesting_lines = (line.startswith('X-DSPAM-Confidence:') for line in fhand)
values = [float(line[(line.find(":")+1):]) for line in interesting_lines]
count = len(values)
Can I do the same task without using list?
If the output needs to be an average, yes, you can accumlate the sum and the count as their own variables, and not need a list to call sum(values) against
Note that open(fname) is giving you an iterable collection anyway, and you're looping over the "list of lines" in the file.
List-comprehensions can often replace for-loops that add to a list:
fname = input('Enter the file name: ')
try:
fhand = open(fname)
except:
print('File cannot be opened:', fname)
exit()
values = [float(l[l.find(":")+1:]) for l in fhand if l.startswith('X-DSPAM-Confidence:')]
print('Average spam confidence:', sum(values)/len(values))
The inner part is simply your code combined, so perhaps less readable.
EDIT: Without using lists, it can be done with "reduce":
from functools import reduce
fname = input('Enter the file name: ')
try:
fhand = open(fname)
except:
print('File cannot be opened:', fname)
exit()
sum, count = reduce(lambda acc, l: (acc[0] + float(l[l.find(":")+1:]), acc[1]+1) if l.startswith('X-DSPAM-Confidence:') else acc, fhand, (0,0))
print('Average spam confidence:', sum / count)
Reduce is often called "fold" in other languages, and it basically allows you to iterate over a collection with an "accumulator". Here, I iterate the collection with an accumulator which is a tuple of (sum, count). With each item, we add to the sum and increment the count. See Reduce documentation.
All this being said, "simplify" does not necessarily mean as little code as possible, so I would stick with your own code if you're not comfortable with these shorthand notations.

BFS, need to create the set once and for all, not everytime

Got a problem, I've solved a problem with this code:
def hamta():
ordlista=[]
fil=open("labb9text.txt")
ordlista=[]
for line in fil.readlines():
ordlista.append(line.strip())
return ordlista
def setlista():
ordlista=hamta()
setlista=set()
for a in ordlista:
if a not in setlista:
setlista.add(a)
return setlista
def hittabarn(parent):
mangd=setlista()
children=[]
lparent=list(parent)
mangd.remove(parent) #listan måste läsas in en gång för alla, inte i hittabarn
for word in mangd:
letters=list(word)
count=0
i=0
for a in letters:
if a==lparent[i]:
count+=1
i+=1
else:
i+=1
if count==2:
if word not in children:
children.append(word)
if i>2:
break
return children
The problem is that everytime I wanna run "hittabarn", I'll have to load call setlista (which means I have to "create" the list with words, and it's not supposed to be like that, the assignment says that I should only load the list once, in the beginning, how do I solve this problem? Is it possible to like give the setlista-method a parameter, and then give hittabarn another parameter? That's what I've been trying but cant really make it work....

How to put random lines from a file into a string in python?

what i want to do is write a code that has a file (in the code, no need to be input by user), and the code picks a random line from the file - whatever it is, a long line, an ip or even a word and at the end of the loop puts it into a string so i could use that in other parts of the code.
i tried using randomchoice(lines) but wasn't sure how to continue from here.
after that i tried using:
import random
def random_line(afile):
line = next(afile)
for num, aline in enumerate(afile):
if random.randrange(num + 2): continue
line = aline
return line
which also for some reason didnt work for me.
The last method you posted worked for me. Maybe you are not opening the file correctly. Here is another approach, using random.choice
import random
def random_line(f):
return random.choice([line for line in f])
f = open("sample.txt", 'r')
print random_line(f)
Edit:
Another way would be (thanks to #zhangxaochen):
def random_line(f):
return random.choice(f.readlines())
Translating another answer of mine from C:
def random_line(afile):
count = 0
kept_line = None
for line in afile:
if random.randint(0, count) == 0:
kept_line = line
count += 1
return kept_line
Edit: This appears to do the same thing as random.choice. I wonder if they use the same algorithm?
Edit 2: from the comments and a little experimentation it appears random.choice uses a different algorithm, which will be much more efficient if all of the elements are already in memory. This isn't usually the case for files unless you use readlines. There will be a tradeoff between having to keep the entire file in memory vs. having to calculate n random numbers.

while loop in python issue

I started learning python few weeks ago (no prior programming knowledge) and got stuck with following issue I do not understand. Here is the code:
def run():
count = 1
while count<11:
return count
count=count+1
print run()
What confuses me is why does printing this function result in: 1?
Shouldn't it print: 10?
I do not want to make a list of values from 1 to 10 (just to make myself clear), so I do not want to append the values. I just want to increase the value of my count until it reaches 10.
What am I doing wrong?
Thank you.
The first thing that you do in the while loop is return the current value of count, which happens to be 1. The loop never actually runs past the first iteration. Python is indentation sensitive (and all languages that I know of are order-sensitive).
Move your return after the while loop.
def run():
count = 1
while count<11:
count=count+1
return count
Change to:
def run():
count = 1
while count<11:
count=count+1
return count
print run()
so you're returning the value after your loop.
Return ends the function early, prohibiting it from going on to the adding part.

Categories

Resources