converting float to sum in python - python

fname = input("Enter file name: ")
count=0
fh = open(fname)
for line in fh:
if not line.startswith("X-DSPAM-Confidence:") : continue
count=count+1
halo=line.find("0")
gh=line[halo:]
tg=gh.rstrip()
ha=float(tg)
total=0
for value in range(ha):
total=total+value
print total
its like a list of decimal number in file ok
0.1235
0.1236
0.1678
I convert it into float where 'tg' have not an array like a list
ha=float(tg)
total=0
for value in range(ha):
total=total+value
print total
error: start must be an integer
I know it's a mistake of using range what should I use instead of range?

If you want to get a sum of floats, just use the code:
fname = input("Enter file name: ")
count = 0
total = 0
fh = open(fname)
for line in fh:
if not line.startswith("X-DSPAM-Confidence:"): continue
count += 1
halo = line.find("0")
gh = line[halo:]
tg = gh.rstrip()
ha = float(tg)
total += ha
print total

You are passing a float as argument to range, which does not make sense. range returns a list with n elements when n is the only argument of range. For example:
>>> range(3)
[0, 1, 2]
So you can see that range of a float does not make sense.
If I understand your code correctly, I think you want to replace:
for value in range(ha):
total=total+value
By
total += ha
On a separate note, and trying not to be too pedantic, I am pretty impressed by how many principles of PEP 8 your code violates. You may think it's not a big deal, but if you care, I would suggest you read it (https://www.python.org/dev/peps/pep-0008/)

Related

How to find and extract values from a txt file?

Write a program that prompts for a file name, then opens that file and reads through the file, looking for lines of the form:
X-DSPAM-Confidence: 0.8475
Count these lines, extract the floating point values from each of the lines, and compute the average of those values and produce an output as shown below. Do not use the sum() function or a variable named sum in your solution.*
This is my code:
fname = input("Enter a file name:",)
fh = open(fname)
count = 0
# this variable is to add together all the 0.8745's in every line
num = 0
for ln in fh:
ln = ln.rstrip()
count += 1
if not ln.startswith("X-DSPAM-Confidence: ") : continue
for num in fh:
if ln.find(float(0.8475)) == -1:
num += float(0.8475)
if not ln.find(float(0.8475)) : break
# problem: values aren't adding together and gq variable ends up being zero
gq = int(num)
jp = int(count)
avr = (gq)/(jp)
print ("Average spam confidence:",float(avr))
The problem is when I run the code it says there is an error because the value of num is zero. So I then receive this:
ZeroDivisionError: division by zero
When I change the initial value of num to None a similar problem occurs:
int() argument must be a string or a number, not 'NoneType'
This is also not accepted by the python COURSERA autograder when I put it at the top of the code:
from __future__ import division
The file name for the sample data they have given us is "mbox-short.txt". Here's a link http://www.py4e.com/code3/mbox-short.txt
I edited your code like below. I think your task is to find numbers next to X-DSPAM-Confidence:. And i used your code to identify the X-DSPAM-Confidence: line. Then I splitted the string by ':' then I took the 1st index and I converted to float.
fname = input("Enter a file name:",)
fh = open(fname)
count = 0
# this variable is to add together all the 0.8745's in every line
num = 0
for ln in fh:
ln = ln.rstrip()
if not ln.startswith("X-DSPAM-Confidence:") : continue
count+=1
num += float(ln.split(":")[1])
gq = num
jp = count
avr = (gq)/(jp)
print ("Average spam confidence:",float(avr))
Open files using with, so the file is automatically closed.
See the in-line comments.
Desired lines are in the form X-DSPAM-Confidence: 0.6961, so split them on the space.
'X-DSPAM-Confidence: 0.6961'.split(' ') creates a list with the number is at list index 1.
fname = input("Enter a file name:",)
with open(fname) as fh:
count = 0
num = 0 # collect and add each found value
for ln in fh:
ln = ln.rstrip()
if not ln.startswith("X-DSPAM-Confidence:"): # find this string or continue to next ln
continue
num += float(ln.split(' ')[1]) # split on the space and add the float
count += 1 # increment count for each matching line
avr = num / count # compute average
print(f"Average spam confidence: {avr}") # print value

Having some problem trying to troubleshoot my code on an assignment on python 3

I've been attempting this assignment but I've encountered a few problems which I am still unable to resolve. Firstly, I am unable to collect the correct sum of numbers from the text so my average value is very off. Secondly, for line 14 it does feel quite strange to have to define my sum as a string before changing it back to float, although it does not give me a Traceback. Lastly, the questions states to not use the sum() function but I'm having trouble not using it. If possible, I would like to understand what is the rationale behind the question restricting us from using the sum() function.
Some help would be greatly appreciated!
file name: https://www.py4e.com/code3/mbox-short.txt , input should be mbox-short.txt
P.S : I added the count as the final output just to see how many lines did it register.
Assignment :
Write a program that prompts for a file name, then opens that file and reads through the file, looking for lines of the form:
X-DSPAM-Confidence: 0.8475
Count these lines and extract the floating point values from each of the lines and compute the average of those values and produce an output as shown below. Do not use the sum() function or a variable named sum in your solution.
You can download the sample data at http://www.py4e.com/code3/mbox-short.txt when you are testing below enter mbox-short.txt as the file name.
fname =input("Enter file name: ")
fhand = open(fname)
for lx in fhand :
if not lx.startswith("X-DSPAM-Confidence:") :
continue
ly = lx.replace("X-DSPAM-Confidence:"," ")
ly = ly.strip()
def avg():
sum = 0
count = 0
count = count
for values in ly :
count = count + 1
sum = str(sum) + values
return print("Average spam confidence:", count, float(sum) / count)
avg()
I have made some changes with your code. Store each float numbers into a list and iterate over this list when you perform addition operation to find the total sum.
fname =input("Enter file name: ")
fhand = open(fname)
num_list = []
for lx in fhand :
if not lx.startswith("X-DSPAM-Confidence:") :
continue
ly = lx.replace("X-DSPAM-Confidence:","")
num_list.append(float(ly))
def avg():
total = 0
count = 0
for values in num_list:
count = count + 1
total += values
return print("Average spam confidence:", count, total / count)
avg()
Output:
Average spam confidence: 27 0.7507185185185187
This worked For Me
summition = 0
fname =input("Enter file name: ")
count = 0
fhand = open(fname)
for lx in fhand :
if not lx.startswith("X-DSPAM-Confidence:") :
continue
ly = lx.replace("X-DSPAM-Confidence:"," ")
ly = ly.strip()
summition += float(ly)
count = count + 1
fhand.close()
print("Average Spam " + str(count)+ " " + str(summition/count))
Bad Code Hints :-
Always Close File Handle
return print() // Returns None

How to read an input file of integers separated by a space using readlines in Python 3?

I need to read an input file (input.txt) which contains one line of integers (13 34 14 53 56 76) and then compute the sum of the squares of each number.
This is my code:
# define main program function
def main():
print("\nThis is the last function: sum_of_squares")
print("Please include the path if the input file is not in the root directory")
fname = input("Please enter a filename : ")
sum_of_squares(fname)
def sum_of_squares(fname):
infile = open(fname, 'r')
sum2 = 0
for items in infile.readlines():
items = int(items)
sum2 += items**2
print("The sum of the squares is:", sum2)
infile.close()
# execute main program function
main()
If each number is on its own line, it works fine.
But, I can't figure out how to do it when all the numbers are on one line separated by a space. In that case, I receive the error: ValueError: invalid literal for int() with base 10: '13 34 14 53 56 76'
You can use file.read() to get a string and then use str.split to split by whitespace.
You'll need to convert each number from a string to an int first and then use the built in sum function to calculate the sum.
As an aside, you should use the with statement to open and close your file for you:
def sum_of_squares(fname):
with open(fname, 'r') as myFile: # This closes the file for you when you are done
contents = myFile.read()
sumOfSquares = sum(int(i)**2 for i in contents.split())
print("The sum of the squares is: ", sumOfSquares)
Output:
The sum of the squares is: 13242
You are trying to turn a string with spaces in it, into an integer.
What you want to do is use the split method (here, it would be items.split(' '), that will return a list of strings, containing numbers, without any space this time. You will then iterate through this list, convert each element to an int as you are already trying to do.
I believe you will find what to do next. :)
Here is a short code example, with more pythonic methods to achieve what you are trying to do.
# The `with` statement is the proper way to open a file.
# It opens the file, and closes it accordingly when you leave it.
with open('foo.txt', 'r') as file:
# You can directly iterate your lines through the file.
for line in file:
# You want a new sum number for each line.
sum_2 = 0
# Creating your list of numbers from your string.
lineNumbers = line.split(' ')
for number in lineNumbers:
# Casting EACH number that is still a string to an integer...
sum_2 += int(number) ** 2
print 'For this line, the sum of the squares is {}.'.format(sum_2)
You could try splitting your items on space using the split() function.
From the doc: For example, ' 1 2 3 '.split() returns ['1', '2', '3'].
def sum_of_squares(fname):
infile = open(fname, 'r')
sum2 = 0
for items in infile.readlines():
sum2 = sum(int(i)**2 for i in items.split())
print("The sum of the squares is:", sum2)
infile.close()
Just keep it really simple, no need for anything complicated. Here is a commented step by step solution:
def sum_of_squares(filename):
# create a summing variable
sum_squares = 0
# open file
with open(filename) as file:
# loop over each line in file
for line in file.readlines():
# create a list of strings splitted by whitespace
numbers = line.split()
# loop over potential numbers
for number in numbers:
# check if string is a number
if number.isdigit():
# add square to accumulated sum
sum_squares += int(number) ** 2
# when we reach here, we're done, and exit the function
return sum_squares
print("The sum of the squares is:", sum_of_squares("numbers.txt"))
Which outputs:
The sum of the squares is: 13242

Python function to display information about data

I am trying to write two functions. One generates random numbers between 1 and 100 with the parameter being the frequency of numbers which saves the information to a file, and another one reads this file and displays the: total sum, average, highest and lowest value. I have managed to get the first function to work however the secound function does not. When running the program I get the TypeError: 'int' object is not iterable due to the "number=int(number)" line in the readNumbers() function, which I don't understand because I thought the number had to be changed from a string to an int in order for equations to work. Any help would be appreciated. Also is there a method of finding the maximum and minimum values without using max() and min(), I would personally like to know.
def generateNumbers(i):
myFile=open("numbers.txt","w")
for n in range(i):
import random
numbers=random.randint(1,100)
numbers=str(numbers)
myFile.write(numbers)
myFile.close()
generateNumbers(400)
def readNumbers():
myFile=open("numbers.txt","r")
for number in myFile:
number=int(number)
total=(sum(number))
average=sum/float(len(number))
while number!=-1:
num_list.append(number)
high= max(num_list)
low= min(num_list)
print(total,average,high,low)
myFile.close()
readNumbers()
Your program has several problems.
Writing the numbers:
You're not separating the numbers in anyway, so your file will contain a very long string of digits (and to nitpick, your numbers is actually random_number)
if you f.ex myFile.write("\n") after myFile.write(random_number) each number
will be separated by a newline.
Now, since separating the numbers with newline, your reading will work, except that you should read them into an array, then do total and average.
ie:
num_list = []
for number in myFile:
if number != "\n": # there is an empty line at the end
num_list.append(int(number))
total = sum(num_list)
average = total / len(num_list)
high = max(num_list)
low = min(num_list)
Note that you don't need the while loop
You could also do the reading with python's list comprehension, and close the file automatically with a context manager
with open("numbers.txt", "r") as f:
num_list = [int(number) for number in f if number != "\n"]
(edited to avoid error on empty line at end)
Kindal is right.
You should do something like:
sum = 0;
count = 0;
for number in myFile:
sum += number
count += 1
average = sum /count
Moreover, I think that you can optimise a bit your code. Read the file and create a list, then you can easily calculate sum, avg, min, max.
with open("my filename.txt", 'r') as myFile:
number_list = list()
for number in myFile:
number = int(number)
if number != -1: #Is it necessary for you?
number_list.append(number)
sum = sum(number_list)
average = sum / len(number_list)
min_val = min(number_list)
max_val = max(number_list)

How to find the smallest number in a text file (Python 2.5)

So far, my code finds the largest number in a text file but it doesn't find the smallest number, when it's run the smallest number is still 0 while the largest is 9997.
integers = open('numbers.txt', 'r') # opens numbers.txt
largestInt = 0 # making variables to store the largest/smallest int
smallestInt = 0
# loop where we check every line for largest/smallest int
for line in integers:
while largestInt <= line.strip():
largestInt = line
while smallestInt >= line.strip():
smallestInt = line
# print results
print "Smallest = ", smallestInt
print "Largest = ", largestInt
numbers.txt looks like:
6037
-2228
8712
-5951
9485
8354
1467
8089
559
9439
-4274
9278
-813
1156
-7528
1843
-9329
574
and so on.
What's wrong here? If I'm doing something wrong, or the logic is incorrect please correct me.
EDIT
I'd like to say thanks to #Martijn Pieters and #Gexos for for explaining what I'm doing wrong. I understand why my code works now!
Final Code:
integers = open('numbers.txt', 'r') # opens numbers.txt
largestInt = 0 # making variables to store the largest/smallest int
smallestInt = 0
# loop where we check every line for largest/smallest int
for line in integers:
if largestInt <= int(line.strip()): # converted a string into an int
largestInt = int(line.strip()) # made that int into largestInt
if smallestInt >= int(line.strip()): # converted a string into an int
smallestInt = int(line.strip()) # made that int into smallestInt
integers.close() # closes the file
# print results
print "Smallest = ", smallestInt
print "Largest = ", largestInt
Results
Smallest = -9993
Largest = 9997
You are comparing strings, not integers; turn your line into an integer before comparing:
largestInt = float('-inf')
smallestInt = float('inf')
for line in integers:
number = int(line)
if largestInt < number:
largestInt = number
if smallestInt > number:
smallestInt = number
Note that you want to use if here, not while; the latter creates a loop.
I started largestInt and smallestInt with float('-inf') and float('inf'), respectively, numbers guaranteed to be smaller and larger than anything else. This makes the first time you test for largestInt < number always true, whatever number is on the first line.
Comparing strings is done lexicographically, where characters are compared one by one; 10 is smaller than 2 because 1 sorts before 2.
You could use the max() and min() built-in functions for ease, but it'll be a bit less efficient because internally these functions do loops as well:
numbers = {int(line) for line in integers}
largestInt = max(numbers)
smallestInt = min(numbers)
While I usually like solutions that use min and max, that would require two linear passes in this case, with a lot of memory overhead. Here's a method that needs one linear pass and constant memory:
with open('numbers.txt') as infile:
smallest, largest = '', None
for line in infile:
n = int(line)
smallest = min(n, smallest)
largest = max(n, largest)
print "the smallest number is", smallest
print "the largest number is", largest
As others have stated, you need to convert your lines to integers first. Additionally, your script will not output the correct minimum number if that number is bigger than zero. To fix this, set both your maximum number and minimum number to the first entry in your file. Then check all other numbers, and see if they're bigger/smaller than the current number.
with open('numbers.txt', 'r') as data_file:
num = int(next(data_file))
min_num, max_num = num, num
for line in data_file:
num = int(line)
if num > max_num:
max_num = num
elif num < min_num:
min_num = num
print 'Smallest number: {}'.format(min_num)
print 'Largest number: {}'.format(max_num)
This can also be solved with list comprehensions:
nums = [int(line) for line in open('numbers.txt', 'r')]
min_num, max_num = min(nums), max(nums)
A few things. You need to cast the line into an int from a string: int(line.strip()) currently you are comparing an int to a string. You should also cast the assignment in largestInt = int(line.strip()) and the same for smallestInt.
You should not be using while. while is for looping, not for comparing. You should be using if.
And last but not least make sure to close the file at the end. integers.close()
integers = open('numbers.txt', 'r')
intList = [int(x) for x in integers.readlines()]
print max(intList), min(intList)
with open('number.txt') as f:
s = [int(n.strip()) for n in f]
print min(s), max(s)
You are comparing strings, not ints. You need to call the int function on them at some stage most likely to convert them to numbers.
This is an example of a different approach:
with open('numbers.txt', 'r') as f:
integers = [int(n) for n in f.readlines()]
smallest = min(integers)
biggest = max(integers)
using with ensures the file is auto closed after the list comprehension, which is good practice. The list comprehension results in:
[6037, -2228, 8712, -5951, 9485, 8354, 1467, 8089, 559, 9439, -4274, 9278, -813, 1156, -7528, 1843, -9329, 574]
Then min and max are called on that list.
One thing to note when setting your smallestInt to 0. If the smallest number your text document contains is 1 then you'll still end up with a 0 as an answer.

Categories

Resources