Find median and mode from .txt file Python 3.4.1 - python

I am trying to determine the median and mode from a list of numbers in "numbers.txt" file.
I am EXTREMELY new to python and have ZERO coding experience.
This is what I have so far calculating mean, sum, count, max, and min but I have no idea where to go from here.
number_file_name = 'numbers.txt'
number_sum = 0
number_count = 0
number_average = 0
number_maximum = 0
number_minimum = 0
number_range = 0
do_calculation = True
while(do_calculation):
while (True):
try:
# Get the name of a file
number_file_name = input('Enter a filename. Be sure to include .txt after the file name: ')
random_number_count = 0
print('')
random_number_file = open(number_file_name, "r")
print ('File Name: ', number_file_name, ':', sep='')
print('')
numbers = random_number_file.readlines()
random_number_file.close
except:
print('An error occured trying to read', random_number_file)
else:
break
try:
number_file = open(number_file_name, "r")
is_first_number = True
for number in number_file:
number = int(number) # convert the read string to an int
if (is_first_number):
number_maximum = number
number_minimum = number
is_first_number = False
number_sum += number
number_count += 1
if (number > number_maximum):
number_maximum = number
if (number < number_minimum):
number_minimum = number
number_average = number_sum / number_count
number_range = number_maximum - number_minimum
index = 0
listnumbers = 0
while index < len(numbers):
numbers[index] = int(numbers[index])
index += 1
number_file.close()
except Exception as err:
print ('An error occurred reading', number_file_name)
print ('The error is', err)
else:
print ('Sum: ', number_sum)
print ('Count:', number_count)
print ('Average:', number_average)
print ('Maximum:', number_maximum)
print ('Minimum:', number_minimum)
print ('Range:', number_range)
print ('Median:', median)
another_calculation = input("Do you want to enter in another file name? (y/n): ")
if(another_calculation !="y"):
do_calculation = False

If you want to find the median and mode of the numbers, you need to keep track of the actual numbers you've encountered so far. You can either create a list holding all the numbers, or a dictionary mapping numbers to how often you've seen those. For now, let's create a (sorted) list from those numbers:
with open("numbers.txt") as f:
numbers = []
for line in f:
numbers.append(int(line))
numbers.sort()
Or shorter: numbers = sorted(map(int, f))
Now, you can use all sorts of builtin functions to calculate count, sum, min and max
count = len(numbers)
max_num = max(numbers)
min_num = min(numbers)
sum_of_nums = sum(numbers)
Calculating the mode and median can also be done very quickly using the list of numbers:
median = numbers[len(numbers)//2]
mode = max(numbers, key=lambda n: numbers.count(n))

Maybe there is a reason for it but why are you avoiding using the python libraries? Numpy and scipy should have everything you are looking for such a task.
Have a look at numpy.genfromtxt() , numpy.mean() and scipy.stats.mode().

Related

My assignment is to Create a file with comma seperated random numbers. Write python code to sum and average of them. How to do this?

n = ("random_numbers", "r+")
a = int(input("How many number do want to input? Type 0 to exit:"))
sum = 0
count = 0
number = 0
for i in range(a):
x = int(input("Enter a number:"))
n.write(str(x) + str(','))
sum = sum + number
count += 1
average = sum/count
n.write('the sum of the numbers is' + sum)
n.write('the average of the numbers is' + average)
n.seek(0)
n.read()
n.close()
This code when it is run shows the error: AttributeError: 'tuple' object has no attribute 'write'
you can use random.sample to generate your random numbers:
import random
a = int(input("How many number do want to input?"))
with open('my_file.txt', 'w') as fp:
my_numbers = random.sample(range(1000), a)
fp.write(','.join(map(str, my_numbers)))
fp.write( '\nthe sum of the numbers is ' + str(sum(my_numbers)))
fp.write( '\nthe average of the numbers is ' + str(sum(my_numbers) / len(my_numbers)))
in the above example if you have a > 1000 this code will not work, another way to generate your random numbers is:
import random
a = int(input("How many number do want to input?"))
with open('my_file.txt', 'w') as fp:
my_numbers = [random.randint(0, 1000) for _ in range(a)]
fp.write(','.join(map(str, my_numbers)))
fp.write( '\nthe sum of the numbers is ' + str(sum(my_numbers)))
fp.write( '\nthe average of the numbers is ' + str(sum(my_numbers) / len(my_numbers)))
using random.randint

How can I get the average of a range of inputs?

I have to create a program that shows the arithmetic mean of a list of variables. There are supposed to be 50 grades.
I'm pretty much stuck. Right now I´ve only got:
for c in range (0,50):
grade = ("What is the grade?")
Also, how could I print the count of grades that are below 50?
Any help is appreciated.
If you don't mind using numpy this is ridiculously easy:
import numpy as np
print np.mean(grades)
Or if you'd rather not import anything,
print float(sum(grades))/len(grades)
To get the number of grades below 50, assuming you have them all in a list, you could do:
grades2 = [x for x in grades if x < 50]
print len(grades2)
Assuming you have a list with all the grades.
avg = sum(gradeList)/len(gradeList)
This is actually faster than numpy.mean().
To find the number of grades less than 50 you can put it in a loop with a conditional statement.
numPoorGrades = 0
for g in grades:
if g < 50:
numPoorGrades += 1
You could also write this a little more compactly using a list comprehension.
numPoorGrades = len([g for g in grades if g < 50])
First of all, assuming grades is a list containing the grades, you would want to iterate over the grades list, and not iterate over range(0,50).
Second, in every iteration you can use a variable to count how many grades you have seen so far, and another variable that sums all the grades so far. Something like that:
num_grades = 0
sum_grades = 0
for grade in grades:
num_grades += 1 # this is the same as writing num_grades = num_grades + 1
sum_grades += sum # same as writing sum_grades = sum_grades + sum
Now all you need to do is to divide sum_grades by num_grades to get the result.
average = float(sum_grade)s / max(num_grades,1)
I used the max function that returns the maximum number between num_grades and 1 - in case the list of grades is empty, num_grades will be 0 and division by 0 is undefined.
I used float to get a fraction.
To count the number of grades lower than 50, you can add another variable num_failed and initialize him to 0 just like num_counts, add an if that check if grade is lower than 50 and if so increase num_failed by 1.
Try the following. Function isNumber tries to convert the input, which is read as a string, to a float, which I believe convers the integer range too and is the floating-point type in Python 3, which is the version I'm using. The try...except block is similar in a way to the try...catch statement found in other programming languages.
#Checks whether the value is a valid number:
def isNumber( value ):
try:
float( value )
return True
except:
return False
#Variables initialization:
numberOfGradesBelow50 = 0
sumOfAllGrades = 0
#Input:
for c in range( 0, 5 ):
currentGradeAsString = input( "What is the grade? " )
while not isNumber( currentGradeAsString ):
currentGradeAsString = input( "Invalid value. What is the grade? " )
currentGradeAsFloat = float( currentGradeAsString )
sumOfAllGrades += currentGradeAsFloat
if currentGradeAsFloat < 50.0:
numberOfGradesBelow50 += 1
#Displays results:
print( "The average is " + str( sumOfAllGrades / 5 ) + "." )
print( "You entered " + str( numberOfGradesBelow50 ) + " grades below 50." )

IndexError: list index out of range PYTHON (w/o using for loop)

I am trying to find averages from a text file. The text file has columns of numbers and I want to find the average of each column. I get the follwoing error:
IndexError: list index out of range
The code I am using is:
import os
os.chdir(r"path of my file")
file_open = open("name of my file", "r")
file_write = open ("average.txt", "w")
line = file_open.readlines()
list_of_lines = []
length = len(list_of_lines[0])
total = 0
for i in line:
values = i.split('\t')
list_of_lines.append(values)
count = 0
for j in list_of_lines:
count +=1
for k in range(0,count):
print k
list_of_lines[k].remove('\n')
for o in range(0,count):
for p in range(0,length):
print list_of_lines[p][o]
number = int(list_of_lines[p][o])
total + number
average = total/count
print average
The error is in line
length = len(list_of_lines[0])
Please let me know if I can provide anymore information.
The issue is you are trying to get the length of something in the array, not the array itself.
Try this:
length = len(list_of_lines)
You wrote length = len(list_of_lines[0])
line_of_lines is defined right above this line, as a list with 0 items in it. As a result, you cannot select the first item (index number 0) because index number 0 does not exist. Therefore, it is out of range.

calling sum() on a list of numbers extrcted from a document

I need a little help with a sum function. I'm trying to locate all the lines with prefix "X-DSPAM-Confidence:" in a document. After i extract them i want to call sum() on them and calculate the average. Thanks, heaps!!!
for line in (fhand):
line = line.rstrip()
if not line.startswith("X-DSPAM-Confidence:"):
continue
else:
n = float(line[line.find(":") + 1:])
a = sum(n)
count = count + 1
print (n)
print (a)
print (total / count)
I don't know if I understood this correctly, but as far as I can see, you only need to store the sum of the values in a variable, something like:
total = 0.0
count = 0
for line in (fhand):
line = line.rstrip()
if not line.startswith("X-DSPAM-Confidence:"):
continue
else:
n = float(line[line.find(":") + 1:])
total += n
count = count + 1
print (total / count)

Python - Using a created list as a parameter

When I run my code it tells me: Type Error: unorderable types: str() < float(). I can't figure out why it won't let me compare these two numbers. The list I am using is defined, and the numbers in it have been redefined as floats, so I'm not sure what else to do. Any suggestions?
def countGasGuzzlers(list1, list2):
total = 0
CCount = 0
HCount = 0
for line in list1:
if num < 22.0:
total = total + 1
CCount = CCount + 1
for line in list2:
if num < 27.0:
total = total + 1
Hcount = Hcount = 1
print('City Gas Guzzlers: ',CCount)
print('Highway Gas Guzzlers: ',HCount)
print('Total Gas Guzzlers: ',total)
This is my list definition. I'm pretty sure it's fine, but maybe there are some bugs in here as well?
CityFile = open('F://SSC/Spring 2015/CSC 110/PythonCode/Chapter 8/HW 4/carModelData_city','r')
for line in CityFile:
CityData = CityFile.readlines()
for num in CityData:
numCityData = float(num)
CityList = numCityData
HwyFile = open('F://SSC/Spring 2015/CSC 110/PythonCode/Chapter 8/HW 4/carModelData_hwy','r')
for line in HwyFile:
HwyData = HwyFile.readlines()
for num in HwyData:
numHwyData = float(num)
HwyList = numHwyData
I believe you are incorrectly referencing to num instead of line which is the counter variable in your for loops, you either need to use num as the counter variable, or use line in the if condition.
def countGasGuzzlers(list1, list2):
total = 0
CCount = 0
HCount = 0
for line in list1:
if float(line) < 22.0:
total = total + 1
CCount = CCount + 1
for line in list2:
if float(line) < 27.0:
total = total + 1
Hcount = Hcount = 1
print('City Gas Guzzlers: ',CCount)
print('Highway Gas Guzzlers: ',HCount)
print('Total Gas Guzzlers: ',total)
Another issue I see is with the way you are creating the lists . The issue is that you are converting each num in file to float and then storing that directly in list variable, which causes list variable to actually store a float value instead of a list, you need to append each value into list, instead of doing list = num
The code would look like -
CityFile = open('F://SSC/Spring 2015/CSC 110/PythonCode/Chapter 8/HW 4/carModelData_city','r')
for line in CityFile:
CityData = CityFile.readlines()
for num in CityData:
numCityData = float(num)
CityList.append(numCityData)
HwyFile = open('F://SSC/Spring 2015/CSC 110/PythonCode/Chapter 8/HW 4/carModelData_hwy','r')
for line in HwyFile:
HwyData = HwyFile.readlines()
for num in HwyData:
numHwyData = float(num)
HwyList.append(numHwyData)
Please also make sure CityList and HwyList are initialized before this code as lists. Like below -
CityList = []
HwyList = []

Categories

Resources