Organizing python code for handling statistic information

Organizing python code for handling statistic information - python

I'm going to create statistics based on information what builds were success or not and how much per project.
I create ProjectStat class per new project I see and inside handled statistics. For printing overall statistic I need to pass through all ProjectStat instances. For printing success statistics per project I need to pass through them again and so on, on any kind of statistics. My question is about simplifying the way handling the cycles, i.e not to pass the dictionary every time. Perhaps using decorators or decorator pattern would be pythonic way? How then they can be used if number of instances of ProjectStat is dynamically changed?
Here is the code:
class ProjectStat(object):
projectSuccess = 0
projectFailed = 0
projectTotal = 0
def addRecord(self, record):
if len(record) == 5: record.append(None)
try:
(datetime, projectName, branchName, number, status, componentName) = record
except ValueError:
pass
self.projectTotal += 1
if status == 'true': self.projectSuccess += 1
else: self.projectFailed += 1
def addDecorator(self, decorator):
decorator = decorator
def readBuildHistoryFile():
dict = {}
f = open("filename")
print("reading the file")
try:
for line in f.readlines():
#print(line)
items = line.split()
projectName = items[1]
projectStat = dict[projectName] = dict.get(projectName, ProjectStat())
projectStat.addRecord(items)
print(items[1])
finally:
f.close()
success = 0
failed = 0
total = 0
for k in dict.keys():
projectStat = dict[k]
success += projectStat.projectSuccess
failed += projectStat.projectFailed
total += projectStat.projectTotal
print("Total: " + str(total))
print("Success: " + str(success))
print("Failed: " + str(failed))
if __name__ == '__main__':
readBuildHistoryFile()

I'm not sure I understand the Q, but I'll try to answer anyway :)
option1:
total = sum([project.projectTotal for project in dict.values()])
success = sum([project.projectSuccess for project in dict.values()])
failed = sum([project.projectFailed for project in dict.values()])
option2:
(total,success,failed) = reduce (lambda x,y:(x[0]+y[0],x[1]+y[1],x[2]+y[2]), [(project.projectTotal,project.projectSuccess,project.projectFailed) for project in dict.values()])

Related

AttributeError: 'QuerySet' object has no attribute 'url'

this code has this error and I am absolutely lost how to fix it. the code runs fine until it gets to the last object saved in the database and after that it throws this error. the code is a tool that checks the html of a website between 2 points in time to check if it has an error, even if the website is running well and giving a 200 response code
This is the error:
in check_html print(monitor.url) AttributeError: 'QuerySet' object has no attribute 'url'
def run_monitors():
delete_from_db()
monitors = Monitor.objects.filter(is_active=True)
monitors._fetch_all()
asyncio.run(_run_monitors(monitors))
check_html(monitor=monitors)
def check_html(monitor):
start_time = time.time()
print(monitor.url)
# The URLs to compare
old_html = monitor.html_compare
new_url = monitor.url
# Get the HTML of each URL
try:
old_html = old_html
# html1.raise_for_status()
except Exception as e:
print(e)
try:
html2 = requests.get(new_url)
html2.raise_for_status()
except Exception as e:
print(e)
return None
html2 = html2.text[:10000]
# Create a SequenceMatcher object to compare the HTML of the two URLs
matcher = difflib.SequenceMatcher(None, old_html, html2)
similarity_ratio = matcher.ratio() * 100
response_time = time.time() - start_time
monitor.html_compare = html2
html_failure = False
counter = monitor.fault_counter
if similarity_ratio <= 90 and counter == 0:
print(f"The two HTMLs have {similarity_ratio:}% in common.")
print("change detected")
html_failure = False
counter += 1
elif similarity_ratio > 90 and counter == 0:
print(f"The two HTMLs have {similarity_ratio:.2f}% in common.")
print("no change detected")
html_failure = False
counter = 0
elif similarity_ratio > 90 and counter >= 1:
print(f"The two HTMLs have {similarity_ratio:.2f}% in common.")
if counter >= 4:
print(f"HTML fault detected")
html_failure = True
else:
counter += 1
print(f"checks if fault persists, current fault counter: {counter}")
elif similarity_ratio < 90 and counter >= 1:
print("Fault is presumably resolved")
html_failure = False
counter = 0
monitor.fault_counter = counter
# Print the similarity ratio between the two URLs
monitor.save(update_fields=['html_compare', 'fault_counter'])
return html_failure
def send_notifications():
for monitor in Monitor.objects.all():
multiple_failures, last_result = has_multiple_failures(monitor)
result = check_html(monitor)
no_notification_timeout = (not monitor.last_notified) or \
monitor.last_notified < timezone.now() - timedelta(hours=1)
if multiple_failures and no_notification_timeout and monitor.is_active:
_send_notification(monitor, last_result)
if result:
_send_notification(monitor, last_result)
I already tried to put a for loop around the 'check_html' function that iterates over every object in monitor but that just returns that monitors can't be iterated over. it was a long shot but still didn't work

You have passed the queryset to the check_html() function. Using a filter we get one or more items that are iterable. You can use the for loop in check_html() function or password only one object to the function.

I found the issue. I had added the check_html function to run on a certain command. Which at the end of the script tried to give the whole queryset to the check_html function itself.
So I just had to remove the check_html function from run_monitor.
Thank you for your help guys.

One of methods doesn't work correctly when i call it

I need to make two checks in log files and display the result. Separately methods work correctly, but when I run all code method hit_unique_check always return "PASS: All hits are unique.". For two of three .log files this result is incorrect.
import os
class ReadFiles:
def __init__(self):
self.current_file = ""
self.shoot_from = "Shoot from"
self.hit_player = "Hit player"
def equally_check(self):
shoot_from_list = []
hit_player_list = []
for line in self.current_file:
if self.shoot_from in line:
shoot_from_list.append(line)
elif self.hit_player in line:
hit_player_list.append(line)
if len(shoot_from_list) == len(hit_player_list):
print(" PASS: Shoots and hits are equal.\n")
else:
print(" FAIL: Shoots and hits are NOT equal.\n")
def hit_unique_check(self):
unique_hit_list = []
duplicates = []
for line in self.current_file:
if self.hit_player in line:
unique_hit_list.append(line)
else:
continue
for i in unique_hit_list:
if unique_hit_list.count(i) > 1:
duplicates.append(i)
print(i)
else:
continue
if len(duplicates) < 1:
print(" PASS: All hits are unique.\n")
else:
print(" FAIL: This hits are duplicated.\n")
def run(self):
for file in os.listdir():
if file.endswith(".log"):
print(f"Log file - {file}")
self.current_file = open(f"{file}", 'rt')
print(self.current_file.readlines, f"")
self.equally_check()
self.hit_unique_check()
self.current_file.close()
if __name__ == "__main__":
run = ReadFiles()
run.run()
I run my python code, but result always the same: "PASS: All hits are unique.". For some files it must be "FAIL: This hits are duplicated.". I'm not sure that problem in the method hit_unique_check, and have no idea what to do.
Can you explain me, how I can make this method working correctly not only separately?

Consider this organization. Each function has one task, to evaluate and return its result. It's up to the caller to decide what to do with the result. Also note that I'm using counters instead of lists, since you don't really care what the lists contain. Also note the use of defaultdict, to avoid having to do repeated searches of your hit list.
import os
from collections import defaultdict
class ReadFiles:
def __init__(self):
self.shoot_from = "Shoot from"
self.hit_player = "Hit player"
def equally_check(self, lines):
shoot_from = 0
hit_player = 0
for line in lines:
if self.shoot_from in line:
shoot_from += 1
elif self.hit_player in line:
hit_player += 1
return shoot_from == hit_player
def hit_unique_check(self, lines):
unique_hit_list = defaultdict(int)
for line in lines:
if self.hit_player in line:
unique_hit_list[line] += 1
duplicates = 0
for k,v in unique_hit_list.items()
if v > 1:
duplicates += 1
print(k)
return not duplicates
def run(self):
for filename in os.listdir():
if filename.endswith(".log"):
print(f"Log file - {filename}")
lines = open(filename, 'rt').readlines()
print(lines)
if self.equally_check(lines):
print(" PASS: Shoots and hits are equal.\n")
else:
print(" FAIL: Shoots and hits are NOT equal.\n")
if self.hit_unique_check(lines):
print(" PASS: All hits are unique.\n")
else:
print(" FAIL: This hits are duplicated.\n")
if __name__ == "__main__":
run = ReadFiles()
run.run()
You could even replace the loop in hit_unique_check with a counter:
from collections import Counter
...
def hit_unique_check(self,lines):
unique_hit_list = Counter(lines)
for k,v in unique_hit_list,items():
...

Extracting two or three smaller functions from main() to find errors more easily

I would like to extract shorter functions from the larger main function to make this more readable and easier to find errors without removing functionality
I was thinking of splitting it down the middle like shown with "def calc_mean():. However an issue is that data amongst other things is not defined in this function. How should I change these so the original program still works despite being divided into 2?

It is a never ending loop.
user_input calls main, then main calls user_input, then user_input calls main and so on.
FIX 1
Remove filename = input(user_input()) from main function and pass filename as an argument from user_input . In this case whole script should first call user_input function.
FIX 2
Remove filename = input(user_input()) with filename = user_input() adjust user_input function so it would only ask for user input and then return that input. In this case script should first call main function.
Also, in the bottom of the script it should be
if __name__ == '__main__': # not `main`
call_some_function()
Update
def user_input(): # adjust this if you need
filename = input()
return filename
def read_data(filename):
data = dict()
with open(filename, 'r') as h:
for line in h:
four_vals = line.split(',')
batch = four_vals[0]
if not batch in data:
data[batch] = []
data[batch] += [(float(four_vals[1]), float(four_vals[2]), float(four_vals[3]))]
return data
def calc_mean(sample):
if len(sample) == 0:
return
n = 0
x_sum = 0
for (x, y, val) in sample:
if x**2 + y**2 <= 1:
x_sum += val
n += 1
average = x_sum / n
return average
def main():
'''
This is the main body of the program.
'''
filename = user_input()
data = read_data(file_name)
for batch, sample in data.items():
average = calc_mean(sample)
if average is not None:
print(f"{batch}\t{average}")
else:
print(f"{batch}\t{No data}")

Python - print statistics from one file into another

import sys
import pickle
import string
def Menu():
print ("***********MENU************")
print ("0. Quit")
print ("1. Read text file")
print ("2. Display counts")
print ("3. Display statistics of word lengths")
print ("4. Print statistics to file")
def readFile():
while True:
fileName = input("Please enter a file name: ")
if (fileName.lower().endswith(".txt")):
break
else:
print("That was an incorrect file name. Please try again.")
continue
return fileName
THE_FILE = ""
myDictionary = 0
def showCounts(fileName):
numCount = 0
dotCount = 0
commaCount = 0
lineCount = 0
wordCount = 0
with open(fileName, 'r') as f:
for line in f:
wordCount+=len(line.split())
lineCount+=1
for char in line:
if char.isdigit() == True:
numCount+=1
elif char == '.':
dotCount+=1
elif char == ',':
commaCount+=1
print("Number count: " + str(numCount))
print("Comma count: " + str(commaCount))
print("Dot count: " + str(dotCount))
print("Line count: " + str(lineCount))
print("Word count: " + str(wordCount))
def showStats(fileName):
temp1 = []
temp2 = []
lengths = []
myWords = []
keys = []
values = []
count = 0
with open(fileName, 'r') as f:
for line in f:
words = line.split()
for word in words:
temp2.append(word)
temp1.append(len(word))
for x in temp1:
if x not in lengths:
lengths.append(x)
lengths.sort()
dictionaryStats = {}
for x in lengths:
dictionaryStats[x] = []
for x in lengths:
for word in temp2:
if len(word) == x:
dictionaryStats[x].append(word)
for key in dictionaryStats:
print("Key = " + str(key) + " Total number of words with " + str(key) + " characters = " + str(len(dictionaryStats[key])))
return dictionaryStats
def printStats(aDictionary):
aFile = open("statsWords.dat", 'w')
for key in aDictionary:
aFile.write(str(key) + " : " + str(aDictionary[key]) + "\n")
aFile.close()
choice = -1
while choice !=0:
Menu()
choice = (int(input("Please choose 1-4 to perform function. Press 0 to exit the program. Thank you. \n")))
if choice == 0:
print ("Exit program. Thank you.")
sys.exit
elif choice == 1:
THE_FILE = readFile()
elif choice == 2:
showCounts(THE_FILE)
elif choice == 3:
showStats(THE_FILE)
elif choice == 4:
printStats(myDictionary)
else:
print ("Error.")
I'm trying to open a file, have it display the statistics of the word lengths, and then have it make a new file with the statistics of the word lengths. I can read the file and have it display the statistics, but when I print the statistics to file I get an error - "int" object is not iterable. Any ideas? Thanks guys!
Error:
Traceback (most recent call last):
File "hw4_ThomasConnor.py", line 111, in <module>
printStats(myDictionary)
File "hw4_ThomasConnor.py", line 92, in printStats
for key in aDictionary:
TypeError: 'int' object is not iterable

The problem is that you set myDictionary to 0 at the top of your program, and then are sending it to your file writing function here printStats(myDictionary).
In this function you have this line for key in aDictionary, and since you passed in 0, this is effectively for key in 0 which is where the error comes from.
You need to send the result of the showStats function to your printStats function.
As this is looking like homework, I will leave it at that for now.
Sorry I am confused. in the showStats function I have to somehow say
"send results to printStats function" and then in the printStats
function I have to call the results? How would I do that exactly?
The printStats function is expecting a dictionary to print. This dictionary is generated by the showStats function (in fact, it returns this dictionary).
So you need to send the result of the showStats function to the printStats function.
To save the return value of a method, you can assign it on the LHS (left hand side) of the call expression, like this:
>>> def foo(bar):
... return bar*2
...
>>> def print_results(result):
... print('The result was: {}'.format(result))
...
>>> result = foo(2) # Save the returned value
Since result is just like any other name in Python, you can pass it to any other function:
>>> print_results(result)
The result was: 4
If we don't want to store the result of the function, and just want to send it to another function, then we can use this syntax:
>>> print_results(foo(2))
The result was: 4
You need to do something similar in your main loop where you execute the functions.
Since the dictionary you want to print is returned by the showStats function, you must call the showStats function first before calling the printStats function. This poses a problem if your user selects 4 before selecting 3 - make sure you find out a work around for this. A simple work around would be to prompt the user to calculate the stats by selecting 3 before selecting 4. Try to think of another way to get around this problem.

Here:
THE_FILE = ""
myDictionary = 0
you set integer to myDictionary.
and later you do:
printStats(myDictionary)
and as you try to interate over keys of dictionary inside, you fail.

Performing an action as python script closes

I was wondering if it was possible to perform an action at any given point in a basic python script, so say when it is close. I have the following code to find prime numbers (Just for fun)
number = 1
primelist = []
nonprime = []
while number < 1000:
number += 1
for i in range(number):
if i != 1 and i != number and i !=0:
if number%i == 0:
nonprime.append(number)
else:
primelist.append(number)
nonprimes = open("nonprimes.txt", "w")
for nonprime in set(primelist) & set(nonprime):
nonprimes.write(str(nonprime) + ", ")
nonprimes.close()
So basically i wanted to run the last part as the script is stopped. If this isn't possible is there a way where say i press "space" while the program is running and then it saves the list?
Cheers in advance :)
EDIT:
I've modified the code to include the atexit module as suggested, but it doesn't appear to be working. Here it is:
import time, atexit
class primes():
def __init__(self):
self.work(1)
def work(self, number):
number = 1
self.primelist = []
self.nonprime = []
while number < 20:
time.sleep(0.1)
print "Done"
number += 1
for i in range(number):
if i != 1 and i != number and i !=0:
if number%i == 0:
self.nonprime.append(number)
else:
self.primelist.append(number)
nonprimes = open("nonprimes.txt", "w")
for nonprime in set(self.primelist) & set(self.nonprime):
nonprimes.write(str(nonprime) + ", ")
nonprimes.close()
def exiting(self, primelist, nonprimelist):
primelist = self.primelist
nonprimelist = self.nonprime
nonprimes = open("nonprimes.txt", "w")
for nonprime in set(self.primelist) & set(self.nonprime):
nonprimes.write(str(nonprime) + ", ")
nonprimes.close()
atexit.register(exiting)
if __name__ == "__main__":
primes()

While I'm pretty certain the file object does cleanup and flushes the stuff to file when it is reclaimed. The best way to go about this is to use a with statement.
with open("nonprimes.txt", "w") as nonprimes:
for nonprime in set(primelist) & set(nonprime):
nonprimes.write(str(nonprime) + ", ")
The boiler plate code of closing the file and such is performed automatically when the statement ends.

Python has an atexit module that allows you to register code you want executed when a script exits:
import atexit, sys
def doSomethingAtExit():
print "Doing something on exit"
atexit.register(doSomethingAtExit)
if __name__ == "__main__":
sys.exit(1)
print "This won't get called"

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Organizing python code for handling statistic information - python

Related

AttributeError: 'QuerySet' object has no attribute 'url'

One of methods doesn't work correctly when i call it

Extracting two or three smaller functions from main() to find errors more easily

Python - print statistics from one file into another

Performing an action as python script closes

Categories

Resources