Benford's law program

Benford's law program - python

I have to write a program that proves Benford's Law for two Data lists. I think I have the code down for the most part but I think there are small errors that I am missing. I am sorry if this is not how the site is supposed to be used but I really need help. Here is my code.
def getData(fileName):
data = []
f = open(fileName,'r')
for line in f:
data.append(line)
f.close()
return data
def getLeadDigitCounts(data):
counts = [0,0,0,0,0,0,0,0,0]
for i in data:
pop = i[1]
digits = pop[0]
int(digits)
counts[digits-1] += 1
return counts
def showResults(counts):
percentage = 0
Sum = 0
num = 0
Total = 0
for i in counts:
Total += i
print"number of data points:",Sum
print
print"digit number percentage"
for i in counts:
Sum += i
percentage = counts[i]/float(Sum)
num = counts[i]
print"5%d 6%d %f"%(i,num,percentage)
def showLeadingDigits(digit,data):
print"Showing data with a leading",digit
for i in data:
if digit == i[i][1]:
print i
def processFile(name):
data = getData(name)
counts = getLeadDigitCounts(data)
showResults(counts)
digit = input('Enter leading digit: ')
showLeadingDigits(digit, data)
def main():
processFile('TexasCountyPop2010.txt')
processFile('MilesofTexasRoad.txt')
main()
Again sorry if this is not how I am supposed to use this site. Also, I can only use programming techniques that the professor has showed us so if you could just give me advice to clean up the code as it is I would really appreciate it.
Also, here are a few lines from my data.
Anderson County 58458
Andrews County 14786
Angelina County 86771
Aransas County 23158
Archer County 9054
Armstrong County 1901

Your error is coming from this line:
int(digits)
This doesn't actually do anything to digits. If you want to convert digits to an integer, you have to re-set the variable:
digits = int(digits)
Also, to properly parse your data, I would do something like this:
for line in data:
place, digits = line.rsplit(None, 1)
digits = int(digits)
counts[digits - 1] += 1

Lets walk though one cycle of your code and I think you'll see what the problem is. I'll be using this file here for data
An, 10, 22
In, 33, 44
Out, 3, 99
Now getData returns:
["An, 10, 22",
"In, 33, 44",
"Out, 3, 99"]
Now take a look the first pass though the loop:
for i in data:
# i = "An, 10, 22"
pop = i[1]
# pop = 'n', the second character of i
digits = pop[0]
# digits = 'n', the first character of pop
int(digits)
# Error here, but you probably wanted digits = int(digits)
counts[digits-1] += 1
Depending on how your data is structured, you need to figure out the logic to extract the digits you expect to get from your file. This logic might do better in the getData funciton, but it mostly depends on the specifics of your data.

Just to share here a different (and maybe more step-by-step) code. It's RUBY.
The thing is, Benford's Law doesn't apply when you have a specific range of random data to extract from. The maximum number of the data set that you are extracting random information from must be undetermined, or infinite.
In other words, say, you used a computer number generator that had a 'set' or specific range from which to extract the numbers, eg. 1-100. You would undoubtedly end up with a random dataset of numbers, yes, but the number 1 would appear as a first digit as often as the number 9 or any other number.
**The interesting** part, actually, happens when you let a computer (or nature) decide randomly, and on each instance, how large you want the random number to potentially be. Then you get a nice, bi-dimensional random dataset, that perfectly attains to Benford's Law. I have generated this RUBY code for you, which will neatly prove that, to our fascination as Mathematicians, Benford's Law works each and every single time!
Take a look at this bit of code I've put together for you!
It's a bit WET, but I'm sure it'll explain.
<-- RUBY CODE BELOW -->
dataset = []
999.times do
random = rand(999)
dataset << rand(random)
end
startwith1 = []
startwith2 = []
startwith3 = []
startwith4 = []
startwith5 = []
startwith6 = []
startwith7 = []
startwith8 = []
startwith9 = []
dataset.each do |element|
case element.to_s.split('')[0].to_i
when 1 then startwith1 << element
when 2 then startwith2 << element
when 3 then startwith3 << element
when 4 then startwith4 << element
when 5 then startwith5 << element
when 6 then startwith6 << element
when 7 then startwith7 << element
when 8 then startwith8 << element
when 9 then startwith9 << element
end
end
a = startwith1.length
b = startwith2.length
c = startwith3.length
d = startwith4.length
e = startwith5.length
f = startwith6.length
g = startwith7.length
h = startwith8.length
i = startwith9.length
sum = a + b + c + d + e + f + g + h + i
p "#{a} times first digit = 1; equating #{(a * 100) / sum}%"
p "#{b} times first digit = 2; equating #{(b * 100) / sum}%"
p "#{c} times first digit = 3; equating #{(c * 100) / sum}%"
p "#{d} times first digit = 4; equating #{(d * 100) / sum}%"
p "#{e} times first digit = 5; equating #{(e * 100) / sum}%"
p "#{f} times first digit = 6; equating #{(f * 100) / sum}%"
p "#{g} times first digit = 7; equating #{(g * 100) / sum}%"
p "#{h} times first digit = 8; equating #{(h * 100) / sum}%"
p "#{i} times first digit = 9; equating #{(i * 100) / sum}%"

Related

Draw a centered triforce surrounded by hyphens using Python

I want to draw a triangle of asterisks from a given n which is an odd number and at least equal to 3. So far I did the following:
def main():
num = 5
for i in range(num):
if i == 0:
print('-' * num + '*' * (i + 1) + '-' * num)
elif i % 2 == 0:
print('-' * (num-i+1) + '*' * (i + 1) + '-' * (num-i+1))
else:
continue
if __name__ == "__main__":
main()
And got this as the result:
-----*-----
----***----
--*****--
But how do I edit the code so the number of hyphens corresponds to the desirable result:
-----*-----
----***----
---*****---
--*-----*--
-***---***-
*****-*****

There's probably a better way but this seems to work:
def triangle(n):
assert n % 2 != 0 # make sure n is an odd number
hyphens = n
output = []
for stars in range(1, n+1, 2):
h = '-'*hyphens
s = '*'*stars
output.append(h + s + h)
hyphens -= 1
pad = n // 2
mid = n
for stars in range(1, n+1, 2):
fix = '-'*pad
mh = '-'*mid
s = '*'*stars
output.append(fix + s + mh + s + fix)
pad -= 1
mid -= 2
print(*output, sep='\n')
triangle(5)
Output:
-----*-----
----***----
---*****---
--*-----*--
-***---***-
*****-*****

Think about what it is you're iterating over and what you're doing with your loop. Currently you're iterating up to the maximum number of hyphens you want, and you seem to be treating this as the number of asterisks to print, but if you look at the edge of your triforce, the number of hyphens is decreasing by 1 each line, from 5 to 0. To me, this would imply you need to print num-i hyphens each iteration, iterating over line number rather than the max number of hyphens/asterisks (these are close in value, but the distinction is important).
I'd recommend trying to make one large solid triangle first, i.e.
-----*-----
----***----
---*****---
--*******--
-*********-
***********
since this is a simpler problem to solve and is just one modification away from what you're trying to do (this is where the distinction between number of asterisks and line number will be important, as your pattern changes dependent on what line you're on).
I'll help get you started; for any odd n, the number of lines you need to print is going to be (n+1). If you modify your range to be over this value, you should be able to figure out how many hyphens and asterisks to print on each line to make a large triangle, and then you can just modify it to cut out the centre.

Python beginner - RGB values to HEX. How bad is my code?

I recently wrote a program that calculates the hex value when given rgb values. I was just wondering if my code is terrible (i did my best to write it from scratch without much help). I'm still a beginner and trying to learn.
Any help would be greatly appreciated (guidance about how i could do things better etc.).
Thank you
# sets the HEX letters for numbers above 9
hex_table = {'0':0,'1':1,'2':2,'3':3,'4':4,'5':5,'6':6,'7':7,'8':8, '9':9,
'a':10, 'b':11, 'c':12, 'd':13, 'e':14, 'f':15}
# creates variable for the keys in dictionary
key_list = list(hex_table.keys())
# creates variable for values in dictionary
val_list = list(hex_table.values())
def test(r= int(input('red value: ')),g= int(input('green value: ')), b= int(input('blue value: '))):
# finds the index of the value
red_value = r // 16
green_value = g // 16
blue_value = b // 16
# Calcuate the remainder
red_float = float(r) / 16
red_remainder = red_float % 1
green_float = float(g) / 16
green_remainder = green_float % 1
blue_float = float(b) / 16
blue_remainder = blue_float % 1
# adds '#' in front of the result
print('#',end='')
#find the first two values in HEX code
if r >= 10:
print(key_list[val_list.index(red_value)],end='')
second_letter = (int(red_remainder * 16))
print(key_list[val_list.index(second_letter)],end='')
elif r <10:
print(red_value,end='')
print(int(red_remainder * 16),end='')
#find the next two values
if g >= 10:
print(key_list[val_list.index(green_value)],end='')
second_letter = (int(green_remainder * 16))
print(key_list[val_list.index(second_letter)],end='')
elif g <10:
print(green_value,end='')
print(int(green_remainder * 16),end='')
#find the last two values
if b >= 10:
print(key_list[val_list.index(blue_value)],end='')
second_letter = (int(blue_remainder * 16))
print(key_list[val_list.index(second_letter)],end='')
elif b <10:
print(blue_value,end='')
print(int(blue_remainder * 16),end='')
test()

You could reduce the amount of code by a lot by reducing the amount of times you repeat yourself, you are calculating the values for the digits for all three colours in the same function, therefore repeating yourself 3 times.
I've wrote my answer using JavaScript but I will explain what I am doing.
function getDigit(val) {
var alphabet = ["a", "b", "c", "d", "e", "f"]
if (val >= 10 && val < 16)
val = alphabet[val - 10];
return val.toString();
}
function helper(val) {
var first = Math.floor(val / 16)
var second = val % 16;
return getDigit(first) + getDigit(second)
}
function rgbToHex(red, green, blue) {
return helper(red) + helper(green) + helper(blue);
}
I have created two functions to help calculate the digits for each of the RGB values.
The helper() function calculates two numbers for the first and second digits. Lets use 24 and 172 as an example.
To find the first digit, you can divide the value by 16, and Floor the answer so it rounds down to a single digit.
24 / 16 = 1.5, Floor(1.5) = 1;
Therefore are first digit is 1.
And for the second digit we take the remainder of the value divided by 16.
24 % 16 = 8
So the full value for would be 18
Lets see 172 now.
172 / 16 = 10.75, Floor(10.75) = 10;
This will not work because the value should be "a" rather than "10", this is where the getDigit() function comes in, this function will take the value and check if it is between 10 and 15. We then take 10 from the value to find which letter we should use from the 'alphabet' array.
So for 10, we get 10 - 10 = 0; which means we will use the value at index 0 which gives us "a"
We can do the same for the second digit
172 % 16 = 12; Now the getDigit() function is called again.
12 - 10 = 2; so we take the item from index 2 in the array, which is "c"
so for 172 the value will be ac

Here's my slightly amended code:
hex_table = {'1':1,'2':2,'3':3,'4':4,'5':5,'6':6,'7':7,'8':8,'9':9,
'A':10, 'B':11, 'C':12, 'D':13, 'E':14, 'F':15}
key_list = list(hex_table.keys())
val_list = list(hex_table.values())
def rgb(num_val):
num = num_val // 16
num_float = float(num_val) / 16
num_remainder = num_float % 1
if num_val >= 10:
print(key_list[val_list.index(num)], end='')
second_letter = (int(num_remainder * 16))
print(key_list[val_list.index(second_letter)],end='')
elif num_val <10:
print(num_val)
print(int(num_remainder * 16))
def rgb_hex(r,g,b):
print('#',end='')
return (f'{rgb(r)} {rgb(g)} {rgb(b)}')
rgb_hex(220,20,60)

This is an old question but for information, I developed a package with some utilities related to colors and colormaps and contains the rgb2hex function you were looking to convert triplet into hexa value (which can be found in many other packages, e.g. matplotlib). It's on pypi
pip install colormap
and then
>>> from colormap import rgb2hex
>>> rgb2hex(0, 128, 64)
'##008040'
Validity of the inputs is checked (values must be between 0 and 255).

How to get percentage of combinations computed?

I have this password generator, which comute combination with length of 2 to 6 characters from a list containing small letters, capital letters and numbers (without 0) - together 61 characters.
All I need is to show percentage (with a step of 5) of the combinations already created. I tried to compute all the combinations of selected length, from that number a boundary value (the 5 % step values) and count each combination written in text file and when when the count of combinations meets the boundary value, print the xxx % completed, but this code doesn't seem to work.
Do you know how to easily show the percentage please?
Sorry for my english, I'm not a native speaker.
Thank you all!
def pw_gen(characters, length):
"""generate all characters combinations with selected length and export them to a text file"""
# counting number of combinations according to a formula in documentation
k = length
n = len(characters) + k - 1
comb_numb = math.factorial(n)/(math.factorial(n-length)*math.factorial(length))
x = 0
# first value
percent = 5
# step of percent done to display
step = 5
# 'step' % of combinations
boundary_value = comb_numb/(100/step)
try:
# output text file
with open("password_combinations.txt", "a+") as f:
for p in itertools.product(characters, repeat=length):
combination = ''.join(p)
# write each combination and create a new line
f.write(combination + '\n')
x += 1
if boundary_value <= x <= comb_numb:
print("{} % complete".format(percent))
percent += step
boundary_value += comb_numb/(100/step)
elif x > comb_numb:
break

First of all - I think you are using incorrect formula for combinations because itertools.product creates variations with repetition, so the correct formula is n^k (n to power of k).
Also, you overcomplicated percentage calculation a little bit. I just modified your code to work as expected.
import math
import itertools
def pw_gen(characters, length):
"""generate all characters combinations with selected length and export them to a text file"""
k = length
n = len(characters)
comb_numb = n ** k
x = 0
next_percent = 5
percent_step = 5
with open("password_combinations.txt", "a+") as f:
for p in itertools.product(characters, repeat=length):
combination = ''.join(p)
# write each combination and create a new line
f.write(combination + '\n')
x += 1
percent = 100.0 * x / comb_numb
if percent >= next_percent:
print(f"{next_percent} % complete")
while next_percent < percent:
next_percent += percent_step
The tricky part is a while loop that makes sure that everything will work fine for very small sets (where one combination is more than step percentage of results).

Removed try:, since you are not handling any errors with expect.
Also removed elif:, this condition is never met anyway.
Besides, your formula for comb_numb is not the right one, since you're generating combinations with repetition. With those changes, your code is good.
import math, iterations, string
def pw_gen(characters, length):
"""generate all characters combinations with selected length and export them to a text file"""
# counting number of combinations according to a formula in documentation
comb_numb = len(characters) ** k
x = 0
# first value
percent = 5
# step of percent done to display
step = 5
# 'step' % of combinations
boundary_value = comb_numb/(100/step)
# output text file
with open("password_combinations.txt", "a+") as f:
for p in itertools.product(characters, repeat=length):
combination = ''.join(p)
# write each combination and create a new line
f.write(combination + '\n')
x += 1
if boundary_value <= x:
print("{} % complete".format(percent))
percent += step
boundary_value += comb_numb/(100/step)
pw_gen(string.ascii_letters, 4)

how do I run a defined function multiple times with a for statement so that the return compounds?

I am pretty new at Python and have a defined function for doubling a number. I want to double the number three times using a for statement. This is from lesson 6.3 in Dan Bader's Python Basics. For some reason, this one has me stumped.
Below, I tried adding:
number = number * 2 after my for statement but my result is
20
40
80
def doubles(number):
"""Takes one number as its input and doubles it."""
double = number * 2
return double
number = 5
for x in range(0, 3):
print(doubles(number))
Actual results are:
10
10
10
Expected results are:
10
20
40

def doubles(number):
"""Takes one number as its input and doubles it."""
double = number * 2
return double
number = 5
for x in range(0, 5):
print(doubles(number))
number=doubles(number)

Sounds like you want number (the global one) to retain the result of calling doubles; so do that explicitly:
for x in range(0,3):
number = doubles(number)
print(number)

you need to add the numbers over each other like this.
'number = number + number' in short "number += number"
best,
i hope its easier to see it when you write it like this.
number = 5
for x in range(0, 3):
double = number * 2
print(double)
number += number

you are incrementing x but not change the value of "number" along the way
def doubles(number):
"""Takes one number as its input and doubles it."""
double = number * 2
return double
number = 5
for x in range(0, 3):
print(doubles(number))
number*=2

Here's the solution with minimum coding.
def double(a):
print(a)
for b in range(1,5):
b = a * 2
a = b
print(a)
double(10)

Add numbers in hexadecimal base without converting bases?

I need to write a function which gets two numbers in hexadecimal base, and calculates the sum of both of them, I'm not allowed to convert them to decimal base, the code is supposed to calculate it "manually" using loops.
for example this is how it should work:
1
1 f 5 (A)
+ 5 a (B)
-------------
= 2 4 f
Here is an input example:
>>> add("a5", "17")
'bc'
I've started building my code but I got stuck, I thought I would divide into three ifs, one that would sum up only numbers, other sums numbers and letters, and the third one sums letters, but I don't know how to continue from here:
def add_hex(A,B):
lstA = [int(l) for l in str(A)]
lstB = [int(l) for l in str(B)]
if len(A)>len(B):
A=B
B=A
A='0'*(len(B)-len(A))+A
remainder=False
result=''
for i in range(len(B)-1)
if (A[i]>0 and A[i]<10) and (B[i]>0 and B[i]<10):
A[i]+B[i]=result
if A[i]+B[i]>10:
result+='1'
Any help is greatly appreciated, I have no clue how to start on this!

You can have a sub-function that adds two single-digit hex numbers and returns their single-digit sum and a carry (either 0 or 1). This function will take three inputs: two numbers you want to add and a carry-in. You can then loop through the digits of the two numbers you want to add from least significant to most significant, and apply this function for every pair of digits while taking into account the carry at each stage.
So let's try your example:
A 5
1 7 +
We start at the least significant digits, 5 and 7, and perform the 1-digit addition. 516 + 716 = 1210. 1210 is less than 1610, so the output of our 1-digit add is 1210 = C16 with a carry of 0.
Now we add A and 1 (our carry-in is 0 so we can just add them normally). A16 + 116 = 1110. 1110 is less than 1610, so the output of our 1-digit add is 1110 = B16 with a carry of 0. (If we had a non-zero carry-in, we would just add 1 to this value.)
Hence, our overall result is:
A 5
1 7 +
-----
B C

I think we just remember the pattern of addition. Like following.
"0" + "0" = "0"
"0" + "1" = "1"
"0" + "2" = "2"
.
.
.
"f" + "d" = "1b"
"f" + "e" = "1c"
"f" + "f" = "1e"
We have dictionary of all of the pattern because we've learned it in school or somewhere. And we've also learned carry.
So I think this seems like manual addition algorithm.
Remembering the pattern include carry.
Calculating
Translate two digit to one digit(a+b->c).
Treat carry correctly.
And here is my code for that. But it may be a bit tricky.
import itertools
def add_hex(A,B):
A = "0"+A
B = "0"+B
#Remember all pattern include carry in variable d.
i2h = dict(zip(range(16), "0123456789abcdef"))
a = [(i,j) for i in "0123456789abcdef" for j in "0123456789abcdef"]
b = list(map(lambda t: int(t[0],16)+int(t[1],16), a))
c = ["0"+i2h[i] if i<16 else "1"+i2h[i-16] for i in b]#list of digit include carry
d = dict(zip(a,c))#d={(digit,digit):digit,,,}
#Calculate with variable d.
result = ""
cur = "0"
nex = "0"
for i in itertools.izip_longest(A[::-1], B[::-1], fillvalue = "0"):
cur = d[(nex, d[i][1])][1] #cur = carry + digit + digit
if d[i][0]=='1' or d[(nex, d[i][1])][0]=='1':#nex = carry = carry + digit + digit
nex = "1"
else:
nex = "0"
result += cur
return result[::-1]
#Test
A = "fedcba"
B = "012346"
print add_hex(A,B)
print hex(int(A,16)+int(B,16))#For validation
I hope it helps. :)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Benford's law program - python

Related

Draw a centered triforce surrounded by hyphens using Python

Python beginner - RGB values to HEX. How bad is my code?

How to get percentage of combinations computed?

how do I run a defined function multiple times with a for statement so that the return compounds?

Add numbers in hexadecimal base without converting bases?

Categories

Resources