The program I'm writing simulates rolling 4 dice and adds the result from each together into a "Total" column. I'm trying to print the outcomes for 10,000 dice rolls but for some reason the value of each dice drops to 0.0 somewhere in the program and it continues like this until the end. Could anyone tell me what's going wrong here and how to fix it? Thanks :)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
np.random.seed(101)
four_dice = np.zeros([pow(10,4),5]) # 10,000 rows, 5 columns
n = 0
outcomes = [1,2,3,4,5,6]
for i in outcomes:
for j in outcomes:
for k in outcomes:
for l in outcomes:
four_dice[n,:] = [i,j,k,l,i+j+k+l]
n +=1
four_dice_df = pd.DataFrame(four_dice,columns=('1','2','3','4','Total'))
print(four_dice_df) #print the table
OUTPUT
1 2 3 4 Total
0 1.0 1.0 1.0 1.0 4.0
1 1.0 1.0 1.0 2.0 5.0
2 1.0 1.0 1.0 3.0 6.0
3 1.0 1.0 1.0 4.0 7.0
4 1.0 1.0 1.0 5.0 8.0
... ... ... ... ... ...
9995 0.0 0.0 0.0 0.0 0.0
9996 0.0 0.0 0.0 0.0 0.0
9997 0.0 0.0 0.0 0.0 0.0
9998 0.0 0.0 0.0 0.0 0.0
9999 0.0 0.0 0.0 0.0 0.0
[10000 rows x 5 columns]
Does this work for what you want?
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(1,7,size=(10000,4)),columns = [1,2,3,4])
df['total'] = df.sum(axis=1)
You ran out of dice combinations. You made your table 10^4 rows long, but there are only 6^4 combinations. Any row from 1296 through 9999 will be 0, because that's the initialized value.
To fix this, cut your table at the proper value: pow(6, 4)
Response to OP comment:
Of course you can write a loop. In this case, the controlling factor should be the number of results you want. Then you generate permutations to fulfill your needs. The Pythonic way to do this is to use the itertools package: permutations will give you the rolls in order; cycle will repeat the sequence until you quit asking.
However, the more obvious way for your current programming is perhaps to simply count in base 6:
digits = [1, 1, 1, 1, 1]
for i in range(10000):
# Record your digits in the data frame
...
# Add one for the next iteration; roll over if the die is already 6
for idx, die in enumerate(digits):
if die < 6:
digits[idx] += 1
break
else: # Reset die to 1 and continue to next die
digits[idx] = 1
This will increment the dice, left to right, until you either have one that doesn't need a reset to 1, or run out of dice.
Another possibility is to copy any of the many base-conversion functions available on line. Convert your iteration counter i to base 6, take the lowest 4 digits (quantity of dice), and add 1 to each digit.
Related
This is my data Frame
3 4 5 6 97 98 99 100
0 1.0 2.0 3.0 4.0 95.0 96.0 97.0 98.0
1 50699.0 16302.0 50700.0 16294.0 50735.0 16334.0 50737.0 16335.0
2 57530.0 33436.0 57531.0 33438.0 NaN NaN NaN NaN
3 24014.0 24015.0 34630.0 24016.0 NaN NaN NaN NaN
4 44933.0 2611.0 44936.0 2612.0 44982.0 2631.0 44972.0 2633.0
1792 46712.0 35340.0 46713.0 35341.0 46759.0 35387.0 46760.0 35388.0
1793 61283.0 40276.0 61284.0 40277.0 61330.0 40323.0 61331.0 40324.0
1794 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1795 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1796 27156.0 48331.0 27157.0 48332.0 NaN NaN NaN NaN
--> How do I apply the below function and get the answers back for each row in one run...
values is the array of values of each row and N is 100
def entropy_s(values, N):
a= scipy.stats.entropy(values,base=2)
a = round(a,2)
global CONSTANT_COUNT,RANDOM_COUNT,LOCAL_COUNT,GLOBAL_COUNT,ODD_COUNT
if(math.isnan(a) == True):
a = 0.0
if(a==0.0):
CONSTANT_COUNT += 1
elif(a<round(math.log2(N),2)):
LOCAL_COUNT +=1
RANDOM_COUNT +=1
elif(a==round(math.log2(N),2)):
RANDOM_COUNT +=1
GLOBAL_COUNT += 1
LOCAL_COUNT += 1
else:
ODD_COUNT +=1
I assume that the values are supposed to be rows? in that case, I suggest the following:
rows will be fed to function and you can get the column in each row using row.column_name.
def func(N=100):
def entropy_s(values):
a= scipy.stats.entropy(values,base=2)
a = round(a,2)
global CONSTANT_COUNT,RANDOM_COUNT,LOCAL_COUNT,GLOBAL_COUNT,ODD_COUNT
if(math.isnan(a) == True):
a = 0.0
if(a==0.0):
CONSTANT_COUNT += 1
elif(a<round(math.log2(N),2)):
LOCAL_COUNT +=1
RANDOM_COUNT +=1
elif(a==round(math.log2(N),2)):
RANDOM_COUNT +=1
GLOBAL_COUNT += 1
LOCAL_COUNT += 1
else:
ODD_COUNT +=1
return entropy_s
df.apply(func(100), axis=1)
if you want to have the rows as list you can do this:
df.apply(lambda x: func(100)([k for k in x]), axis=1)
import functools
series = df.apply(functool.partial(entropy_s, N=100), axis=1)
# or
series = df.apply(lambda x: entropy_s(x, N=100), axis=1)
axis=1 will push the rows of your df to the first arg of apply.
You will get a pd.Series of None's though, because your function doesn't return anything.
I highly suggest to avoid using globals in your function.
EDIT: If you want meaningful help, you need to ask meaningful questions. Which errors are you getting?
Here is a quick and dirty example that demonstrates what I've suggested. If you have an error, your function likely has a bug (for example, it doesn't return anything), or it doesn't know how to handle NaN.
In [6]: df = pd.DataFrame({1: [1, 2, 3], 2: [3, 4, 5], 3: [6, 7, 8]})
In [7]: df
Out[7]:
1 2 3
0 1 3 6
1 2 4 7
2 3 5 8
In [8]: df.apply(lambda x: np.sum(x), axis=1)
Out[8]:
0 10
1 13
2 16
dtype: int64
Exercise 7.3 from Think Python 2nd Edition:
To test the square root algorithm in this chapter, you could compare it with
math.sqrt. Write a function named test_square_root that prints a table like this:
1.0 1.0 1.0 0.0
2.0 1.41421356237 1.41421356237 2.22044604925e-16
3.0 1.73205080757 1.73205080757 0.0
4.0 2.0 2.0 0.0
5.0 2.2360679775 2.2360679775 0.0
6.0 2.44948974278 2.44948974278 0.0
7.0 2.64575131106 2.64575131106 0.0
8.0 2.82842712475 2.82842712475 4.4408920985e-16
9.0 3.0 3.0 0.0
The first column is a number, a; the second column is the square root of a computed with the function from Section 7.5; the third column is the square root computed by math.sqrt; the fourth column is the absolute value of the difference between the two estimates.
It took me a while to get to this point:
import math
def square_root(a):
x = a / 2
epsilon = 0.0000001
while True:
y = (x + a/x) / 2
if abs(y-x) < epsilon:
break
x = y
return y
def last_digit(number):
rounded = '{:.11f}'.format(number)
dig = str(rounded)[-1]
return dig
def test_square_root():
for a in range(1, 10):
if square_root(a) - int(square_root(a)) < .001:
f = 1
s = 13
elif last_digit(math.sqrt(a)) == '0':
f = 10
s = 13
else:
f = 11
s = 13
print('{0:.1f} {1:<{5}.{4}f} {2:<{5}.{4}f} {3}'.format(a, square_root(a), math.sqrt(a), abs(square_root(a)-math.sqrt(a)), f, s))
test_square_root()
That's my current output:
1.0 1.0 1.0 1.1102230246251565e-15
2.0 1.41421356237 1.41421356237 2.220446049250313e-16
3.0 1.73205080757 1.73205080757 0.0
4.0 2.0 2.0 0.0
5.0 2.2360679775 2.2360679775 0.0
6.0 2.44948974278 2.44948974278 8.881784197001252e-16
7.0 2.64575131106 2.64575131106 0.0
8.0 2.82842712475 2.82842712475 4.440892098500626e-16
9.0 3.0 3.0 0.0
I'm more focused now on achieving the right output, then I'll perfect the code itself, so here are my main problems:
Format the last column (I used {:.12g} once, but then the '0.0' turned to be only '0', so, what should I do?)
Fix the values of the last column. As you can see, there should be only two numbers greater than 0 (when a = 2 and 8), but there are two more (when a = 6 and 1), I printed them alone to see what was going on and the results were the same, I can't understand it.
Thanks for your help! :)
I have this DataFrame with x-axis data organized in column. However, for the non-existent, the columns were omitted, so the steps are uneven. For instance:
0.1 0.2 0.5 ...
0 1 4 7 ...
1 2 5 8 ...
2 3 6 9 ...
I want to plot each of those in with x-axis np.arange(0, max(df.columns), step=0.1) and also combined plot of those. Is there any easy way to achieve this with matplotlib.pyplot?
plt.plot(np.arange(0, max(df.columns), step=0.1), new_data)
Any help would be appreciated.
If I understood you correctly, your final dataframe is supposed to look like this:
0.0 0.1 0.2 0.3 0.4 0.5
0 0.0 1 4 0.0 0.0 7
1 0.0 2 5 0.0 0.0 8
2 0.0 3 6 0.0 0.0 9
which can be generated (and then also plotted) like this:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({0.1:[1,2,3],0.2:[4,5,6],0.5:[7,8,9]})
## make sure to actually include the maximum value (add one step)
# or alternatively rather use np.linspace() with appropriate number of points
xs = np.arange(0, max(df.columns) +0.1, step=0.1)
df = df.reindex(columns=xs, fill_value=0.0)
plt.plot(df.T)
plt.show()
which yields:
I am trying to cycle through a list of numbers (mostly decimals), but I want to return both 0.0 and the max number.
for example
maxNum = 3.0
steps = 5
increment = 0
time = 10
while increment < time:
print increment * (maxNum / steps)% maxNum
increment+=1
#
I am getting this as an output
0.0
0.6
1.2
1.8
2.4
0.0
but I want 3.0 as the largest number and to start back at 0.0 I.E.
0.0
0.6
1.2
1.8
2.4
3.0
0.0
Note, I have to avoid logical loops for the calculation part.
You could create the numbers that you want then use itertools.cycle to cycle through them:
import itertools
nums = itertools.cycle(0.6*i for i in range(6))
for t in range(10):
print(next(nums))
Output:
0.0
0.6
1.2
1.7999999999999998
2.4
3.0
0.0
0.6
1.2
1.7999999999999998
Only small change did the trick:
maxNum = 3.0
steps = 5
i = 0
times = 10
step = maxNum / steps
while (i < times):
print(step * (i % (steps + 1)))
i += 1
0.0
0.6
1.2
1.7999999999999998
2.4
3.0
0.0
0.6
1.2
1.7999999999999998
You could make a if statement that looks ahead if the next printed number is 0.0 then print the maxNum
maxNum = 3.0
steps = 5
increment = 0
time = 10
while increment < time:
print(round(increment * (maxNum / steps)% maxNum, 2))
increment+=1
if (round(increment * (maxNum / steps)% maxNum, 2)) == 0.0:
print(maxNum)
0.0
0.6
1.2
1.8
2.4
3.0
0.0
0.6
1.2
1.8
2.4
3.0
This question already has answers here:
Why does the division get rounded to an integer? [duplicate]
(13 answers)
Closed 6 years ago.
i have the following code:
a = 0.0
b = 0.0
c = 1.0
while a < 300:
a = a + 1
b = b + 1
c = c * b
d = (3**a)
e = (a+1)*c
f = d / e
print a, f
The moment f becomes less than 1, I get "0" displayed.. why?
The moment f becomes less than 1, I get "0" displayed
That's not what happens. The first time f is less than 1, 4.0 0.675 is printed. That's not 0:
1.0 1.5
2.0 1.5
3.0 1.125
4.0 0.675
5.0 0.3375
The value of f then quickly becomes very very small, to the point Python starts using scientific notation negative exponents:
6.0 0.144642857143
7.0 0.0542410714286
8.0 0.0180803571429
9.0 0.00542410714286
10.0 0.00147930194805
11.0 0.000369825487013
12.0 8.53443431568e-05
13.0 1.82880735336e-05
14.0 3.65761470672e-06
15.0 6.8580275751e-07
Note the -05, -06, etc. The value of f has become so small that it is more efficient to shift the decimal point. If you were to format those values using fixed-point notation, they'd use more zeros:
>>> format(8.53443431568e-05, '.53f')
'0.00008534434315680000128750276600086976941383909434080'
>>> format(6.8580275751e-07, '.53f')
'0.00000068580275751000002849671038224199648425383202266'
Eventually, f becomes too small for the floating point type; floating point values simply can't go closer to zero beyond this point:
165.0 5.89639287564e-220
166.0 1.05923225311e-221
167.0 1.89148616627e-223
168.0 3.35766775077e-225
169.0 5.92529603077e-227
170.0 0.0
The last value before 0.0 is 5.925 with two hundred and twenty seven zeros before it:
>>> format(5.92529603077e-227, '.250f')
'0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000592529603077000028336751'
That's pretty close to the absolute minimum value a float object can represent; see the min attribute of the sys.float_info named tuple:
>>> import sys
>>> sys.float_info.min
2.2250738585072014e-308
In other words, you reached the limits, and there was no more representable value between 5.92529603077e-227 and 0.0 left.