calculate the range and quartiles - python

[![Marks
Freq
0-10
5
10-20
13
20-30
20
30-40
32
40-50
60
I want to calculate quartile and range of above data please help using python also represent suitable plot using Matplotlib
[1]: https://i.stack.imgur.com/x0cNf.png

I used this formula to solve it
# Formula for finding "i"th quartile:
# Q_i = L + h/f (i*N/4 - c.f)
data = {
(0, 10): 5,
(10, 20): 13,
(20, 30): 20,
(30, 40): 32,
(40, 50): 60
}
i = 1 # Quartile you want to find
x = (i * sum(data.values())) / 4 # Precalculate i*N/4
c_f = [sum(list(data.values())[:n]) for n in range(1, len(data) + 1)] # Cumulative frequencies
# Calculate class which the quartile is in
# (L = lower, u = upper, f = frequency, c = cumulative frequency)
for ((L, u), f), c in zip(data.items(), c_f):
if c >= x:
break
h = u - L # Class size
C_f = c - f
Q_i = L + ((h/f) * (x - C_f))
print('Quartile ', i, ': ', Q_i, sep='')
Output: Quartile 1: 27.25

Related

How to solve a Knapsack problem with extra constraints? (Or alternative algorithms)

Expanding upon a common dynamic programming solution for the knapsack problem:
def knapSack(W, wt, val, n):
results = []
K = [[0 for x in range(W + 1)] for x in range(n + 1)]
# Build tаble K[][] in bоttоm uр mаnner
for i in range(n + 1):
for w in range(W + 1):
if (i == 0) or (w == 0):
K[i][w] = 0
elif wt[i-1] <= w:
K[i][w] = max(val[i-1] + K[i-1][w-wt[i-1]], K[i-1][w])
else:
K[i][w] = K[i-1][w]
print(K[n][W])
return(set(results))
# Driver code
val = [60, 100, 120]
wt = [10, 20, 30]
W = 50
n = len(val)
knapSack(W, wt, val, n) = 220
Now perhaps I add extra constraints:
# Driver code
val_all = [60, 100, 120, 50, 80, 10]
wt_all = [10, 20, 30, 15, 20, 5]
W_all = 50
n_all = len(val_all)
val_1 = [60, 100, 120]
wt_1 = [10, 20, 30]
W_1 = 30
n_1 = len(val_1)
val_2 = [50, 80, 10]
wt_2 = [15, 20, 5]
W_2 = 25
n_2 = len(val_2)
I want to maximise all 3, using the same values. val_all has a solution of 240 [60, 100, 80]. val_1 is maxed at 160 [60, 100] and val_2 would max at 90 [80, 10] but given I want the same values and 10 does not sit in the other two sets the max solution would be 80.
I am also wondering if you can add to the function to give you the values chosen as well as the maximum value. And is this approach feasible for large lists as I have a list of 150,000 different values each with different weights.
There may be a better algorithm, my problem is I have 150,000 values each with a weight and need to select any number of those values such that we get a close to a ceiling W value. However, the data is actually a mixture of two different types of values and the sum of weights of each type also have a W1 and W2 ceiling value. I'd like to maximise all three equations but using the same set of values. Any value chosen in W1 or W2 must exist in W.
This knapsack code won't be very useful as I have 150k values with an average weight of 50 and a weight ceiling of 6mil. The time complexity given such large loops will be huge.
Based on the top comment I found a package that is very fast and allows for this:
from mip import Model, xsum, maximize, BINARY
all = pd.read_csv('df_all.csv')
X = pd.read_csv('df_x_only.csv')
Y = pd.read_csv('df_y_only.csv')
p = all.id.values # as we arent optimizing the value of the index, p is irrelevant and we replace xsum(p[i] with xsum(w[i] in the objective function
w = all.new_weights.values
w1 = X.new_weights.values
w2 = Y.new_weights.values
c = 5876834
Cx = 4902953
Cy = 719051.4
I = range(len(w))
I1 = range(len(w1))
I2 = range(len(w2))
m = Model("knapsack")
x = [m.add_var(var_type=BINARY) for i in I]
m.objective = maximize(xsum(w[i] * x[i] for i in I)) + maximize(xsum(w1[i] * x[i] for i in I1)) + maximize(xsum(w2[i] * x[i] for i in I2))
m += xsum(w[i] * x[i] for i in I) <= c*1.05
m += xsum(w[i] * x[i] for i in I) >= c*0.95
m += xsum(w1[i] * x[i] for i in I1) <= Cx*1.05
m += xsum(w1[i] * x[i] for i in I1) >= Cx*0.95
m += xsum(w2[i] * x[i] for i in I2) <= Cy*1.05
m += xsum(w2[i] * x[i] for i in I2) >= Cy*0.95
m.optimize()
selected = [i for i in I if x[i].x >= 0.99]

Split intervals longer than a threshold

I have a list of tuples, each defining an interval (start, end).
I would like to split the intervals which are longer than a certain threshold.
Example:
Initial list: segs = [(0,100),(120,140),(160,200)]
Threshold: 30
Desired output:
split_segs = [(0,30),(30,60),(60,90),(90,100),(120,140),(160,190),(190,200)]
I come up with this code.
thr = 30.
split_segs = []
for a,b in segs:
if b-a < thr:
split_segs.extend([(a,b)])
else:
n = int((b-a)/thr)
for i in range(n):
if b-(a + (i+1)*thr) < thr:
split_segs.extend([(a+(i+1)*thr, b)])
else:
split_segs.extend([(a+i*thr, a+(i+1)*thr)])
It works but looks very clumsy to me. Any better or more pythonic solution?
You can do this slightly more elegantly by extending with a range that has a step of threshold:
segs = [(0,100),(120,140),(160,200)]
threshold = 30
split_segs = []
for seg in segs:
(a, b) = seg
diff = b - a
if diff <= threshold:
split_segs.append(seg)
else:
split_segs.extend((n - threshold, n) for n in range(a + threshold, b + 1, threshold))
if diff % threshold:
# complete the gap
split_segs.append((b - diff % threshold, b))
print(split_segs)
This is a recursive solution for your problem:
segs = [(0,100),(120,140),(160,200)]
threshold = 30
def divide(to_divide):
divided = []
if to_divide[1] - to_divide[0] > threshold:
divided.append((to_divide[0], to_divide[0] + threshold))
divided.extend(divide((to_divide[0] + threshold, to_divide[1])))
return divided
else:
return [to_divide]
divided = [el for x in segs for el in divide(x)]
print(divided)
The output will be:
[(0, 30), (30, 60), (60, 90), (90, 100), (120, 140), (160, 190), (190, 200)]
UPDATE: if you prefere a non-recursive solution, this is a possible one:
segs = [(0,100),(120,140),(160,200)]
threshold = 30
def divide(to_divide):
divided = []
divided.extend((to_divide[0] + i * threshold, to_divide[0] + (i+1) * threshold) for i in range((to_divide[1] - to_divide[0]) // threshold))
if divided:
if divided[-1][1] != to_divide[1]:
divided.append((divided[-1][1], to_divide[1]))
else:
divided.append((to_divide[0], to_divide[1]))
return divided
divided = [el for x in segs for el in divide(x)]
print(divided)

How many boxes can we put to the big one?

I'm trying to work on simple algorithms.
I have 600 snacks and I have two kind of boxes 45 snacks inside and 60 snacks. We need to receive all the amount of options that we can do with this small boxes
I have this kind of code, but some how it doesn't work in a right way.
k = 0
for x in range(0,601):
for y in range(0, int(600 // 45) + 1):
for z in range(0, int(600 // 60) +1):
if x +45 * y + 45 * z == 600:
print(x,'45=',y,'60=',z)
k=k+1
print(k)
If I got you right it is simple math. You have 600 items and want use these 600 items in boxes size of 45 and size of 60. I don’t know what you use x for?
k=0
for y in range(0,20):
for z in range(0,20):
if 45 * y + 60 * z == 600
print('45=',y,'60=',z)
k = k + 1
print(k)
Result will be:
45= 0 60= 10
45= 4 60= 7
45= 8 60= 4
45= 12 60= 1
4
At a first glance, z seems to represent the box that can hold 60 snacks. So the line of code if x +45 * y + 45 * z == 600: does not seem ok. The multiplication factor for z should be 60, i.e., if x +45 * y + 60 * z == 600:
The answer is (EDIT: the solution is rewritten as functions):
Both functions return list of tuples with found combinations.
Both functions iterate only throw one box size and filter by second one.
The length of the list that return both functions is the amount of options
def box_comb(snacks, boxA, boxB):
res = []
for a in range(snacks // boxA + 1): # Iterate by boxA
free_space = snacks - boxA * a
if free_space % boxB == 0: # Filter by boxB
b = free_space // boxB # Calculate the count of boxB
res.append((a, b))
return res
# Try this
comb = box_comb(snacks=600,
boxA=60,
boxB=45)
print(comb)
print(f"Number of combinations = {len(comb)}")
The output:
[(1, 12), (4, 8), (7, 4), (10, 0)]
Number of combinations = 4
Single line solution:
The same algorithm written as single line solution
def box_comb_2(snacks, boxA, boxB):
return [(a, (snacks - a * boxA) // boxB) for a in range(snacks // boxA + 1) \
if (snacks - a * boxA) % boxB == 0]
# try this
comb = box_comb_2(snacks=600,
boxA=60,
boxB=45)
print(comb)
print(f"Number of combinations = {len(comb)}")
The output is
[(1, 12), (4, 8), (7, 4), (10, 0)]
Number of combinations = 4

Min. iterations drawing regular shape("turtle")

I'm trying to find the least number of iterations necessary to form a regular polygon without my "turtle" (shape) repeating its motion.... and noticed a strange(?) relationship which I cannot pinpoint.
If you run the code below and experiment with different values (NOTE: make sure to replace parameters 'x' & 'n' with actual numbers - of your choice):
import turtle
def draw_square():
wn = turtle.Screen()
wn.bgcolor("black")
mike = turtle.Turtle()
mike.shape("turtle")
mike.color("yellow")
mike.speed(100)
count = 0
while count < n: # replace n with number!
mike.forward(100)
mike.right(90)
mike.forward(100)
mike.right(90)
mike.forward(100)
mike.right(90)
mike.forward(100)
mike.right(x) # replace x with number!
if __name__ == "__main__":
draw_square()
You will find the turtle moving in a circular(-ish) motion.
For example, you'll notice that when x = 100, min. value of n needed to form a regular shape is 36 (since 100°- 90°=10°; 360°/10°=36).
when x = 10 e.g
.
Further tests show:
x = 1, (min.) n = 360 # 360°/1° = 360
x = 5, (min.) n = 72 # 360°/5° = 72
x = 9, (min.) n = 10* # 360°/9° = 10*
x = 10, (min.) n = 9* # 360°/10° = 9*
x = 45, (min.) n = 8 # 360°/45° = 8
x = 90, (min.) n = 1* # 360°/90° = 4*
## NOTE: no obvs. solution for n, if x isn't factor of 360....
*: Strangely, you must divide the result by 4 to get min. value of n for some numbers.
I had initially thought it was to do with multiples of 9, or four rotations for square, but [above] led me to reject my hypotheses.
So anyone have any better ideas as to a generic rule? Cheers.
So anyone have any better ideas as to a generic rule?
I believe I've narrowed it down. There are some errors in your table. And there are four different types of exception, not just the "divide the result by 4" one. In fact, across the factors of 360, the exceptions occur more often than the simple 360 / x rule. The four exceptions are:
After, n = 360 / x if x is a:
A) multiple of 8 then n *= 4
B) multiple of 4 then n *= 2
C) multiple of 6 and not a multiple of 9 then n /= 2
D) multiple of 2 then n /= 4
The rules must be applied in the above order and only one rule can fire. If no rules apply, leave n as it is. The revised table for all factors of 360:
x = 1, n = 360 , 360° / 1° = 360
x = 2, n = 45 (/ 4), 360° / 2° = 180 (D)
x = 3, n = 120 , 360° / 3° = 120
x = 4, n = 180 (* 2), 360° / 4° = 90 (B)
x = 5, n = 72 , 360° / 5° = 72
x = 6, n = 30 (/ 2), 360° / 6° = 60 (C)
x = 8, n = 180 (* 4), 360° / 8° = 45 (A)
x = 9, n = 40 , 360° / 9° = 40
x = 10, n = 9 (/ 4), 360° / 10° = 36 (D)
x = 12, n = 60 (* 2), 360° / 12° = 30 (B)
x = 15, n = 24 , 360° / 15° = 24
x = 18, n = 5 (/ 4), 360° / 18° = 20 (D)
x = 20, n = 36 (* 2), 360° / 20° = 18 (B)
x = 24, n = 60 (* 4), 360° / 24° = 15 (A)
x = 30, n = 6 (/ 2), 360° / 30° = 12 (C)
x = 36, n = 20 (* 2), 360° / 36° = 10 (B)
x = 40, n = 36 (* 4), 360° / 40° = 9 (A)
x = 45, n = 8 , 360° / 45° = 8
x = 60, n = 12 (* 2), 360° / 60° = 6 (B)
x = 72, n = 20 (* 4), 360° / 72° = 5 (A)
x = 90, n = 1 (/ 4), 360° / 90° = 4 (D)
x = 120, n = 12 (* 4), 360° / 120° = 3 (A)
x = 180, n = 4 (* 2), 360° / 180° = 2 (B)
x = 360, n = 4 (* 4), 360° / 360° = 1 (A)
The code that generated the above table:
EXCEPTIONS = [
('A', lambda x: x % 8 == 0, lambda n: n * 4, "(* 4)"),
('B', lambda x: x % 4 == 0, lambda n: n * 2, "(* 2)"),
('C', lambda x: x % 6 == 0 and x % 9 != 0, lambda n: n // 2, "(/ 2)"),
('D', lambda x: x % 2 == 0, lambda n: n // 4, "(/ 4)"),
]
for x in range(1, 360 + 1):
if 360 % x != 0:
continue
n = 360 // x
for exception, test, outcome, explain in EXCEPTIONS:
if test(x):
n = outcome(n)
exception = f"({exception})"
break
else: # no break
exception = explain = '' # no rule applies
angle = 360 // x
print(f"x = {x:3}, n = {n:3} {explain:5}, 360° / {x:3}° = {angle:3} {exception}")
My rework of your code which I used to test individual table entries:
from turtle import Screen, Turtle
def draw_square(angle, repetitions):
mike = Turtle("turtle")
mike.speed('fastest')
mike.color("yellow")
count = 0
while count < repetitions:
mike.forward(100)
mike.right(90)
mike.forward(100)
mike.right(90)
mike.forward(100)
mike.right(90)
mike.forward(100)
mike.right(angle)
count += 1
if __name__ == "__main__":
wn = Screen()
wn.bgcolor("black")
draw_square(9, 40)
wn.exitonclick()
Further on from the rule-set identified by #cdlane, I've found a quick way to find just the min. number of iterations for any input x - regardless of whether it's a factor of 360 or not - needed to complete a regular shape! (Of course, I've also realised there will not be a minimum value at all for some e.g. when x is 20.75)
.
Below code shows my corrections to faults identified & addition of heading(), to check if mike has returned to its original position after cycle(s):
import turtle
def draw_square(angle, repetitions):
mike = turtle.Turtle()
mike.shape("turtle")
mike.color("red")
mike.speed("fastest")
count = 0
while count < repetitions:
mike.forward(100)
mike.right(90)
mike.forward(100)
mike.right(90)
mike.forward(100)
mike.right(90)
mike.forward(100)
mike.right(angle)
count += 1
print("Turn ", count, "; ", mike.heading())
if mike.heading() == 0:
break
print("Min. iterations needed to complete cycle: ", count)
if __name__ == "__main__":
wn = turtle.Screen()
wn.bgcolor("black")
x = int(input("Enter angle: "))
n = int(input("Enter boundary: ")) # For n, advisably to a v large number; while loop will probably break before reaching its limit anyways
draw_square(x, n)
wn.exitonclick()

Python generate all combinations of directions including diagonals in 3 dimensions

I want to generate all directions from a point in a 3D grid, but I can't quite get my head around the next bit. For the record it's all stored in a single list, so I need some maths to calculate where the next point will be.
I only really need 3 calculations to calculate any of the 26 or so different directions (up, up left, up left forwards, up left backwards, up right, up right forwards, etc), so I decided to work with X, Y, Z, then split them into up/down left/right etc, to then get the correct number to add or subtract. Generating this list to get the maths working however, seems to be the hard bit.
direction_combinations = 'X Y Z XY XZ YZ XYZ'.split()
direction_group = {}
direction_group['X'] = 'LR'
direction_group['Y'] = 'UD'
direction_group['Z'] = 'FB'
So basically, using the below code, this is the kind of stuff I'd like it to do, but obviously not have it hard coded. I could do it in a hacky way, but I imagine there's something really simple I'm missing here.
#Earlier part of the code to get this bit working
#I've also calculated the edges but it's not needed until after I've got this bit working
grid_size = 4
direction_maths = {}
direction_maths['U'] = pow(grid_size, 2)
direction_maths['R'] = 1
direction_maths['F'] = grid_size
direction_maths['D'] = -direction_maths['U']
direction_maths['L'] = -direction_maths['R']
direction_maths['B'] = -direction_maths['F']
#Bit to get working
starting_point = 25
current_direction = 'Y'
possible_directions = [direction_group[i] for i in list(current_direction)]
for y in list(possible_directions[0]):
print starting_point + direction_maths[y]
# 41 and 9 are adjacent on the Y axis
current_direction = 'XYZ'
possible_directions = [direction_group[i] for i in list(current_direction)]
for x in list(possible_directions[0]):
for y in list(possible_directions[1]):
for z in list(possible_directions[2]):
print starting_point + direction_maths[x] + direction_maths[y] + direction_maths[z]
# 44, 36, 12, 4, 46, 38, 14 and 6 are all adjacent on the corner diagonals
Here's a general idea of how the grid looks with the list indexes (using 4x4x4 as an example):
________________
/ 0 / 1 / 2 / 3 /
/___/___/___/___/
/ 4 / 5 / 6 / 7 /
/___/___/___/___/
/ 8 / 9 /10 /11 /
/___/___/___/___/
/12 /13 /14 /15 /
/___/___/___/___/
________________
/16 /17 /18 /19 /
/___/___/___/___/
/20 /21 /22 /23 /
/___/___/___/___/
/24 /25 /26 /27 /
/___/___/___/___/
/28 /29 /30 /31 /
/___/___/___/___/
________________
/32 /33 /34 /35 /
/___/___/___/___/
/36 /37 /38 /39 /
/___/___/___/___/
/40 /41 /42 /43 /
/___/___/___/___/
/44 /45 /46 /47 /
/___/___/___/___/
________________
/48 /49 /50 /51 /
/___/___/___/___/
/52 /53 /54 /55 /
/___/___/___/___/
/56 /57 /58 /59 /
/___/___/___/___/
/60 /61 /62 /63 /
/___/___/___/___/
Edit: Using the answers mixed with what I posted originally (wanted to avoid converting to and from 3D points if possible), this is what I ended up with to count the number of complete rows :)
def build_directions():
direction_group = {}
direction_group['X'] = 'LR'
direction_group['Y'] = 'UD'
direction_group['Z'] = 'FB'
direction_group[' '] = ' '
#Come up with all possible directions
all_directions = set()
for x in [' ', 'X']:
for y in [' ', 'Y']:
for z in [' ', 'Z']:
x_directions = list(direction_group[x])
y_directions = list(direction_group[y])
z_directions = list(direction_group[z])
for i in x_directions:
for j in y_directions:
for k in z_directions:
all_directions.add((i+j+k).replace(' ', ''))
#Narrow list down to remove any opposite directions
some_directions = all_directions
opposite_direction = all_directions.copy()
for i in all_directions:
if i in opposite_direction:
new_direction = ''
for j in list(i):
for k in direction_group.values():
if j in k:
new_direction += k.replace(j, '')
opposite_direction.remove(new_direction)
return opposite_direction
class CheckGrid(object):
def __init__(self, grid_data):
self.grid_data = grid_data
self.grid_size = calculate_grid_size(self.grid_data)
self.grid_size_squared = pow(grid_size, 2)
self.grid_size_cubed = len(grid_data)
self.direction_edges = {}
self.direction_edges['U'] = range(self.grid_size_squared)
self.direction_edges['D'] = range(self.grid_size_squared*(self.grid_size-1), self.grid_size_squared*self.grid_size)
self.direction_edges['R'] = [i*self.grid_size+self.grid_size-1 for i in range(self.grid_size_squared)]
self.direction_edges['L'] = [i*self.grid_size for i in range(self.grid_size_squared)]
self.direction_edges['F'] = [i*self.grid_size_squared+j+self.grid_size_squared-self.grid_size for i in range(self.grid_size) for j in range(self.grid_size)]
self.direction_edges['B'] = [i*self.grid_size_squared+j for i in range(self.grid_size) for j in range(self.grid_size)]
self.direction_edges[' '] = []
self.direction_maths = {}
self.direction_maths['D'] = pow(self.grid_size, 2)
self.direction_maths['R'] = 1
self.direction_maths['F'] = self.grid_size
self.direction_maths['U'] = -self.direction_maths['D']
self.direction_maths['L'] = -self.direction_maths['R']
self.direction_maths['B'] = -self.direction_maths['F']
self.direction_maths[' '] = 0
def points(self):
total_points = defaultdict(int)
opposite_directions = build_directions()
all_matches = set()
#Loop through each point
for starting_point in range(len(self.grid_data)):
current_player = self.grid_data[starting_point]
if current_player:
for i in opposite_directions:
#Get a list of directions and calculate movement amount
possible_directions = [list(i)]
possible_directions += [[j.replace(i, '') for i in possible_directions[0] for j in direction_group.values() if i in j]]
direction_movement = sum(self.direction_maths[j] for j in possible_directions[0])
#Build list of invalid directions
invalid_directions = [[self.direction_edges[j] for j in possible_directions[k]] for k in (0, 1)]
invalid_directions = [[item for sublist in j for item in sublist] for j in invalid_directions]
num_matches = 1
list_match = [starting_point]
#Use two loops for the opposite directions
for j in (0, 1):
current_point = starting_point
while current_point not in invalid_directions[j]:
current_point += direction_movement*int('-'[:j]+'1')
if self.grid_data[current_point] == current_player:
num_matches += 1
list_match.append(current_point)
else:
break
#Add a point if enough matches
if num_matches == self.grid_size:
list_match = tuple(sorted(list_match))
if list_match not in all_matches:
all_matches.add(list_match)
total_points[current_player] += 1
return total_points
Here's basically the same thing that #AnnoSielder did, but makes use of itertools to reduce the amount of code.
from itertools import product
# Get a list of all 26 possible ways to move from a given coordinate in a 3 coordinate system.
base_deltas = filter(lambda point: not all(axis ==0 for axis in point), list(product([-1, 0, 1], repeat=3)))
# Define your max axis length or your grid size
grid_size = 4
# Simple function that applys the deltas to the given coordinate and returns you the list.
def apply_deltas(deltas, coordinate):
return [
(coordinate[0]+x, coordinate[1]+y, coordinate[2]+z)
for x, y, z in deltas
]
# This will determine whether the point is out of bounds for the given grid
is_out_of_bounds = lambda point: all(0 <= axis < grid_size for axis in point)
# Define your point, in this case it's block #27 in your example
coordinate = [3, 2, 1]
# Apply the deltas, then filter using the is_out_of_bounds lambda
directions = filter(is_out_of_bounds, apply_deltas(base_deltas, coordinate))
# directions is now the list of 17 coordinates that you could move to.
Don't make thinks unnecessary complicated. Do not describe a point in 3 dimensions with 1 number - 3 coordinates means 3 numbers.
Should be something like this:
numb = 37
cube_size = 4
# convert to x - y - z
start = [0, 0, 0]
start[2] = numb / cube_size ** 2
numb = numb % cube_size ** 2
start[1] = numb / cube_size
start[0] = numb % cube_size
for x in [-1, 0, 1]:
current_x = start[0] + x
for y in [-1, 0, 1]:
current_y = start[1] + y
for z in [-1, 0, 1]:
current_z = start[2] + z
#reconvert
convert = current_x + current_y * cube_size + current_z * cube_size ** 2
print("x: " + str(current_x) + " y: " + str(current_y) + " z: " + str(current_z) + " => " + str(convert))
Simply generate your x/y/z-coordinate, then run all possibilities of add -1/0/1 to these coordinates and re-convert to your number in the grid.

Categories

Resources