Find the longest arithmetic progression inside a sequence

Find the longest arithmetic progression inside a sequence - python

Suppose I have a sequence of increasing numbers, and I want to find the length of longest arithmetic progression within the sequence. Longest arithmetic progression means an increasing sequence with common difference, such as [2, 4, 6, 8] or [3, 6, 9, 12].
For example,
for [5, 10, 14, 15, 17], [5, 10, 15] is the longest arithmetic progression, with length 3;
for [10, 12, 13, 20, 22, 23, 30], [10, 20, 30] is the longest arithmetic progression with length 3;
for [7, 10, 12, 13, 15, 20, 21], [10, 15, 20] or [7, 10, 13] are the longest arithmetic progressions with length 3.
This site
https://prismoskills.appspot.com/lessons/Dynamic_Programming/Chapter_22_-_Longest_arithmetic_progression.jsp
offers some insight into the problem, i.e. by looping around j and consider
every 3 elements. I intend to use this algorithm in Python, and my code is as follows:
def length_of_AP(L):
n = len(L)
Table = [[0 for _ in range(n)] for _ in range(n)]
length_of_AP = 2
# initialise the last column of the table as all i and (n-1) pairs have lenth 2
for i in range(n):
Table[i][n-1] =2
# loop around the list and i, k such that L[i] + L[k] = 2 * L[j]
for j in range(n - 2, 0, -1):
i = j - 1
k = j + 1
while i >= 0 and k < n:
difference = (L[i] + L[k]) - 2 * L[j]
if difference < 0:
k = k + 1
else:
if difference > 0:
i = i - 1
else:
Table[i][j] = Table[j][k] + 1
length_of_AP = max(length_of_AP, Table[i][j])
k = k + 1
i = i - 1
return length_of_AP
This function works fine with [1, 3, 4, 5, 7, 8, 9], but it doesn't work for [5, 10, 14, 15, 20, 25, 26, 27, 28, 30, 31], where I am supposed to get 6 but I got 4. I can see the reason being that 25, 26, 27, 28 inside the list may be a distracting factor for my function. How do I change my function so that it gives me the result desired.
Any help may be appreciated.

Following your link and running second sample, it looks like the code actually find proper LAP
5, 10, 15, 20, 25, 30,
but fails to find proper length. I didn't spend too much time analyzing the code but the piece
// Any 2-letter series is an AP
// Here we initialize only for the last column of lookup because
// all i and (n-1) pairs form an AP of size 2
for (int i=0; i<n; i++)
lookup[i][n-1] = 2;
looks suspicious to me. It seems that you need to initialize whole lookup table with 2 instead of just last column and if I do so, it starts to get correct length on your sample as well.
So get rid of the "initialise" loop and change your 3rd line to following code:
# initialise whole table with 2 as all (i, j) pairs have length 2
Table = [[2 for _ in range(n)] for _ in range(n)]
Moreover their
Sample Execution:
Max AP length = 6
3, 5, 7, 9, 11, 13, 15, 17,
Contains this bug as well and actually prints correct sequence only because of sheer luck. If I modify the sortedArr to
int sortedArr[] = new int[] {3, 4, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17, 18, 112, 113, 114, 115, 116, 117, 118};
I get following output
Max AP length = 7
112, 113, 114, 115, 116, 117, 118,
which is obviously wrong as original 8-items long sequence 3, 5, 7, 9, 11, 13, 15, 17, is still there.

Did you try it?
Here's a quick brute force implementation, for small datasets it should run fast enough:
def gen(seq):
diff = ((b-a, a) for a, b in it.combinations(sorted(seq), 2))
for d, n in diff:
k = []
while n in seq:
k.append(n)
n += d
yield (d, k)
def arith(seq):
return max(gen(seq), key=lambda x: len(x[1]))
In [1]: arith([7, 10, 12, 13, 15, 20, 21])
Out[1]: (3, [7, 10, 13])
In [2]: %timeit arith([7, 10, 12, 13, 15, 20, 21])
10000 loops, best of 3: 23.6 µs per loop
In [3]: seq = {random.randrange(1000) for _ in range(100)}
In [4]: arith(seq)
Out[4]: (171, [229, 400, 571, 742, 913])
In [5]: %timeit arith(seq)
100 loops, best of 3: 3.79 ms per loop
In [6]: seq = {random.randrange(1000000) for _ in range(1000)}
In [7]: arith(seq)
Out[7]: (81261, [821349, 902610, 983871])
In [8]: %timeit arith(seq)
1 loop, best of 3: 434 ms per loop

Related

How can I find the longest contiguous subsequence in a rising sequence in Python?

I need to find the longest contiguous subsequence in a rising sequence in Python.
For example if I have A = [1, 2, 3, 5, 8, 9, 11, 13, 17, 18, 19, 20, 21, 25, 27, 28, 29, 30]
The answer would be [17, 18, 19, 20, 21] because it's the longest contiguous subsequence with 5 numbers (whereas [1, 2, 3] is 3 numbers long and [27, 28, 29, 30] is 4 numbers long.)
My code is stuck in an endless loop
num_list = [1, 2, 3, 5, 8, 9, 11, 13, 17, 18, 19, 20, 21, 23, 25, 26, 27]
longest_sequence = {}
longest_sequence_length = 1
for num in num_list:
sequence_length = 1
while True:
if (num + sequence_length) in num_list:
sequence_length += 1
else:
if sequence_length > longest_sequence_length:
longest_sequence_length_length = sequence_length
longest_sequence = {"start": num, "end": num + (sequence_length - 1)}
break
print(f"The longest sequence is {longest_sequence_length} numbers long"
f" and it's between {longest_sequence['start']} and {longest_sequence['end']}")

You can use numpy to solve it in one line:
import numpy as np
A = [1, 2, 3, 5, 8, 9, 11, 13, 17, 18, 19, 20, 21, 25, 27, 28, 29, 30]
out = max(np.split(A, np.where(np.diff(A) != 1)[0] + 1), key=len).tolist()
You can also find the same outcome by running 3 iterations.
(i) First you need to find the differences between consecutive elements in A; that's found in diff (with zip(A,A[1:]), you can access consecutive elements).
(ii) Then you split A on indices where the difference is not 1; that's being done in the second iteration. Basically, if a difference is 1, append the value in A to the running sublist, if not, create a new sublist and put the corresponding value to this new sublist.
(iii) Finally, using max() function, you can find the longest sublist using key=len.
This exact same job is done by the numpy code above.
diff = [j-i for i,j in zip(A, A[1:])]
splits = [[A[0]]]
for x,d in zip(A[1:], diff):
if d == 1:
splits[-1].append(x)
else:
splits.append([x])
out = max(splits, key=len)
Output:
[17, 18, 19, 20, 21]

In line 13 you need a break instead of a continue statement.
Also, in line 11 you had a little mistake, added an extra "_length" to you variable name.

A variation of the knapsack problem where some items must be included but don't count towards the objective

I am trying to solve a variation of the 0/1 knapsack problem where n items must be picked, but only k < n items' values are counted towards the objective function.
My idea was to set up two vectors of binary variables, x and y - x denoting which n items are picked, and y denoting which k items are counted towards the objective - however my problem is ensuring that y is a subset of x.
I am using the python mip library, and here is the code I have so far (a slightly modified version of the knapsack example in the mip documentation):
from mip import Model, xsum, maximize, BINARY
values = [10, 13, 18, 31, 7, 15, 8, 11, 3, 9, 13, 12, 11, 6, 18, 11, 18, 13, 12, 11]
weights = [11, 15, 20, 24, 9, 16, 12, 3, 6, 9, 17, 13, 20, 9, 32, 14, 19, 20, 12, 13]
max_weight = 200
I = range(len(weights))
m = Model("knapsack")
x = [m.add_var(var_type=BINARY) for i in I]
y = [m.add_var(var_type=BINARY) for i in I]
m += xsum(x[i] for i in I) == 15 # n
m += xsum(y[i] for i in I) == 11 # k
m += xsum(weights[i] * x[i] for i in I) <= max_weight
m.objective = maximize(xsum(values[i] * y[i] for i in I))
# `m += xsum(x[i] * y[i] for i in I) == 11` doesn't work
m.optimize()
selected_x = [i for i in I if x[i].x >= 0.99]
selected_y = [i for i in I if y[i].x >= 0.99]
print("selected items: {}".format(selected_x))
print("selected items: {}".format(selected_y))
#Output:
# selected items: [0, 1, 2, 4, 6, 7, 8, 11, 12, 13, 15, 16, 17, 18, 19]
# selected items: [1, 2, 3, 5, 10, 11, 14, 16, 17, 18, 19]
Any help would be great, thank you.
edit: for anyone finding this in the future, simply adding
for i in I:
m += x[i] >= y[i]
works perfectly.

How to swap the new added number into the correct position in binary heap?

This question of my homework has passed a list where index 1 is the new node and is also the root. Then I have to check if it's children is smaller then itself and swap it with the smaller child. I've written some code but it's not working.
def perc_down(data):
count = 0
index = 1
l, r = 2 * index, 2 * index + 1
while index < len(data):
if data[index] > data[l] and data[index] > data[r]:
min_i = data.index(min(data[l], data[r]))
data[index], data[min_i] = data[min_i], data[index]
count += 1
index = min_i
return count
values = [0, 100, 7, 8, 9, 22, 45, 12, 16, 27, 36]
swaps = perc_down(values)
print('Binary heap =',values)# should be [0, 7, 9, 8, 16, 22, 45, 12, 100, 27, 36]
print('Swaps =', swaps)# should be 3

Give l and r values inside the while loop
while index <= len(data) // 2:
l, r = 2 * index, 2 * index + 1
if r >= len(data):
r = index
if data[index] > data[l] or data[index] > data[r]:
min_i = data.index(min(data[l], data[r]))
data[index], data[min_i] = data[min_i], data[index]
count += 1
index = min_i
print(data) #Added this for easy debugging.
return count
And run the loop till half values only because it's binary min heap.
Output:
[0, 7, 100, 8, 9, 22, 45, 12, 16, 27, 36]
[0, 7, 9, 8, 100, 22, 45, 12, 16, 27, 36]
[0, 7, 9, 8, 16, 22, 45, 12, 100, 27, 36]
Binary heap = [0, 7, 9, 8, 16, 22, 45, 12, 100, 27, 36]
Swaps = 3
Revised the algorithm for those indices whose children do not exist.
For : values = [0, 100, 7, 11, 9, 8, 45, 12, 16, 27, 36] for 100 after 2 swaps comes at index 5 which does not have a right child so when it exceeds the length of list we just set it back to original index.
Heapified list : Binary heap = [0, 7, 8, 11, 9, 36, 45, 12, 16, 27, 100].

Selecting a random sample from a very large generator

I am trying to test some strategies for a game, which can be defined by 10 non-negative integers that add up to 100. There are 109 choose 9, or roughly 10^12 of these, so comparing them all is not practical. I would like to take a random sample of about 1,000,000 of these.
I have tried the methods from the answers to this question, and this one, but all still seem far too slow to work. The quickest method seems like it will take about 180 hours on my machine.
This is how I've tried to make the generator (adapted from a previous SE answer). For some reason, changing prob does not seem to impact the run time of turning it into a list.
def tuples_sum_sample(nbval,total, prob, order=True) :
"""
Generate all the tuples L of nbval positive or nul integer
such that sum(L)=total.
The tuples may be ordered (decreasing order) or not
"""
if nbval == 0 and total == 0 : yield tuple() ; raise StopIteration
if nbval == 1 : yield (total,) ; raise StopIteration
if total==0 : yield (0,)*nbval ; raise StopIteration
for start in range(total,0,-1) :
for qu in tuples_sum(nbval-1,total-start) :
if qu[0]<=start :
sol=(start,)+qu
if order :
if random.random() <prob:
yield sol
else :
l=set()
for p in permutations(sol,len(sol)) :
if p not in l :
l.add(p)
if random.random()<prob:
yield p
Rejection sampling seems like it would take about 3 million years, so this is out as well.
randsample = []
while len(randsample)<1000000:
x = (random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100))
if sum(x) == 100:
randsample.append(x)
randsample
Can anyone think of another way to do this?
Thanks

A couple of frame-challenging questions:
Is there any reason you must generate the entire population, then sample that population?
Why do you need to check if your numbers sum to 100?
You can generate a set of numbers that sum to a value. Check out the first answer here:
Random numbers that add to 100: Matlab
Then generate the number of such sets you desire (1,000,000 in this case).
import numpy as np
def set_sum(number=10, total=100):
initial = np.random.random(number-1) * total
sort_list = np.append(initial, [0, total]).astype(int)
sort_list.sort()
set_ = np.diff(sort_list)
return set_
if __name__ == '__main__':
import timeit
a = set_sum()
n = 1000000
sample = [set_sum() for i in range(n)]

Numpy to the rescue!
Specifically, you need a multinomial distribution:
import numpy as np
desired_sum = 100
n = 10
np.random.multinomial(desired_sum, np.ones(n)/n, size=1000000)
It outputs a matrix with a million rows of 10 random integers in a few seconds. Each row sums up to 100.
Here's a smaller example:
np.random.multinomial(desired_sum, np.ones(n)/n, size=10)
which outputs:
array([[ 8, 7, 12, 11, 11, 9, 9, 10, 11, 12],
[ 7, 11, 8, 9, 9, 10, 11, 14, 11, 10],
[ 6, 10, 11, 13, 8, 10, 14, 12, 9, 7],
[ 6, 11, 6, 7, 8, 10, 8, 18, 13, 13],
[ 7, 7, 13, 11, 9, 12, 13, 8, 8, 12],
[10, 11, 13, 9, 6, 11, 7, 5, 14, 14],
[12, 5, 9, 9, 10, 8, 8, 16, 9, 14],
[14, 8, 14, 9, 11, 6, 10, 9, 11, 8],
[12, 10, 12, 9, 12, 10, 7, 10, 8, 10],
[10, 7, 10, 19, 8, 5, 11, 8, 8, 14]])
The sums appear to be correct:
sum(np.random.multinomial(desired_sum, np.ones(n)/n, size=10).T)
# array([100, 100, 100, 100, 100, 100, 100, 100, 100, 100])
Python only
You could also start with a list on 10 zeroes, iterate 100 times and increment a random cell each time :
import random
desired_sum = 100
n = 10
row = [0] * n
for _ in range(desired_sum):
row[random.randrange(n)] += 1
row
# [16, 7, 9, 7, 10, 11, 4, 19, 4, 13]
sum(row)
# 100

Count total number of occurrences of given list of integers in another

How do I count the number of times the same integer occurs?
My code so far:
def searchAlgorithm (target, array):
i = 0 #iterating through elements of target list
q = 0 #iterating through lists sublists via indexes
while q < 4:
x = 0 #counting number of matches
for i in target:
if i in array[q]:
x += 1
else:
x == 0
print(x)
q += 1
a = [8, 12, 14, 26, 27, 28]
b = [[4, 12, 17, 26, 30, 45], [8, 12, 19, 24, 33, 47], [3, 10, 14, 31, 39, 41], [4, 12, 14, 26, 30, 45]]
searchAlgorithm(a, b)
The output of this is:
2
2
1
3
What I want to achieve is counting the number of times '1', '2' '3' matches occurs.
I have tried:
v = 0
if searchAlgorithm(a, b) == 2:
v += 1
print(v)
But that results in 0

You can use intersection of sets to find elements that are common in both lists. Then you can get the length of the sets. Here is how it looks:
num_common_elements = (len(set(a).intersection(i)) for i in b)
You can then iterate over the generator num_common_elements to use the values. Or you can cast it to a list to see the results:
print(list(num_common_elements))
[Out]: [2, 2, 1, 3]
If you want to implement the intersection functionality yourself, you can use the sum method to implement your own version. This is equivalent to doing len(set(x).intersection(set(y))
sum(i in y for i in x)
This works because it generates values such as [True, False, False, True, True] representing where the values in the first list are present in the second list. The sum method then treats the Trues as 1s and Falses as 0s, thus giving you the size of the intersection set

This is based on what I understand from your question. Probably you are looking for this:
from collections import Counter
def searchAlgorithm (target, array):
i = 0 #iterating through elements of target list
q = 0 #iterating through lists sublists via indexes
lst = []
while q < 4:
x = 0 #counting number of matches
for i in target:
if i in array[q]:
x += 1
else:
x == 0
lst.append(x)
q += 1
print(Counter(lst))
a = [8, 12, 14, 26, 27, 28]
b = [[4, 12, 17, 26, 30, 45], [8, 12, 19, 24, 33, 47], [3, 10, 14, 31, 39, 41], [4, 12, 14, 26, 30, 45]]
searchAlgorithm(a, b)
# Counter({2: 2, 1: 1, 3: 1})

Thanks to some for their helpful feedback, I have since come up a more simplified solution that does exactly what I want.
By storing the results of the matches in a list, I can then return the list out of the searchAlgorithm function and simple use .count() to count all the matches of a specific number within the list.
def searchAlgorithm (target, array):
i = 0
q = 0
results = []
while q < 4:
x = 0 #counting number of matches
for i in target:
if i in array[q]:
x += 1
else:
x == 0
results.append(x)
q += 1
return results
a = [8, 12, 14, 26, 27, 28]
b = [[4, 12, 17, 26, 30, 45], [8, 12, 19, 24, 33, 47], [3, 10, 14, 31, 39, 41], [4, 12, 14, 26, 30, 45]]
searchAlgorithm(a, b)
d2 = (searchAlgorithm(winNum, lotto).count(2))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find the longest arithmetic progression inside a sequence - python

Related

How can I find the longest contiguous subsequence in a rising sequence in Python?

A variation of the knapsack problem where some items must be included but don't count towards the objective

How to swap the new added number into the correct position in binary heap?

Selecting a random sample from a very large generator

Count total number of occurrences of given list of integers in another

Categories

Resources