I am trying to generate some vectors with numbers [0....k-1], and with length k^n. n and k were given before.
k = 4
n = 2
args = list(product(range(k), repeat=n))
# vector=str([i for i in range(k)]*(n+1))
for i in product(range(k), repeat=k ** n):
if (check(i, args)): print(i)
Commented line is not important,it was my idea.
I need to generate this vectors with condition: I want to see each number from [0;k-1] in my vectors more or equal to (n) times. So it is task about permutations with replacements with special conditions to control numbers I can get. What shall I do?
For example I have k=2,n=2 vector from 4 elements and want to see 0 and 1 TWO or more times.
I should get 0011 0101 0110 1001 1010 1100
Everything is easy in example, but when k=5,n=2 (for example) there are 25-size vector and i want to see 0 1 2 3 4 2 times and other 17 numbers should be from 0 1 2 3 4 it becomes difficult.
UPDATE:
Here is a solution that generates the necessary combinations only. It is in principle faster, although the complexity is still exponential and you can quickly hit the limits of recursion.
def my_vectors(k, n):
# Minimum repetitions per element
base_repetitions = [n] * k
# "Unassigned" repetitions
rest = k ** n - k * n
# List reused for permutation construction
permutation = [-1] * (k ** n)
# For each possible repetition assignment
for repetitions in make_repetitions(base_repetitions, rest):
# Make all possible permutations
yield from make_permutations(repetitions, permutation)
# Finds all possible repetition assignments
def make_repetitions(repetitions, rest, first=0):
if rest <= 0:
yield repetitions
else:
for i in range(first, len(repetitions)):
repetitions[i] += 1
yield from make_repetitions(repetitions, rest - 1, i)
repetitions[i] -= 1
# Make all permutations with repetitions
def make_permutations(repetitions, permutation, idx=0):
if idx >= len(permutation):
yield list(permutation)
# If you are going to use the permutation within a loop only
# maybe you can avoid copying the list and do just:
# yield permutation
else:
for elem in range(len(repetitions)):
if repetitions[elem] > 0:
repetitions[elem] -= 1
permutation[idx] = elem
yield from make_permutations(repetitions, permutation, idx + 1)
repetitions[elem] += 1
for v in my_vectors(3, 2):
print(v)
Output:
(0, 0, 0, 0, 0, 1, 1, 2, 2)
(0, 0, 0, 0, 0, 1, 2, 1, 2)
(0, 0, 0, 0, 0, 1, 2, 2, 1)
(0, 0, 0, 0, 0, 2, 1, 1, 2)
(0, 0, 0, 0, 0, 2, 1, 2, 1)
(0, 0, 0, 0, 0, 2, 2, 1, 1)
(0, 0, 0, 0, 1, 0, 1, 2, 2)
(0, 0, 0, 0, 1, 0, 2, 1, 2)
(0, 0, 0, 0, 1, 0, 2, 2, 1)
(0, 0, 0, 0, 1, 1, 0, 2, 2)
...
This is an inefficient but simple way to implement it:
from itertools import product
from collections import Counter
def my_vectors(k, n):
for v in product(range(k), repeat=k ** n):
count = Counter(v)
if all(count[i] >= n for i in range(k)):
yield v
for v in my_vectors(3, 2):
print(v)
Output:
(0, 0, 0, 0, 0, 1, 1, 2, 2)
(0, 0, 0, 0, 0, 1, 2, 1, 2)
(0, 0, 0, 0, 0, 1, 2, 2, 1)
(0, 0, 0, 0, 0, 2, 1, 1, 2)
(0, 0, 0, 0, 0, 2, 1, 2, 1)
(0, 0, 0, 0, 0, 2, 2, 1, 1)
(0, 0, 0, 0, 1, 0, 1, 2, 2)
(0, 0, 0, 0, 1, 0, 2, 1, 2)
(0, 0, 0, 0, 1, 0, 2, 2, 1)
(0, 0, 0, 0, 1, 1, 0, 2, 2)
...
Obviously, as soon as your numbers get slightly bigger it will take forever to run, so it is only useful either for very small problems or as a baseline for comparison.
In any case, the number of items that the problem produces is exponentially large anyway, so although you can make it significantly better (i.e. generate only the right elements instead of all the possible ones and discarding), it cannot be "fast" for any size.
Related
Pretend I have a pandas Series that consists of 0s and 1s, but this can work with numpy arrays or any iterable. I would like to create a formula that would take an array and an input n and then return a new series that contains 1s at the nth indices leading up to every time that there is at least a single 1 in the original series. Here is an example:
array = np.array([0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1])
> preceding_indices_function(array, 2)
np.array([0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1])
For each time there is a 1 in the input array, the two indices preceding it are filled in with 1 regardless of whether there is a 0 or 1 in that index in the original array.
I would really appreciate some help on this. Thanks!
Use a convolution with np.convolve:
N = 2
# craft a custom kernel
kernel = np.ones(2*N+1)
kernel[-N:] = 0
# array([1, 1, 1, 0, 0])
out = (np.convolve(array, kernel, mode='same') != 0).astype(int)
Output:
array([0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1])
Unless you don't want to use numpy, mozway's transpose is the best solution.
But since several iterations have been given, I add my itertools based solution
[a or b or c for a,b,c in itertools.zip_longest(array, array[1:], array[2:], fillvalue=0)]
zip_longest is the same as classical zip, but if the iterators have different "lengths", the number of iteration is the one of the longest, and finished iterators will return None. Unless you add a fillvalue parameter to zip_longest.
So, here itertools.zip_longest(array, array[1:], array[2:], fillvalue=0) gives a sequence of triplets (a,b,c), of 3 subsequent elements (a being the current element, b the next, c the one after, b and c being 0 if there isn't any next element or element after the next).
So from there, a simple comprehension build a list of [a or b or c] that is 1 if a, or b or c is 1, 0 else.
import numpy as np
array = np.array([0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1])
array = np.array([a or array[idx+1] or array[idx+2] for idx, a in enumerate(array[:-2])] + [array[-2] or array[-1]] + [array[-1]])
this function works if a is a list, should work with other iterables as well:
def preceding_indices_function(array, n):
for i in range(len(a)):
if array[i] == 1:
for j in range(n):
if i-j-1 >= 0:
array[i-j-1] = 1
return array
I got a solution that is similar to the other one but slightly simpler in my opinion:
>>> [1 if (array[i+1] == 1 or array[i+2] == 1) else x for i,x in enumerate(array) if i < len(array) - 2]
[0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1]
Is there is a vectorized way to change all concurrent 1s that are within offset of the first 1 into 0s (transform A into B)? I'm currently trying to do this on a numpy array with over 1 million items where speed is critical.
The 1s represent a signal trigger and the 0s represent no trigger. For example: Given an offset of 5, whenever there is a 1, the following 5 items must be 0 (to remove signal concurrency).
Example 1:
offset = 3
A = np.array([1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0])
B = np.array([1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0])
Example 2:
offset = 2
A = np.array([1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0])
B = np.array([1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0])
From the comments, it seems that, the question is not just related to use NumPy and …, and the main objective is to speed up the code. Since, you are using the partial solution, mentioned by JohanC, (Which needs much more considerations for this question), I suggest the following methods:
def com_():
n = 1
for i in range(1, len(A)+1):
if A[n-1] == 1:
A[n:n+offset] = 0
n += offset + 1
else:
n += 1
if n > len(A):
break
#nb.jit(forceobj=True)
def com_fast():
B = A.tolist()
n = 1
while n < len(B):
if B[n-1] == 1:
for i in range(offset):
if n+i < len(B):
B[n+i] = 0
n += offset + 1
else:
n += 1
The first method is using A in the form of NumPy array and loops. The second one uses an input in the form of list and loops, and is accelerated by numba as it is mentioned by hpaulj in the comments.
Using the same inputs (1,000,000 in length) for the methods, and running on Google Colab TPU:
1000 loops, best of 5: 153 ms per loop # for com_()
1000 loops, best of 5: 10.2 ms per loop # for com_fast()
Which, I think, will show acceptable performance times with that large data.
I think, this question could not be solved just by NumPy, or if so, It will be very difficult and need to think about it a lot (I have tried and I achieved good results, but finally needs to loops). My guess is that, using numba and libraries like that, could have similar results (in runtime) and, so, it does not need to use just NumPy.
Given a pattern [1,1,0,1,1], and a binary list of length 100, [0,1,1,0,0,...,0,1]. I want to count the number of occurences of this pattern in this list. Is there a simple way to do this without the need to track the each item at every index with a variable?
Note something like this, [...,1, 1, 0, 1, 1, 1, 1, 0, 1, 1,...,0] can occur but this should be counted as 2 occurrences.
Convert your list to string using join. Then do:
text.count(pattern)
If you need to count overlapping matches then you will have to use regex matching or define your own function.
Edit
Here is the full code:
def overlapping_occurences(string, sub):
count = start = 0
while True:
start = string.find(sub, start) + 1
if start > 0:
count+=1
else:
return count
given_list = [1, 1, 0, 1, 1, 1, 1, 0, 1, 1]
pattern = [1,1,0,1,1]
text = ''.join(str(x) for x in given_list)
print(text)
pattern = ''.join(str(x) for x in pattern)
print(pattern)
print(text.count(pattern)) #for no overlapping
print(overlapping_occurences(text, pattern))
l1 = [1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0]
l1str = str(l1).replace(" ", "").replace("[", "").replace("]", "")
l3 = [1, 1, 0, 1, 1]
l3str = str(l3).replace(" ", "").replace("[", "").replace("]", "")
l1str = l1str.replace(l3str, "foo")
foo = l1str.count("foo")
print(foo)
you can always use the naive way :
for loop on slices of the list (as in the slice that starts at i-th index and ends at i+[length of pattern]).
and you can improve it - notice that if you found an occurence in index i' you can skip i+1 and i+2 and check from i+3 and onwards (meaning - you can check if there is a sub-pattern that will ease your search )
it costs O(n*m)
you can use backwards convolution (called pattern matching algorithem)
this costs O(n*log(n)) which is better
I think a simple regex would suffice:
def find(sample_list):
list_1 = [1,1,0,1,1]
str_1 = str(list_1)[1:-1]
print len(re.findall(str_1, str(sample_list)))
Hope this solves your problem.
from collections import Counter
a = [1,1,0,1,1]
b = [1,1,0,1,1,1,1,0,1,1]
lst = list()
for i in range(len(b)-len(a)+1):
lst.append(tuple(b[i:i+len(a)]))
c = Counter(lst)
print c[tuple(a)]
output
2
the loop can be written in one line like, for more "clean" but less understood code
lst = [tuple(b[i:i+len(a)]) for i in range(len(b)-len(a)+1)]
NOTE, I'm using tuple cause they are immutable objects and can be hashed
you can also use the hash functionality and create your own hash method like multiple each var with 10 raised to his position e.g
[1,0,1] = 1 * 1 + 0 * 10 + 1 * 100 = 101
that way you can make a one pass on the list and check if it contains the pattern by simply check if sub_list == 101
You can solve it using following two steps:
Combine all elements of the list in a single string
Use python count function to match the pattern in the string
a_new = ''.join(map(str,a))
pattern = ''.join(map(str,pattern))
a_new.count(pattern)
You can divide the lookup list into chucks of size of the pattern you are looking. You can achieve this using simple recipe involving itertools.islice to yield a sliding window iterator
>>> from itertools import islice
>>> p = [1,1,0,1,1]
>>> l = [0,1,1,0,0,0,1,1,0,1,1,1,0,0,1]
>>> [tuple(islice(l,k,len(p)+k)) for k in range(len(l)-len(p)+1)]
This will give you output like:
>>> [(0, 1, 1, 0, 0), (1, 1, 0, 0, 0), (1, 0, 0, 0, 1), (0, 0, 0, 1, 1), (0, 0, 1, 1, 0), (0, 1, 1, 0, 1), (1, 1, 0, 1, 1), (1, 0, 1, 1, 1), (0, 1, 1, 1, 0), (1, 1, 1, 0, 0), (1, 1, 0, 0, 1)]
Now you can use collections.Counter to count the occurrence of each sublist in sequence like
>>> from collections import Counter
>>> c = Counter([tuple(islice(l,k,len(p)+k)) for k in range(len(l)-len(p)+1)])
>>> c
>>> Counter({(0, 1, 1, 0, 1): 1, (1, 1, 1, 0, 0): 1, (0, 0, 1, 1, 0): 1, (0, 1, 1, 1, 0): 1, (1, 1, 0, 0, 0): 1, (0, 0, 0, 1, 1): 1, (1, 1, 0, 1, 1): 1, (0, 1, 1, 0, 0): 1, (1, 0, 1, 1, 1): 1, (1, 1, 0, 0, 1): 1, (1, 0, 0, 0, 1): 1})
To fetch frequency of your desired sequence use
>>> c.get(tuple(p),0)
>>> 1
Note I have used tuple everywhere as dict keys since list is not a hashable type in python so cannot be used as dict keys.
You can try range approach :
pattern_data=[1,1,0,1,1]
data=[1,1,0,1,1,0,0,0,0,1,1,1,1,0,0,1,1,0,1,1,1,1,0,1,1,0,0,0,0,0,1,1,0,1,0,1,1,0,1,1,1,1,0,1,1,0,0,0,0,0,0,0,1,1,0,1,1,0,1,1,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1]
count=0
for i in range(0,len(data),1):
if data[i:i+len(pattern_data)]==pattern_data:
print(i,data[i:i+len(pattern_data)])
j+=1
print(count)
output:
0 [1, 1, 0, 1, 1]
15 [1, 1, 0, 1, 1]
20 [1, 1, 0, 1, 1]
35 [1, 1, 0, 1, 1]
40 [1, 1, 0, 1, 1]
52 [1, 1, 0, 1, 1]
55 [1, 1, 0, 1, 1]
60 [1, 1, 0, 1, 1]
75 [1, 1, 0, 1, 1]
80 [1, 1, 0, 1, 1]
95 [1, 1, 0, 1, 1]
11
I'm new to python and FFT. I have taken a small task in Python to find the shuffling order for a given number of datapoints.
My objective is to have an output like below for N datapoints. Here N=8, so we have 3 sets:
[0, 1, 0, 1, 0, 1, 0, 1]
[0, 0, 1, 1, 0, 0, 1, 1]
[0, 0, 0, 0, 1, 1, 1, 1]
The code I tried is below. Could someone help me where I'm wrong and suggest modifications to the code to produce the desired output.
le=8
steps=int(math.ceil(math.log(le,2)))
pos2=[]
m=0
for k in range(0,steps):
x=2**k
#print x
pos1=[]
for i in range(0,le):
if m<x:
pos1.append(0)
m=m+1
else:
pos1.append(1)
m=0
pos2.append(pos1)
You immediately get back to appending 0s after appending only one 1. Here is a working version with slightly different logic:
import math
le = 8
steps = int(math.ceil(math.log(le, 2)))
pos2 = []
for k in range(0, steps):
x = 2**k
pos1 = []
while len(pos1) < le:
for i in range(0, x):
pos1.append(0)
for i in range(0, x):
pos1.append(1)
pos2.append(pos1)
print pos1
this will print
[0, 1, 0, 1, 0, 1, 0, 1]
[0, 0, 1, 1, 0, 0, 1, 1]
[0, 0, 0, 0, 1, 1, 1, 1]
and here is a one-liner for you to examine:
import math
le = 8
pos2 = [[(i // 2**k) % 2 for i in range(le)] for k in range(int(math.ceil(math.log(le, 2))))]
print pos2
The goal is to create a list of 99 elements. All elements must be 1s or 0s. The first element must be a 1. There must be 7 1s in total.
import random
import math
import time
# constants determined through testing
generation_constant = 0.96
def generate_candidate():
coin_vector = []
coin_vector.append(1)
for i in range(0, 99):
random_value = random.random()
if (random_value > generation_constant):
coin_vector.append(1)
else:
coin_vector.append(0)
return coin_vector
def validate_candidate(vector):
vector_sum = sum(vector)
sum_test = False
if (vector_sum == 7):
sum_test = True
first_slot = vector[0]
first_test = False
if (first_slot == 1):
first_test = True
return (sum_test and first_test)
vector1 = generate_candidate()
while (validate_candidate(vector1) == False):
vector1 = generate_candidate()
print vector1, sum(vector1), validate_candidate(vector1)
Most of the time, the output is correct, saying something like
[1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0] 7 True
but sometimes, the output is:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] 2 False
What exactly am I doing wrong?
I'm not certain I understand your requirements, but here's what it sounds like you need:
#!/usr/bin/python3
import random
ones = [ 1 for i in range(6) ]
zeros = [ 0 for i in range(99 - 6) ]
list_ = ones + zeros
random.shuffle(list_)
list_.insert(0, 1)
print(list_)
print(list_.count(1))
print(list_.count(0))
HTH
The algorithm you gave works, though it's slow. Note that the ideal generation_constant can actually be calculated using the binomial distribution. The optimum is ≈0.928571429 which will fit the conditions 1.104% of the time. If you set the first element to 1 manually, then the optimum generation_constant is ≈0.93877551 which will fit the conditions 16.58% of the time.
The above is based on the binomial distribution, which says that the probability of having exactly k "success" events out of N total tries where each try has probability p will be P( k | N, p ) = N! * p ^ k * (1 - p) ^ (N - k) / ( n! * (N - k)). Just stick that into Excel, Mathematica, or a graphing calculator and maximize P.
Alternatively:
To generate a list of 99 numbers where the first and 6 additional items are 1 and the remaining elements are 0, you don't need to call random.random so much. Generating pseudo-random numbers is very expensive.
There are two ways to avoid calling random so much.
The most processor efficient way is to only call random 6 times, for the 6 ones you need to insert:
import random
# create vector of 99 0's
vector = [0 for i in range(99)]
# set first element to 1
vector[0] = 1
# list of locations of all 0's
indexes = range(1, 99)
# only need to loop 6 times for remaining 6 ones
for i in range(6):
# select one of the 0 locations at random
# "pop" it from the list so it can't be selected again
# and set it's coresponding element in vector to 1.
vector[indexes.pop(random.randint(0, len(indexes) - 1))] = 1
Alternatively, to save on memory, you can just test each new index to make sure it will actually set something:
import random
# create vector of 99 0's
vector = [0 for i in range(99)]
# only need to loop 7 times
for i in range(7):
index = 0 # first element is set to 1 first
while vector[index] == 1: # keep calling random until a 0 is found
index = random.randint(0, 98) # random index to check/set
vector[index] = 1 # set the random (or first) element to 1
The second one will always set the first element to 1 first, because index = random.randint(0, 98) only ever gets called if vector[0] == 1.
With genetic programming you want to control your domain so that invalid configurations are eliminated as much as possible. The fitness is suppose to rate valid configurations, not eliminate invalid configurations. Honestly this problem doesn't really seem to be a good fit for genetic programming. You have outlined the domain. But I don't see a fitness description anywhere.
Anyway, that being said, the way I would populate the domain would be: since the first element is always 1, ignore it, since the remaining 98 only have 6 ones, shuffle in 6 ones to 92 zeros. Or even enumerate the possible as your domain isn't very large.
I have a feeling it is your use of sum(). I believe this modifies the list in place:
>>> mylist = [1,2,3,4]
>>> sum(mylist)
10
>>> mylist
[]
Here's a (somewhat) pythonic recursive version
def generate_vector():
generation_constant = .96
myvector = [1]+[ 1 if random.random() > generation_constant else 0 for i in range(0,99)]
mysum = 0
for a in myvector:
mysum = (mysum + a)
if mysum == 7 and myvector[0]==1:
return myvector
return generate_vector()
and for good measure
def generate_test():
for i in range(0,10000):
vector = generate_vector()
sum = 0
for a in vector:
sum = sum + a
if sum != 7 or vector[0]!=1:
print vector
output:
>>> generate_test()
>>>