Iterating after pairs in string in Python - python

Here is the combination / permutator generator function that uses any number of lists with values and any length of template strings. The generator creates corresponding lists with combinations / permutations of values matching the template string.
The answer was given by a colleague here: https://stackoverflow.com/a/48121777/8820330
from itertools import permutations, combinations, product
def groups(sources, template, mode='P'):
func = permutations if mode == 'P' else combinations
keys = sources.keys()
combos = [func(sources[k], template.count(k)) for k in keys]
for t in product(*combos):
d = {k: iter(v) for k, v in zip(keys, t)}
yield [next(d[k]) for k in template]
# tests
sources = {
'a': [0, 1, 2],
'b': [3, 4, 5],
'c': [6, 7, 8],
}
templates = 'aa', 'abc', 'abba', 'cab'
for template in templates:
print('\ntemplate', template)
for i, t in enumerate(groups(sources, template, mode='C'), 1):
print(i, t)
Output:
template aa
1 [0, 1]
2 [0, 2]
3 [1, 2]
template abc
1 [0, 3, 6]
2 [0, 3, 7]
3 [0, 3, 8]
4 [0, 4, 6]
5 [0, 4, 7]
6 [0, 4, 8]
7 [0, 5, 6]
8 [0, 5, 7]
9 [0, 5, 8]
10 [1, 3, 6]
11 [1, 3, 7]
12 [1, 3, 8]
13 [1, 4, 6]
14 [1, 4, 7]
15 [1, 4, 8]
16 [1, 5, 6]
17 [1, 5, 7]
18 [1, 5, 8]
19 [2, 3, 6]
20 [2, 3, 7]
21 [2, 3, 8]
22 [2, 4, 6]
23 [2, 4, 7]
24 [2, 4, 8]
25 [2, 5, 6]
26 [2, 5, 7]
27 [2, 5, 8]
template abba
1 [0, 3, 4, 1]
2 [0, 3, 5, 1]
3 [0, 4, 5, 1]
4 [0, 3, 4, 2]
5 [0, 3, 5, 2]
6 [0, 4, 5, 2]
7 [1, 3, 4, 2]
8 [1, 3, 5, 2]
9 [1, 4, 5, 2]
template cab
1 [6, 0, 3]
2 [7, 0, 3]
3 [8, 0, 3]
4 [6, 0, 4]
5 [7, 0, 4]
6 [8, 0, 4]
7 [6, 0, 5]
8 [7, 0, 5]
9 [8, 0, 5]
10 [6, 1, 3]
11 [7, 1, 3]
12 [8, 1, 3]
13 [6, 1, 4]
14 [7, 1, 4]
15 [8, 1, 4]
16 [6, 1, 5]
17 [7, 1, 5]
18 [8, 1, 5]
19 [6, 2, 3]
20 [7, 2, 3]
21 [8, 2, 3]
22 [6, 2, 4]
23 [7, 2, 4]
24 [8, 2, 4]
25 [6, 2, 5]
26 [7, 2, 5]
27 [8, 2, 5]
As you can see, the generator supports strings of templates built of single characters. However, I am interested in how to transform the above generator, to support the templates' strings, for example in this form:
sources = {
'a1' = [0,1,2],
'a2' = [3,4,5],
'a3' = [6,7,8],
}
templates = 'a1a2a3'
It seems to me that the problem lies in the iteration of the string from templates, where the iteration runs after the individual elements. In my case, however, iteration would have to follow pairs from a given string. How can this be done?

Define your templates like this:
sources = {
'a1': [0, 1, 2],
'a2': [3, 4, 5],
'a3': [6, 7, 8],
}
templates = [['a1', 'a2', 'a3']]
And then it will work out of the box. The reason this works is that strings in python are in fact just lists of characters, so instead of defining 'abc' you could define ['a', 'b', 'c'] and this behaves the same in most situations.

Related

Convert one-dimensional array to two-dimensional array so that each element is a row in the result

I want to know how to convert this: array([0, 1, 2, 3, 4, 5]) to this:
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4],
[5, 5, 5]])
In short, given a flat array, repeat each element inside the array n times, so that each element creates a sub-array of n of the same element, and concatenate these sub-arrays into one, so that each row contains an element from the original array repeated n times.
I can do this:
def repeat(lst, n):
return [[e]*n for e in lst]
>repeat(range(10), 4)
[[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3],
[4, 4, 4, 4],
[5, 5, 5, 5],
[6, 6, 6, 6],
[7, 7, 7, 7],
[8, 8, 8, 8],
[9, 9, 9, 9]]
How to do this in NumPy?
You can use numpy's repeat like this:
np.repeat(range(10), 4).reshape(10,4)
which gives:
[[0 0 0 0]
[1 1 1 1]
[2 2 2 2]
[3 3 3 3]
[4 4 4 4]
[5 5 5 5]
[6 6 6 6]
[7 7 7 7]
[8 8 8 8]
[9 9 9 9]]
You can use tile that handles dimensions:
a = np.array([0, 1, 2, 3, 4, 5])
N = 4
np.tile(a[:,None], (1, N))
# or
np.tile(a, (N, 1)).T
or broadcast_to:
np.broadcast_to(a, (N, a.shape[0])).T
# or
np.broadcast_to(a[:,None], (a.shape[0], N))
Or multiply by an array of ones:
a[:,None]*np.ones(N, dtype=a.dtype)
output:
array([[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3],
[4, 4, 4, 4],
[5, 5, 5, 5]])

Efficiently structure data into series by ID

I am looking at a very large structured dataset that I would like to make unstructured. Here is the example…
x1 x2 x3 day id
1 5 9 2 A
9 7 9 3 B
3 1 4 1 A
2 6 5 1 B
3 5 8 2 B
3 2 3 2 C
The rows above are presented in a random order. Another way to think of this example is as follows…
x = [[1, 5, 9, 2, “A”],
[9, 7, 9, 3, “B”],
[3, 1, 4, 1, “A”],
[2, 6, 5, 1, “B”],
[3, 5, 8, 2, “B”],
[3, 2, 3, 2, “C”]]
Once processed, the desired output is…
[[[3, 1, 4, 1], [1, 5, 9, 2]],
[[2, 6, 5, 1], [3, 5, 8, 2], [9, 7, 9, 3]],
[[3, 2, 3, 2]]],
[[1, A], [1,B], [2,C]]
The first list has the x variables, and the second list has the start date with each identifier.
I have an idea of how to achieve this, but it is in O(n^3). Is there a more efficient method, maybe in O(nlogn)?
Edit: Although mentioned in my previous post, I have made it clearer that the rows are presented in random order. I have also removed redundant column in the code example.
Try:
x = [
[3, 1, 4, 1, 1, "A"],
[1, 5, 9, 2, 2, "A"],
[2, 6, 5, 1, 1, "B"],
[3, 5, 8, 2, 2, "B"],
[9, 7, 9, 3, 3, "B"],
[3, 2, 3, 2, 2, "C"],
]
out = {}
for row in x:
out.setdefault(row[-1], []).append(row[:-1])
print(list(out.values()) + [[[v[0][-1], k] for k, v in out.items()]])
Prints:
[
[[3, 1, 4, 1, 1], [1, 5, 9, 2, 2]],
[[2, 6, 5, 1, 1], [3, 5, 8, 2, 2], [9, 7, 9, 3, 3]],
[[3, 2, 3, 2, 2]],
[[1, "A"], [1, "B"], [2, "C"]],
]

variable access between Function calls behavior in Python

I am doing recursion and storing the value in every step if calling.
Like if my working program-code is like-
lst=[]
def after_occurance(ls,l,curr):
for i in range(l,curr):
if ls[i]==ls[curr]:
return False
return True
def permutate(A,l,r):
if l==r:
ans=A.copy()
print(A,ans)
# change the commenting of the following 2 lines to see the difference
lst.append(A)
#lst.append(ans)
print(lst)
return lst
else:
for i in range(l,r+1):
if after_occurance(A,l,i):
A[i],A[l] = A[l],A[i]
permutate(A,l+1,r)
hm[A[l]]=1
A[l],A[i] = A[i],A[l]
else:
continue
lst.clear()
A=[1,2,6]
A=sorted(A)
permutate(A,0,len(A)-1)
return lst
Following are 2 kind of outputs when Toggling between 2 commented line respectively
[1, 2, 6] [1, 2, 6]
[[1, 2, 6]]
[1, 6, 2] [1, 6, 2]
[[1, 2, 6], [1, 6, 2]]
[2, 1, 6] [2, 1, 6]
[[1, 2, 6], [1, 6, 2], [2, 1, 6]]
[2, 6, 1] [2, 6, 1]
[[1, 2, 6], [1, 6, 2], [2, 1, 6], [2, 6, 1]]
[6, 2, 1] [6, 2, 1]
[[1, 2, 6], [1, 6, 2], [2, 1, 6], [2, 6, 1], [6, 2, 1]]
[6, 1, 2] [6, 1, 2]
[[1, 2, 6], [1, 6, 2], [2, 1, 6], [2, 6, 1], [6, 2, 1], [6, 1, 2]]
[1 2 6 ] [1 6 2 ] [2 1 6 ] [2 6 1 ] [6 1 2 ] [6 2 1 ]
[1, 2, 6] [1, 2, 6]
[[1, 2, 6]]
[1, 6, 2] [1, 6, 2]
[[1, 6, 2], [1, 6, 2]]
[2, 1, 6] [2, 1, 6]
[[2, 1, 6], [2, 1, 6], [2, 1, 6]]
[2, 6, 1] [2, 6, 1]
[[2, 6, 1], [2, 6, 1], [2, 6, 1], [2, 6, 1]]
[6, 2, 1] [6, 2, 1]
[[6, 2, 1], [6, 2, 1], [6, 2, 1], [6, 2, 1], [6, 2, 1]]
[6, 1, 2] [6, 1, 2]
[[6, 1, 2], [6, 1, 2], [6, 1, 2], [6, 1, 2], [6, 1, 2], [6, 1, 2]]
[1 2 6 ] [1 2 6 ] [1 2 6 ] [1 2 6 ] [1 2 6 ] [1 2 6 ]
Can somebody explain this behavior and what basic rule should I follow while doing Recursive calls and variable access in python?
So, this is the code you really wanted to post:
def after_occurance(ls, l, curr):
for i in range(l, curr):
if ls[i] == ls[curr]:
return False
return True
def permutate(A, l, r):
if l == r:
ans = A.copy()
# change the commenting of the following 2 lines to see the difference
#lst.append(A)
lst.append(ans)
return
else:
for i in range(l, r + 1):
if after_occurance(A, l, i):
A[i],A[l] = A[l],A[i]
permutate(A, l + 1, r)
A[l],A[i] = A[i],A[l]
else:
continue
lst = []
A = [1,2,6]
A = sorted(A)
permutate(A, 0, len(A) - 1)
print(lst)
The difference comes from appending a copy() of A or just a reference to A.
When you append a reference to A, all the future changes to A show up in lst because the result is lst = [A, A, A, A, A, ....] and so lst cannot be anything apart from a list of the same thing.
When you append a copy() of A, you make a new list which is not changed after the append() and so records the history of how A looked over time.

All combinations of values from two lists representing a certain feature

I have three lists:
a = [0,1,2]
b = [3,4,5]
c = [aab, abb, aaa]
How to create all three-element combinations? Where sequences from the list c tell you which list can be used to choose numbers for a given place in a given output sequence
For example (pseudocode):
for i=0 in range(len(c)):
print: [0,1,3]
[0,1,4]
...
[0,2,5]
...
[1,2,4]
[1,2,5]
And the same for the rest of the i indexes. Where the values in individual sublistas can not be repeated.
I will be very grateful for any tips.
This generator function will handle 'ab' template strings with the a's and b's in any order, and the output lists will not contain repeated items if the a and b lists are disjoint. We use itertools.combinations to generate combinations of the required order, and combine the a and b combinations using itertools.product. We get them in the correct order by turning each a and b combination into an iterator and select from the correct iterator via a dictionary.
from itertools import combinations, product
def groups(a, b, c):
for pat in c:
acombo = combinations(a, pat.count('a'))
bcombo = combinations(b, pat.count('b'))
for ta, tb in product(acombo, bcombo):
d = {'a': iter(ta), 'b': iter(tb)}
yield [next(d[k]) for k in pat]
# tests
a = [0,1,2]
b = [3,4,5]
templates = ['aab', 'abb', 'aaa'], ['aba'], ['bab']
for c in templates:
print('c', c)
for i, t in enumerate(groups(a, b, c), 1):
print(i, t)
print()
output
c ['aab', 'abb', 'aaa']
1 [0, 1, 3]
2 [0, 1, 4]
3 [0, 1, 5]
4 [0, 2, 3]
5 [0, 2, 4]
6 [0, 2, 5]
7 [1, 2, 3]
8 [1, 2, 4]
9 [1, 2, 5]
10 [0, 3, 4]
11 [0, 3, 5]
12 [0, 4, 5]
13 [1, 3, 4]
14 [1, 3, 5]
15 [1, 4, 5]
16 [2, 3, 4]
17 [2, 3, 5]
18 [2, 4, 5]
19 [0, 1, 2]
c ['aba']
1 [0, 3, 1]
2 [0, 4, 1]
3 [0, 5, 1]
4 [0, 3, 2]
5 [0, 4, 2]
6 [0, 5, 2]
7 [1, 3, 2]
8 [1, 4, 2]
9 [1, 5, 2]
c ['bab']
1 [3, 0, 4]
2 [3, 0, 5]
3 [4, 0, 5]
4 [3, 1, 4]
5 [3, 1, 5]
6 [4, 1, 5]
7 [3, 2, 4]
8 [3, 2, 5]
9 [4, 2, 5]
I should mention that even though combinations returns iterators, and product happily takes iterators as arguments, it has to make lists from the iterators because it has to run over the iterator contents multiple times. So if the number of combinations is huge this can consume a fair amount of RAM.
If you want permutations instead of combinations, that's easy. We just call itertools.permutations instead of itertools.combinations.
from itertools import permutations, product
def groups(a, b, c):
for pat in c:
acombo = permutations(a, pat.count('a'))
bcombo = permutations(b, pat.count('b'))
for ta, tb in product(acombo, bcombo):
d = {'a': iter(ta), 'b': iter(tb)}
yield [next(d[k]) for k in pat]
# tests
a = [0,1,2]
b = [3,4,5]
templates = ['aaa'], ['abb']
for c in templates:
print('c', c)
for i, t in enumerate(groups(a, b, c), 1):
print(i, t)
print()
output
c ['aaa']
1 [0, 1, 2]
2 [0, 2, 1]
3 [1, 0, 2]
4 [1, 2, 0]
5 [2, 0, 1]
6 [2, 1, 0]
c ['abb']
1 [0, 3, 4]
2 [0, 3, 5]
3 [0, 4, 3]
4 [0, 4, 5]
5 [0, 5, 3]
6 [0, 5, 4]
7 [1, 3, 4]
8 [1, 3, 5]
9 [1, 4, 3]
10 [1, 4, 5]
11 [1, 5, 3]
12 [1, 5, 4]
13 [2, 3, 4]
14 [2, 3, 5]
15 [2, 4, 3]
16 [2, 4, 5]
17 [2, 5, 3]
18 [2, 5, 4]
Finally, here's a version that handles any number of lists, and template strings of any length. It only accepts a single template string per call, but that shouldn't be an issue. You can also choose whether you want to generate permutations or combinations via an optional keyword arg.
from itertools import permutations, combinations, product
def groups(sources, template, mode='P'):
func = permutations if mode == 'P' else combinations
keys = sources.keys()
combos = [func(sources[k], template.count(k)) for k in keys]
for t in product(*combos):
d = {k: iter(v) for k, v in zip(keys, t)}
yield [next(d[k]) for k in template]
# tests
sources = {
'a': [0, 1, 2],
'b': [3, 4, 5],
'c': [6, 7, 8],
}
templates = 'aa', 'abc', 'abba', 'cab'
for template in templates:
print('\ntemplate', template)
for i, t in enumerate(groups(sources, template, mode='C'), 1):
print(i, t)
output
template aa
1 [0, 1]
2 [0, 2]
3 [1, 2]
template abc
1 [0, 3, 6]
2 [0, 3, 7]
3 [0, 3, 8]
4 [0, 4, 6]
5 [0, 4, 7]
6 [0, 4, 8]
7 [0, 5, 6]
8 [0, 5, 7]
9 [0, 5, 8]
10 [1, 3, 6]
11 [1, 3, 7]
12 [1, 3, 8]
13 [1, 4, 6]
14 [1, 4, 7]
15 [1, 4, 8]
16 [1, 5, 6]
17 [1, 5, 7]
18 [1, 5, 8]
19 [2, 3, 6]
20 [2, 3, 7]
21 [2, 3, 8]
22 [2, 4, 6]
23 [2, 4, 7]
24 [2, 4, 8]
25 [2, 5, 6]
26 [2, 5, 7]
27 [2, 5, 8]
template abba
1 [0, 3, 4, 1]
2 [0, 3, 5, 1]
3 [0, 4, 5, 1]
4 [0, 3, 4, 2]
5 [0, 3, 5, 2]
6 [0, 4, 5, 2]
7 [1, 3, 4, 2]
8 [1, 3, 5, 2]
9 [1, 4, 5, 2]
template cab
1 [6, 0, 3]
2 [7, 0, 3]
3 [8, 0, 3]
4 [6, 0, 4]
5 [7, 0, 4]
6 [8, 0, 4]
7 [6, 0, 5]
8 [7, 0, 5]
9 [8, 0, 5]
10 [6, 1, 3]
11 [7, 1, 3]
12 [8, 1, 3]
13 [6, 1, 4]
14 [7, 1, 4]
15 [8, 1, 4]
16 [6, 1, 5]
17 [7, 1, 5]
18 [8, 1, 5]
19 [6, 2, 3]
20 [7, 2, 3]
21 [8, 2, 3]
22 [6, 2, 4]
23 [7, 2, 4]
24 [8, 2, 4]
25 [6, 2, 5]
26 [7, 2, 5]
27 [8, 2, 5]
from itertools import product, chain
setups = ['aab', 'abb', 'aaa']
sources = {
'a': [0,1,2],
'b': [3,4,5]
}
combinations = (product(*map(sources.get, setup)) for setup in setups)
combinations is a nested lazy iterator (i.e. nothing is stored in memory and calculated, yet). If you want to get an iterator of lists
combinations = map(list, (product(*map(sources.get, setup)) for setup in setups))
Or you might want to flatten the result:
combinations = chain.from_iterable(product(*map(sources.get, setup)) for setup in setups)
If I understand it correctly, you can achieve the goal with a dictionary bookkeeping the correspondence of a character like "a" to a variable name a.
from collections import defaultdict
a = [0,1,2]
b = [3,4,5]
c = ["aab", "abb", "aaa"]
d = {"a": a, "b": b}
d2 = defaultdict(list)
for seq in c:
l = []
for idx, v in enumerate(seq):
l.append(d[v][idx])
print(l)
d2[seq].append(l)
# Out:
#[0, 1, 5]
#[0, 4, 5]
#[0, 1, 2]
print(d2)
# defaultdict(<class 'list'>, {'aab': [[0, 1, 5]], 'abb': [[0, 4, 5]], 'aaa': [[0, 1, 2]]})
Put the lists in a dictionary so you can access them with strings.
Use the characters in each sequence to determine which lists to use.
Use itertools.product to get the combinations.
import itertools, collections
from pprint import pprint
d = {'a':[0,1,2], 'b':[3,4,5]}
c = ['aab', 'abb', 'aaa']
def f(t):
t = collections.Counter(t)
return max(t.values()) < 2
for seq in c:
data = (d[char] for char in seq)
print(f'sequence: {seq}')
pprint(list(filter(f, itertools.product(*data))))
print('***************************')
Result for sequence 'abb':
sequence: abb
[(0, 3, 4),
(0, 3, 5),
(0, 4, 3),
(0, 4, 5),
(0, 5, 3),
(0, 5, 4),
(1, 3, 4),
(1, 3, 5),
(1, 4, 3),
(1, 4, 5),
(1, 5, 3),
(1, 5, 4),
(2, 3, 4),
(2, 3, 5),
(2, 4, 3),
(2, 4, 5),
(2, 5, 3),
(2, 5, 4)]
edit to filter out tuples with duplicates
I like the idea of a callable dict that can be used with map. It could be used here.
class CallDict(dict):
def __call__(self, key):
return self[key] #self.get(key)
e = CallDict([('a',[0,1,2]), ('b',[3,4,5])])
for seq in c:
data = map(e, seq)
print(f'sequence: {seq}')
for thing in filter(f, itertools.product(*data)):
print(thing)
print('***************************')
I couldn't help myself, here is a generic version of #PM2Ring's solution/answer. Instead of filtering out unwanted items, it doesn't produce them in the first place.
d = {'a':[0,1,2], 'b':[3,4,5]}
c = ['aab', 'abb', 'aaa', 'aba']
def g(d, c):
for seq in c:
print(f'sequence: {seq}')
counts = collections.Counter(seq)
## data = (itertools.combinations(d[key],r) for key, r in counts.items())
data = (itertools.permutations(d[key],r) for key, r in counts.items())
for thing in itertools.product(*data):
q = {key:iter(other) for key, other in zip(counts, thing)}
yield [next(q[k]) for k in seq]
for t in g(d, c):
print(t)
It looks like you're looking for some way to programmatically call itertools.product
from itertools import product
d = {'a': [0,1,2],
'b': [3,4,5]}
c = ['aab', 'abb', 'aaa']
for s in c:
print(list(product(*[d[x] for x in s])))

How for loop works in python?

I have a lists of lists in variable lists something like this:
[7, 6, 1, 8, 3]
[1, 7, 2, 4, 2]
[5, 6, 4, 2, 3]
[0, 3, 3, 1, 6]
[3, 5, 2, 14, 3]
[3, 11, 9, 1, 1]
[1, 10, 2, 3, 1]
When I write lists[1] I get vertically:
6
7
6
3
5
11
10
but when I loop it:
for i in list:
print(i)
I get this horizontally.
7
6
1
8
3
etc...
So, how it works? How can I modify loop to go and give me all vertically?
Short answer:
for l in lists:
print l[1]
Lists of lists
list_of_lists = [ [1, 2, 3], [4, 5, 6], [7, 8, 9]]
for list in list_of_lists:
for x in list:
print x
Here is how you would print out the list of lists columns.
lists = [[7, 6, 1, 8, 3],
[1, 7, 2, 4, 2],
[5, 6, 4, 2, 3],
[0, 3, 3, 1, 6],
[3, 5, 2, 14, 3],
[3, 11, 9, 1, 1],
[1, 10, 2, 3, 1]]
for i in range(0, len(lists[1])):
for j in range(0, len(lists)):
print lists[j][i],
print "\n"

Categories

Resources