Generate MultiIndex DataFrame at different length - python

I need to initialize a multiIndex DataFrame from given data.
id = ['a','b','c'] ;
days = [2,5,4], which means, each id has its corresponding duration of days, i.e. 'a' has day 1,2; 'b' has day 1,..,5; and 'c' has day 1,...4. In another words, the days varies for each id.
And within each day, there are 4 periods, prd = [0,1,2,3].
What I expect to have, is a MultiIndex of DataFrame for each id, at each day and each period.
MultiIndex([('a',1,0),
('a',1,1),
('a',1,2),
('a',1,3),
('a',2,0),
('a',2,1),
('a',2,2),
('a',2,3),
('b',1,0),
('b',1,1),
('b',1,2),
...
('b',5,1),
('b',5,2),
('b',5,3),
('c',1,0),
('c',1,1),
('c',1,2),
...
('c',4,1),
('c',4,2),
('c',4,3),
],
names=['id','day','prd']
)
I tried to handle in python:
Because the days are different for different id, I generate two complete lists of id and days, by loop and list comprehension, and then zip them together to get the tuple pairs. And then I use itertools.product() to combine with period. But what I get is like
[(('a',1),0),
(('a',1),1),
(('a',1),2),....]
If I use pd.MultiIndex.from_product(), I got similar results, that first two index are in a group, third one separated.
Since product won't help either way, the old fashion is to also stretch prd into long and complete list to match the other two fellas , and zip them at once.
I really want to know if there is a better way of generate index from beginning, better than such a long way of loops, list comprehension, zip and product to combine them together. Is there anything in Pandas can handle this case, other than native python data structures?
Many thanks!

Create the combinations using list comprehension with zip:
id = ['a','b','c']
prd = [0,1,2,3]
days = [2,5,4]
result = [(idx, i, p) for d, idx in zip(days, id) for i in range(1, d+1) for p in prd]
print (pd.MultiIndex.from_tuples(result))
MultiIndex([('a', 1, 0),
('a', 1, 1),
('a', 1, 2),
('a', 1, 3),
('a', 2, 0),
('a', 2, 1),
('a', 2, 2),
('a', 2, 3),
('b', 1, 0),
('b', 1, 1),
('b', 1, 2),
('b', 1, 3),
('b', 2, 0),
('b', 2, 1),
('b', 2, 2),
('b', 2, 3),
('b', 3, 0),
('b', 3, 1),
('b', 3, 2),
('b', 3, 3),
('b', 4, 0),
('b', 4, 1),
('b', 4, 2),
('b', 4, 3),
('b', 5, 0),
('b', 5, 1),
('b', 5, 2),
('b', 5, 3),
('c', 1, 0),
('c', 1, 1),
('c', 1, 2),
('c', 1, 3),
('c', 2, 0),
('c', 2, 1),
('c', 2, 2),
('c', 2, 3),
('c', 3, 0),
('c', 3, 1),
('c', 3, 2),
('c', 3, 3),
('c', 4, 0),
('c', 4, 1),
('c', 4, 2),
('c', 4, 3)],
)

You can use np.repeat and np.tile here. You can use this when dealing with large id, prd, days.
You need to repeat id values w.r.t days and each element len(prd) times, we can use np.multiply here.
You need to tile prd values by the total number of days, we can use np.sum here.
Use pd.MultiIndex.from_arrays to build desired output.
id = ['a','b','c']
prd = [0,1,2,3]
days = [2,5,4]
x = np.repeat(id,np.multiply(days, len(prd)))
y = np.concatenate([np.arange(1, i+1).repeat(len(prd)) for i in days])
z = np.tile(prd,np.sum(days))
pd.MultiIndex.from_arrays([x,y,z])
# Equivalent to
# pd.MultiIndex.from_tuples(np.c_[x,y,z].tolist())
# x y z
# | | |
# V V V
MultiIndex([('a', '1', '0'),
('a', '1', '1'),
('a', '1', '2'),
('a', '1', '3'),
('a', '2', '0'),
('a', '2', '1'),
('a', '2', '2'),
('a', '2', '3'),
('b', '1', '0'),
('b', '1', '1'),
('b', '1', '2'),
('b', '1', '3'),
('b', '2', '0'),
('b', '2', '1'),
('b', '2', '2'),
('b', '2', '3'),
('b', '3', '0'),
('b', '3', '1'),
('b', '3', '2'),
('b', '3', '3'),
('b', '4', '0'),
('b', '4', '1'),
('b', '4', '2'),
('b', '4', '3'),
('b', '5', '0'),
('b', '5', '1'),
('b', '5', '2'),
('b', '5', '3'),
('c', '1', '0'),
('c', '1', '1'),
('c', '1', '2'),
('c', '1', '3'),
('c', '2', '0'),
('c', '2', '1'),
('c', '2', '2'),
('c', '2', '3'),
('c', '3', '0'),
('c', '3', '1'),
('c', '3', '2'),
('c', '3', '3'),
('c', '4', '0'),
('c', '4', '1'),
('c', '4', '2'),
('c', '4', '3')],
)

Related

Not random sample of permutations of a set of lists

I have a set of lists and a list with all permutations of the set of lists.
mylist = []
mylist.append(['a', 'b', 'c', 'd', 'e'])
mylist.append(['1', '2', '3', '4', '5'])
mylist.append(['f', 'g', 'h', 'i', 'j'])
# Print all permutations
print(list(itertools.product(*mylist)))
Output
[('a', '1', 'f'), ('a', '1', 'g'), ('a', '1', 'h'), ('a', '1', 'i'), ('a', '1', 'j'), ('a', '2', 'f'), ('a', '2', 'g'), ('a', '2', 'h'), ('a', '2', 'i'), ('a', '2', 'j'), ('a', '3', 'f'), ('a', '3', 'g'), ('a', '3', 'h'), ('a', '3', 'i'), ('a', '3', 'j'), ('a', '4', 'f'), ('a', '4', 'g'), ('a', '4', 'h'), ('a', '4', 'i'), ('a', '4', 'j'), ('a', '5', 'f'), ('a', '5', 'g'), ('a', '5', 'h'), ('a', '5', 'i'), ('a', '5', 'j'), ('b', '1', 'f'), ('b', '1', 'g'), ('b', '1', 'h'), ('b', '1', 'i'), ('b', '1', 'j'), ('b', '2', 'f'), ('b', '2', 'g'), ('b', '2', 'h'), ('b', '2', 'i'), ('b', '2', 'j'), ('b', '3', 'f'), ('b', '3', 'g'), ('b', '3', 'h'), ('b', '3', 'i'), ('b', '3', 'j'), ('b', '4', 'f'), ('b', '4', 'g'), ('b', '4', 'h'), ('b', '4', 'i'), ('b', '4', 'j'), ('b', '5', 'f'), ('b', '5', 'g'), ('b', '5', 'h'), ('b', '5', 'i'), ('b', '5', 'j'), ('c', '1', 'f'), ('c', '1', 'g'), ('c', '1', 'h'), ('c', '1', 'i'), ('c', '1', 'j'), ('c', '2', 'f'), ('c', '2', 'g'), ('c', '2', 'h'), ('c', '2', 'i'), ('c', '2', 'j'), ('c', '3', 'f'), ('c', '3', 'g'), ('c', '3', 'h'), ('c', '3', 'i'), ('c', '3', 'j'), ('c', '4', 'f'), ('c', '4', 'g'), ('c', '4', 'h'), ('c', '4', 'i'), ('c', '4', 'j'), ('c', '5', 'f'), ('c', '5', 'g'), ('c', '5', 'h'), ('c', '5', 'i'), ('c', '5', 'j'), ('d', '1', 'f'), ('d', '1', 'g'), ('d', '1', 'h'), ('d', '1', 'i'), ('d', '1', 'j'), ('d', '2', 'f'), ('d', '2', 'g'), ('d', '2', 'h'), ('d', '2', 'i'), ('d', '2', 'j'), ('d', '3', 'f'), ('d', '3', 'g'), ('d', '3', 'h'), ('d', '3', 'i'), ('d', '3', 'j'), ('d', '4', 'f'), ('d', '4', 'g'), ('d', '4', 'h'), ('d', '4', 'i'), ('d', '4', 'j'), ('d', '5', 'f'), ('d', '5', 'g'), ('d', '5', 'h'), ('d', '5', 'i'), ('d', '5', 'j'), ('e', '1', 'f'), ('e', '1', 'g'), ('e', '1', 'h'), ('e', '1', 'i'), ('e', '1', 'j'), ('e', '2', 'f'), ('e', '2', 'g'), ('e', '2', 'h'), ('e', '2', 'i'), ('e', '2', 'j'), ('e', '3', 'f'), ('e', '3', 'g'), ('e', '3', 'h'), ('e', '3', 'i'), ('e', '3', 'j'), ('e', '4', 'f'), ('e', '4', 'g'), ('e', '4', 'h'), ('e', '4', 'i'), ('e', '4', 'j'), ('e', '5', 'f'), ('e', '5', 'g'), ('e', '5', 'h'), ('e', '5', 'i'), ('e', '5', 'j')]
From above list I want to extract 10 items under the following conditions:
The a should appear 6 times, the b 3 times and the c 1 time.
The 5 should appear 5 times (the rest is random)
The f should appear 2 times and the i should appear 4 times (the rest is random)
Your list of permutations actually contains only combinations. To impose frequencies to your selection of 10, you can pre-fill parts of the combinations with the required values and complete the rest with random values from the remaining elements of the corresponding list. Then shuffle the parts before assembling then into the 10 combinations items:
from random import choices,sample
part0 = sample(['a']*6 + ['b']*3 + choices("cde",k=1) ,10)
part1 = sample(['5']*5 + choices("1234",k=5) ,10)
part2 = sample(['f']*2 + ['i']*4 + choices("ghj",k=4) ,10)
result = list(zip(part0,part1,part2))
Output:
print(*result,sep="\n")
('a', '5', 'i')
('a', '3', 'j')
('b', '5', 'i')
('a', '4', 'g')
('b', '5', 'h')
('b', '5', 'g')
('a', '4', 'f')
('a', '3', 'i')
('a', '1', 'i')
('d', '5', 'f')
\ \ \____ 4 times 'i', 2 times 'f'
\ \________ 5 times '5'
\____________ 6 times 'a', 3 times 'b'
Note that this may produce duplicate combinations. To work around that, you can place the 4 lines in a loop that regenerates the combinations until they are all distinct (with your conditions and a selection of 10 this results in 3.75 attempts on average):
For example:
result = set()
while len(result) != 10:
part0 = sample(['a']*6 + ['b']*3 + choices("cde",k=1) ,10)
part1 = sample(['5']*5 + choices("1234",k=5) ,10)
part2 = sample(['f']*2 + ['i']*4 + choices("ghj",k=4) ,10)
result = set(zip(part0,part1,part2))
If you really want permutations, then you can randomize the position of items produced by zip:
...
result = set(map(lambda c:tuple(sample(c,3)),zip(part0,part1,part2)))
which will then give actual permutations in the result (becoming less likely to have duplicates and only needs 1.22 attempts on average):
('5', 'b', 'i')
('2', 'j', 'a')
('b', 'j', '5')
('g', '2', 'a')
('2', 'f', 'a')
('i', '5', 'a')
('g', '5', 'b')
('i', '1', 'a')
('d', '5', 'f')
('i', '4', 'a')

Get every pair-wise combination of values between two lists

Let's say I have two lists;
a = ['A', 'B', 'C', 'D']
b = ['1', '2', '3', '4']
I know I can get every permutation of these two lists like so:
for r in itertools.product(a, b): print (r[0] + r[1])
But what I'm looking for is every pairwise combination stored in a tuple. So, for example, some combinations would be:
[(A, 1), (B, 2), (C, 3), (D, 4)]
[(A, 1), (B, 3), (C, 2), (D, 4)]
[(A, 1), (B, 4), (C, 3), (D, 2)]
[(A, 1), (B, 3), (C, 4), (D, 2)]
[(A, 1), (B, 2), (C, 4), (D, 3)]
So it would iterate through every possible combination so that no letter has the same number value. I'm at a loss for an efficient way to do this (particularly since I need to scale this to three lists in my actual example)
We can make permutations of one of the lists (for example here the latter one), and then zip each permutation together with the first list, like:
from functools import partial
from itertools import permutations
def pairwise_comb(xs, ys):
return map(partial(zip, xs), permutations(ys))
or in case all subelements should be lists (here these are still iterables that can take a specific shape when you "materialize" these):
from functools import partial
from itertools import permutations
def pairwise_comb(xs, ys):
return map(list, map(partial(zip, xs), permutations(ys)))
For the given sample input, we obtain:
>>> for el in pairwise_comb(a, b):
... print(list(el))
...
[('A', '1'), ('B', '2'), ('C', '3'), ('D', '4')]
[('A', '1'), ('B', '2'), ('C', '4'), ('D', '3')]
[('A', '1'), ('B', '3'), ('C', '2'), ('D', '4')]
[('A', '1'), ('B', '3'), ('C', '4'), ('D', '2')]
[('A', '1'), ('B', '4'), ('C', '2'), ('D', '3')]
[('A', '1'), ('B', '4'), ('C', '3'), ('D', '2')]
[('A', '2'), ('B', '1'), ('C', '3'), ('D', '4')]
[('A', '2'), ('B', '1'), ('C', '4'), ('D', '3')]
[('A', '2'), ('B', '3'), ('C', '1'), ('D', '4')]
[('A', '2'), ('B', '3'), ('C', '4'), ('D', '1')]
[('A', '2'), ('B', '4'), ('C', '1'), ('D', '3')]
[('A', '2'), ('B', '4'), ('C', '3'), ('D', '1')]
[('A', '3'), ('B', '1'), ('C', '2'), ('D', '4')]
[('A', '3'), ('B', '1'), ('C', '4'), ('D', '2')]
[('A', '3'), ('B', '2'), ('C', '1'), ('D', '4')]
[('A', '3'), ('B', '2'), ('C', '4'), ('D', '1')]
[('A', '3'), ('B', '4'), ('C', '1'), ('D', '2')]
[('A', '3'), ('B', '4'), ('C', '2'), ('D', '1')]
[('A', '4'), ('B', '1'), ('C', '2'), ('D', '3')]
[('A', '4'), ('B', '1'), ('C', '3'), ('D', '2')]
[('A', '4'), ('B', '2'), ('C', '1'), ('D', '3')]
[('A', '4'), ('B', '2'), ('C', '3'), ('D', '1')]
[('A', '4'), ('B', '3'), ('C', '1'), ('D', '2')]
[('A', '4'), ('B', '3'), ('C', '2'), ('D', '1')]
This thus results in 24 possible ways to combine this, since the order of 'A', 'B', 'C' and 'D' remains fixed, and the 4 characters can be assigned in 4! ways, or 4! = 4×3×2×1 = 24.
It may be a lot easier than you think. What about:
import itertools
a = ['A', 'B', 'C', 'D']
b = ['1', '2', '3', '4']
for aperm in itertools.permutations(a):
for bperm in itertools.permutations(b):
print(list(zip(aperm, bperm)))
First outputs:
[('A', '1'), ('B', '2'), ('C', '3'), ('D', '4')]
[('A', '1'), ('B', '2'), ('C', '4'), ('D', '3')]
[('A', '1'), ('B', '3'), ('C', '2'), ('D', '4')]
[('A', '1'), ('B', '3'), ('C', '4'), ('D', '2')]
[('A', '1'), ('B', '4'), ('C', '2'), ('D', '3')]
[('A', '1'), ('B', '4'), ('C', '3'), ('D', '2')]
[('A', '2'), ('B', '1'), ('C', '3'), ('D', '4')]
[('A', '2'), ('B', '1'), ('C', '4'), ('D', '3')]
[('A', '2'), ('B', '3'), ('C', '1'), ('D', '4')]
...
(There are 576 lines printed for these two 4-element lists)
Edit: If you want to generalize this to more iterables, you could do something like:
import itertools
a = ['A', 'B', 'C', 'D']
b = ['1', '2', '3', '4']
gens = [itertools.permutations(lst) for lst in (a,b)]
for perms in itertools.product(*gens):
print(list(zip(*perms)))
Which outputs the same thing, but could be easily extended, e.g.
import itertools
a = ['A', 'B', 'C', 'D']
b = ['1', '2', '3', '4']
c = ['W', 'X', 'Y', 'Z']
gens = [itertools.permutations(lst) for lst in (a,b,c)] # add c
for perms in itertools.product(*gens): # no change
print(list(zip(*perms))) # ''
You can use recursion with yield for a no-import solution:
a = ['A', 'B', 'C', 'D']
b = ['1', '2', '3', '4']
def combinations(d, current = []):
if len(current) == 4:
yield current
elif filter(None, d):
for i in d[0]:
_d0, _d1 = [c for c in d[0] if c != i], [c for c in d[1] if c != d[1][0]]
yield from combinations([_d0, _d1] , current+[[i, d[1][0]]])
for i in combinations([a, b]):
print(i)
Output:
[['A', '1'], ['B', '2'], ['C', '3'], ['D', '4']]
[['A', '1'], ['B', '2'], ['D', '3'], ['C', '4']]
[['A', '1'], ['C', '2'], ['B', '3'], ['D', '4']]
[['A', '1'], ['C', '2'], ['D', '3'], ['B', '4']]
[['A', '1'], ['D', '2'], ['B', '3'], ['C', '4']]
[['A', '1'], ['D', '2'], ['C', '3'], ['B', '4']]
[['B', '1'], ['A', '2'], ['C', '3'], ['D', '4']]
[['B', '1'], ['A', '2'], ['D', '3'], ['C', '4']]
[['B', '1'], ['C', '2'], ['A', '3'], ['D', '4']]
[['B', '1'], ['C', '2'], ['D', '3'], ['A', '4']]
[['B', '1'], ['D', '2'], ['A', '3'], ['C', '4']]
[['B', '1'], ['D', '2'], ['C', '3'], ['A', '4']]
[['C', '1'], ['A', '2'], ['B', '3'], ['D', '4']]
[['C', '1'], ['A', '2'], ['D', '3'], ['B', '4']]
[['C', '1'], ['B', '2'], ['A', '3'], ['D', '4']]
[['C', '1'], ['B', '2'], ['D', '3'], ['A', '4']]
[['C', '1'], ['D', '2'], ['A', '3'], ['B', '4']]
[['C', '1'], ['D', '2'], ['B', '3'], ['A', '4']]
[['D', '1'], ['A', '2'], ['B', '3'], ['C', '4']]
[['D', '1'], ['A', '2'], ['C', '3'], ['B', '4']]
[['D', '1'], ['B', '2'], ['A', '3'], ['C', '4']]
[['D', '1'], ['B', '2'], ['C', '3'], ['A', '4']]
[['D', '1'], ['C', '2'], ['A', '3'], ['B', '4']]
[['D', '1'], ['C', '2'], ['B', '3'], ['A', '4']]

Python: reconstruct after pos_tag

I have to following results after using pos_tag:
list = [('a',` '1'), ('b', '2'), ('c', '3'), ('d', '4')]
Now, I have to reconstruct like the following:
a b c d
I used:
[x[0] for x in list]
But, it resulted in
['a', 'b', 'c' , 'd']
Use join method with " " it will make as string
data = [('a', '1'), ('b', '2'), ('c', '3'), ('d', '4')]
s= " ".join(x[0] for x in data)
print(s)
OUT
a b c d

Get and make a histogram of the number of non equal characters in two strings

la have the following sample for instance (just for explanation) :
Real_value Predicted_values
hello halo
communication commanecetpo
what waht
is is
up down
neural narel
network natwark
computer computer
vision vison
convolutional conventioanl
hebbian hebien
learing larnig
transfer trasfert
the first column represents the real values and the second the predicted values. l want to compare the values of columns of each row to detect where the two string differ
l did the following :
ifor i in range(len(df)):
if df.manual_raw_value[i] != df.raw_value[i]:
text=df.manual_raw_value[i]
text2=df.raw_value[i]
x=len(df.manual_raw_value[i])
y = len(df.raw_value[i])
z=min(x,y)
for t in range(z):
if text[t] != text2[t]:
d= (text[t],text2[t])
dictionnary.append(d)
print(dictionnary)
[ ('a', 'n'),
('n', 'g'),
('g', 'e'),
('e', '.'),
('.', 'f'),
('f', 'r'),
("'", 'E'),
('E', 'S'),
('S', 'C'),
('C', 'O'),
('O', 'M'),
('M', 'P'),
('P', 'T'),
('T', 'E'),
('C', 'Q'),
('6', 'G'),
('9', 'o'),
('1', 'i'),
("'", 'E'),
('E', 'a'),
('a', 'u'),
('.', ','),
...]
They key of the dictionary represents the real value.
Now l want to count the number of occurrence as follow :
[('a' : 'e'), ('a','e'), ('b','d')]
becomes
[('a' : 'e') : 2, ('b','d') : 1]
l tried :
collections.Counter(dictionnary)
[ ('/', '1'): 2,
('/', 'M'): 2,
('/', 'W'): 2,
('/', 'h'): 8,
('/', 'm'): 2,
('/', 't'): 6,
('0', '-'): 2,
('0', '1'): 2,
('0', '3'): 2,
('0', '4'): 6,
('0', '5'): 2,
('0', '6'): 2,
('0', '7'): 4,
('0', '9'): 2,
('0', 'C'): 2,
('0', 'D'): 4,
('0', 'O'): 16,
('0', 'Q'): 4,
('0', 'U'): 2,
('0', 'm'): 4,
('0', 'o'): 2,
('0', '\xc3'): 2,
('1', ' '): 2,
('1', '/'): 2,
('1', '0'): 4,
('1', '2'): 2,
('1', '3'): 2,
('1', '4'): 2,
('1', '6'): 2,
('1', 'H'): 2,
('1', 'I'): 24,
('1', 'S'): 2,
('1', 'i'): 6,
('1', 'l'): 6,
('2', '3'): 2,
('2', '8'): 2,
('2', 'N'): 2,
('2', 'S'): 2, ..]
to plot a histogram l tried the following :
import numpy as np
import matplotlib.pyplot as plt
pos = np.arange(len(dictionnary.keys()))
width = 1.0
ax = plt.axes()
ax.set_xticks(pos + (width / 2))
ax.set_xticklabels(dictionnary.keys())
plt.bar(dictionary.keys(), ******, width, color='g')
plt.show()
However :
dictionnary.keys() returns the following error :
Traceback (most recent call last):
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-94-5d466717162c>", line 1, in <module>
dictionnary_new.keys()
AttributeError: 'list' object has no attribute 'keys'
Edit 1:
dictionnary_new = collections.Counter(dictionnary) # it works
import numpy as np
import matplotlib.pyplot as plt
pos = np.arange(len(dictionnary_new.keys()))
width = 1.0
ax = plt.axes()
ax.set_xticks(pos + (width / 2))
ax.set_xticklabels(dictionnary_new.keys())
plt.bar(dictionnary_new.keys(), dictionnary_new.values(), width, color='g')
plt.show()
l got the following error :
Traceback (most recent call last):
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-
packages/IPython/core/interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-117-4155944ddaf3>", line 11, in <module>
plt.bar(dictionnary_new.keys(), dictionnary_new.values(), width, color='g')
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/matplotlib/pyplot.py", line 2705, in bar
**kwargs)
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/matplotlib/__init__.py", line 1892, in inner
return func(ax, *args, **kwargs)
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 2105, in bar
left = [left[i] - width[i] / 2. for i in xrange(len(left))]
TypeError: unsupported operand type(s) for -: 'tuple' and 'float'
Thank you a lot
First of all, I think you have a typo in your example of pairs:
>>> lst = [{'a': 'e'}, {'a': 'e'}, {'b': 'd'}]
>>> collections.Counter([tuple(i.items()) for i in lst])
Counter({(('a', 'e'),): 2, (('b', 'd'),): 1})
Having said that, I don't think this is the proper way to address this. In your code, when you append things to your dictionary variable, don't use a dictionary, use tuples! Replace:
d= {text[t] : text2[t]}
dictionnary.append(d)
With:
d= (text[t], text2[t])
dictionnary.append(d)
And then you can just use:
collections.Counter(dictionnary)
Would something like this work for you?
df['string diff'] = df.apply(lambda x: distance.levenshtein(x['Real Value'], x['Predicted Values']), axis=1)
plt.hist(df['string diff'])
plt.show()

Python Iteration Through Lists with Stepped Progression

Given the following lists:
letters = ('a', 'b', 'c', 'd', 'e', 'f', 'g')
numbers = ('1', '2', '3', '4')
How can I produce an iterated list that produces the following:
output = [('a', '1'), ('b', '2'), ('c', '3'), ('d', '4'),
('e', '1'), ('f', '2'), ('g', '3'), ('a', '4'),
('b', '1'), ('c', '2'), ('d', '3'), ('e', '4'),
('f', '1'), ('g', '2')...]
I feel like I should be able to produce the desired output by using
output = (list(zip(letters, itertools.cycle(numbers))
But this produces the following:
output = [('a', '1'), ('b', '2'), ('c', '3'), ('d', '4'),
('e', '1'), ('f', '2'), ('g', '3')]
Any help would be greatly appreciated.
If you are looking for an infinite generator, you can use cycle with zip for both lists, in the form of zip(itertools.cycle(x), itertools.cycle(y)). That would supply you with the required generator:
>>> for x in zip(itertools.cycle(letters), itertools.cycle(numbers)):
... print(x)
...
('a', '1')
('b', '2')
('c', '3')
('d', '4')
('e', '1')
('f', '2')
('g', '3')
('a', '4')
('b', '1')
('c', '2')
('d', '3')
...
If you want a finite list of elements, this should work
import itertools
letters = ('a', 'b', 'c', 'd', 'e', 'f', 'g')
numbers = ('1', '2', '3', '4')
max_elems = 10
list(itertools.islice((zip(itertools.cycle(letters), itertools.cycle(numbers))), max_elems))
results in
[('a', '1'), ('b', '2'), ('c', '3'), ('d', '4'), ('e', '1'), ('f', '2'), ('g', '3'), ('a', '4'), ('b', '1'), ('c', '2')]

Categories

Resources