Is there a clear way to iterate over items for each generator in a list? I believe the simplest way to show the essence of the question is o proved an expample. Here it is
0. Assume we have an function returning generator:
def gen_fun(hint):
for i in range(1,10):
yield "%s %i" % (hint, i)
1. Clear solution with straight iteration order:
hints = ["a", "b", "c"]
for hint in hints:
for txt in gen_fun(hint):
print(txt)
This prints
a 1
a 2
a 3
...
b 1
b 2
b 3
...
2. Cumbersome solution with inverted iterating order
hints = ["a", "b", "c"]
generators = list(map(gen_fun, hints))
any = True
while any:
any = False
for g in generators:
try:
print(next(g))
any = True
except StopIteration:
pass
This prints
a 1
b 1
c 1
a 2
b 2
...
This works as expected and does what I want.
Bonus points:
The same task, but gen_fun ranges can differ, i.e
def gen_fun(hint):
if hint == 'a':
m = 5
else:
m = 10
for i in range(1,m):
yield "%s %i" % (hint, i)
The correct output for this case is:
a 1
b 1
c 1
a 2
b 2
c 2
a 3
b 3
c 3
a 4
b 4
c 4
b 5
c 5
b 6
c 6
b 7
c 7
b 8
c 8
b 9
c 9
The querstion:
Is there a way to implement case 2 cleaner?
If i understand the question correctly, you can use zip() to achieve the same thing as that whole while any loop:
hints = ["a", "b", "c"]
generators = list(map(gen_fun, hints))
for x in zip(*generators):
for txt in x:
print(txt)
output:
a 1
b 1
c 1
a 2
b 2
...
UPDATE:
If the generators are of different length, zip 'trims' them all to the shortest. you can use itertools.izip_longest (as suggested by this q/a) to achieve the opposite behaviour and continue yielding until the longest generator is exhausted. You'll need to filter out the padded values though:
hints = ["a", "b", "c"]
generators = list(map(gen_fun, hints))
for x in zip_longest(*generators):
for txt in x:
if txt:
print(txt)
You might want to look into itertools.product:
from itertools import product
# Case 1
for tup in product('abc', range(1,4)):
print('{0} {1}'.format(*tup))
print '---'
# Case 2
from itertools import product
for tup in product(range(1,4), 'abc'):
print('{1} {0}'.format(*tup))
Output:
a 1
a 2
a 3
b 1
b 2
b 3
c 1
c 2
c 3
---
a 1
b 1
c 1
a 2
b 2
c 2
a 3
b 3
c 3
Note that the different between case 1 and 2 are just the order of parameters passed into the product function and the print statement.
Related
How for I get the "rest of the list" after the the current element for an iterator in a loop?
I have a list:
[ "a", "b", "c", "d" ]
They are not actually letters, they are words, but the letters are there for illustration, and there is no reason to expect the list to be small.
For each member of the list, I need to:
def f(depth, list):
for i in list:
print(f"{depth} {i}")
f(depth+1, rest_of_the_list_after_i)
f(0,[ "a", "b", "c", "d" ])
The desired output (with spaces for clarity) would be:
0 a
1 b
2 c
3 d
2 d
1 c
2 d
1 d
0 b
1 c
2 d
1 d
0 c
1 d
0 d
I explored enumerate with little luck.
The reality of the situation is that there is a yield terminating condition. But that's another matter.
I am using (and learning with) python 3.10
This is not homework. I'm 48 :)
You could also look at it like:
0 a 1 b 2 c 3 d
2 d
1 c 2 d
1 d
0 b 1 c 2 d
1 d
0 c 1 d
0 d
That illustrates the stream nature of the thing.
Seems like there are plenty of answers here, but another way to solve your given problem:
def f(depth, l):
for idx, item in enumerate(l):
step = f"{depth * ' '} {depth} {item[0]}"
print(step)
f(depth + 1, l[idx + 1:])
f(0,[ "a", "b", "c", "d" ])
def f(depth, alist):
# you dont need this if you only care about first
# for i in list:
print(f"{depth} {alist[0]}")
next_depth = depth + 1
rest_list = alist[1:]
f(next_depth,rest_list)
this doesnt seem like a very useful method though
def f(depth, alist):
# if you actually want to iterate it
for i,item in enumerate(alist):
print(f"{depth} {alist[0]}")
next_depth = depth + 1
rest_list = alist[i:]
f(next_depth,rest_list)
I guess this code is what you're looking for
def f(depth, lst):
for e,i in enumerate(lst):
print(f"{depth} {i}")
f(depth+1, lst[e+1:])
f(0,[ "a", "b", "c", "d" ])
I cannot find a solution for this very specific problem I have.
In essence, I have two lists with two elements each: [A, B] and [1,2]. I want to create a nested loop that iterates and expands on the second list and adds each element of first list after each iteration.
What I want to see in the end is this:
A B
1 A
1 B
2 A
2 B
1 1 A
1 2 A
2 1 A
2 2 A
1 1 B
1 2 B
2 1 B
2 2 B
1 1 1 A
1 1 2 A
...
My problem is that my attempt at doing this recursively splits the A and B apart so that this pattern emerges (note the different first line, too):
A
1 A
2 A
1 1 A
1 2 A
2 1 A
2 2 A
1 1 1 A
1 1 2 A
...
B
1 B
2 B
1 1 B
1 2 B
2 1 B
2 2 B
1 1 1 B
1 1 2 B
...
How do I keep A and B together?
Here is the code:
def second_list(depth):
if depth < 1:
yield ''
else:
for elements in [' 1 ', ' 2 ']:
for other_elements in list (second_list(depth-1)):
yield elements + other_elements
for first_list in [' A ', ' B ']:
for i in range(0,4):
temp=second_list(i)
for temp_list in list(temp):
print temp_list + first_list
I would try something in the following style:
l1 = ['A', 'B']
l2 = ['1', '2']
def expand(l1, l2):
nl1 = []
for e in l1:
for f in l2:
nl1.append(f+e)
yield nl1[-1]
yield from expand(nl1,l2)
for x in expand(l1, l2):
print (x)
if len(x) > 5:
break
Note: the first line of your output does not seem to be the product of the same rule, so it is not generated here, you can add it, if you want, manually.
Note2: it would be more elegant not to build the list of the newly generated elements, but then you would have to calculate them twice.
I have a data set which contains something like this:
SNo Cookie
1 A
2 A
3 A
4 B
5 C
6 D
7 A
8 B
9 D
10 E
11 D
12 A
So lets say we have 5 cookies 'A,B,C,D,E'. Now I want to count if any cookie has reoccurred after a new cookie was encountered. For example, in the above example, cookie A was encountered again at 7th place and then at 12th place also. NOTE We wouldn't count A at 2nd place as it came simultaneously, but at position 7th and 12th we had seen many new cookies before seeing A again, hence we count that instance. So essentially I want something like this:
Sno Cookie Count
1 A 2
2 B 1
3 C 0
4 D 2
5 E 0
Can anyone give me logic or python code behind this?
One way to do this would be to first get rid of consecutive Cookies, then find where the Cookie has been seen before using duplicated, and finally groupby cookie and get the sum:
no_doubles = df[df.Cookie != df.Cookie.shift()]
no_doubles['dups'] = no_doubles.Cookie.duplicated()
no_doubles.groupby('Cookie').dups.sum()
This gives you:
Cookie
A 2.0
B 1.0
C 0.0
D 2.0
E 0.0
Name: dups, dtype: float64
Start by removing consecutive duplicates, then count the survivers:
no_dups = df[df.Cookie != df.Cookie.shift()] # Borrowed from #sacul
no_dups.groupby('Cookie').count() - 1
# SNo
#Cookie
#A 2
#B 1
#C 0
#D 2
#E 0
pandas.factorize and numpy.bincount
If immediately repeated values are not counted then remove them.
Do a normal counting of values on what's left.
However, that is one more than what is asked for, so subtract one.
factorize
Filter out immediate repeats
bincount
Produce pandas.Series
i, r = pd.factorize(df.Cookie)
mask = np.append(True, i[:-1] != i[1:])
cnts = np.bincount(i[mask]) - 1
pd.Series(cnts, r)
A 2
B 1
C 0
D 2
E 0
dtype: int64
pandas.value_counts
zip cookies with its lagged self, pulling out non repeats
c = df.Cookie.tolist()
pd.value_counts([a for a, b in zip(c, [None] + c) if a != b]).sort_index() - 1
A 2
B 1
C 0
D 2
E 0
dtype: int64
defaultdict
from collections import defaultdict
def count(s):
d = defaultdict(lambda:-1)
x = None
for y in s:
d[y] += y != x
x = y
return pd.Series(d)
count(df.Cookie)
A 2
B 1
C 0
D 2
E 0
dtype: int64
I am reading a file with pd.read_csv and removing all the values that are -1. Here's the code
import pandas as pd
import numpy as np
columns = ['A', 'B', 'C', 'D']
catalog = pd.read_csv('data.txt', sep='\s+', names=columns, skiprows=1)
a = cataog['A']
b = cataog['B']
c = cataog['C']
d = cataog['D']
print len(b) # answer is 700
# remove rows that are -1 in column b
idx = np.where(b != -1)[0]
a = a[idx]
b = b[idx]
c = c[idx]
d = d[idx]
print len(b) # answer is 612
So I am assuming that I have successfully managed to remove all the rows where the value in column b is -1.
In order to test this, I am doing the following naive way:
for i in range(len(b)):
print i, a[i], b[i]
It prints out the values until it reaches a row which was supposedly filtered out. But now it gives a KeyError.
You can filtering by boolean indexing:
catalog = catalog[catalog['B'] != -1]
a = cataog['A']
b = cataog['B']
c = cataog['C']
d = cataog['D']
It is expected you get KeyError, because index values not match, because filtering.
One possible solution is convert Series to lists:
for i in range(len(b)):
print i, list(a)[i], list(b)[i]
Sample:
catalog = pd.DataFrame({'A':list('abcdef'),
'B':[-1,5,4,5,-1,4],
'C':[7,8,9,4,2,3],
'D':[1,3,5,7,1,0]})
print (catalog)
A B C D
0 a -1 7 1
1 b 5 8 3
2 c 4 9 5
3 d 5 4 7
4 e -1 2 1
#filtered DataFrame have no index 0, 4
catalog = catalog[catalog['B'] != -1]
print (catalog)
A B C D
1 b 5 8 3
2 c 4 9 5
3 d 5 4 7
5 f 4 3 0
a = catalog['A']
b = catalog['B']
c = catalog['C']
d = catalog['D']
print (b)
1 5
2 4
3 5
5 4
Name: B, dtype: int64
#a[i] in first loop want match index value 0 (a[0]) what does not exist, so KeyError,
#same problem for b[0]
for i in range(len(b)):
print (i, a[i], b[i])
KeyError: 0
#convert Series to list, so list(a)[0] return first value of list - there is no Series index
for i in range(len(b)):
print (i, list(a)[i], list(b)[i])
0 b 5
1 c 4
2 d 5
3 f 4
Another solution should be create default index 0,1,... by reset_index with drop=True:
catalog = catalog[catalog['B'] != -1].reset_index(drop=True)
print (catalog)
A B C D
0 b 5 8 3
1 c 4 9 5
2 d 5 4 7
3 f 4 3 0
a = catalog['A']
b = catalog['B']
c = catalog['C']
d = catalog['D']
#default index values match a[0] and a[b]
for i in range(len(b)):
print (i, a[i], b[i])
0 b 5
1 c 4
2 d 5
3 f 4
If you filter out indices, then
for i in range(len(b)):
print i, a[i], b[i]
will attempt to access erased indices. Instead, you can use the following:
for i, ae, be in zip(a.index, a.values, b.values):
print(i, ae, be)
def wordCount(inPath):
inFile = open(inPath, 'r')
lineList = inFile.readlines()
counter = {}
for line in range(len(lineList)):
currentLine = lineList[line].rstrip("\n")
for letter in range(len(currentLine)):
if currentLine[letter] in counter:
counter[currentLine[letter]] += 1
else:
counter[currentLine[letter]] = 1
sorted(counter.keys(), key=lambda counter: counter[0])
for letter in counter:
print('{:3}{}'.format(letter, counter[letter]))
inPath = "file.txt"
wordCount(inPath)
This is the output:
a 1
k 1
u 1
l 2
12
h 5
T 1
r 4
c 2
d 1
s 5
i 6
o 3
f 2
H 1
A 1
e 10
n 5
x 1
t 5
This is the output I want:
12
A 1
H 1
T 1
a 1
c 2
d 1
e 10
f 2
h 5
i 6
k 1
l 2
n 5
o 3
r 4
s 5
t 5
u 1
x 1
How do I sort the "counter" alphabetically?
I've tried simply sorting by keys and values but it doesn't return it alphabetically starting with capitals first
Thank you for your help!
sorted(counter.keys(), key=lambda counter: counter[0])
alone does nothing: it returns a result which isn't used at all (unless you recall it using _ but that's rather a command-line practice)
As opposed to what you can do with a list with .sort() method, you cannot sort dictionary keys "in-place". But what you can do is iterating on the sorted version of the keys:
for letter in sorted(counter.keys()):
, key=lambda counter: counter[0] is useless here: you only have letters in your keys.
Aside: your whole code could be simplified a great deal using collections.Counter to count the letters.
import collections
c = collections.Counter("This is a Sentence")
for k,v in sorted(c.items()):
print("{} {}".format(k,v))
result (including space char):
3
S 1
T 1
a 1
c 1
e 3
h 1
i 2
n 2
s 2
t 1