Python Iterating through two lists only iterates through last element - python

I am trying to iterate through a double list but am getting the incorrect results. I am trying to get the count of each element in the list.
l = [['<s>', 'a', 'a', 'b', 'b', 'c', 'c', '</s>'], ['<s>', 'a', 'c', 'b', 'c', '</s>'], ['<s>', 'b', 'c', 'c', 'a', 'b', '</s>']]
dict = {}
for words in l:
for letters in words:
dict[letters] = words.count(letters)
for x in countVocabDict:
print(x + ":" + str(countVocabDict[x]))
at the moment, I am getting:
<s>:1
a:1
b:2
c:2
</s>:1
It seems as if it is only iterating through the last list in 'l' : ['<s>', 'b', 'c', 'c', 'a', 'b', '</s>']
but I am trying to get:
<s>: 3
a: 4
b: 5
c: 6
</s>:3

In each inner for loop, you are not adding to the current value of dict[letters] but set it to whatever amount is counted for the current sublist (peculiarly) named word.
Fixing your code with a vanilla dict:
>>> l = [['<s>', 'a', 'a', 'b', 'b', 'c', 'c', '</s>'], ['<s>', 'a', 'c', 'b', 'c', '</s>'], ['<s>', 'b', 'c', 'c', 'a', 'b', '</s>']]
>>> d = {}
>>>
>>> for sublist in l:
...: for x in sublist:
...: d[x] = d.get(x, 0) + 1
>>> d
{'<s>': 3, 'a': 4, 'b': 5, 'c': 6, '</s>': 3}
Note that I am not calling list.count in each inner for loop. Calling count will iterate over the whole list again and again. It is far more efficient to just add 1 every time a value is seen, which can be done by looking at each element of the (sub)lists exactly once.
Using a Counter.
>>> from collections import Counter
>>> Counter(x for sub in l for x in sub)
Counter({'<s>': 3, 'a': 4, 'b': 5, 'c': 6, '</s>': 3})
Using a Counter and not manually unnesting the nested list:
>>> from collections import Counter
>>> from itertools import chain
>>> Counter(chain.from_iterable(l))
Counter({'<s>': 3, 'a': 4, 'b': 5, 'c': 6, '</s>': 3})

The dictionary is being overwritten in every iteration, rather it should update
count_dict[letters] += words.count(letters)
Initialize the dictionary with defaultdict
from collections import defaultdict
count_dict = defaultdict(int)

As #Vishnudev said, you must add current counter. But dict[letters] must exists (else you'll get a KeyError Exception). You can use the get method of dict with a default value to avoir this:
l = [['<s>', 'a', 'a', 'b', 'b', 'c', 'c', '</s>'],
['<s>', 'a', 'c', 'b', 'c', '</s>'],
['<s>', 'b', 'c', 'c', 'a', 'b', '</s>']]
dict = {}
for words in l:
for letters in words:
dict[letters] = dict.get(letters, 0) + 1

As per your question, you seem to know that it only takes on the result of the last sublist. This happens because after every iteration your previous dictionary values are replaced and overwritten by the next iteration values. So, you need to maintain the previous states values and add it to the newly calculated values.
You can try this-
l = [['<s>', 'a', 'a', 'b', 'b', 'c', 'c', '</s>'], ['<s>', 'a', 'c', 'b', 'c', '</s>'], ['<s>', 'b', 'c', 'c', 'a', 'b', '</s>']]
d={}
for lis in l:
for x in lis:
if x in d:
d[x]+=1
else:
d[x]=1
So the resulting dictionary d will be as-
{'<s>': 3, 'a': 4, 'c': 6, 'b': 5, '</s>': 3}
I hope this helps!

Related

Insert frequency of elements in list without using temporary list

Input : ['a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'd', 'a']
Output: ['a', 'a', 2, 'b', 'b', 'b', 'b', 4, 'c', 'c', 2, 'd', 1, 'a',1]
What can be the best way to get the output as above without using temporary list in python? I am trying using while loop but the frequency of last element is missed out as the loop gets terminated.
l = ['a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'd', 'a']
i = 0
idx = 0
length = len(l)
while i < length:
if l[i - 1] != l[i]:
l.insert(i, (i - idx))
length += 1
idx = i + 1
i += 2
else:
i += 1
print(l)
The original code has a crucial bug: since i=0 in the first iteration, the condition is checking if the first and last items of the list match, which I don't think is what's intended.
Here's the corrected code.
l = ['a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'd', 'a']
i = 0 # dynamic index
n = 1 # length of sequence
length = len(l)
while i < length:
if i == length-1 or l[i] != l[i+1]:
l.insert(i+1, n)
length += 1
n = 1
i += 2
else:
n += 1
i += 1
print(l)
Instead, I'm comparing to the subsequent item. This of course would be a problem for the last item, but this condition is caught first by the check i == length-1.
Just add the last frequency at the end, so after the loop:
l.append(i - idx)
But make sure your input list isn't empty!
itertools is your friend:
import itertools
l = ['a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'd', 'a']
list(itertools.chain(*[(k, len(list(g))) for k,g in itertools.groupby(l)]))
['a', 2, 'b', 4, 'c', 2, 'd', 1, 'a', 1]
.. but fixing your own code is more satisfying ;)
Edit: So the problem definition changed again? Python 3.9+ needed:
list(itertools.chain(*[(*(gg:=list(g)), len(gg)) for k,g in itertools.groupby(l)]))
['a', 'a', 2, 'b', 'b', 'b', 'b', 4, 'c', 'c', 2, 'd', 1, 'a', 1]
Another approach could be:
from functools import reduce
breaks = [0] + [i for i in range(1, len(l)) if l[i] != l[i-1]] + [len(l)]
reduce(lambda x, y: x + y, [l[breaks[i-1]:breaks[i]] + [breaks[i] - breaks[i-1]] for i in range(1, len(breaks))])
OUTPUT
['a', 'a', 2, 'b', 'b', 'b', 'b', 4, 'c', 'c', 2, 'd', 1, 'a', 1]
Alternatively, if you would like to do everything inside the loop, this could be another approach
l = ['a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'd', 'a']
l = list(reversed(l))
e = l.pop()
res = []
res.append(e)
i = 1
while len(l) > 0:
new_e = l.pop()
if new_e == e:
i += 1
else:
res.append(i)
i = 1
e = new_e
res.append(new_e)
res.append(i)
print(res)
OUTPUT
['a', 'a', 2, 'b', 'b', 'b', 'b', 4, 'c', 'c', 2, 'd', 1, 'a', 1]
You can insert the repetition counts as you go if you are careful with resetting the counter so that the insertions in the list don't skew your end of streak detection:
L = ['a', 'a','b', 'b', 'b', 'b', 'c', 'c', 'd', 'a']
count = 0 # current number of reperitions
for i,c in enumerate(L): # go through list's indexes and items
if count and L[i-1] != c: # end of non-empty streak
L.insert(i,count) # insert count
count = 0 # and reset it
else:
count += 1 # count repetitions
L.append(count) # count for last streak
print(L)
['a', 'a', 2, 'b', 'b', 'b', 'b', 4, 'c', 'c', 2, 'd', 1, 'a', 1]
When the count is inserted, the next iteration will be on the same item (c) that caused the end of streak. Because we reset the count to zero after insertion, that item will go to the else: part and be counted as 1 (0+1) thus properly starting a new streak.
Another way to approach this is to track the index of the first item of a streak and insert the difference between indexes when detecting a change. Here too, you need to be careful with the effect of inserting into the list that you are iterating over. The index of items that cause a break in the streak will be 1 more than their original index after inserting the count:
L = ['a', 'a','b', 'b', 'b', 'b', 'c', 'c', 'd', 'a']
s = 0 # index of start of streak
for i,c in enumerate(L): # go through list's indexes and items
if L[s] != c: # end of streak
L.insert(i,i-s) # insert count
s = i+1 # track start of next streak
L.append(len(L)-s) # count for last streak
print(L)
['a', 'a', 2, 'b', 'b', 'b', 'b', 4, 'c', 'c', 2, 'd', 1, 'a', 1]

Want to change an element of a list to a dictionary key in python

I want to change a list to a dictionary and also want to make the first element of the list as key of dictionary and other elements will be the result of this key in dictionary. Thanks in advance.
This what I have :
lst = ['a', 'b', 'c', 'd']
print (list)
['a', 'b', 'c', 'd']
and this is what I desir:
dic = {'a':['b', 'c', 'd']}
print(dic)
{'a': ['b', 'c', 'd']}
or
print (dic['a'])
['b', 'c', 'd']
you can try:
lst = ['a', 'b', 'c', 'd']
dct = {lst[0]:lst[1:]}
It will give you the desired result
I thought of a function in case you need to do do this for several lists.
def todict (lst):
first, *rest = lst
return {first: rest}
In[1]: todict(lst)
Out[1]: {'a': ['b', 'c', 'd']}

How can I print two counter side by side in python?

I have a function like this.I need to print two dictionaries side by side in python.
def kelime_sayma(metin):
kelimeler = metin.split()
kelime_sayi = Counter(kelimeler)
for i,value in kelime_sayi.most_common():
print('{} {}'.format(i, value))
for j,value in sorted(kelime_sayi.items()):
print('{} {}'.format(j, value))
You can try :
>>> a
['c', 'a', 'b', 'b', 'c', 'b', 'b', 'b', 'a', 'a', 'b', 'a', 'b', 'c', 'c', 'a', 'c', 'a', 'a', 'b', 'a', 'a', 'c', 'a', 'b', 'c', 'c', 'c', 'b', 'a']
>>> b=Counter(a)
>>> b
Counter({'a': 11, 'b': 10, 'c': 9})
>>> for i,j in zip(b.most_common(), b.items()):
... print('{} {} {} {}'.format(i[0], i[1], j[0], j[1]))
Output:
a 11 c 9
b 10 a 11
c 9 b 10
Question: print two dictionaries side by side
zip(*iterables)
for i, v1, v2 in enumerate(zip(kelime_sayi.most_common(), sorted(kelime_sayi.items()), 1):
print('{} {} {}'.format(i, v1, v2))

Iterating over two lists A and B

I am trying to iterate over two lists A and B. Where the B is equal to A - A[i], where i = 1:
For E.g. listA = ['A', 'B', 'C', 'D'].
For first Item, 'A' in List A, I
want the List B to have ['B', 'C', 'D'] For second Item 'B' in List A,
I want the List B to have ['A', 'C', 'D']
What I have tried until now.
listA = ['A', 'B', 'C', 'D']
for term in listA:
listA.remove(term)
for item in listA:
print(listA)
If all you want is to print the sublists, it will be like:
for i in range(len(listA)):
print(listA[:i]+listA[i+1:])
Or,
for i in listA:
print(list(set(listA) - set(i)))
Try this,
>>> la = ['A', 'B', 'C', 'D']
>>> for i in la:
_temp = la.copy()
_temp.remove(i)
print(_temp)
Output:
['B', 'C', 'D']
['A', 'C', 'D']
['A', 'B', 'D']
['A', 'B', 'C']
*If you want to assign the print output to new variables, use a dictionary where the key will the name of list and value is printted output.
Is this what you want?
listA = ['A', 'B', 'C', 'D']
Bs = \
[listA[:idx] + listA[idx + 1:]
for idx
in range(len(listA))]
for B in Bs:
print(B)
Taking the above solutions a step further, you can store a reference to each of the resulting list in the corresponding variable using a dictionary comprehension:
keys_map = {x: [item for item in listA if item != x] for x in listA}
print(keys_map)
Output
{
'A': ['B', 'C', 'D'],
'B': ['A', 'C', 'D'],
'C': ['A', 'B', 'D'],
'D': ['A', 'B', 'C']
}
and access the desired key like so
keys_map.get('A')
# returns
['B', 'C', 'D']

Mapping one value to another in a list

In a Python list, how can I map all instances of one value to another value?
For example, suppose I have this list:
x = [1, 3, 3, 2, 3, 1, 2]
Now, perhaps I want to change all 1's to 'a', all 2's to 'b', and all 3's to 'c', to create another list:
y = ['a', 'c', 'c', 'b', 'c', 'a', 'b']
How can I do this mapping elegantly?
You should use a dictionary and a list comprehension:
>>> x = [1, 3, 3, 2, 3, 1, 2]
>>> d = {1: 'a', 2: 'b', 3: 'c'}
>>> [d[i] for i in x]
['a', 'c', 'c', 'b', 'c', 'a', 'b']
>>>
>>> x = [True, False, True, True, False]
>>> d = {True: 'a', False: 'b'}
>>> [d[i] for i in x]
['a', 'b', 'a', 'a', 'b']
>>>
The dictionary serves as a translation table of what gets converted into what.
An alternative solution is to use the built-in function map which applies a function to a list:
>>> x = [1, 3, 3, 2, 3, 1, 2]
>>> subs = {1: 'a', 2: 'b', 3: 'c'}
>>> list(map(subs.get, x)) # list() not needed in Python 2
['a', 'c', 'c', 'b', 'c', 'a', 'b']
Here the dict.get method was applied to the list x and each number was exchanged for its corresponding letter in subs.
In [255]: x = [1, 3, 3, 2, 3, 1, 2]
In [256]: y = ['a', 'c', 'c', 'b', 'c', 'a', 'b']
In [257]: [dict(zip(x,y))[i] for i in x]
Out[257]: ['a', 'c', 'c', 'b', 'c', 'a', 'b']

Categories

Resources