python dictionary comprehension iterator - python

Hey I have a doubt in the following python code i wrote :
#create a list of elements
#use a dictionary to find out the frequency of each element
list = [1,2,6,3,4,5,1,1,3,2,2,5]
list.sort()
dict = {i: list.count(i) for i in list}
print(dict)
In the dictionary compression method, "for i in list" is the sequence supplied to the method right ? So it takes 1,2,3,4.. as the keys. My question is why doesn't it take 1 three times ? Because i've said "for i in list", doesn't it have to take each and every element in the list as a key ?
(I'm new to python so be easy on me !)

My question is why doesn't it take 1 three times ?
That's because dictionary keys are unique. If there is another entry found for the same key, the previous value for that key will be overwritten.
Well, for your issue, if you are only after counting the frequency of each element in your list, then you can use collections.Counter
And please don't use list as variable name. It's a built-in.
>>> lst = [1,2,6,3,4,5,1,1,3,2,2,5]
>>> from collections import Counter
>>> Counter(lst)
Counter({1: 3, 2: 3, 3: 2, 5: 2, 4: 1, 6: 1})

Yes, your suspicion is correct. 1 will come up 3 times during iteration. However, since dictionaries have unique keys, each time 1 comes up it will replace the previously generated key/value pair with the newly generated key/value pair. This will give the right answer, it's not the most efficient. You could convert the list to a set instead to avoid reprocessing duplicate keys:
dict = {i: list.count(i) for i in set(list)}
However, even this method is horribly inefficient because it does a full pass over the list for each value in the list, i.e. O(n²) total comparisons. You could do this in a single pass over the list, but you wouldn't use a dict comprehension:
xs = [1,2,6,3,4,5,1,1,3,2,2,5]
counts = {}
for x in xs:
counts[x] = counts.get(x, 0) + 1
The result for counts is: {1: 3, 2: 3, 3: 2, 4: 1, 5: 2, 6: 1}
Edit: I didn't realize there was something in the library to do this for you. You should use Rohit Jain's solution with collections.Counter instead.

Related

Dictionary comprehension Bigram

I have a python assignment to extract bigrams from a string into a dictionary and I think I have found the solution online but cant remember where I found it. But it seems to work but I am having trouble understanding it as I am new to python. Can anyone explain the code below which takes a string and extracts chars into tuples and counts instances and puts it into a dictionary
'''
s = 'Mississippi' # Your string
# Dictionary comprehension
dic_ = {k : s.count(k) for k in{s[i]+s[i+1] for i in range(len(s)-1)}}
'''
First let's understand comprehensions:
A list, dict, set, etc. can be made with a comprehension. Basically a comprehension is taking a generator and using it to form a new variable. A generator is just an object that returns a different value each iteration so to use list as an example: to make a list with a list comprehension we take the values that the generator outputs and put them into their own spot in a list. Take this generator for example:
x for x in range(0, 10)
This will just give 0 on the first iteration, then 1, then 2, etc. so to make this a list we would use [] (list brakets) like so:
[x for x in range(0, 10)]
This would give:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] #note: range does not include the second input
for a dictionary and for a set we use {}, but since dictionaries uses key-value pairs our generator will be different for sets and dictionaries. For a set it is the same as a list:
{x for x in range(0, 10)} #gives the set --> {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
but for a dictionary we need a key and a value. Since enumerate gives two items this could be useful for dictionaries in some cases:
{key: value for key, value in enumerate([1,2,3])}
In this case the keys are the indexes and the values are the items in the list. So this gives:
{0: 1, 1: 2, 2: 3} #dictionary
It doesn't make a set because we denote x : y which is the format for items in a dictionary, not a set.
Now, let's break this down:
This part of the code:
{s[i]+s[i+1] for i in range(len(s)-1)}
is making a set of values that is every pair of touching letters, s[i] is one letter, s[i+1] is the letter after, so it is saying get this pair (s[i]+s[i+1]) and do it for every item in the string (for i in range(len(s)-1) Notice there is a -1 since the last letter does not have a touching letter after it (so we don't want to run it for the last letter).
Now that we have a set let's save it to a variable so it's easier to see:
setOfPairs = {s[i]+s[i+1] for i in range(len(s)-1)}
Then our original comprehension would change to:
{k : s.count(k) for k in setOfPairs}
This is saying we want to make a dictionary that has keys of k and values of s.count(k) since we get every k from our pairs list: for k in setOfPairs the keys of the dictionary are, then, the pairs. Since s.count(k) returns the number of times k is in s, the values of the dictionary are the number of times the key appears in s.
Let's take this apart one step at a time:
s[i] is the code to select the i-th letter in the string s.
s[i]+s[i+1] concatenates the letter at position i and the letter at position i+1.
s[i]+s[i+1] for i in range(len(s)-1) iterates each index i (except the last one) and so computes all the bigrams.
Since the expression in 3 is surrounded by curly brackets, the result is a set, meaning that all duplicate bigrams are removed.
for k in {s[i]+s[i+1] for i in range(len(s)-1)} therefore iterates over all unique bigrams in the given string s.
Lastly, {k : s.count(k) for k in{s[i]+s[i+1] for i in range(len(s)-1)}} maps each each bigram k to the amount of times it appears in s, because the str.count function returns the number of times a substring appears in a string.
I hope that helps. If you want to know more about list/set/dict comprehensions in Python, the relevant entry in the Python documentation is here: https://docs.python.org/3/tutorial/datastructures.html?highlight=comprehension#list-comprehensions
dic_ = {k : s.count(k) for k in{s[i]+s[i+1] for i in range(len(s)-1)}}
Read backwards
dic_ = {k : s.count(k)
## Step 3 with each of the pair of characters
count how many are in the string
store the 2 char string as the key and the count as the value for the dictionary.
for k in{s[i]+s[i+1]
# Step 2 From each of the position take 2 characters out of the string
for i in range(len(s)-1)}}
# Step 1 loop over all but the last character of the string.
The code may be inefficient for long strings with many repetitions. Step 2 takes every pair so the count and store will be repeated count times.
Refactoring so you can test if the key already exists and not repeating the count may speed it up. bench mark time on ... say a billion base pair DNA sequence.

Fill list using Counter with 0 values

Is possible to have a count of how many times a value appears in a given list, and have '0' if the item is not in the list?
I have to use zip but the first list have 5 items and the other one created using count, have only 3. That's why I need to fill the other two position with 0 values.
You can achieve your purpose with itertools zip_longest.
With zip_longest, you can zip two lists of different lengths, just that the missing corresponding values will be filled with 'None'. You may define a suitable fill values as i have done below.
from itertools import zip_longest
a = ['a','b','c','d','e']
b = [1,4,3]
final_lst = list(zip_longest(a,b, fillvalue=0))
final_dict = dict(list(zip_longest(a,b, fillvalue=0))) #you may convert answer to dictionary if you wish
ELSE
If what you are trying to do is count the number of times items in a reference list appear in another list(taking record also of reference items that don't appear in the other list), you may use dictionary comprehension:
ref_list = ['a','b','c','d','e']#reference list
other_list = ['a','b','b','d','a','d','a','a','a']
count_dict = {n:other_list.count(n) for n in ref_list}
print (count_dict)
Output
{'a': 5, 'b': 2, 'c': 0, 'd': 2, 'e': 0}
Use collections.Counter, and then call get with a default value of 0 to see how many times any given element appears:
>>> from collections import Counter
>>> counts = Counter([1, 2, 3, 1])
>>> counts.get(1, 0)
2
>>> counts.get(2, 0)
1
>>> counts.get(5, 0)
0
If you want to count how many times a value appears in a list, you could do this:
def count_in_list(list_,value):
count=0
for e in list_:
if e==value:
count+=1
return count
And use the code like this:
MyList=[1,3,1,1,1,1,1,2]
count_in_list(MyList,1)
Output:
6
This will work without any additional things such as imports.

Python: dictionary list comprehension for count of number of instances

I have the following:
a = rand(100).round() #for example
count = {}
for i in a:
count[i] = count.get(i, 0) + 1
# print(a)
print(count)
The last line returns something like {0.0: 52, 1.0: 48}
I would like to do the for loop as dictionary comprehension. But,
count = {i: count.get(i,0)+1 for i in a}
always returns {0.0: 1, 1.0: 1}
What am I doing wrong?
Why not use the appropriately named Counter?
from collections import Counter
>>> c = Counter([1,1,1,1,1,1,1,1,1,1,1,1,5,4,3,2,3,4,1,3,13,12,13,2,1,13,4,4,4])
>>> c
Counter({1: 14, 4: 5, 3: 3, 13: 3, 2: 2, 5: 1, 12: 1})
The statement
count = {i: count.get(i,0)+1 for i in a}
is composed of two parts:
{i: count.get(i,0)+1 for i in a}
and
count = ...
the first one computes a dictionary and when evaluating it count is just the empty dictionary you defined first and has no relation with the dictionary being constructed by the comprehension expression.
Only at the end of the dictionary construction this is assigned to count (replacing the empty dictionary). During the comprehension evaluation count is empty and remains empty so every get will always return the default value of 0.
There is no way to refer to the object being constructed in a comprehension (e.g a list or a dictionary) in the expressions used inside the comprehension.
I think the comprehension version of yours look like this,
count = {}
count = {i: count.get(i,0)+1 for i in a}
When the comprehension is executed, count refers to the empty dictionary created in the previous line. So count.get(i,0) always returns 0. That is why the result has 1 always. If you didn't defined in the previous line, you will get
NameError: global name 'count' is not defined
Because the count is not defined yet in the program.
Note: You cannot reference the dictionary being constructed in the dictionary comprehension.
So, updating the dictionary will not work in the comprehension. The solution what you actually have now is fine.

How to return dictionary keys as a list in Python?

With Python 2.7, I can get dictionary keys, values, or items as a list:
newdict = {1:0, 2:0, 3:0}
newdict.keys()
# [1, 2, 3]
With Python >= 3.3, I get:
newdict.keys()
# dict_keys([1, 2, 3])
How do I get a plain list of keys with Python 3?
This will convert the dict_keys object to a list:
list(newdict.keys())
On the other hand, you should ask yourself whether or not it matters. It is Pythonic to assume duck typing -- if it looks like a duck and it quacks like a duck, it is a duck. The dict_keys object can be iterated over just like a list. For instance:
for key in newdict.keys():
print(key)
Note that dict_keys doesn't support insertion newdict[k] = v, though you may not need it.
Python >= 3.5 alternative: unpack into a list literal [*newdict]
New unpacking generalizations (PEP 448) were introduced with Python 3.5 allowing you to now easily do:
>>> newdict = {1:0, 2:0, 3:0}
>>> [*newdict]
[1, 2, 3]
Unpacking with * works with any object that is iterable and, since dictionaries return their keys when iterated through, you can easily create a list by using it within a list literal.
Adding .keys() i.e [*newdict.keys()] might help in making your intent a bit more explicit though it will cost you a function look-up and invocation. (which, in all honesty, isn't something you should really be worried about).
The *iterable syntax is similar to doing list(iterable) and its behaviour was initially documented in the Calls section of the Python Reference manual. With PEP 448 the restriction on where *iterable could appear was loosened allowing it to also be placed in list, set and tuple literals, the reference manual on Expression lists was also updated to state this.
Though equivalent to list(newdict) with the difference that it's faster (at least for small dictionaries) because no function call is actually performed:
%timeit [*newdict]
1000000 loops, best of 3: 249 ns per loop
%timeit list(newdict)
1000000 loops, best of 3: 508 ns per loop
%timeit [k for k in newdict]
1000000 loops, best of 3: 574 ns per loop
with larger dictionaries the speed is pretty much the same (the overhead of iterating through a large collection trumps the small cost of a function call).
In a similar fashion, you can create tuples and sets of dictionary keys:
>>> *newdict,
(1, 2, 3)
>>> {*newdict}
{1, 2, 3}
beware of the trailing comma in the tuple case!
list(newdict) works in both Python 2 and Python 3, providing a simple list of the keys in newdict. keys() isn't necessary.
You can also use a list comprehension:
>>> newdict = {1:0, 2:0, 3:0}
>>> [k for k in newdict.keys()]
[1, 2, 3]
Or, shorter,
>>> [k for k in newdict]
[1, 2, 3]
Note: Order is not guaranteed on versions under 3.7 (ordering is still only an implementation detail with CPython 3.6).
A bit off on the "duck typing" definition -- dict.keys() returns an iterable object, not a list-like object. It will work anywhere an iterable will work -- not any place a list will. a list is also an iterable, but an iterable is NOT a list (or sequence...)
In real use-cases, the most common thing to do with the keys in a dict is to iterate through them, so this makes sense. And if you do need them as a list you can call list().
Very similarly for zip() -- in the vast majority of cases, it is iterated through -- why create an entire new list of tuples just to iterate through it and then throw it away again?
This is part of a large trend in python to use more iterators (and generators), rather than copies of lists all over the place.
dict.keys() should work with comprehensions, though -- check carefully for typos or something... it works fine for me:
>>> d = dict(zip(['Sounder V Depth, F', 'Vessel Latitude, Degrees-Minutes'], [None, None]))
>>> [key.split(", ") for key in d.keys()]
[['Sounder V Depth', 'F'], ['Vessel Latitude', 'Degrees-Minutes']]
If you need to store the keys separately, here's a solution that requires less typing than every other solution presented thus far, using Extended Iterable Unpacking (Python3.x+):
newdict = {1: 0, 2: 0, 3: 0}
*k, = newdict
k
# [1, 2, 3]
Operation
no. Of characters
k = list(d)
9 characters (excluding whitespace)
k = [*d]
6 characters
*k, = d
5 characters
Converting to a list without using the keys method makes it more readable:
list(newdict)
and, when looping through dictionaries, there's no need for keys():
for key in newdict:
print key
unless you are modifying it within the loop which would require a list of keys created beforehand:
for key in list(newdict):
del newdict[key]
On Python 2 there is a marginal performance gain using keys().
Yes, There is a better and simplest way to do this in python3.X
use inbuild list() function
#Devil
newdict = {1:0, 2:0, 3:0}
key_list = list(newdict)
print(key_list)
#[1, 2, 3]
I can think of 2 ways in which we can extract the keys from the dictionary.
Method 1: -
To get the keys using .keys() method and then convert it to list.
some_dict = {1: 'one', 2: 'two', 3: 'three'}
list_of_keys = list(some_dict.keys())
print(list_of_keys)
-->[1,2,3]
Method 2: -
To create an empty list and then append keys to the list via a loop.
You can get the values with this loop as well (use .keys() for just keys and .items() for both keys and values extraction)
list_of_keys = []
list_of_values = []
for key,val in some_dict.items():
list_of_keys.append(key)
list_of_values.append(val)
print(list_of_keys)
-->[1,2,3]
print(list_of_values)
-->['one','two','three']
Beyond the classic (and probably more correct) way to do this (some_dict.keys()) there is also a more "cool" and surely more interesting way to do this:
some_dict = { "foo": "bar", "cool": "python!" }
print( [*some_dict] == ["foo", "cool"] ) # True
Note: this solution shouldn't be used in a develop environment; I showed it here just because I thought it was quite interesting from the *-operator-over-dictionary side of view. Also, I'm not sure whether this is a documented feature or not, and its behaviour may change in later versions :)
You can you use simple method like below
keys = newdict.keys()
print(keys)
This is the best way to get key List in one line of code
dict_variable = {1:"a",2:"b",3:"c"}
[key_val for key_val in dict_variable.keys()]
Get a list of keys with specific values
You can select a subset of the keys that satisfies a specific condition. For example, if you want to select the list of keys where the corresponding values are not None, then use
[k for k,v in newdict.items() if v is not None]
Slice the list of keys in a dictionary
Since Python 3.7, dicts preserve insertion order. So one use case of list(newdict) might be to select keys from a dictionary by its index or slice it (not how it's "supposed" to be used but certainly a possible question). Instead of converting to a list, use islice from the built-in itertools module, which is much more efficient since converting the keys into a list just to throw away most of it is very wasteful. For example, to select the second key in newdict:
from itertools import islice
next(islice(newdict, 1, 2))
or to slice the second to fifth key:
list(islice(newdict, 1, 6))
For large dicts, it's thousands of times faster than list(newdict)[1] etc.

What are the pythonic way to replace a specific set element?

I have a python set set([1, 2, 3]) and always want to replace the third element of the set with another value.
It can be done like below:
def change_last_elemnent(data):
result = []
for i,j in enumerate(list(data)):
if i == 2:
j = 'C'
result.append(j)
return set(result)
But is there any other pythonic way to do that,more smartly and making it more readable?
Thanks in advance.
Sets are unordered, so the 'third' element doesn't really mean anything. This will remove an arbitrary element.
If that is what you want to do, you can simply do:
data.pop()
data.add(new_value)
If you wish to remove an item from the set by value and replace it, you can do:
data.remove(value) #data.discard(value) if you don't care if the item exists.
data.add(new_value)
If you want to keep ordered data, use a list and do:
data[index] = new_value
To show that sets are not ordered:
>>> list({"dog", "cat", "elephant"})
['elephant', 'dog', 'cat']
>>> list({1, 2, 3})
[1, 2, 3]
You can see that it is only a coincidence of CPython's implementation that '3' is the third element of a list made from the set {1, 2, 3}.
Your example code is also deeply flawed in other ways. new_list doesn't exist. At no point is the old element removed from the list, and the act of looping through the list is entirely pointless. Obviously, none of that really matters as the whole concept is flawed.

Categories

Resources