Data frame:
pair = collections.defaultdict(collections.Counter)
e.g.
pair = {'doc1': {'word1':4, 'word2':3},
'doc2': {'word1':2, 'word3':4},
'doc3': {'word2':2, 'word4':1},
...}
I want to keep the data frame but alter the type of this part {'word1':4, 'word2':3} {'word1':2, 'word3':4}``... It is now a Counter and I need a dict.
I tried this to get the data from pair, but I do not know how to create a dict for each doc:
new_pair = collections.defaultdict(collections.Counter)
for doc, tab in testing.form.items():
for word, freq in tab.items():
new_pair[doc][word] = freq
I do not want to change the output. I just need that in each doc, the data type is dict, not Counter.
A Counter is already a dict - or, a subclass of it. But, if you really need exactly a dict for some reason, then its a one-liner:
>>> c = Counter(word1=4, word2=3)
>>> c
Counter({'word1': 4, 'word2': 3})
>>> dict(c)
{'word1': 4, 'word2': 3}
Any Mapping (anything that behaves like a dictionary) can be passed into dict, and you will get a dict with the same contents. There is no need to iterate over it to construct it yourself.
This gives you one loop, with one line in the body instead of a nested loop. But any code of the form:
thing = a new empty collection
for elem in old_thing:
Add something to do with elem to thing
Can usually be done in one line using a generator expression or a list, set or dict comprehension. We're building a dict, so a dict comprehension (the Examples section is what you're most interested in) seems likely. I'll leave coming up with it as an exercise for the reader. ;-)
Since Counter is already a dict.
I would like to suggest this in addition to #lvc answer as well.
>>> c = Counter(word1=4, word2=3)
>>> c
Counter({'word1': 4, 'word2': 3})
>>> isinstance(c,dict)
True
>>> {**c}
{'word1': 4, 'word2': 3}
This allows you to add more key and combine multiple dict or counter
>>> {**c, 'total': sum(c.values())}
{'word1': 4, 'word2': 3, 'total': 7}
Maybe you are looking for:
>>> from collections import defaultdict
>>> pair = defaultdict(dict)
>>> pair[3][2]='hello'
>>>
>>> pair
defaultdict(<type 'dict'>, {3: {2: 'hello'}})
>>>
>>> pair[3]
{2: 'hello'}
>>>
new_pair = {} # simple dict at the top level
for doc, tab in testing.form.items():
for word, freq in tab.items():
# top-level values is word counters
new_pair[doc].setdefault(word, Counter()) += freq
The Counter is also a dict. But depend on you need, maybe the follow code is you want.
new_pair ={}
for doc, tab in pari.items():
new_pair[doc] = {}
for word, freq in tab.items():
new_pair[doc][word] = freq
the new_pair dict is you want. Good Luck!
Related
So I tried to only allow the program to store only last 3 scores(values) for each key(name) however I experienced a problem of the program only storing the 3 scores and then not updating the last 3 or the program appending more values then it should do.
The code I have so far:
#appends values if a key already exists
while tries < 3:
d.setdefault(name, []).append(scores)
tries = tries + 1
Though I could not fully understand your question, the concept that I derive from it is that, you want to store only the last three scores in the list. That is a simple task.
d.setdefault(name,[]).append(scores)
if len(d[name])>3:
del d[name][0]
This code will check if the length of the list exceeds 3 for every addition. If it exceeds, then the first element (Which is added before the last three elements) is deleted
Use a collections.defaultdict + collections.deque with a max length set to 3:
from collections import deque,defaultdict
d = defaultdict(lambda: deque(maxlen=3))
Then d[name].append(score), if the key does not exist the key/value will be created, if it does exist we will just append.
deleting an element from the start of a list is an inefficient solution.
Demo:
from random import randint
for _ in range(10):
for name in range(4):
d[name].append(randint(1,10))
print(d)
defaultdict(<function <lambda> at 0x7f06432906a8>, {0: deque([9, 1, 1], maxlen=3), 1: deque([5, 5, 8], maxlen=3), 2: deque([5, 1, 3], maxlen=3), 3: deque([10, 6, 10], maxlen=3)})
One good way for keeping the last N items in python is using deque with maxlen N, so in this case you can use defaultdict and deque functions from collections module.
example :
>>> from collections import defaultdict ,deque
>>> l=[1,2,3,4,5]
>>> d=defaultdict()
>>> d['q']=deque(maxlen=3)
>>> for i in l:
... d['q'].append(i)
...
>>> d
defaultdict(<type 'collections.deque'>, {'q': deque([3, 4, 5], maxlen=3)})
A slight variation on another answer in case you want to extend the list in the entry name
d.setdefault(name,[]).extend(scores)
if len(d[name])>3:
del d[name][:-3]
from collections import defaultdict
d = defaultdict(lambda:[])
d[key].append(val)
d[key] = d[key][:3]
len(d[key])>2 or d[key].append(value) # one string solution
I want to make know if there is a command that can do this:
>>>A=dict()
>>>A[1]=3
>>>A
{1:3}
>>>A[1].add(5) #This is the command that I don't know if exists.
>>>A
{1:(3,5)}
I mean, add another value to the same key without quiting the old value added.
It is possible to do this?
You could make the dictionary values into lists:
>>> A = dict()
>>> A[1] = [3]
>>> A
{1: [3]}
>>> A[1].append(5) # Add a new item to the list
>>> A
{1: [3, 5]}
>>>
You may also be interested in dict.setdefault, which has functionality similar to collections.defaultdict but without the need to import:
>>> A = dict()
>>> A.setdefault(1, []).append(3)
>>> A
{1: [3]}
>>> A.setdefault(1, []).append(5)
>>> A
{1: [3, 5]}
>>>
A defaultdict of type list will create an empty list in case you access a key that does not exist in the dictionary so far. This often leads to quite elegant code.
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> d[1].append(3)
>>> d[1].append(2)
>>> d
defaultdict(<type 'list'>, {1: [3, 2]})
Using a defaultdict eliminates the "special case" of the initial insert.
from collections import defaultdict
A = defaultdict(list)
for num in (3,5):
A[1].append(num)
Like others pointed out, store the values in a list, but remember to check if the key is in the dictionary to determine whether you need to append or create a new list for that key...
A = dict()
if key in A: A[key].append(value)
else: A[key] = [value]
index = {
u'when_air': 0,
u'chrono': 1,
u'age_marker': 2,
u'name': 3
}
How can I make this more beautiful (and clear) way than just manually setting each value?
like:
index = dict_from_range(
[u'when_air', u'chrono', u'age_marker', u'name'],
range(4)
)
You can feed the results of zip() to the builtin dict():
>>> names = [u'when_air', u'chrono', u'age_marker', u'name']
>>> print(dict(zip(names, range(4))))
{'chrono': 1, 'name': 3, 'age_marker': 2, 'when_air': 0}
zip() will return a list of tuples, where each tuple is the ith element from names and range(4). dict() knows how to create a dictionary from that.
Notice that if you give sequences of uneven lengths to zip(), the results are truncated. Thus it might be smart to use range(len(names)) as the argument, to guarantee an equal length.
>>> print(dict(zip(names, range(len(names)))))
{'chrono': 1, 'name': 3, 'age_marker': 2, 'when_air': 0}
You can use a dict comprehension together with the built-in function enumerate to build the dictionary from the keys in the desired order.
Example:
keys = [u'when_air', u'chrono', u'age_marker', u'name']
d = {k: i for i,k in enumerate(keys)}
print d
The output is:
{u'age_marker': 2, u'when_air': 0, u'name': 3, u'chrono': 1}
Note that with Python 3.4 the enum module was added. It may provide the desired semantics more conveniently than a dictionary.
For reference:
http://legacy.python.org/dev/peps/pep-0274/
https://docs.python.org/2/library/functions.html#enumerate
https://docs.python.org/3/library/enum.html
index = {k:v for k,v in zip(['when_air','chrono','age_marker','name'],range(4))}
This?
#keys = [u'when_air', u'chrono', u'age_marker', u'name']
from itertools import count
print dict(zip(keys, count()))
I'm new to python and I have become stuck on a data type issue.
I have a script which looks a bit like this
dd = defaultdict(list)
for i in arr:
dd[color].append(i)
which creates a default dict which resembles something along the lines of
dd = [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
However I need to now access the first list([2,4]). I have tried
print(dd[0])
but this game me the following output
[][][]
I know the defaultdict has data in it as I have printed it in its entirety. However other than access the first item by its dictionary index I don't know how to access it. However, other than access the list by the dictionary key I don't know how to get it. However, I don't know the name of the key until I populate the dict.
I have thought about creating a list of lists rather than a defaultdict but being able to search via key is going to be really usefull for another part of the code so I would like to maintain this data structure if possible.
is there a way to grab the list by an index number or can you only do it using a key?
You can get a list of keys, pick the key by index, then access that key.
print(dd[dd.keys()[0]])
Note that a dictionary in Python is an unordered collection. This means that the order of keys is undefined. Consider the following example:
from collections import defaultdict
d = defaultdict (int)
d['a'] = 1
d['b'] = 2
d['c'] = 3
d['d'] = 4
d['e'] = 5
print (d)
My Python2 gives:
defaultdict(<type 'int'>, {'a': 1, 'c': 3, 'b': 2, 'e': 5, 'd': 4})
Python3 output is different by the way:
defaultdict(<class 'int'>, {'c': 3, 'b': 2, 'a': 1, 'e': 5, 'd': 4})
So, you will have to use some other means to remember the order in which you populate the dictionary. Either maintain a separate list of keys (colors) in the order you need, or use OrderedDict.
So I realized that
dict1.update(dict2)
replaces values of dict2 with dict1 if the key exists in both the dictionaries. Is there any way to add the values of dict2 to dict1 directly if the key is present instead of looping around the key,value pairs
You say you want to add the values, but not what type they are. If they are numeric, you may be able to use collections.Counter instead of dict
>>> from collections import Counter
>>> a = Counter({'a':1, 'b':2})
>>> b = Counter({'a':5.4, 'c':6})
>>> a + b
Counter({'a': 6.4, 'c': 6, 'b': 2})