So I want to set up a function that takes a string and basically counts how many times a letter is repeating, and I want to do it with dictionaries. I've used an if else statement, but now I want to use the .get method. So far my code looks like this:
def histogram(s):
d = dict()
for c in s:
d.get(c)
d[c] = 1
return d
g = histogram('bronto')
print(g)
This prints:
{'b': 1, 'r': 1, 'o': 1, 'n': 1, 't': 1}
However as you can see there should be 2 o's. I cant do d[c] += 1, because it hasn't been previously declared. How do I get the function to count in the extra letters within that for loop?
That's exactly what collections.Counter is for:
from collections import Counter
g = Counter('bronto')
However if you want to use plain dicts and dict.get you need to process the return value of dict.get, for example with:
d[c] = d.get(c, 0) + 1
You'll want to check if the entry exists in the dictionary before trying to add to it. The simplest extension of what you've written so far is to check each character as you go.
def histogram(s):
d = dict()
for c in s:
if c in d:
d[c] += 1
else:
d[c] = 1
return d
g = histogram('bronto')
print(g)
Apart from the d[c] = d.get(c, 0) + 1, and the Counter solutions, I'd like to point out the existence of defaultdict:
from collections import defaultdict
def histogram(s):
d = defaultdict(int)
for c in s:
d[c] += 1
return d
defaultdict never raises a KeyError. It is initialized with a constructor (a class, or a function). If a key is missing, the constructor will be called without arguments and the returned value will be assigned to the key, before resuming the normal operation.
For this case I'd use Counter, but defaultdict can be useful in more general scenarios.
Related
I write a small program using comprehension list python and I need to assign a value to dictionary.
It gives me syntax error.
all_freq = {}
Input = 'google.com'
[all_freq[s] += 1 if s in Input else all_freq[s] = 1 for s in Input]
It says "[" was not closed.
Could you please help me.
Use a normal for loop, not a list comprehension, as you are not trying to create a list of anything.
all_freq = {}
for s in Input:
if s in all_freq:
all_freq[s] += 1
else:
all_freq[s] = 1
which can be simplified slightly to
all_freq = {}
for s in Input:
if s not in all_freq:
all_freq[s] = 0
all_freq[s] += 1
which can be replaced entirely with
from collections import Counter
all_freq = Counter(Input)
Just inspired by earlier post, you can also do this way:
Of course, Counter is the best to do quick tallying.
from collections import defaultdict
all_freq = defaultdict(int) # initialize the dict to take 0, if key is not existing yet
for s in 'google': # each for-loop, just increment the *count* for corresponding letter
all_freq[s] += 1
print(all_freq)
defaultdict(<class 'int'>, {'g': 2, 'o': 2, 'l': 1, 'e': 1})
This is what I have so far as a function
example = "Sample String"
def func(text, let):
count= {}
for let in text.lower():
let = count.keys()
if let in text:
count[let] += 1
else:
count[let] = 1
return count
I want to return something like this
print(func(example, "sao"))
{'s': 2, 'a' : 1}
I am not very sure what I could improve on
I would use Counter from the collections built-in module:
from collections import Counter
def func(text, let):
c = Counter(text.lower())
return {l: c[l] for l in let if l in c.keys()}
Breaking it down:
Counter will return the count of letters in your string:
In [5]: Counter(example.lower())
Out[5]:
Counter({'s': 2,
'a': 1,
'm': 1,
'p': 1,
'l': 1,
'e': 1,
' ': 1,
't': 1,
'r': 1,
'i': 1,
'n': 1,
'g': 1})
So then all you need to do is return a dictionary of the appropriate letters, which can be done in a dictionary comprehension:
# iterate over every letter in `let`, and get the Counter value for that letter,
# if that letter is in the Counter keys
{l: c[l] for l in let if l in c.keys()}
Fixing your code
If you prefer to use your approach, you could make your code work properly with this:
def func(text, let):
count = {}
for l in text.lower():
if l in let:
if l in count.keys():
count[l] += 1
else:
count[l] = 1
return count
from functools import reduce
def count(text, letters):
return reduce(
lambda d, letr: d.update({letr: d.get(letr, 0) + 1}) or d,
filter(lambda l: l in letters, text), {}
)
Read it backwards.
Creates an empty dictionary.
{}
Filters letters from text.
lambda l: l in letters
This lambda function returns true if l is in letters
filter(lambda l: l in letters, text)
reduce will iterate over the object returned by filter, which will
only produce letters in text, if they are in letters.
lambda d, letr: d.update({letr: d.get(letr, 0) + 1}) or d
Updates the dictionary with the count of the letters it encounters.
Each time reduce iterates over an item generated by the filter object,
it will call this lambda function. Since dict.update() -> None, returns None, which evaluates to false, we say or d to actually return the dict back to reduce, which will pass the dict back into the lambda the next time it gets called, thus building up the counts. We also use dict.get() in the lambda instead of d[i], this allows us to pass the default of 0 if the letter is not yet in the dictionary.
At the end reduce returns the dict, and we return that from count.
This is similar to how "map reduce" works.
You can read about functional style and lambda expressions in the python docs.
>>> def func(text: str, let: str):
... text, count = text.lower(), {}
... for i in let:
... if text.count(i) != 0:
... count[i] = text.count(i)
... return count
...
>>> print(func("Sample String", "sao"))
{'s': 2, 'a': 1}
I'm trying to build a method where if an item is not in a dictionary then it uses the last member of a list and updates the dictionary accordingly. Sort of like a combination of the pop and setdefault method. What I tried was the following:
dict1 = {1:2,3:4,5:6}
b = 7
c = [8,9,10]
e = dict1.setdefault(b, {}).update(pop(c))
So I would like the output to be where {7:10} gets updated to dict1, that is to say, if b is not in the keys of dict1 then the code updates dict1 with an item using b and the last item of c.
It might be possible for you to abuse a defaultdict:
from collections import defaultdict
c = [8, 9, 10]
dict1 = defaultdict(c.pop, {1: 2, 3: 4, 5: 6})
b = 7
e = dict1[b]
This will pop an item from c and make it a value of dict1 whenever a key missing from dict1 is accessed. (That means the expression dict1[b] on its own has side-effects.) There are many situations where that behaviour is more confusing than helpful, though, in which case you can opt for explicitness:
if b in dict1:
e = dict1[b]
else:
e = dict1[b] = c.pop()
which can of course be wrapped up in a function:
def get_or_pop(mapping, key, source):
if key in mapping:
v = mapping[key]
else:
v = mapping[key] = source.pop()
return v
⋮
e = get_or_pop(dict1, b, c)
Considering your variables, you could use the following code snippet
dict1[b] = dict1.pop(b, c.pop())
where you are updating the dictionary "dict1" with the key "b" and the value c.pop(), (last value of the list in c, equivalent to c[-1] in this case). Note that this is possible because the key value b=7 is not in you original dictionary.
prior={}
conditionProb={}
Counts={}
for i in range(len(trainingData)):
label=trainingLabels[i]
prior[label]+=1
datum=trainingData[i]
for j in range(len(datum)):
Counts[(i,j,label)]+=1
if(datum[j]>0):
conditionProb[(i,j,label)]+=1
when I run this code, it will report a key error because prior do not initialize first so the value is 0. I can initialize these 3 dict by loops but it seems put too many code to do the work. So I am seeking some other way to do this, e.g. override default method in dict? I am not familiar with python. Any idea is appreciated.
You can use defaultdict to initialize keys to 0:
from collections import defaultdict
prior = defaultdict(lambda: 0)
conditionProb = defaultdict(lambda: 0)
Counts = defaultdict(lambda: 0)
for i, (label, data) in enumerate(zip(trainingLabels, trainingData)):
prior[label] += 1
for j,datum in enumerate(data):
Counts[i, j, label] += 1
if datum > 0:
conditionProb[i, j, label] += 1
You can use defaultdict from the collections module. You construct it passing the type of values in there, in this case an int, plus a default value if it's not set (default is 0). Do it like this:
from collections import defaultdict
my_dict = defaultdict(int)
my_dict['foo'] += 2
You can use Counter:
>>> from collections import Counter
>>> c = Counter()
>>> c['a'] += 2
>>> c
Counter({'a': 2})
I have a problem concerning a comparison between a char key in a dict and a char within a list.
The Task is to read a text and count all beginning letters.
I have a list with chars:
bchars = ('i','g','h','n','h')
and a dict with the alphabet and frequency default to zero:
d = dict(dict())
for i in range(97,123):
d[i-97]={chr(i):0}
no i want to check like the following:
for i in range(len(bchars)):
for j in range(len(d)):
if(bchars[i] in d[j]):
d[j][chr(i+97)] +=1
else:
d[j][chr(i+97)] +=0
so if the char in the list is a key at the certain position then += 1 else += zero
I thought by using a if/else statement I can bypass the KeyError.
Is there any more elegant solution for that?
The specific problem is that you check whether bchars[i] is in d[j], but then the key you actually use is chr(i+97).
chr(i+97) is the index of the ith character in bchars, but mapped to ASCII characters starting from 'a'. Why would you want to use this as your key?
I think you really want to do:
for i in range(len(bchars)):
for j in range(len(d)):
if(bchars[i] in d[j]):
d[j][bchars[i]] += 1
else:
d[j][bchars[i]] = 1
Note that you can't use += in the else; remember how you literally just checked whether the key was there and decided it wasn't?
More broadly, though, your code doesn't make sense - it is overcomplicated and does not use the real power of Python's dictionaries. d looks like:
{0: {'a': 0}, 1: {'b': 0}, 2: {'c': 0}, ...}
It would be much more sensible to build a dictionary mapping character directly to count:
{'a': 0, 'b': 0, 'c': 0, ...}
then you can simply do:
for char in bchars:
if char in d:
d[char] += 1
Python even comes with a class just for doing this sort of thing.
The nested dictionary doesn't seem necessary:
d = [0] * 26
for c in bchars:
d[ord(c)-97] += 1
You might also want to look at the Counter class in the collections module.
from collections import Counter
bchars = ('i','g','h','n','h')
counts = Counter(bchars)
print(counts)
print(counts['h'])
prints
Counter({'h': 2, 'i': 1, 'g': 1, 'n': 1})
2