The line of code below is meant to count the number of individual letters in the sentence assigned to the variable "message".
message = 'It was a bright cold day in April, and the clocks were striking thirteen'
count = {}
for character in message:
count.setdefault(character, 0)
count[character] = count[character] + 1
print(count)
The code runs successfully with the output below
{' ': 13, ',': 1, '.': 1, 'A': 1, 'I': 1, 'a': 4, 'c': 3, 'b': 1, 'e': 5, 'd': 3, 'g': 2,
'i': 6, 'h': 3, 'k': 2, 'l': 3, 'o': 2, 'n': 4, 'p': 1, 's': 3, 'r': 5, 't': 6, 'w': 2, 'y': 1}
Please what is this part of the code doing in the program
count[character] = count[character] + 1
The line of code you mention in your question:
Extracts the existing dictionary value for the given key (character) and increments it by 1 count[character] + 1.
Updates the value for the key count[character] = .
You may actually make your code a bit shorter as setdefault method returns the current value for a given key:
message = 'It was a bright cold day in April, and the clocks were striking thirteen'
count = {}
for character in message:
count[character] = count.setdefault(character, 0) + 1
print(count)
count[character] = count[character] + 1 is essentially increasing the count of the current character by one. So, first of all, setdefault() is a function that checks if the current count exists in the dictionary. If it doesn't, it will be added with the default value of 0. Then in the dictionary, that character will be incremented by one. Let me give an example. If we are on the first (zeroth) character in message, 'I', setdefault() checks if 'I' is already in the dictionary, then adds it with the value of 0. Then it is incremented by one. That means the value of 'I' in the dictionary is now 1.
This question already has answers here:
How do I count the occurrences of a list item?
(29 answers)
Closed 2 years ago.
I am trying to count the occurrences of each letter of a word
word = input("Enter a word")
Alphabet=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
for i in range(0,26):
print(word.count(Alphabet[i]))
This currently outputs the number of times each letter occurs including the ones that don't.
How do I list the letters vertically with the frequency alongside it, e.g., like the following?
word="Hello"
H 1
E 1
L 2
O 1
from collections import Counter
counts=Counter(word) # Counter({'l': 2, 'H': 1, 'e': 1, 'o': 1})
for i in word:
print(i,counts[i])
Try using Counter, which will create a dictionary that contains the frequencies of all items in a collection.
Otherwise, you could do a condition on your current code to print only if word.count(Alphabet[i]) is greater than 0, though that would be slower.
def char_frequency(str1):
dict = {}
for n in str1:
keys = dict.keys()
if n in keys:
dict[n] += 1
else:
dict[n] = 1
return dict
print(char_frequency('google.com'))
As Pythonista said, this is a job for collections.Counter:
from collections import Counter
print(Counter('cats on wheels'))
This prints:
{'s': 2, ' ': 2, 'e': 2, 't': 1, 'n': 1, 'l': 1, 'a': 1, 'c': 1, 'w': 1, 'h': 1, 'o': 1}
s = input()
t = s.lower()
for i in range(len(s)):
b = t.count(t[i])
print("{} -- {}".format(s[i], b))
An easy and simple solution without a library:
string = input()
f = {}
for i in string:
f[i] = f.get(i,0) + 1
print(f)
Here is the link for get(): https://docs.quantifiedcode.com/python-anti-patterns/correctness/not_using_get_to_return_a_default_value_from_a_dictionary.html
Following up what LMc said, your code was already pretty close to functional. You just needed to post-process the result set to remove 'uninteresting' output. Here's one way to make your code work:
#!/usr/bin/env python
word = raw_input("Enter a word: ")
Alphabet = [
'a','b','c','d','e','f','g','h','i','j','k','l','m',
'n','o','p','q','r','s','t','u','v','w','x','y','z'
]
hits = [
(Alphabet[i], word.count(Alphabet[i]))
for i in range(len(Alphabet))
if word.count(Alphabet[i])
]
for letter, frequency in hits:
print letter.upper(), frequency
But the solution using collections.Counter is much more elegant/Pythonic.
For future references: When you have a list with all the words you want, lets say wordlistit's pretty simple
for numbers in range(len(wordlist)):
if wordlist[numbers][0] == 'a':
print(wordlist[numbers])
Another way could be to remove repeated characters and iterate only on the unique characters (by using set()) and then counting the occurrence of each unique character (by using str.count())
def char_count(string):
freq = {}
for char in set(string):
freq[char] = string.count(char)
return freq
if __name__ == "__main__":
s = "HelloWorldHello"
print(char_count(s))
# Output: {'e': 2, 'o': 3, 'W': 1, 'r': 1, 'd': 1, 'l': 5, 'H': 2}
It might make sense to include all letters of the alphabet. For example, if you're interested in calculating the cosine difference between word distributions you typically require all letters.
You can use this method:
from collections import Counter
def character_distribution_of_string(pass_string):
letters = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
chars_in_string = Counter(pass_string)
res = {}
for letter in letters:
if(letter in chars_in_string):
res[letter] = chars_in_string[letter]
else:
res[letter] = 0
return(res)
Usage:
character_distribution_of_string("This is a string that I want to know about")
Full Character Distribution
{'a': 4,
'b': 1,
'c': 0,
'd': 0,
'e': 0,
'f': 0,
'g': 1,
'h': 2,
'i': 3,
'j': 0,
'k': 1,
'l': 0,
'm': 0,
'n': 3,
'o': 3,
'p': 0,
'q': 0,
'r': 1,
's': 3,
't': 6,
'u': 1,
'v': 0,
'w': 2,
'x': 0,
'y': 0,
'z': 0}
You can extract the character vector easily:
list(character_distribution_of_string("This is a string that I want to know about").values())
giving...
[4, 1, 0, 0, 0, 0, 1, 2, 3, 0, 1, 0, 0, 3, 3, 0, 0, 1, 3, 6, 1, 0, 2, 0, 0, 0]
Initialize an empty dictionary and iterate over every character of the word. If the current character in present in the dictionary, increment its value by 1, and if not, set its value to 1.
word="Hello"
characters={}
for character in word:
if character in characters:
characters[character] += 1
else:
characters[character] = 1
print(characters)
import string
word = input("Enter a word: ")
word = word.lower()
Alphabet=list(string.ascii_lowercase)
res = []
for i in range(0,26):
res.append(word.count(Alphabet[i]))
for i in range (0,26):
if res[i] != 0:
print(str(Alphabet[i].upper()) + " " + str(res[i]))
If using libraries or built-in functions is to be avoided then the following code may help:
s = "aaabbc" # Sample string
dict_counter = {} # Empty dict for holding characters
# as keys and count as values
for char in s: # Traversing the whole string
# character by character
if not dict_counter or char not in dict_counter.keys(): # Checking whether the dict is
# empty or contains the character
dict_counter.update({char: 1}) # If not then adding the
# character to dict with count = 1
elif char in dict_counter.keys(): # If the character is already
# in the dict then update count
dict_counter[char] += 1
for key, val in dict_counter.items(): # Looping over each key and
# value pair for printing
print(key, val)
Output:
a 3
b 2
c 1
def string(n):
a=list()
n=n.replace(" ","")
for i in (n):
c=n.count(i)
a.append(i)
a.append(c)
y=dict(zip(*[iter(a)]*2))
print(y)
string("Lets hope for better life")
#Output:{'L': 1, 'e': 5, 't': 3, 's': 1, 'h': 1, 'o': 2, 'p': 1, 'f': 2, 'r': 2, 'b': 1, 'l': 1, 'i': 1}
(if u notice in output 2 L-letter one uppercase and other lowercase..if u want them together look for the code below)
In the output, it removes repeated characters, drops empty spaces and iterates only on the unique characters.
IF you want to count both uppercase and lowercase together the:
def string(n):
n=n.lower() #either use (n.uperr())
a=list()
n=n.replace(" ","")
for i in (n):
c=n.count(i)
a.append(i)
a.append(c)
y=dict(zip(*[iter(a)]*2))
print(y)
string("Lets hope for better life")
#output:{'l': 2, 'e': 5, 't': 3, 's': 1, 'h': 1, 'o': 2, 'p': 1, 'f': 2, 'r': 2, 'b': 1, 'i': 1}
So i have this code that is supposed to count the characters in a user inputted sentece
import pprint
message = str(input())
count = {}
for character in message:
count.setdefault(character, 0)
count[character] = count[character] + 1
pprint.pprint(count)
however the problem is, it gives an output for every successive character i.e if you give a sentence with 3 characters in it e.g "the" it will give 3 outputs
the
{'t': 1}
{'h': 1, 't': 1}
{'e': 1, 'h': 1, 't': 1}
Process finished with exit code 0
how do i get it only to give the final output with all characters counted? thanks
You're getting multiple outputs because the print (pprint.pprint) is in a for loop.
Just remove the indentation from the pprint.pprint(count) line, so that it isn't in the for loop:
import pprint
message = str(input())
count = {}
for character in message:
count.setdefault(character, 0)
count[character] = count[character] + 1
pprint.pprint(count)
You print the output every line so this is expected. You can achieve the same with much simpler code using Counter -
from collections import Counter
import pprint
message = str(input())
count = Counter(message)
pprint.pprint(dict(count))
import pprint
message = str(input())
count = {}
for character in message:
count.setdefault(character, 0)
count[character] = count[character] + 1
pprint.pprint(count)
you just have to take off pprint.pprint(count) from for cycle
import pprint
message = str(input())
count = {}
for character in message:
count.setdefault(character, 0)
count[character] = count[character] + 1
pprint.pprint(count)
output:
messaggio
{'a': 1, 'e': 1, 'g': 2, 'i': 1, 'm': 1, 'o': 1, 's': 2}
If I understood your objective correctly; this might work. if character.isalpha() filters non alphabetic characters in the string like !"#$%&'()*+,-./:;<=>?#[\]^_{|}~` and white spaces only take into account the alphabetic letters.
from pprint import pprint
message = 'This is a test string to check whether character to occurences mapping works correctly' #str(input())
occurrences_dict = {}
for character in message:
if character.isalpha() and character not in occurrences_dict:
occurrences_dict[character] = message.count(character)
pprint(occurrences_dict)
{'T': 1,
'a': 4,
'c': 9,
'e': 8,
'g': 2,
'h': 5,
'i': 4,
'k': 2,
'l': 1,
'm': 1,
'n': 3,
'o': 5,
'p': 2,
'r': 8,
's': 6,
't': 8,
'u': 1,
'w': 2,
'y': 1}
I have been working through Automate the Boring Stuff by Al Sweighart. I'm struggling with understanding the code below:
INPUT
message = 'It was a bright cold day in April, and the clocks were striking thirteen.'
count = {}
for character in message:
count.setdefault(character, 0)
count[character] = count[character] + 1
print(count)
OUTPUT
{'I': 1, 't': 6, ' ': 13, 'w': 2, 'a': 4, 's': 3, 'b': 1, 'r': 5, 'i': 6, 'g': 2, 'h': 3, 'c': 3, 'o': 2, 'l': 3, 'd': 3, 'y': 1, 'n': 4, 'A': 1, 'p': 1, ',': 1, 'e': 5, 'k': 2, '.': 1}
QUESTION
Since it does not matter what the variable in a for loop is called (ie character can be changed to x, pie etc) how does the code know to run the loop through each character in the string?
It's not about the variable's name, it's about the object this variable points to. The implementation of the loop in the Python virtual machine knows how to iterate over objects based on their types.
Iterating over something is implemented as iterating over iter(something), which in turn is the same as iterating over something.__iter__(). Different classes implement their own versions of __iter__, so that loops work correctly.
str.__iter__ iterates over the individual characters of a string, list.__iter__ - over the list's elements and so on.
You could create your own object and iterate over it:
class MyClass:
def __iter__(self):
return iter([1,2,3,4])
my_object = MyClass()
for x in my_object:
print(x)
This will print the numbers from 1 to 4.
A string is an array in python. So, it means that when you loop on a string, you loop on each character; in your case, you set what has been read to character.
Then, setdefault maps character to 0 if character is not yet in the dict. The rest looks quite straightforward.
Strings in python are sequences of chars : https://docs.python.org/3/library/stdtypes.html#textseq. Therefore, the for c in m: line iterate on every elements of the m sequence, i.e. on every character of the string