Create function to count occurrence of characters in a string - python

I want get the occurrence of characters in a string, I got this code:
string = "Foo Fighters"
def conteo(string):
copia = ''
for i in string:
if i not in copia:
copia = copia + i
conteo = [0]*len(copia)
for i in string:
if i in copia:
conteo[copia.index(i)] = conteo[copia.index(i)] + 1
out = ['0']*2*len(copia)
for i in range(len(copia)):
out[2*i] = copia[i]
out[2*i + 1] = conteo[i]
return (out)
And I want return something like: ['f', 2, 'o', 2, '', 1, 'i', 1, 'g', 1, 'h', 1, 't', 1, 'e', 1, 'r', 1, 's', 1]
How can I do it? Without use a python library
Thank you

Use Python Counter (part of standard library):
>>> str = 'foo fighters'
>>> from collections import Counter
>>> counter = Counter(str)
Counter({'f': 2, 'o': 2, ' ': 1, 'e': 1, 'g': 1, 'i': 1, 'h': 1, 's': 1, 'r': 1, 't': 1})
>>> counter['f']
2
>>>

Depending on why you want this information, one method could be to use a Counter:
from collections import Counter
print(Counter("Foo Fighters"))
Of course, to create exactly the same output as requested, use itertools as well:
from collections import Counter
from itertools import chain
c = Counter("Foo Fighters")
output = list(chain.from_iterable(c.items()))
>> ['F', 2, 'o', 2, ' ', 1, 'i', 1, 'g', 1, 'h', 1, 't', 1, 'e', 1, 'r', 1, 's', 1]

It's not clear whether you want a critique of your current attempt or a pythonic solution. Below is one way where output is a dictionary.
from collections import Counter
mystr = "Foo Fighters"
c = Counter(mystr)
Result
Counter({' ': 1,
'F': 2,
'e': 1,
'g': 1,
'h': 1,
'i': 1,
'o': 2,
'r': 1,
's': 1,
't': 1})
Output as list
I purposely do not combine the tuples in this list, as it's a good idea to maintain structure until absolutely necessary. It's a trivial task to combine these into one list of strings.
list(c.items())
# [('F', 2),
# ('o', 2),
# (' ', 1),
# ('i', 1),
# ('g', 1),
# ('h', 1),
# ('t', 1),
# ('e', 1),
# ('r', 1),
# ('s', 1)]

Related

How do I count letters occurring in a string without using a dictionary or list.count()?

I am trying to count each letter up without using count() or
dict().
I did write something but I am still having issues with my code.
myString = []
#countList = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
myString ="pynativepynvepynative"
countList = [len(myString)+1]
for i in range(len(myString)):
#print("Here 0")
for j in range(len(countList)):
#print("Here 1")
if i == countList[j]:
#print("Here 1.1")
countList[j+1] = (countList[j+1] + 1)
break
else:
#print("Here 2")
countList.append(myString[i])
countList.append(1)
break
print(countList)
Expected output:
['p', 3, 'y', 3, 'n', 3, 'a', 2, 't', 2, 'i', 2, 'v', 3, 'e', 3]
Actual output:
[22, 'p', 1, 'y', 1, 'n', 1, 'a', 1, 't', 1, 'i', 1, 'v', 1, 'e', 1, 'p', 1, 'y', 1, 'n', 1, 'v', 1, 'e', 1, 'p', 1, 'y', 1, 'n', 1, 'a', 1, 't', 1, 'i', 1, 'v', 1, 'e', 1]
what can you do is get the unique letters from the string and the for each unique letter loop through the string to count its frequency.
def func_count(string):
letter = []
for char in string:
if char not in letter:
letter.append(char)
res = []
for let in letter:
count = 0
for char in string:
if let == char:
count+=1
res.extend([let, count])
# res = {a:b for a,b in zip(res[::2], res[1::2])}
return res
string = "pynativepynvepynative"
solution = func_count(string)
print(solution)
output
['p', 3, 'y', 3, 'n', 3, 'a', 2, 't', 2, 'i', 2, 'v', 3, 'e', 3]
edit, if you want solution in dict form add res = {a:b for a,b in zip(res[::2], res[1::2])} in function
Using my question code, I was able to get the right answer modifying the question's code
My problem was that I did not know how to initiate countList properly
countList = []
myString ="pynativepynvepynative"
for i in range(len(myString)):
#print("Here 0")
for j in range(len(countList)):
#print("Here 1")
#print(myString[i])
#print(j)
#print(countList[j])
if myString[i] == countList[j]:
#print("Here 1.1")
#print(myString[i])
countList[j+1] = (countList[j+1] + 1)
break
else :
#print("Here 2")
countList.append(myString[i])
countList.append(1)
print(countList)
Actual output:
['p', 3, 'y', 3, 'n', 3, 'a', 2, 't', 2, 'i', 2, 'v', 3, 'e', 3]
Use collections.Counter, the dict subclass for counting objects, which makes this a one-liner:
from collections import Counter
c = Counter('pynativepynvepynative')
Counter({'p': 3, 'y': 3, 'n': 3, 'v': 3, 'e': 3, 'a': 2, 't': 2, 'i': 2})
(Technically this isn't a dict, it's a subclass of dict.)
You can get a list-of-tuple from it:
>>> c.most_common()
[('p', 3), ('y', 3), ('n', 3), ('v', 3), ('e', 3), ('a', 2), ('t', 2), ('i', 2)]
Lists or tuples are undesirable for counting things, because you want to be able to separately access/sort by the keys (objects that you're counting) and the values (counts). In theory you can do that on list-of-list/tuple, but it's a pain, and Counter alrady defines several of the methods you'll need.

Dictionary rearrangement and sorting

Required of counting the number of different values appear in the dict books, and in accordance with the number of occurrences of value reverse output.
books = {
123457889: 'A',
252435234: 'A',
434234341: 'B',
534524365: 'C',
354546589: 'D',
146546547: 'D',
353464543: 'F',
586746547: 'E',
511546547: 'F',
546546647: 'F',
541146127: 'F',
246546127: 'A',
434545127: 'B',
533346127: 'E',
544446127: 'F',
546446127: 'G',
155654627: 'G',
546567627: 'G',
145452437: 'H',
}
Output like this:
'F': 5,
'A': 3,
'G': 3,
'B': 2,
'D': 2,
'E': 2,
'C': 1,
'H': 1
I tried it:
import pprint
# to get the values from books
clist = [v for v in books.values()]
# values in books as keys in count,
count = {}
for c in clist:
count.setdefault(c, 0)
count[c] += 1
pprint.pprint(count)
But dict couldn't sorting.
Your code works fine. You can do this much easier using Counter from the collections module to do this for you. Simply pass books.values() in to Counter:
from collections import Counter
counts = Counter(books.values())
print(counts)
Output:
Counter({'F': 5, 'A': 3, 'G': 3, 'E': 2, 'D': 2, 'B': 2, 'H': 1, 'C': 1})
To provide the layout of the output you are expecting in order of value, you can perform a simple iteration using the most_common method and print each line:
for char, value in counts.most_common():
print("'{}': {}".format(char, value))
Output:
'F': 5
'G': 3
'A': 3
'E': 2
'D': 2
'B': 2
'C': 1
'H': 1

Using python for frequency analysis

I am trying to use python to help me crack Vigenère ciphers. I am fairly new to programming but I've managed to make an algorithm to analyse single letter frequencies. This is what I have so far:
Ciphertext = str(input("What is the cipher text?"))
Letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
def LetterFrequency():
LetterFrequency = {'A': 0, 'B': 0, 'C': 0, 'D': 0, 'E': 0, 'F': 0, 'G': 0, 'H': 0, 'I': 0, 'J': 0, 'K': 0, 'L': 0, 'M': 0, 'N': 0, 'O': 0, 'P': 0, 'Q': 0, 'R': 0, 'S': 0, 'T': 0, 'U': 0, 'V': 0, 'W': 0, 'X': 0, 'Y': 0, 'Z': 0}
for letter in Ciphertext.upper():
if letter in Letters:
LetterFrequency[letter]+=1
return LetterFrequency
print (LetterFrequency())
But is there a way for me to print the answers in descending order starting from the most frequent letter? The answers are shown in random order right now no matter what I do.
Also does anyone know how to extract specific letters form a large block of text to perform frequency analysis? So for instance if I wanted to put every third letter from the text “THISISARATHERBORINGEXAMPLE” together to analyse, I would need to get:
T H I
S I S
A R A
T H E
R B O
R I N
G E X
A M P
L E
Normally I would have to do this by hand in either notepad or excel which takes ages. Is there a way to get around this in python?
Thanks in advance,
Tony
For the descending order you could use Counter:
>>> x = "this is a rather boring example"
>>> from collections import Counter
>>> Counter(x)
Counter({' ': 5, 'a': 3, 'e': 3, 'i': 3, 'r': 3, 'h': 2, 's': 2, 't': 2, 'b': 1, 'g': 1, 'm': 1, 'l': 1, 'o': 1, 'n': 1, 'p': 1, 'x': 1})
As for the second question you could iterate per 3.
To exclude spaces you can try what #not_a_robot suggests in the comment or
delete it manually like:
>>> y = Counter(x)
>>> del y[' ']
>>> y
Counter({'a': 3, 'e': 3, 'i': 3, 'r': 3, 'h': 2, 's': 2, 't': 2, 'b': 1, 'g': 1, 'm': 1, 'l': 1, 'o': 1, 'n': 1, 'p': 1, 'x': 1})
Another approach, although the collections.Counter example from #coder is your best bet.
from collections import defaultdict
from operator import itemgetter
Letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
Ciphertext = "this is a rather boring example"
def LetterFrequency():
LetterFrequency = {letter: 0 for letter in Letters}
for letter in Ciphertext.upper():
if letter in Letters:
LetterFrequency[letter]+=1
return LetterFrequency
def sort_dict(dct):
return sorted(dct.items(), key = itemgetter(1), reverse = True)
print(sort_dict(LetterFrequency()))
Which prints this, a list of tuples sorted descendingly by frequency:
[('A', 3), ('I', 3), ('E', 3), ('R', 3), ('T', 2), ('S', 2), ('H', 2), ('L', 1), ('G', 1), ('M', 1), ('P', 1), ('B', 1), ('N', 1), ('O', 1), ('X', 1), ('Y', 0), ('J', 0), ('D', 0), ('U', 0), ('F', 0), ('C', 0), ('Q', 0), ('W', 0), ('Z', 0), ('K', 0), ('V', 0)]

How to define section number that 2d-array is divided to using Python?

I have this data structure:
It is 2d-array that is divided on 3 sections. For each letter in the array I need to define Section number. For example, letters a,b,c,d are in Section 1; e,f,g,h are in Section 2.
My code. Firstly, this 2d-array preparation:
from itertools import cycle
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l']
#2d-array initialization
width, height = 3, 6
repRange = cycle(range(1, 3))
values = [0] * (width - 1)
array2d = [[next(repRange)] + values for y in range(height)]
#Filling array with letters:
m = 0
for i in range(height):
for j in range(1,width):
array2d[i][j] = letters[m]
m+=1
#Printing:
for row in array2d:
print(row)
Output:
[1, 'a', 'b']
[2, 'c', 'd']
[1, 'e', 'f']
[2, 'g', 'h']
[1, 'i', 'j']
[2, 'k', 'l']
Now I need to determine section number of each letter and save it along with the letter itself. I use defineSection function and save values in dictionary:
def defineSection(i, division, height):
if i <= division:
return 1
elif division*2 >= i > division :
return 2
elif division*3 >= i > division*2 :
return 3
dic = {}
for i in range(height):
for j in range(1,width):
section = defineSection(i+1, 2, height)
dic.update({array2d[i][j] : section})
for item in dic.items():
print(item)
Output:
('f', 2)
('b', 1)
('c', 1)
('e', 2)
('k', 3)
('g', 2)
('d', 1)
('a', 1)
('l', 3)
('h', 2)
('i', 3)
('j', 3)
It defined all section numbers for each letter correctly. But defineSection method is primitive and will not work if number of rows is bigger than 6.
I don't know how to implement defineSection method so that it defines Section number automatically taking into account only current Row number, division and number of rows in total.
Question: Is there some way I can simply determine section number without so many if-elif conditions and independently of total number of rows?
You can simplify your matrix creation code immensely. All you need is a letters iterator, which returns itself so you can iterate 2-letters at a time using zip.
In [3]: from itertools import cycle
In [4]: letters = "abcdefghijkl"
In [5]: ranges = cycle(range(1,3))
In [6]: iter_letters = iter(letters)
In [7]: matrix = [[i,a,b] for i,a,b in zip(ranges,iter_letters,iter_letters)]
In [8]: matrix
Out[8]:
[[1, 'a', 'b'],
[2, 'c', 'd'],
[1, 'e', 'f'],
[2, 'g', 'h'],
[1, 'i', 'j'],
[2, 'k', 'l']]
As for assigning sections, note that a section is every two rows, which is four letters, so you can use simple floor division to "skip" counts.
In [9]: sections = {letter:(i//4 + 1) for i,letter in enumerate(letters)}
In [10]: sections
Out[10]:
{'a': 1,
'b': 1,
'c': 1,
'd': 1,
'e': 2,
'f': 2,
'g': 2,
'h': 2,
'i': 3,
'j': 3,
'k': 3,
'l': 3}

counting letters in a string python

I have to write a function, countLetters(word), that takes in a word as argument and returns a list that counts the number of times each letter appears. The letters must be sorted in alphabetical order.
This is my attempt:
def countLetters(word):
x = 0
y = []
for i in word:
for j in range(len(y)):
if i not in y[j]:
x = (i, word.count(i))
y.append(x)
return y
I first tried it without the if i not in y[j]
countLetters("google")
result was
[('g', 2), ('o', 2), ('o', 2), ('g', 2), ('l', 1), ('e', 1)]
when I wanted
[('e', 1), ('g', 2), ('l', 1), ('o', 2)]
When I added the if i not in y[j] filter, it just returns an empty list [].
Could someone please point out my error here?
I recommend the collections module's Counter if you're in Python 2.7+
>>> import collections
>>> s = 'a word and another word'
>>> c = collections.Counter(s)
>>> c
Counter({' ': 4, 'a': 3, 'd': 3, 'o': 3, 'r': 3, 'n': 2, 'w': 2, 'e': 1, 'h': 1, 't': 1})
You can do the same in any version Python with an extra line or two:
>>> c = {}
>>> for i in s:
... c[i] = c.get(i, 0) + 1
This would also be useful to check your work.
To sort in alphabetical order (the above is sorted by frequency)
>>> for letter, count in sorted(c.items()):
... print '{letter}: {count}'.format(letter=letter, count=count)
...
: 4
a: 3
d: 3
e: 1
h: 1
n: 2
o: 3
r: 3
t: 1
w: 2
or to keep in a format that you can reuse as a dict:
>>> import pprint
>>> pprint.pprint(dict(c))
{' ': 4,
'a': 3,
'd': 3,
'e': 1,
'h': 1,
'n': 2,
'o': 3,
'r': 3,
't': 1,
'w': 2}
Finally, to get that as a list:
>>> pprint.pprint(sorted(c.items()))
[(' ', 4),
('a', 3),
('d', 3),
('e', 1),
('h', 1),
('n', 2),
('o', 3),
('r', 3),
('t', 1),
('w', 2)]
I think the problem lies in your outer for loop, as you are iterating over each letter in the word.
If the word contains more than one of a certain letter, for example "bees", when it iterates over this, it will now count the number of 'e's twice as the for loop does not discriminate against unique values. Look at string iterators, this might clarify this more. I'm not sure this will solve your problem, but this is the first thing that I noticed.
You could maybe try something like this:
tally= {}
for s in check_string:
if tally.has_key(s):
tally[s] += 1
else:
tally[s] = 1
and then you can just retrieve the tally for each letter from that dictionary.
Your list y is always empty. You are never getting inside a loop for j in range(len(y))
P.S. your code is not very pythonic
Works fine with latest Py3 and Py2
def countItems(iter):
from collections import Counter
return sorted(Counter(iter).items())
Using a dictionary and pprint from answer of #Aaron Hall
import pprint
def countLetters(word):
y = {}
for i in word:
if i in y:
y[i] += 1
else:
y[i] = 1
return y
res1 = countLetters("google")
pprint.pprint(res1)
res2 = countLetters("Google")
pprint.pprint(res2)
Output:
{'e': 1, 'g': 2, 'l': 1, 'o': 2}
{'G': 1, 'e': 1, 'g': 1, 'l': 1, 'o': 2}
I am not sure what is your expected output, according to the problem statement, it seems you should sort the word first to get the count of letters in a sorted order. code below may be helpful:
def countLetters(word):
letter = []
cnt = []
for c in sorted(word):
if c not in letter:
letter.append(c)
cnt.append(1)
else:
cnt[-1] += 1
return zip(letter, cnt)
print countLetters('hello')
this will give you [('e', 1), ('h', 1), ('l', 2), ('o', 1)]
You can create dict of characters first, and than list of tulips
text = 'hello'
my_dict = {x : text.count(x) for x in text}
my_list = [(key, my_dict[key]) for key in my_dict]
print(my_dict)
print(my_list)
{'h': 1, 'e': 1, 'l': 2, 'o': 1}
[('h', 1), ('e', 1), ('l', 2), ('o', 1)]

Categories

Resources