How to apply a for loop to a string in Python - python

How do I apply a for loop to a string in Python which allows me to count each letter in a word? The ultimate goal is to discover the most common letter.
This is the code so far:
print "Type 'exit' at any time to exit the program."
continue_running = True
while continue_running:
word = raw_input("Please enter a word: ")
if word == "exit":
continue_running = False
else:
while not word.isalpha():
word = raw_input("ERROR: Please type a single word containing alphabetic characters only:")
print word
if len(word) == 1:
print word + " has " + str(len(word)) + " letter."
else:
print word + " has " + str(len(word)) + " letters."
if sum(1 for v in word.upper() if v in ["A", "E", "I", "O", "U"]) == 1:
print "It also has ", sum(1 for v in word.upper() if v in ["A", "E", "I", "O", "U"]), " vowel."
else:
print "It also has ", sum(1 for v in word.upper() if v in ["A", "E", "I", "O", "U"]), " vowels."
if sum(1 for c in word if c.isupper()) == 1:
print "It has ", sum(1 for c in word if c.isupper()), " capital letter."
else:
print "It has ", sum(1 for c in word if c.isupper()), " capital letters."
for loop in word:
I know I can use the:
(collections.Counter(word.upper()).most_common(1)[0])
format, but this isn't the way I want to do it.

You can simply loop directly over strings; they are sequences just like lists and tuples; each character in the string is a separate element:
for character in word.upper():
# count the character.
Counts can then be collected in a dictionary:
counts = {}
for character in word.upper():
# count the character.
counts[character] = counts.get(character, 0) + 1
after which you'd pick the most common one; you can use the max() function for that:
most_common = max(counts, key=counts.__getitem__) # maximum by value
Demo:
>>> word = 'fooBARbazaar'
>>> counts = {}
>>> for character in word.upper():
... # count the character.
... counts[character] = counts.get(character, 0) + 1
...
>>> counts
{'A': 4, 'B': 2, 'F': 1, 'O': 2, 'R': 2, 'Z': 1}
>>> max(counts, key=counts.__getitem__)
'A'

If you don't want to re-invent the wheel try using Counter
>>> from collections import Counter
>>> x = Counter("stackoverflow")
>>> x
Counter({'o': 2, 'a': 1, 'c': 1, 'e': 1, 'f': 1, 'k': 1, 'l': 1, 's': 1, 'r': 1, 't': 1, 'w': 1, 'v': 1})
>>> print max(x, key=x.__getitem__)
o
>>>

Related

Removing duplicates using a dictionary

I am writing a function that is supposed to count duplicates and mention how many duplicates are of each individual record. For now my output is giving me the total number of duplications, which I don't want.
i.e. if there are 4 duplicates of one record, it's giving me 4 instead of 1; if there are 6 duplicates of 2 individual records it should give me 2.
Could someone please help find the bug?
Thank you
def duplicate_count(text):
text = text.lower()
dict = {}
word = 0
if len(text) != "":
for a in text:
dict[a] = dict.get(a,0) + 1
for a in text:
if dict[a] > 1:
word = word + 1
return word
else:
return "0"
Fixed it:
def duplicate_count(text):
text = text.lower()
dict = {}
word = 0
if len(text) != "":
for a in text:
dict[a] = dict.get(a,0) + 1
return sum(1 for a in dict.values() if a >= 2)
else:
return "0"
You can do this with set and sum. First set is used to remove all duplicates. This is so we can have as few iterations as possible, as-well-as get an immediate count, as opposed to a "one-at-a-time" count. The set is then used to create a dictionary that stores the amount of times a character repeats. Those values are then used as a generator in sum to sum all the times that the "repeat value" is greater than 1.
def dup_cnt(t:str) -> int:
if not t: return 0
t = t.lower()
d = dict()
for c in set(t):
d[c] = t.count(c)
return sum(v>1 for v in d.values())
print(dup_cnt('aabccdeefggh')) #4
I don't really understand the question you asked.
But I assume you want to get the count or details of each letter's duplication in the text. You can do this, hoping this can help.
def duplicate_count(text):
count_dict = {}
for letter in text.lower():
count_dict[letter] = count_dict.setdefault(letter, 0) + 1
return count_dict
ret = duplicate_count('asuhvknasiasifjiasjfija')
# Get all letter details
print(ret)
#{'a': 5, 's': 4, 'u': 1, 'h': 1, 'v': 1, 'k': 1, 'n': 1, 'i': 4, 'f': 2, 'j': 3}
# Get all letter count
print(len(ret))
# 10
# Get only the letters appear more than once in the text
dedup = {k: v for k, v in ret.items() if v > 1}
# Get only duplicated letter details
print(dedup)
# {'a': 5, 's': 4, 'i': 4, 'f': 2, 'j': 3}
# Get only duplicated letter count
print(len(dedup))
# 5

How would I make a Horizontal list into a vertical list

How would I make it so that my vowel counter returns a vertical list instead of a horizontal list. Every time I use \n it just gives me an error I have no clue what I am doing wrong, I am a beginner coder and still have problems solving this. I have tried looking for an answer and I haven't found any. Could anybody help? So instead of looking like this {'a': 1, 'e': 1, 'i': 2, 'o': 1, 'u': 1}.
How do I make it look vertical like this
'a': 1
'e': 1
'i': 2
'o': 1
'u': 1
def count_vowels(string, vowels):
string = string.casefold()
count = {}.fromkeys(vowels, 0)
# To count the vowels
for character in string:
if character in count:
count[character] += 1
return count
vowels = ("a", "e", "i", "o", "u")
string = "Counting all the strings"
print(count_vowels(string, vowels))
If you want something other than Python's default formatting, then you have to do it yourself:
vowels = ("a", "e", "i", "o", "u")
string = "Counting all the strings"
for k,v in count_vowels(string, vowels).items():
print( f"{k}: {v}" )
By casting the dictionary to string and slice (or strip) the brackets should work:
...
string = "CoUnting All the strings"
counter = count_vowels(string)
print(*str(counter).strip('{}').split(', '), sep='\n')
#'a': 1
#'e': 1
#'i': 2
#'o': 1
#'u': 1

Finding the occurrence of a specific character from a text file - Python

I want to write a function that takes a file name and a list as arguments and reads the text within the corresponding file, then creates a dictionary whose keys are the characters of the provided list, and the values are the counts of these characters within the text. If the file does not exist, then the function returns an empty dictionary.
For example:
sample.txt
This is an example sentence
This is yet another sentence
Here is the code that I've written so far:
def text2dict(filename, characters):
count_word = dict()
for char in characters:
count_word[char] = 0
with open(filename) as input_text:
text = input_text.read()
words = text.lower().split()
for word in words:
_word = word.strip('.,:-)()')
if word in count_word:
count_word[_word] += 1
return count_word
file_name = "sample.txt"
list_char = ["a", "b", "c", "t"]
text2dict(file_name, list_char)
Expected Output:
{'a':3, 'b':0, 'c':2, 't':6}
The output I got:
{'a': 0, 'b': 0, 'c': 0, 't': 0}
You can use "".count() for that. Also there is no need to pre-fill the dictionary anymore, as we are not using iadd.
def text2dict(filename, characters):
count_letters = dict()
with open(filename) as input_text:
text = input_text.read()
for k in characters:
count_letters[k] = text.count(k)
return count_letters
With this you get the expected result
>>> file_name = r"sample.txt"
>>> list_char = ["a", "b", "c", "t"]
>>> print(text2dict(file_name, list_char))
{'a': 3, 'b': 0, 'c': 2, 't': 6}
You are checking if a word is existing in the dictionnary, but the dictionnary only contains letters.
Spoiler:
if you want a working version right away:
def text2dict(filename, characters):
count_word = dict()
for char in characters:
count_word[char] = 0
with open(filename) as input_text:
text = input_text.read()
words = text.lower().split()
for word in words:
_word = list(word)
for i in _word:
if i in count_word:
count_word[i] += 1
return count_word
You can use collections.Counter().
from collections import Counter
file = open('test.txt')
rows = [row.strip().replace(' ','').lower() for row in file]
wanted = ["a", "b", "c", "t"]
finalDict = {letters:dict(Counter(r)).get(letters,0) for r in rows for letters in wanted}
output
{'a': 1, 'b': None, 'c': 1, 't': 4}
Shortened down all the way, for funnsies
from collections import Counter
finalDict = {letters:dict(Counter(r)).get(letters) for r in [row.strip().replace(' ','').lower() for row in open('test.txt')] for letters in ["a", "b", "c", "t"]}

Anagram test for two strings in python

This is the question:
Write a function named test_for_anagrams that receives two strings as
parameters, both of which consist of alphabetic characters and returns
True if the two strings are anagrams, False otherwise. Two strings are
anagrams if one string can be constructed by rearranging the
characters in the other string using all the characters in the
original string exactly once. For example, the strings "Orchestra" and
"Carthorse" are anagrams because each one can be constructed by
rearranging the characters in the other one using all the characters
in one of them exactly once. Note that capitalization does not matter
here i.e. a lower case character can be considered the same as an
upper case character.
My code:
def test_for_anagrams (str_1, str_2):
str_1 = str_1.lower()
str_2 = str_2.lower()
print(len(str_1), len(str_2))
count = 0
if (len(str_1) != len(str_2)):
return (False)
else:
for i in range(0, len(str_1)):
for j in range(0, len(str_2)):
if(str_1[i] == str_2[j]):
count += 1
if (count == len(str_1)):
return (True)
else:
return (False)
#Main Program
str_1 = input("Enter a string 1: ")
str_2 = input("Enter a string 2: ")
result = test_for_anagrams (str_1, str_2)
print (result)
The problem here is when I enter strings as Orchestra and Carthorse, it gives me result as False. Same for the strings The eyes and They see. Any help would be appreciated.
I'm new to python, so excuse me if I'm wrong
I believe this can be done in a different approach: sort the given strings and then compare them.
def anagram(a, b):
# string to list
str1 = list(a.lower())
str2 = list(b.lower())
#sort list
str1.sort()
str2.sort()
#join list back to string
str1 = ''.join(str1)
str2 = ''.join(str2)
return str1 == str2
print(anagram('Orchestra', 'Carthorse'))
The problem is that you just check whether any character matches exist in the strings and increment the counter then. You do not account for characters you already matched with another one. That’s why the following will also fail:
>>> test_for_anagrams('aa', 'aa')
False
Even if the string is equal (and as such also an anagram), you are matching the each a of the first string with each a of the other string, so you have a count of 4 resulting in a result of False.
What you should do in general is count every character occurrence and make sure that every character occurs as often in each string. You can count characters by using a collections.Counter object. You then just need to check whether the counts for each string are the same, which you can easily do by comparing the counter objects (which are just dictionaries):
from collections import Counter
def test_for_anagrams (str_1, str_2):
c1 = Counter(str_1.lower())
c2 = Counter(str_2.lower())
return c1 == c2
>>> test_for_anagrams('Orchestra', 'Carthorse')
True
>>> test_for_anagrams('aa', 'aa')
True
>>> test_for_anagrams('bar', 'baz')
False
For completeness: If just importing Counter and be done with the exercise is not in the spirit of the exercise, you can just use plain dictionaries to count the letters.
def test_for_anagrams(str_1, str_2):
counter1 = {}
for c in str_1.lower():
counter1[c] = counter1.get(c, 0) + 1
counter2 = {}
for c in str_2.lower():
counter2[c] = counter2.get(c, 0) + 1
# print statements so you can see what's going on,
# comment out/remove at will
print(counter1)
print(counter2)
return counter1 == counter2
Demo:
print(test_for_anagrams('The eyes', 'They see'))
print(test_for_anagrams('orchestra', 'carthorse'))
print(test_for_anagrams('orchestr', 'carthorse'))
Output:
{' ': 1, 'e': 3, 'h': 1, 's': 1, 't': 1, 'y': 1}
{' ': 1, 'e': 3, 'h': 1, 's': 1, 't': 1, 'y': 1}
True
{'a': 1, 'c': 1, 'e': 1, 'h': 1, 'o': 1, 's': 1, 'r': 2, 't': 1}
{'a': 1, 'c': 1, 'e': 1, 'h': 1, 'o': 1, 's': 1, 'r': 2, 't': 1}
True
{'c': 1, 'e': 1, 'h': 1, 'o': 1, 's': 1, 'r': 2, 't': 1}
{'a': 1, 'c': 1, 'e': 1, 'h': 1, 'o': 1, 's': 1, 'r': 2, 't': 1}
False
Traverse through string test and validate weather character present in string test1 if present store the data in string value.
compare the length of value and length of test1 if equals return True Else False.
def anagram(test,test1):
value =''
for data in test:
if data in test1:
value += data
if len(value) == len(test1):
return True
else:
return False
anagram("abcd","adbc")
I have done Anagram Program in basic way and easy to understandable .
def compare(str1,str2):
if((str1==None) or (str2==None)):
print(" You don't enter string .")
elif(len(str1)!=len(str2)):
print(" Strings entered is not Anagrams .")
elif(len(str1)==len(str2)):
b=[]
c=[]
for i in str1:
#print(i)
b.append(i)
b.sort()
print(b)
for j in str2:
#print(j)
c.append(j)
c.sort()
print(c)
if (b==c and b!=[] ):
print(" String entered is Anargama .")
else:
print(" String entered are not Anargama.")
else:
print(" String entered is not Anargama .")
str1=input(" Enter the first String :")
str2=input(" Enter the second String :")
compare(str1,str2)
A more concise and pythonic way to do it is using sorted & lower/upper keys.
You can first sort the strings and then use lower/ upper to make the case consistent for proper comparison as follows:
# Function definition
def test_for_anagrams (str_1, str_2):
if sorted(str_1).lower() == sorted(str_2).lower():
return True
else:
return False
#Main Program
str_1 = input("Enter a string 1: ")
str_2 = input("Enter a string 2: ")
result = test_for_anagrams (str_1, str_2)
print (result)
Another solution:
def test_for_anagrams(my_string1, my_string2):
s1,s2 = my_string1.lower(), my_string2.lower()
count = 0
if len(s1) != len(s2) :
return False
for char in s1 :
if s2.count(char,0,len(s2)) == s1.count(char,0,len(s1)):
count = count + 1
return count == len(s1)
My solution is:
#anagrams
def is_anagram(a, b):
if sorted(a) == sorted(b):
return True
else:
return False
print(is_anagram("Alice", "Bob"))
def anagram(test,test1):
test_value = []
if len(test) == len(test1):
for i in test:
value = test.count(i) == test1.count(i)
test_value.append(value)
else:
test_value.append(False)
if False in test_value:
return False
else:
return True
check for length of test and test1 , if length matches traverse through string test and compare the character count in both test and test1 strings if matches store the value in string.

Determining Letter Frequency Of Cipher Text

I am trying to make a tool that finds the frequencies of letters in some type of cipher text.
Lets suppose it is all lowercase a-z no numbers. The encoded message is in a txt file
I am trying to build a script to help in cracking of substitution or possibly transposition ciphers.
Code so far:
cipher = open('cipher.txt','U').read()
cipherfilter = cipher.lower()
cipherletters = list(cipherfilter)
alpha = list('abcdefghijklmnopqrstuvwxyz')
occurrences = {}
for letter in alpha:
occurrences[letter] = cipherfilter.count(letter)
for letter in occurrences:
print letter, occurrences[letter]
All it does so far is show how many times a letter appears.
How would I print the frequency of all letters found in this file.
import collections
d = collections.defaultdict(int)
for c in 'test':
d[c] += 1
print d # defaultdict(<type 'int'>, {'s': 1, 'e': 1, 't': 2})
From a file:
myfile = open('test.txt')
for line in myfile:
line = line.rstrip('\n')
for c in line:
d[c] += 1
For the genius that is the defaultdict container, we must give thanks and praise. Otherwise we'd all be doing something silly like this:
s = "andnowforsomethingcompletelydifferent"
d = {}
for letter in s:
if letter not in d:
d[letter] = 1
else:
d[letter] += 1
The modern way:
from collections import Counter
string = "ihavesometextbutidontmindsharing"
Counter(string)
#>>> Counter({'i': 4, 't': 4, 'e': 3, 'n': 3, 's': 2, 'h': 2, 'm': 2, 'o': 2, 'a': 2, 'd': 2, 'x': 1, 'r': 1, 'u': 1, 'b': 1, 'v': 1, 'g': 1})
If you want to know the relative frequency of a letter c, you would have to divide number of occurrences of c by the length of the input.
For instance, taking Adam's example:
s = "andnowforsomethingcompletelydifferent"
n = len(s) # n = 37
and storing the absolute frequence of each letter in
dict[letter]
we obtain the relative frequencies by:
from string import ascii_lowercase # this is "a...z"
for c in ascii_lowercase:
print c, dict[c]/float(n)
putting it all together, we get something like this:
# get input
s = "andnowforsomethingcompletelydifferent"
n = len(s) # n = 37
# get absolute frequencies of letters
import collections
dict = collections.defaultdict(int)
for c in s:
dict[c] += 1
# print relative frequencies
from string import ascii_lowercase # this is "a...z"
for c in ascii_lowercase:
print c, dict[c]/float(n)

Categories

Resources