Writing Program to count different letters

Writing Program to count different letters - python

Im writing a program that takes in a string and 2 parameters to count the amount of letter e's that fall under the specified parameter. I get the correct count for the 1st and last possibility but not the in between 2 steps. Here is my function:
def count_letter_e(string_to_be_counted,ignore_case=True,ignore_accent=True):
"""
Return the number of times 'e' and its variations appears in a string given the parameters specified.
Parameters
----------
string_to_be_counted: str
A string containing e's that need to be counted
Returns
-------
total: int
Number of times 'e' and the specified variations appear in the given string
"""
#counting individual letters to be used to calculate totals in if statements
#Gets all counts of lowercase 'e'
e_counted=string_to_be_counted.count('e')
é_counted=string_to_be_counted.count('é')
ê_counted=string_to_be_counted.count('ê')
è_counted=string_to_be_counted.count('è')
#Get all counts of Uppercase 'E'
E_counted=string_to_be_counted.count('E')
É_counted=string_to_be_counted.count('É')
Ê_counted=string_to_be_counted.count('Ê')
È_counted=string_to_be_counted.count('È')
#Create a total variable
total=0
#check which parameters have been set
if ignore_case == True and ignore_accent == True:
total=e_counted + é_counted + ê_counted + è_counted + E_counted + É_counted + Ê_counted + È_counted
return total
total=0
elif ignore_case == True and ignore_accent == False:
total= e_counted + E_counted
return total
total=0
elif ignore_case == False and ignore_accent == True:
total= e_counted + é_counted + ê_counted + è_counted
return total
total=0
elif ignore_case == False and ignore_accent == False:
total=e_counted
return total
total=0
Here are my sentences that im testing:
sentence_1=("ThE weEk will bè frÊe until thÉre is a shÈèp that is freêd from thé pen")
sentence_2=("Thé redEyê fèlt likE a rÊal pain until I got hit in the hÊel by a freE sÈed")
sentence_3=("The frée pÊa made a gêtaway towards thé hèêl of a pÉnquin but only made it to the knEÈ")
sentence_4=("ThErÉ is a knêe that nèÊds to meÈt the queen for tÈsting of léaning pizza")
Here are the output vs the desire output for each
sentence 1: 14 v 14 (This is good)
setnence 2: 7 v 8 (This is not good)
sentence 3: 10 v 7 (This is not good)
sentence 4: 5 v 5 (This is good)
Any help would be appreciated!

Here is the improvement for your function:
def count_letter_e(string_to_be_counted,ignore_case=True,ignore_accent=True):
chars_to_count = { # chars that will be counted
# based on the "ignore_case+ignore_accent" state
(True, True):'eéêèEÉÊÈ',
(True, False):'eE',
(False, True):'eéêè',
(False, False):'e'
}
condition = (ignore_case, ignore_accent)
result = 0
for c in chars_to_count[condition]:
result += string_to_be_counted.count(c)
return result
Or just the same in a shortcut way:
def count_letter_e(string_to_be_counted,ignore_case=True,ignore_accent=True):
chars_to_count = {
(True, True):'eéêèEÉÊÈ',
(True, False):'eE',
(False, True):'eéêè',
(False, False):'e'
}
return sum([string_to_be_counted.count(c) for c in chars_to_count[(ignore_case, ignore_accent)]])
The value of this approach is not only in a significant code reduction, but also in the fact that all the settings of your function are now in one place - in the dictionary chars_to_count - and you can quickly and flexibly change them for other count-tasks.
Results:
sentence_1 = "ThE weEk will bè frÊe until thÉre is a shÈèp that is freêd from thé pen"
sentence_2 = "Thé redEyê fèlt likE a rÊal pain until I got hit in the hÊel by a freE sÈed"
sentence_3 = "The frée pÊa made a gêtaway towards thé hèêl of a pÉnquin but only made it to the knEÈ"
sentence_4 = "ThErÉ is a knêe that nèÊds to meÈt the queen for tÈsting of léaning pizza"
print(count_letter_e(sentence_1, True, True)) # 14
print(count_letter_e(sentence_2, True, False)) # 8
print(count_letter_e(sentence_3, False, True)) # 10
print(count_letter_e(sentence_4, False, False)) # 5
Note that your original code produces the same results.
And it seems that there is no error - based on the logic of the program, the desired results should be the same as in the printout above.

I like #MaximTitarenko's approach, but here's another option. It's not as DRY as it could be, but it makes the counting logic very clear.
def count_letter_e(string, ignore_case=True, ignore_accent=True):
if ignore_case and ignore_accent:
counts = [
string.count('e'),
string.count('é'),
string.count('ê'),
string.count('è'),
string.count('E'),
string.count('É'),
string.count('Ê'),
string.count('È'),
]
elif ignore_case and not ignore_accent:
counts = [
string.count('e'),
string.count('E'),
]
elif not ignore_case and ignore_accent:
counts = [
string.count('e'),
string.count('é'),
string.count('ê'),
string.count('è'),
]
elif not ignore_case and not ignore_accent:
counts = [
string.count('e'),
]
return sum(counts)
sentence_1 = 'ThE weEk will bè frÊe until thÉre is a shÈèp that is freêd from thé pen'
sentence_2 = 'Thé redEyê fèlt likE a rÊal pain until I got hit in the hÊel by a freE sÈed'
sentence_3 = 'The frée pÊa made a gêtaway towards thé hèêl of a pÉnquin but only made it to the knEÈ'
sentence_4 = 'ThErÉ is a knêe that nèÊds to meÈt the queen for tÈsting of léaning pizza'
print(count_letter_e(sentence_1, True, True))
print(count_letter_e(sentence_2, True, False))
print(count_letter_e(sentence_3, False, True))
print(count_letter_e(sentence_4, False, False))

Related

Longest Common Prefix from list elements in Python

I have a list as below:
strs = ["flowers", "flow", "flight"]
Now, I want to find the longest prefix of the elements from the list. If there is no match then it should return "". I am trying to use the 'Divide and Conquer' rule for solving the problem. Below is my code:
strs = ["flowers", "flow", "flight"]
firstHalf = ""
secondHalf = ""
def longestCommonPrefix(strs) -> str:
minValue = min(len(i) for i in strs)
length = len(strs)
middle_index = length // 2
firstHalf = strs[:middle_index]
secondHalf = strs[middle_index:]
minSecondHalfValue = min(len(i) for i in secondHalf)
matchingString=[] #Creating a stack to append the matching characters
for i in range(minSecondHalfValue):
secondHalf[0][i] == secondHalf[1][i]
return secondHalf
print(longestCommonPrefix(strs))
I was able to find the mid and divide the list into two parts. Now I am trying to use the second half and get the longest prefix but am unable to do so. I have had created a stack where I would be adding the continuous matching characters and then I would use it to compare with the firstHalf but how can I compare the get the continuous matching characters from start?
Expected output:
"fl"
Just a suggestion would also help. I can give it a try.

No matter what, you need to look at each character from each string in turn (until you find a set of corresponding characters that doesn't match), so there's no benefit to splitting the list up. Just iterate through and break when the common prefix stops being common:
def common_prefix(strs) -> str:
prefix = ""
for chars in zip(*strs):
if len(set(chars)) > 1:
break
prefix += chars[0]
return prefix
print(common_prefix(["flowers", "flow", "flight"])) # fl

Even if this problem has already found its solution, I would like to post my approach (I considered the problem interesting, so started playing around with it).
So, your divide-and-conquer solution would involve a very big task split in many smaller subtasks, whose solutions get processed by other small tasks and so, until you get to the final solution. The typical example is a sum of numbers (let's take 1 to 8), which can be done sequentially (1 + 2 = 3, then 3 + 3 = 6, then 6 + 4 = 10... until the end) or splitting the problem (1 + 2 = 3, 3 + 4 = 7, 5 + 6 = 11, 7 + 8 = 15, then 3 + 7 = 10 and 11 + 15 = 26...). The second approach has the clear advantage that it can be parallelized - increasing the time performance dramatically in the right set up - reason why this goes generally hand in hand with topics like multithreading.
So my approach:
import math
def run(lst):
if len(lst) > 1:
lst_split = [lst[2 * (i-1) : min(len(lst) + 1, 2 * i)] for i in range(1, math.ceil(len(lst)/2.0) + 1)]
lst = [Processor().process(*x) for x in lst_split]
if any([len(x) == 0 for x in lst]):
return ''
return run(lst)
else:
return lst[0]
class Processor:
def process(self, w1, w2 = None):
if w2 != None:
zipped = list(zip(w1, w2))
for i, (x, y) in enumerate(zipped):
if x != y:
return w1[:i]
if i + 1 == len(zipped):
return w1[:i+1]
else:
return w1
return ''
lst = ["flowers", "flow", "flight", "flask", "flock"]
print(run(lst))
OUTPUT
fl
If you look at the run method, the passed lst gets split in couples, which then get processed (this is where you could start multiple threads, but let's not focus on that). The resulting list gets reprocessed until the end.
An interesting aspect of this problem is: if, after a pass, you get one empty match (two words with no common start), you can stop the reduction, given that you know the solution already! Hence the introduction of
if any([len(x) == 0 for x in lst]):
return ''
I don't think the functools.reduce offers the possibility of stopping the iteration in case a specific condition is met.
Out of curiosity: another solution could take advantage of regex:
import re
pattern = re.compile("(\w+)\w* \\1\w*")
def find(x, y):
v = pattern.findall(f'{x} {y}')
return v[0] if len(v) else ''
reduce(find, lst)
OUTPUT
'fl'

Sort of "divide and conquer" :
solve for 2 strings
solve for the other strings
def common_prefix2_(s1: str, s2: str)-> str:
if not s1 or not s2: return ""
for i, z in enumerate(zip(s1,s2)):
if z[0] != z[1]:
break
else:
i += 1
return s1[:i]
from functools import reduce
def common_prefix(l:list):
return reduce(common_prefix2_, l[1:], l[0]) if len(l) else ''
Tests
for l in [["flowers", "flow", "flight"],
["flowers", "flow", ""],
["flowers", "flow"],
["flowers", "xxx"],
["flowers" ],
[]]:
print(f"{l if l else '[]'}: '{common_prefix(l)}'")
# output
['flowers', 'flow', 'flight']: 'fl'
['flowers', 'flow', '']: ''
['flowers', 'flow']: 'flow'
['flowers', 'xxx']: ''
['flowers']: 'flowers'
[]: ''

define a function to check if a list of elements is palindrome and returns a list

I have to do the following:
Define a function called isSymmetricalVec that takes a list of elements, checks if each element in a list is palindrome, then returns their results in a list. For example, given ["1441", "Apple", "radar", "232", "plane"] the function returns [TRUE, FALSE, TRUE, TRUE, FALSE].
I wrote the following code but I'm stuck at the point where I cannot return the result in a list.
def isSymmetricalVec(myList):
for myString in myList:
myList = []
mid = (len(myString)-1)//2
start = 0
last = len(myString)-1
flag = 0
while(start<mid):
if (myString[start]== myString[last]):
start += 1
last -= 1
else:
flag = 1
break;
if flag == 0:
print(bool(1))
else:
print(bool(0))
# Enter a list of strings to check whether it is symmetrical or not
myList = ["12321", "12345", "madam", "modem"]
isSymmetricalVec(myList)
My function returns the following but the result is not in a list format:
True
False
True
False
How can I modify my code to return a result in a list format?

You function should return value instead of printing it.
Created an empty list new_list and appended result to it.
def isSymmetricalVec(myList):
new_lst = []
for myString in myList:
mid = (len(myString) - 1) // 2
start = 0
last = len(myString) - 1
flag = 0
while (start < mid):
if (myString[start] == myString[last]):
start += 1
last -= 1
else:
flag = 1
break;
if flag == 0:
new_lst.append(True)
else:
new_lst.append(False)
return new_lst
# Enter a list of strings to check whether it is symmetrical or not

Your function is actually not returning anything. What you see are only being printed and not returned.
You would want to keep each answer in a list like this.
def isSymmetricalVec(myList):
list_of_answers = []
for myString in myList:
...
while(start<mid):
...
if flag == 0:
print(bool(1))
list_of_answers.append(True)
else:
print(bool(0))
list_of_answers.append(False)
return list_of_answers
This way your answers are printed and returned.
Finally, you would need a variable to hold the returned list.
# Enter a list of strings to check whether it is symmetrical or not
myList = ["12321", "12345", "madam", "modem"]
list_of_answers = isSymmetricalVec(myList)

def isSymmetricalVec(myString):
mid = (len(myString)-1)//2
start = 0
last = len(myString)-1
flag = 0
while(start<mid):
if (myString[start]== myString[last]):
start += 1
last -= 1
else:
flag = 1
break;
if flag == 0:
return True
else:
return False
test = ["1221", "madam", "hello world"]
final = [isSymmetricalVec(i) for i in test]
print(final)
I rewrote the code a bit and this would be my solution. It retains the original functionality and is stylish as well as efficient. It also makes the original function more flexible and makes it easily migratable.

Your function doesn't return anything, just printing values.
There is a good way to shorten your algorithm with python slices. Also this function returns list of boolean objects.
from math import ceil, floor
def isSymmetricalVec(myList):
list_of_answers = []
for string in myList:
strLength = len(string)
#There is a string palindrome check
if string[:floor(strLength//2)] == string[ceil(strLength/2):][::-1]:
toAppend = True
else:
toAppend = False
list_of_answers.append(toAppend)
return list_of_answers
It is worth adding that it's better to use True and False instead of bool(1) and bool(0).
An example:
>>> isSymmetricalVec(['11211', 'AbbA', '12f23'])
>>> [True, True, False]

Python : checking if all letters in two words are exactly the same but not in same order (amphisbaena)

A word is an amphisbaena if the first half and the last half of the word contain exactly the same letters, but not necessarily in the same order. In case the word has an odd number of letters, the middle letter is ignored in this definition (or it belongs to both halves).
My code works in most cases except for example with: 'eisegesis' -> eise esis
My code doesn't check if all letters appear ONLY ONE TIME UNIQUE in the other word and vice versa. The letter 's' doesn't appear two times in the other part (half) of the word. How can I adjust my code?
def amphisbaena(word):
"""
>>> amphisbaena('RESTAURATEURS')
True
>>> amphisbaena('eisegesis')
False
>>> amphisbaena('recherche')
True
"""
j = int(len(word) / 2)
count = 0
tel = 0
firstpart, secondpart = word[:j], word[-j:]
for i in firstpart.lower():
if i in secondpart.lower():
count +=1
for i in secondpart.lower():
if i in firstpart.lower():
tel +=1
if 2 * j == count + tel:
return True
else:
return False

i would have done something like this:
j = int(len(word) / 2)
firstpart, secondpart = word[:j], word[-j:]
return sorted(firstpart) == sorted(secondpart)

You need to count letters in both halves separately and compare counts for each letter. Simplest is to use a collections.Counter:
def amphisbaena(word):
from collections import Counter
w = word.lower()
half = len(word) // 2
return half == 0 or Counter(word[:half]) == Counter(word[-half:])
While this is not quite as simple as just comparing the sorted halves, it is O(N) as opposed to O(N * log_N).

You can do with lambda function in one line :
string_1='recherche'
half=int(len(string_1)/2)
amphisbaena=lambda x: True if sorted(x[:half])==sorted(x[-half:]) else False
print(amphisbaena(string_1))
output:
True
With other string :
string_1='eisegesis'
half=int(len(string_1)/2)
amphisbaena=lambda x: True if sorted(x[:half])==sorted(x[-half:]) else False
print(amphisbaena(string_1))
output:
False

Check whether the last three flips were all heads or all tails in Python

So I have a challenge, in which I have to create a programme that simulates a coin flip, by generating a random number corresponding to either heads or tails. When three simultaneous 'H' (heads) or 'T' (tails) are outputted my programme should stop. I have tried to get this to work, here is my code so far:
import random
active=True
list1 = []
b = 0
while active:
l=random.randint(0,1)
b += 1
i='a'
if l == 0:
i = 'H'
else:
i = 'T'
list1.append(i)
if list1[:-3] is ['H','H','H']:
active = False
elif list1[:-3] is ['T','T','T']:
active = False
else:
active = True
print(list1),
It seems that the only thing not working is the part that checks for 3 corresponding heads or tails, does anybody know how I may code this part correctly?

The problem, as was mentioned in the comments above, is that list1[:-3] should be list1[-3:] (getting the last three elements of the list, instead of everything up to the last three elements) and comparisons should be done with == instead of is. The adjusted program would be:
import random
active=True
list1 = []
b = 0
while active:
l=random.randint(0,1)
b += 1
i='a'
if l == 0:
i = 'H'
else:
i = 'T'
list1.append(i)
if list1[-3:] == ['H','H','H']:
active = False
elif list1[-3:] == ['T','T','T']:
active = False
else:
active = True
print(list1)
However, I think it might also be useful to see a condensed approach at writing the same program:
import random
flips = []
while flips[-3:] not in (['H'] * 3, ['T'] * 3):
flips.append(random.choice('HT'))
print(flips)

You can do this by tracking a running list of flips. If the new flip is the same as the previous, append it to the list. Else, if the new flip is not the same as the previous, clear the list and append the new flip. Once the length reaches 3, break from the while loop:
import random
flipping = True
flips = []
flip_status = {0: "H", 1: "T"}
current_flip = None
while flipping:
current_flip = flip_status[random.randint(0,1)]
print current_flip
if len(flips) == 0:
flips.append(current_flip)
else:
if current_flip == flips[-1]:
flips.append(current_flip)
if len(flips) == 3:
break
else:
flips = []
flips.append(current_flip)
print "Flips: " + str(flips)
Here's a sample run:
T
T
H
T
T
H
T
H
H
T
T
T
Flips: ['T', 'T', 'T']

Finding the length of longest repeating?

I have tried plenty of different methods to achieve this, and I don't know what I'm doing wrong.
reps=[]
len_charac=0
def longest_charac(strng)
for i in range(len(strng)):
if strng[i] == strng[i+1]:
if strng[i] in reps:
reps.append(strng[i])
len_charac=len(reps)
return len_charac

Remember in Python counting loops and indexing strings aren't usually needed. There is also a builtin max function:
def longest(s):
maximum = count = 0
current = ''
for c in s:
if c == current:
count += 1
else:
count = 1
current = c
maximum = max(count,maximum)
return maximum
Output:
>>> longest('')
0
>>> longest('aab')
2
>>> longest('a')
1
>>> longest('abb')
2
>>> longest('aabccdddeffh')
3
>>> longest('aaabcaaddddefgh')
4

Simple solution:
def longest_substring(strng):
len_substring=0
longest=0
for i in range(len(strng)):
if i > 0:
if strng[i] != strng[i-1]:
len_substring = 0
len_substring += 1
if len_substring > longest:
longest = len_substring
return longest
Iterates through the characters in the string and checks against the previous one. If they are different then the count of repeating characters is reset to zero, then the count is incremented. If the current count beats the current record (stored in longest) then it becomes the new longest.

Compare two things and there is one relation between them:
'a' == 'a'
True
Compare three things, and there are two relations:
'a' == 'a' == 'b'
True False
Combine these ideas - repeatedly compare things with the things next to them, and the chain gets shorter each time:
'a' == 'a' == 'b'
True == False
False
It takes one reduction for the 'b' comparison to be False, because there was one 'b'; two reductions for the 'a' comparison to be False because there were two 'a'. Keep repeating until the relations are all all False, and that is how many consecutive equal characters there were.
def f(s):
repetitions = 0
while any(s):
repetitions += 1
s = [ s[i] and s[i] == s[i+1] for i in range(len(s)-1) ]
return repetitions
>>> f('aaabcaaddddefgh')
4
NB. matching characters at the start become True, only care about comparing the Trues with anything, and stop when all the Trues are gone and the list is all Falses.
It can also be squished into a recursive version, passing the depth in as an optional parameter:
def f(s, depth=1):
s = [ s[i] and s[i]==s[i+1] for i in range(len(s)-1) ]
return f(s, depth+1) if any(s) else depth
>>> f('aaabcaaddddefgh')
4
I stumbled on this while trying for something else, but it's quite pleasing.

You can use itertools.groupby to solve this pretty quickly, it will group characters together, and then you can sort the resulting list by length and get the last entry in the list as follows:
from itertools import groupby
print(sorted([list(g) for k, g in groupby('aaabcaaddddefgh')],key=len)[-1])
This should give you:
['d', 'd', 'd', 'd']

This works:
def longestRun(s):
if len(s) == 0: return 0
runs = ''.join('*' if x == y else ' ' for x,y in zip(s,s[1:]))
starStrings = runs.split()
if len(starStrings) == 0: return 1
return 1 + max(len(stars) for stars in starStrings)
Output:
>>> longestRun("aaabcaaddddefgh")
4

First off, Python is not my primary language, but I can still try to help.
1) you look like you are exceeding the bounds of the array. On the last iteration, you check the last character against the character beyond the last character. This normally leads to undefined behavior.
2) you start off with an empty reps[] array and compare every character to see if it's in it. Clearly, that check will fail every time and your append is within that if statement.

def longest_charac(string):
longest = 0
if string:
flag = string[0]
tmp_len = 0
for item in string:
if item == flag:
tmp_len += 1
else:
flag = item
tmp_len = 1
if tmp_len > longest:
longest = tmp_len
return longest
This is my solution. Maybe it will help you.

Just for context, here is a recursive approach that avoids dealing with loops:
def max_rep(prev, text, reps, rep=1):
"""Recursively consume all characters in text and find longest repetition.
Args
prev: string of previous character
text: string of remaining text
reps: list of ints of all reptitions observed
rep: int of current repetition observed
"""
if text == '': return max(reps)
if prev == text[0]:
rep += 1
else:
rep = 1
return max_rep(text[0], text[1:], reps + [rep], rep)
Tests:
>>> max_rep('', 'aaabcaaddddefgh', [])
4
>>> max_rep('', 'aaaaaabcaadddddefggghhhhhhh', [])
7

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Writing Program to count different letters - python

Related

Longest Common Prefix from list elements in Python

define a function to check if a list of elements is palindrome and returns a list

Python : checking if all letters in two words are exactly the same but not in same order (amphisbaena)

Check whether the last three flips were all heads or all tails in Python

Finding the length of longest repeating?

Categories

Resources