How to compare Char Strings

How to compare Char Strings - python

So I'm trying to find the least/smallest char in a string. The program is suppose to compare each character to each other and finds the smallest char. Should look like this when calling.
least("rcDefxB")
The least char is B
this is the code that i have so far
def least(inputS):
for i in range(len(inputS)-1):
current = inputS[i]
nextt = inputS[i+1]
if current > nextt:
current = nextt
nextt = inputS[i+2]
print('The least char is',current)
but the output that i get is this:
least("rcDefxB")
C
D
IndexError: string index out of range
in line least nextt = inputS[i+2]
I probably incremented the wrong way or compared my characters the wrong way. I feel like i have the right setup, let me know where i missed up in my code.

You could just use :
min("rcDefxB")
If you really want to write it on your own, you could use :
def leastChar(inputString):
min_char = inputString[0]
for char in inputString:
if char < min_char:
min_char = char
print 'The leastchar is %s' % min_char
Both methods require a non-empty string.

Eric Duminil solution is better, but if you want your code works properly you should modify it as follows:
inputString = "rcDefxB"
index = 0
while index < len(inputString) - 1:
currentChar = inputString[index]
nextChar = inputString[index + 1]
if currentChar > nextChar:
currentChar = nextChar
index += 1
print('The leastchar is',currentChar)

if by smallest you mean the ASCII code position just:
>>> s = "aBCdefg"
>>> min(s)
'B'
but if you mean the alphabet position ignore the upper or lower case:
>>> min(s, key=lambda x: x.upper())
'a'

Please consider the following approach:
def least(inputString):
leastChar = min(list(inputString))
print('The leastchar is', leastChar)
After running least("rcDefxB"), you'll have:
The leastchar is B

Related

Looking for patterns like 'mop' 'map', 'mXp' in the string, starting with lower case 'm' and ending with lower case 'p'

Hello I'm having trouble with this question. I've attempted to do it, however it seems like I'm doing something wrong.
Here's the question
'Look for patterns like 'mop' 'map', 'mXp' in the string, starting with lower case 'm' and ending with lower case 'p'. Return a string where for all such words, the middle letter is gone, so 'mopXmap' yields 'mpXmp'.'
Heres what i have so far
def mop_map(string):
if len(string) <= 2:
return string
i = 0
case = ""
for index in range(0, len(string)):
b = string[i]
if i < len(string)-2:
e = string[i+2]
if b == "m" and e == "p":
case = case + (b + e)
i = i + 2
else:
case = case + b
i = i + 1
else:
case = case + b
i = i + 1
return case
assert(mop_map('') == '')
assert(mop_map('abc') == 'abc')
assert(mop_map('mp') == 'mp')
assert(mop_map('mop') == 'mp')
assert(mop_map('map') == 'mp')
assert(mop_map('mipXmap') == 'mpXmp')
assert(mop_map('m pm p') == 'mpmp')
assert(mop_map('mmm1pm2p') == 'mmmpmp')
It passes until it hits "mop". I feel like I made it way more complicated than what it really needs to be. The answer should be simple right?

You've made heroic efforts to solve this problem. You're right to suspect there's an easier approach though. You can simply use regular expressions with the re package.
import re
pattern = re.compile('m[a-zA-z]p')
results = re.findall(pattern, your_string)
results will contain a list of all non-overlapping strings that match your pattern.
Now you can print your words with the middle letter removed, if you wish.
for result in results:
final = result.replace(result[1], '')
print(final)

return longest alphabetical substring

The aim of the program is to print the longest substring within variable s that is in alphabetical order.
s ='abchae'
currentlen = 0
longestlen = 0
current = ''
longest = ''
alphabet = 'abcdefghijklmnopqrstuvwxyz'
for char in s:
for number in range(0,len(s)):
if s[number] == char:
n = number
nxtchar = 1
alphstring = s[n]
while alphstring in alphabet == True and n+nxtchar <= 5:
alphstring += s[n+nxtchar]
nxtchar += 1
currentlen = len(alphstring)
current = alphstring
if currentlen > longestlen:
longest = current
print longest
When run, the program doesn't print anything. I don't seem to see what's wrong with the code. Any help would be appreciated.

I'd use regex for this
import re
string = 'abchae'
alphstring = re.compile(r'a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*', re.I)
longest = ''
for match in alphstring.finditer(string):
if len(match.group()) > len(longest):
longest = match.group()
print(longest)
Output:
abch
Note: The flag re.I in the regex expression causes the regex to ignore case. If this is not the desired behavior you can delete the flag and it will only match lowercase characters.

Like Kasramvd said, I do not understand the logic behind your code. You sure your code can run without raise IndentationError? As I concerned, the following part (the second row, have wrong indentation).
for number in range(0,len(s)):
if s[number] == char:
n = number
If you fixed that indentation error, you can run you code without error, and the last row (print longest) does work, it just does not work as you expect, it only prints a blank line.

I think I understood what you meant.
First you need to fix the indentation problem in your code, that would make it run:
for number in range(0,len(s)):
if s[number] == char:
n = number
Second, that condition will return two numbers 0 and 4 since a appears two times in s. I believe you only want the first so you should probably add a break statement after you find a match.
for number in range(0,len(s)):
if s[number] == char:
n = number
break
Finally, alphstring in alphabet == True will always return False. Because alphabet will never be True, you need parentheses to make this work or remove the == True.
ex: while (alphstring in alphabet) == True and n+nxtchar <= 5:
I believe that you were looking for the string abch which is what I managed to obtain with these changes

This is my solution:
result = ""
s = 'abchae'
alphabet = 'abcdefghijklmnopqrstuvwxyz'
max_length=0
for i in range(len(s)):
for j in range(len(s)):
if s[i:j] in alphabet and len(s[i:j])>max_length:
max_length = len(s[i:j])
result = s[i:j]
print result

Why is the following Python code wrong?

I have the following problem for my assignment:
Write a program that prints the longest substring of s in which the letters occur in alphabetical order. For example, if s = azcbobobegghakl, then your program should print:
Longest substring in alphabetical order is: beggh
The code I wrote for the problem is this:
s = 'azcbobobegghakl'
current_index = 1
first_index = 0
result_string = ''
current_string = s[first_index]
while current_index < len(s):
if ord(s[first_index]) <= ord(s[current_index]):
current_string += s[current_index]
elif ord(s[current_index]) < ord(s[first_index]):
current_string = ''
if len(current_string) > len(result_string):
result_string = current_string[:]
current_index += 1
first_index += 1
print('Longest substring in alphabetical order is: ' + result_string)
The code doesn't give out the correct result, for some reason, it gives out eggh instead of beggh.
And since this is an assignment, I do not ask that you give me the corrected code, but just give me an hint on where I am wrong since I wanna solve my problem BY MYSELF and don't wanna cheat.
Thanks.

Error is here:
current_string = ''
you should not clear it when you find s[current_index]) < s[first_index].
Other hints:
no need to use ord.
what happens if s='a'?
no need to copy result_string = current_string[:], since strings are immutables
Hints OVER ;P

Python: Compare two strings and return the longest segment that they have in common

As a novice in Python, I have written a working function that will compare two strings and search for the longest substring shared by both strings. For instance, when the function compares "goggle" and "google", it will identify "go" and "gle" as the two common substrings (excluding single letters), but will only return "gle" since it's the longest one.
I would like to know if anywhere part of my code can be improved/re-written, as it may be considered lengthy and convoluted. I'll also be very glad to see other approaches to the solution. Thanks in advance!
def longsub(string1, string2):
sublist = []
i=j=a=b=count=length=0
while i < len(string1):
while j < len(string2):
if string1[i:a+1] == string2[j:b+1] and (a+1) <= len(string1) and (b+1) <= len(string2):
a+=1
b+=1
count+=1
else:
if count > 0:
sublist.append(string1[i:a])
count = 0
j+=1
b=j
a=i
j=b=0
i+=1
a=i
while len(sublist) > 1:
for each in sublist:
if len(each) >= length:
length = len(each)
else:
sublist.remove(each)
return sublist[0]
Edit: Comparing "goggle" and "google" may have been a bad example, since they are equal length with longest common segments in the same positions. The actual inputs would be closer to this: "xabcdkejp" and "zkdieaboabcd". Correct output should be "abcd".

There actually happens to be a function for this in the standard library: difflib.SequencMatcher.find_longest_match

EDIT: This algorithm only works when the words have the longest segment in the same indices
You can get away with only one loop. Use helper variables. Something like these (needs refactoring) http://codepad.org/qErRBPav:
word1 = "google"
word2 = "goggle"
longestSegment = ""
tempSegment = ""
for i in range(len(word1)):
if word1[i] == word2[i]:
tempSegment += word1[i]
else: tempSegment = ""
if len(tempSegment) > len(longestSegment):
longestSegment = tempSegment
print longestSegment # "gle"
EDIT: mgilson's proposal of using find_longest_match (works for varying positions of the segments):
from difflib import SequenceMatcher
word1 = "google"
word2 = "goggle"
s = SequenceMatcher(None, word1, word2)
match = s.find_longest_match(0, len(word1), 0, len(word2))
print word1[match.a:(match.b+match.size)] # "gle"

Letter Count on a string

Python newb here. I m trying to count the number of letter "a"s in a given string. Code is below. It keeps returning 1 instead 3 in string "banana". Any input appreciated.
def count_letters(word, char):
count = 0
while count <= len(word):
for char in word:
if char == word[count]:
count += 1
return count
print count_letters('banana','a')

The other answers show what's wrong with your code. But there's also a built-in way to do this, if you weren't just doing this for an exercise:
>>> 'banana'.count('a')
3
Danben gave this corrected version:
def count_letters(word, char):
count = 0
for c in word:
if char == c:
count += 1
return count
Here are some other ways to do it, hopefully they will teach you more about Python!
Similar, but shorter for loop. Exploits the fact that booleans can turn into 1 if true and 0 if false:
def count_letters(word, char):
count = 0
for c in word:
count += (char == c)
return count
Short for loops can generally be turned into list/generator comprehensions. This creates a list of integers corresponding to each letter, with 0 if the letter doesn't match char and 1 if it does, and then sums them:
def count_letters(word, char):
return sum(char == c for c in word)
The next one filters out all the characters that don't match char, and counts how many are left:
def count_letters(word, char):
return len([c for c in word if c == char])

One problem is that you are using count to refer both to the position in the word that you are checking, and the number of char you have seen, and you are using char to refer both to the input character you are checking, and the current character in the string. Use separate variables instead.
Also, move the return statement outside the loop; otherwise you will always return after checking the first character.
Finally, you only need one loop to iterate over the string. Get rid of the outer while loop and you will not need to track the position in the string.
Taking these suggestions, your code would look like this:
def count_letters(word, char):
count = 0
for c in word:
if char == c:
count += 1
return count

A simple way is as follows:
def count_letters(word, char):
return word.count(char)
Or, there's another way count each element directly:
from collections import Counter
Counter('banana')
Of course, you can specify one element, e.g.
Counter('banana')['a']

Your return is in your for loop! Be careful with indentation, you want the line return count to be outside the loop. Because the for loop goes through all characters in word, the outer while loop is completely unneeded.
A cleaned-up version:
def count_letters(word, to_find):
count = 0
for char in word:
if char == to_find:
count += 1
return count

You have a number of problems:
There's a problem with your indentation as others already pointed out.
There's no need to have nested loops. Just one loop is enough.
You're using char to mean two different things, but the char variable in the for loop will overwrite the data from the parameter.
This code fixes all these errors:
def count_letters(word, char):
count = 0
for c in word:
if char == c:
count += 1
return count
A much more concise way to write this is to use a generator expression:
def count_letters(word, char):
return sum(char == c for c in word)
Or just use the built-in method count that does this for you:
print 'abcbac'.count('c')

I see a few things wrong.
You reuse the identifier char, so that will cause issues.
You're saying if char == word[count] instead of word[some index]
You return after the first iteration of the for loop!
You don't even need the while. If you rename the char param to search,
for char in word:
if char == search:
count += 1
return count

Alternatively You can use:
mystring = 'banana'
number = mystring.count('a')

count_letters=""
number=count_letters.count("")
print number

"banana".count("ana") returns 1 instead of 2 !
I think the method iterates over the string (or the list) with a step equal to the length of the substring so it doesn't see this kind of stuff.
So if you want a "full count" you have to implement your own counter with the correct loop of step 1
Correct me if I'm wrong...

def count_letter(word, char):
count = 0
for char in word:
if char == word:
count += 1
return count #Your return is inside your for loop
r = count_word("banana", "a")
print r
3

x=str(input("insert string"))
c=0
for i in x:
if 'a' in i:
c=c+1
print(c)

Following program takes a string as input and output a pandas DataFrame, which represents the letter count.
Sample Input
hello
Sample Output
 char Freq.
0 h  1
1 e  1
2 l  2
3 o  1
import pandas as pd
def count_letters(word, char):
return word.count(char)
text = input()
text_split = text.split()
list1 = []
list2 = []
for i in text_split:
for j in i:
counter = count_letters (text, j)
list1.append(j)
list2.append(counter)
dictn = dict(zip(list1, list2))
df = pd.DataFrame (dictn.items(), columns = ['char', 'freq.'])
print (df)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to compare Char Strings - python

You could just use : min("rcDefxB") If you really want to write it on your own, you could use : def leastChar(inputString): min_char = inputString[0] for char in inputString: if char < min_char: min_char = char print 'The leastchar is %s' % min_char Both methods require a non-empty string.

if by smallest you mean the ASCII code position just: >>> s = "aBCdefg" >>> min(s) 'B' but if you mean the alphabet position ignore the upper or lower case: >>> min(s, key=lambda x: x.upper()) 'a'

Please consider the following approach: def least(inputString): leastChar = min(list(inputString)) print('The leastchar is', leastChar) After running least("rcDefxB"), you'll have: The leastchar is B

Related

Looking for patterns like 'mop' 'map', 'mXp' in the string, starting with lower case 'm' and ending with lower case 'p'

return longest alphabetical substring

Why is the following Python code wrong?

Python: Compare two strings and return the longest segment that they have in common

Letter Count on a string

Categories

Resources