return longest alphabetical substring

return longest alphabetical substring - python

The aim of the program is to print the longest substring within variable s that is in alphabetical order.
s ='abchae'
currentlen = 0
longestlen = 0
current = ''
longest = ''
alphabet = 'abcdefghijklmnopqrstuvwxyz'
for char in s:
for number in range(0,len(s)):
if s[number] == char:
n = number
nxtchar = 1
alphstring = s[n]
while alphstring in alphabet == True and n+nxtchar <= 5:
alphstring += s[n+nxtchar]
nxtchar += 1
currentlen = len(alphstring)
current = alphstring
if currentlen > longestlen:
longest = current
print longest
When run, the program doesn't print anything. I don't seem to see what's wrong with the code. Any help would be appreciated.

I'd use regex for this
import re
string = 'abchae'
alphstring = re.compile(r'a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*', re.I)
longest = ''
for match in alphstring.finditer(string):
if len(match.group()) > len(longest):
longest = match.group()
print(longest)
Output:
abch
Note: The flag re.I in the regex expression causes the regex to ignore case. If this is not the desired behavior you can delete the flag and it will only match lowercase characters.

Like Kasramvd said, I do not understand the logic behind your code. You sure your code can run without raise IndentationError? As I concerned, the following part (the second row, have wrong indentation).
for number in range(0,len(s)):
if s[number] == char:
n = number
If you fixed that indentation error, you can run you code without error, and the last row (print longest) does work, it just does not work as you expect, it only prints a blank line.

I think I understood what you meant.
First you need to fix the indentation problem in your code, that would make it run:
for number in range(0,len(s)):
if s[number] == char:
n = number
Second, that condition will return two numbers 0 and 4 since a appears two times in s. I believe you only want the first so you should probably add a break statement after you find a match.
for number in range(0,len(s)):
if s[number] == char:
n = number
break
Finally, alphstring in alphabet == True will always return False. Because alphabet will never be True, you need parentheses to make this work or remove the == True.
ex: while (alphstring in alphabet) == True and n+nxtchar <= 5:
I believe that you were looking for the string abch which is what I managed to obtain with these changes

This is my solution:
result = ""
s = 'abchae'
alphabet = 'abcdefghijklmnopqrstuvwxyz'
max_length=0
for i in range(len(s)):
for j in range(len(s)):
if s[i:j] in alphabet and len(s[i:j])>max_length:
max_length = len(s[i:j])
result = s[i:j]
print result

Related

identifying the substring when the number of characters in between don't matter

def checkPattern(x, string):
e = len(string)
if len(x) < e:
return False
for i in range(e - 1):
x = string[i]
y = string[i + 1]
last = x.rindex(x)
first = x.index(y)
if last == -1 or first == -1 or last > first:
return False
return True
if __name__ == "__main__":
x = str(input())
string = "hello"
if checkPattern(x, string) is True:
print('YES')
if checkPattern(x, string) is False:
print('NO')
So basically the code is supposed to identify a substring when the number of characters between the substring's letters don't matter. string = "hello" is supposed to be the substring. While the characters in between don't matter the order still matters so If I type "h.e.l.l.o" for example it's a YES, but if it's something like "hlelo" it's a NO. I sorta copied the base of the code and I'm still a little new to python so sorry if the question and code aren't clear.

Assuming I understand, and the input hlelo is No and the input h.e..l.l.!o is Yes, then the following code should work:
def checkPattern(x, string):
assert x and string, "Error. Both inputs should be non-empty. "
count_idx = 0 # index which counts where you are.
for letter in x:
if letter == string[count_idx]:
count_idx += 1 # increment to check the next string
if count_idx == len(string):
return True
# pattern was found if counter found matches equal to string length
return False
if __name__ == "__main__":
inp = input()
string = "hello"
if checkPattern(inp, string) is True:
print('YES')
if checkPattern(inp, string) is False:
print('NO')
Explaination: Regardless of the input string, x, you want to loop through each character of the search-string hello, to check if you find each character in the correct order. What my solution does is that it counts how many of the characters h, e, l, l, o it has found, starting from 0. If it finds a match for h, it moves on to check for a match for e, and so on. Ultimately, if you search through the entire string x, and the counter does not equal to the length of the search string (i.e. you could not find all the hello characters), it returns false.
EDIT: Small debug in the way the return worked. Instead returns if ever the counter goes over the length. Also added more examples given in comments

Here is my solution to this problem:
pattern = "hello"
def patternCheck(word, pattern) -> bool:
plist = list(pattern)
wlist = list(word)
for p in plist:
if p in wlist:
for _ in range(wlist.index(p) , -1, -1):
wlist.pop(_)
else:
return False
return True
print(patternCheck("h.e.l.l.o", pattern))
print(patternCheck("aalohel", pattern))
print(patternCheck("hhhhheeelllooo", pattern))
Explanation
First we convert our strings to a list
plist = list(pattern)
wlist = list(word)
Now we check using a for loop if every element in our pattern list is in the word list.
for p in plist:
if p in wlist:
If yes then we remove all the elements from index 0 to the index of that element.
for _ in range(wlist.index(p) , -1, -1):
wlist.pop(_)
We are removing elements in decreasing order of there indices to protect ourself from the IndexError: pop index out of range.
If the for loop ends normally then there was a match and we return True. Else if the element was not found in the word list in the first place then we return false as there is no match.

Find alphabet neighbour letters in string (2 types: aABClg = ABC, aAbcd = aA)

EDIT: The problem, and the answer lies in using the enumerate function to access the index of iterables, but I'm still working on applying it properly.
I was asked to generate a random word with N length, and to print uppercase alphabet neighbours and lower - upper neighbours. I really don't know how to put this better.
The example is in the title, here is my code so far, and I think it works, I just need to fix the error made by the index search in the ascii_uppercase list variable.
Also, please excuse the messy do - while loop at the beginning.
import string
import random
letters = list(string.ascii_uppercase + string.ascii_lowercase)
enter = int(input('N: '))
def randomised(signs = string.ascii_uppercase + string.ascii_lowercase, X = enter):
return( ''.join(random.choice(signs) for _ in range(X)))
if enter == 1:
print('END')
while enter > 1:
enter = int(input('N: '))
word1 = randomised()
word2 = list(word1)
neighbour = ''
same = ''
for j in word2:
if word2[j] and word2[j+1] in string.ascii_uppercase and letters.index(j) == word2.index(j) and letters.index(j+1) == word2.index(j+1):
same += j
same += j+1
for i in word2:
if word2[i] == i.upper and word2[i+1] == (i+1).upper:
neighbour += i
neighbour += i+1
print('Created: {}, Neighbour uppercase letters: {}, Neighbour same letters: {}' .format(word1,same,neighbour))
expected behaviour:
N: 7
Created: aaBCDdD, Neighbour uppercase letters: BCD, Neighbour same letters: dD
N: 1
N: END

I am not so sure, but i think your problem might stem from the use of "i+1" and "j+1" without limiting the iterations stop before the last one.
The next thing is that you need to put this code in english for people to be able to provide better answers, i am not native english, but the core of my code is in english so it will be understandable worldwide, there are other general improvements that can be done, but those are learn while coding.
I hope your assignment goes great.
Another recommendation you can use "a".islower() to see if the character is lowercase, isupper() to se if it is uppercase, you don't need to import the whole alphabet. There are many builtin functions to deal with common scenarios and those tend to be more efficient than what most people would do without them.
Edit: the error is because you are doing string + number

Here is a simple working code (without formatting of the output). I used a simple pairwise iteration to compare each character with the previous one.
# generation of random word
N = 50
word = ''.join(random.choices(ascii_letters, k=N))
# forcing word for testing
word = 'aaBCDdD'
# test of conditions
cond1 = ''
cond2 = ''
flag1 = False # flag to add last character of stretch
for a,b in zip('_'+word, word+'_'):
if a.isupper() and b.isupper() and ord(a) == ord(b)-1:
cond1 += a
flag1 = True
elif flag1:
flag1 = False
cond1 += a
if a.islower() and b.isupper() and a.lower() == b.lower():
cond2 += a+b
print(cond1, cond2, sep='\n')
# BCD
# dD
NB. In case the conditions are met several times in the word, the identified patterns will just be concatenated
Example on random word of 500 characters:
IJRSOPHILMMNVW
kKdDyY

Remove string character after run of n characters in string

Suppose you have a given string and an integer, n. Every time a character appears in the string more than n times in a row, you want to remove some of the characters so that it only appears n times in a row. For example, for the case n = 2, we would want the string 'aaabccdddd' to become 'aabccdd'. I have written this crude function that compiles without errors but doesn't quite get me what I want:
def strcut(string, n):
for i in range(len(string)):
for j in range(n):
if i + j < len(string)-(n-1):
if string[i] == string[i+j]:
beg = string[:i]
ends = string[i+1:]
string = beg + ends
print(string)
These are the outputs for strcut('aaabccdddd', n):
n
output
expected
1
'abcdd'
'abcd'
2
'acdd'
'aabccdd'
3
'acddd'
'aaabccddd'
I am new to python but I am pretty sure that my error is in line 3, 4 or 5 of my function. Does anyone have any suggestions or know of any methods that would make this easier?

This may not answer why your code does not work, but here's an alternate solution using regex:
import re
def strcut(string, n):
return re.sub(fr"(.)\1{{{n-1},}}", r"\1"*n, string)
How it works: First, the pattern formatted is "(.)\1{n-1,}". If n=3 then the pattern becomes "(.)\1{2,}"
(.) is a capture group that matches any single character
\1 matches the first capture group
{2,} matches the previous token 2 or more times
The replacement string is the first capture group repeated n times
For example: str = "aaaab" and n = 3. The first "a" is the capture group (.). The next 3 "aaa" matches \1{2,} - in this example a{2,}. So the whole thing matches "a" + "aaa" = "aaaa". That is replaced with "aaa".
regex101 can explain it better than me.

you can implement a stack data structure.
Idea is you add new character in stack, check if it is same as previous one or not in stack and yes then increase counter and check if counter is in limit or not if yes then add it into stack else not. if new character is not same as previous one then add that character in stack and set counter to 1
# your code goes here
def func(string, n):
stack = []
counter = None
for i in string:
if not stack:
counter = 1
stack.append(i)
elif stack[-1]==i:
if counter+1<=n:
stack.append(i)
counter+=1
elif stack[-1]!=i:
stack.append(i)
counter = 1
return ''.join(stack)
print(func('aaabbcdaaacccdsdsccddssse', 2)=='aabbcdaaccdsdsccddsse')
print(func('aaabccdddd',1 )=='abcd')
print(func('aaabccdddd',2 )=='aabccdd')
print(func('aaabccdddd',3 )=='aaabccddd')
output
True
True
True
True

The method I would use is creating a new empty string at the start of the function and then everytime you exceed the number of characters in the input string you just not insert them in the output string, this is computationally efficient because it is O(n) :
def strcut(string,n) :
new_string = ""
first_c, s = string[0], 0
for c in string :
if c != first_c :
first_c, s= c, 0
s += 1
if s > n : continue
else : new_string += c
return new_string
print(strcut("aabcaaabbba",2)) # output : #aabcaabba

Simply, to anwer the question
appears in the string more than n times in a row
the following code is small and simple, and will work fine :-)
def strcut(string: str, n: int) -> str:
tmp = "*" * (n+1)
for char in string:
if tmp[len(tmp) - n:] != char * n:
tmp += char
print(tmp[n+1:])
strcut("aaabccdddd", 1)
strcut("aaabccdddd", 2)
strcut("aaabccdddd", 3)
Output:
abcd
aabccdd
aaabccddd
Notes:
The character "*" in the line tmp = "*"*n+string[0:1] can be any character that is not in the string, it's just a placeholder to handle the start case when there are no characters.
The print(tmp[n:]) line simply removes the "*" characters added in the beginning.

You don't need nested loops. Keep track of the current character and its count. include characters when the count is less or equal to n, reset the current character and count when it changes.
def strcut(s,n):
result = '' # resulting string
char,count = '',0 # initial character and count
for c in s: # only loop once on the characters
if c == char: count += 1 # increase count
else: char,count = c,1 # reset character/count
if count<=n: result += c # include character if count is ok
return result

Just to give some ideas, this is a different approach. I didn't like how n was iterating each time even if I was on i=3 and n=2, I still jump to i=4 even though I already checked that character while going through n. And since you are checking the next n characters in the string, you method doesn't fit with keeping the strings in order. Here is a rough method that I find easier to read.
def strcut(string, n):
for i in range(len(string)-1,0,-1): # I go backwards assuming you want to keep the front characters
if string.count(string[i]) > n:
string = remove(string,i)
print(string)
def remove(string, i):
if i > len(string):
return string[:i]
return string[:i] + string[i+1:]
strcut('aaabccdddd',2)

Super Reduced String python

I've been having problems with this simple hackerrank question. My code works in the compiler but hackerrank test is failing 6 test cases. One of which my output is correct for (I didn't pay premium). Is there something wrong here?
Prompt:
Steve has a string of lowercase characters in range ascii[‘a’..’z’]. He wants to reduce the string to its shortest length by doing a series of operations in which he selects a pair of adjacent lowercase letters that match, and then he deletes them. For instance, the string aab could be shortened to b in one operation.
Steve’s task is to delete as many characters as possible using this method and print the resulting string. If the final string is empty, print Empty String
Ex.
aaabccddd → abccddd → abddd → abd
baab → bb → Empty String
Here is my code:
def super_reduced_string(s):
count_dict = {}
for i in s:
if (i in count_dict.keys()):
count_dict[i] += 1
else:
count_dict[i] = 1
new_string = ''
for char in count_dict.keys():
if (count_dict[char] % 2 == 1):
new_string += char
if (new_string is ''):
return 'Empty String'
else:
return new_string
Here is an example of output for which it does not work.
print(super_reduced_string('abab'))
It outputs 'Empty String' but should output 'abab'.

By using a counter, your program loses track of the order in which it saw characters. By example with input 'abab', you program sees two a's and two b's and deletes them even though they are not adjacent. It then outputs 'Empty String' but should output 'abab'.
O(n) stack-based solution
This problem is equivalent to finding unmatched parentheses, but where an opening character is its own closing character.
What this means is that it can be solved in a single traversal using a stack.
Since Python can return an actual empty string, we are going to output that instead of 'Empty String' which could be ambiguous if given an input such as 'EEEmpty String'.
Code
def super_reduced_string(s):
stack = []
for c in s:
if stack and c == stack[-1]:
stack.pop()
else:
stack.append(c)
return ''.join(stack)
Tests
print(super_reduced_string('aaabab')) # 'abab'
print(super_reduced_string('aabab')) # 'bab'
print(super_reduced_string('abab')) # 'abab'
print(super_reduced_string('aaabccddd ')) # 'abd'
print(super_reduced_string('baab ')) # ''

I solved it with recursion:
def superReducedString(s):
if not s:
return "Empty String"
for i in range(0,len(s)):
if i < len(s)-1:
if s[i] == s[i+1]:
return superReducedString(s[:i]+s[i+2:])
return s
This code loops over the string and checks if the current and next letter/position in the string are the same. If so, these two letters/positions I get sliced from the string and the newly created reduced string gets passed to the function.
This occurs until there are no pairs in the string.
TESTS:
print(super_reduced_string('aaabccddd')) # 'abd'
print(super_reduced_string('aa')) # 'Empty String'
print(super_reduced_string('baab')) # 'Empty String'

I solved it by creating a list and then add only unique letters and remove the last letter that found on the main string. Finally all the tests passed!
def superReducedString(self, s):
stack = []
for i in range(len(s)):
if len(stack) == 0 or s[i] != stack[-1]:
stack.append(s[i])
else:
stack.pop()
return 'Empty String' if len(stack) == 0 else ''.join(stack)

I used a while loop to keep cutting down the string until there's no change:
def superReducedString(s):
repeat = set()
dups = set()
for char in s:
if char in repeat:
dups.add(char + char)
else:
repeat.add(char)
s_old = ''
while s_old != s:
s_old = s
for char in dups:
if char in s:
s = s.replace(char, '')
if len(s) == 0:
return 'Empty String'
else:
return s

alphabetical order in Python

Okay I have questions regarding the following code:
s = "wxyabcd"
myString = s[0]
longest = s[0]
for i in range(1, len(s)):
if s[i] >= myString[-1]:
myString += s[i]
if len(myString) > len(longest):
longest = myString
else:
myString = s[i]
print longest
Answer: "abcd"
w
wx
wxy
a
ab
abc
abcd
I am new to Python and I am trying to learn how some of these loops work but I am very confused. This found what the longest string in alphabetical order was... The actual answer was "abcd" but I know that the process it went through was one by one.
Question: Can someone please guide me through the code so I can understand it better? Since there are 7 characters I am assuming it starts by saying: "For each item in range 1-7 if the item is 'more' than myString [-1] which is 'w' then I add the letter plus the item in i which in this case it would be 'x'.
I get lost right after this... So from a - z : a > z? Is that how it is? And how then when s[i] != myString[-1] did it skip to start from 'a' in s[i].
Sorry I am all over the place. Anyways i've tried to search places online to help me learn this but some things are just hard. I know that in a few months ill get the hang of it and hopefully be more fluent.
Thank you!

Here's a bit of an explanation of the control flow and what's going on with Python's indexing, hope it helps:
s = "wxyabcd"
myString = s[0] # 'w'
longest = s[0] # 'w' again, for collecting other chars
for i in range(1, len(s)): # from 1 to 7, exclusive of 7, so 2nd index to last
if s[i] >= myString[-1]: # compare the chars, e.g. z > a, so x and y => True
myString += s[i] # concatenate on to 'w'
if len(myString) > len(longest): # evident?
longest = myString # reassign longest to myString
else:
myString = s[i] # reassign myString to where you are in s.
print longest

# s is a 7 character string
s = "wxyabcd"
# set `mystring` to be the first character of s, 'w'
myString = s[0]
# set `longest` to be the first character of s, 'w'
longest = s[0]
# loop from 1 up to and not including length of s (7)
# Which equals for i in (1,2,3,4,5,6):
for i in range(1, len(s)):
# Compare the character at i with the last character of `mystring`
if s[i] >= myString[-1]:
# If it is greater (in alphabetical sense)
# append the character at i to `mystring`
myString += s[i]
# If this makes `mystring` longer than the previous `longest`,
# set `mystring` to be the new `longest`
if len(myString) > len(longest):
longest = myString
# Otherwise set `mystring` to be a single character string
# and start checking from index i
else:
myString = s[i]
# `longest` will be the longest `mystring` that was formed,
# using only characters in descending alphabetic order
print longest

Two approaches I can think of (quickly)
def approach_one(text): # I approve of this method!
all_substrings = list()
this_substring = ""
for letter in text:
if len(this_substring) == 0 or letter > this_substring[-1]:
this_substring+=letter
else:
all_substrings.append(this_substring)
this_substring = letter
all_substrings.append(this_substring)
return max(all_substrings,key=len)
def approach_two(text):
#forthcoming

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

return longest alphabetical substring - python

This is my solution: result = "" s = 'abchae' alphabet = 'abcdefghijklmnopqrstuvwxyz' max_length=0 for i in range(len(s)): for j in range(len(s)): if s[i:j] in alphabet and len(s[i:j])>max_length: max_length = len(s[i:j]) result = s[i:j] print result

Related

identifying the substring when the number of characters in between don't matter

Find alphabet neighbour letters in string (2 types: aABClg = ABC, aAbcd = aA)

Remove string character after run of n characters in string

Super Reduced String python

alphabetical order in Python

Categories

Resources