Finding substring in python [duplicate]

Finding substring in python [duplicate] - python

This question already has answers here:
Python: Find a substring in a string and returning the index of the substring
(7 answers)
Closed 2 years ago.
I'm new to python and I'm trying different methods to accomplish the same task, right now I'm trying to figure out how to get a substring out of a string using a for loop and a while loop. I quickly found that this is a really easy task to accomplish using regex. For example if I have a string: "ABCDEFGHIJKLMNOP" and I want to find if "CDE" exists then print out "CDE" + the rest of the string how would I do that using loops? Right now I'm using:
for i, c in enumerate(myString):
which returns each index and character, which I feel is a start but I can't figure out what to do after. I also know there are a lot of build in functions to find substrings by doing: myString.(Function) but I would still like to know if it's possible doing this with loops.

Given:
s = 'ABCDEFGHIJKLMNOP'
targets = 'CDE','XYZ','JKL'
With loops:
for t in targets:
for i in range(len(s) - len(t) + 1):
for j in range(len(t)):
if s[i + j] != t[j]:
break
else:
print(s[i:])
break
else:
print(t,'does not exist')
Pythonic way:
for t in targets:
i = s.find(t)
if i != -1:
print(s[i:])
else:
print(t,'does not exist')
Output (in both cases):
CDEFGHIJKLMNOP
XYZ does not exist
JKLMNOP

Here's a concise way to do so:
s = "ABCDEFGHIJKLMNOP"
if "CDE" in s:
print s[s.find("CDE")+len("CDE"):]
else:
print s
Prints:
FGHIJKLMNOP
The caveat here is of course, if the sub-string is not found, the original string will be returned.
Why do this? Doing so allows you to check whether or not the original string was found or not. As such, this can be conceptualized into a simple function (warning: no type checks enforced for brevity - it is left up to the reader to implement them as necessary):
def remainder(string, substring):
if substring in string:
return string[string.find(substring)+len(substring):]
else:
return string

Getting the remainder of the string using a for-loop:
n = len(substr)
rest = next((s[i+n:] for i in range(len(s) - n + 1) if s[i:i+n] == substr),
None) # return None if substr not in s
It is equivalent to:
_, sep, rest = s.partition(substr)
if substr and not sep: # substr not in s
rest = None

Related

python Question: Could any one explain how the above class reversed the string return val? sorry coming back to python after many years [duplicate]

I want to use recursion to reverse a string in python so it displays the characters backwards (i.e "Hello" will become "olleh"/"o l l e h".
I wrote one that does it iteratively:
def Reverse( s ):
result = ""
n = 0
start = 0
while ( s[n:] != "" ):
while ( s[n:] != "" and s[n] != ' ' ):
n = n + 1
result = s[ start: n ] + " " + result
start = n
return result
But how exactly do I do this recursively? I am confused on this part, especially because I don't work with python and recursion much.
Any help would be appreciated.

def rreverse(s):
if s == "":
return s
else:
return rreverse(s[1:]) + s[0]
(Very few people do heavy recursive processing in Python, the language wasn't designed for it.)

To solve a problem recursively, find a trivial case that is easy to solve, and figure out how to get to that trivial case by breaking the problem down into simpler and simpler versions of itself.
What is the first thing you do in reversing a string? Literally the first thing? You get the last character of the string, right?
So the reverse of a string is the last character, followed by the reverse of everything but the last character, which is where the recursion comes in. The last character of a string can be written as x[-1] while everything but the last character is x[:-1].
Now, how do you "bottom out"? That is, what is the trivial case you can solve without recursion? One answer is the one-character string, which is the same forward and reversed. So if you get a one-character string, you are done.
But the empty string is even more trivial, and someone might actually pass that in to your function, so we should probably use that instead. A one-character string can, after all, also be broken down into the last character and everything but the last character; it's just that everything but the last character is the empty string. So if we handle the empty string by just returning it, we're set.
Put it all together and you get:
def backward(text):
if text == "":
return text
else:
return text[-1] + backward(text[:-1])
Or in one line:
backward = lambda t: t[-1] + backward(t[:-1]) if t else t
As others have pointed out, this is not the way you would usually do this in Python. An iterative solution is going to be faster, and using slicing to do it is going to be faster still.
Additionally, Python imposes a limit on stack size, and there's no tail call optimization, so a recursive solution would be limited to reversing strings of only about a thousand characters. You can increase Python's stack size, but there would still be a fixed limit, while other solutions can always handle a string of any length.

I just want to add some explanations based on Fred Foo's answer.
Let's say we have a string called 'abc', and we want to return its reverse which should be 'cba'.
def reverse(s):
if s == "":
return s
else:
return reverse(s[1:]) + s[0]
s = "abc"
print (reverse(s))
How this code works is that:
when we call the function
reverse('abc') #s = abc
=reverse('bc') + 'a' #s[1:] = bc s[0] = a
=reverse('c') + 'b' + 'a' #s[1:] = c s[0] = a
=reverse('') + 'c' + 'b' + 'a'
='cba'

If this isn't just a homework question and you're actually trying to reverse a string for some greater goal, just do s[::-1].

def reverse_string(s):
if s: return s[-1] + reverse_string(s[0:-1])
else: return s
or
def reverse_string(s):
return s[-1] + reverse_string(s[0:-1]) if s else s

I know it's too late to answer original question and there are multiple better ways which are answered here already. My answer is for documentation purpose in case someone is trying to implement tail recursion for string reversal.
def tail_rev(in_string,rev_string):
if in_string=='':
return rev_string
else:
rev_string+=in_string[-1]
return tail_rev(in_string[:-1],rev_string)
in_string=input("Enter String: ")
rev_string=tail_rev(in_string,'')
print(f"Reverse of {in_string} is {rev_string}")

s = input("Enter your string: ")
def rev(s):
if len(s) == 1:
print(s[0])
exit()
else:
#print the last char in string
#end="" prints all chars in string on same line
print(s[-1], end="")
"""Next line replaces whole string with same
string, but with 1 char less"""
return rev(s.replace(s, s[:-1]))
rev(s)

if you do not want to return response than you can use this solution. This question is part of LeetCode.
class Solution:
i = 0
def reverseString(self, s: List[str]) -> None:
"""
Do not return anything, modify s in-place instead.
"""
if self.i >= (len(s)//2):
return
s[self.i], s[len(s)-self.i-1] = s[len(s)-self.i-1], s[self.i]
self.i += 1
self.reverseString(s)

Python 2.7 finding if some anagram of one string is a substring of another [duplicate]

This question already has answers here:
Anagram of String 2 is Substring of String 1
(5 answers)
Closed 5 years ago.
EDIT: Posting my final solution because this was a very helpful thread and I want to add some finality to it. Using the advice from both answers below I was able to craft a solution. I added a helper function in which I defined an anagram. Here is my final solution:
def anagram(s1, s2):
s1 = list(s1)
s2 = list(s2)
s1.sort()
s2.sort()
return s1 == s2
def Question1(t, s):
t_len = len(t)
s_len = len(s)
t_sort = sorted(t)
for start in range(s_len - t_len + 1):
if anagram(s[start: start+t_len], t):
return True
return False
print Question1("app", "paple")
I am working on some practice technical interview questions and I'm stuck on the following question:
Find whether an anagram of string t is a substring of s
I have worked out the following two variants of my code, and a solution to this I believe lies in a cross between the two. The problem I am having is that the first code always prints False., regardless of input. The second variation works to some degree. However, it cannot sort individual letters. For example t=jks s=jksd will print True! however t=kjs s=jksd will print False.
def Question1():
# Define strings as raw user input.
t = raw_input("Enter phrase t:")
s = raw_input("Enter phrase s:")
# Use the sorted function to find if t in s
if sorted(t.lower()) in sorted(s.lower()):
print("True!")
else:
print("False.")
Question1()
Working variant:
def Question1():
# Define strings as raw user input.
t = raw_input("Enter phrase t:")
s = raw_input("Enter phrase s:")
# use a loop to find if t is in s.
if t.lower() in s.lower():
print("True!")
else:
print("False.")
Question1()
I believe there is a solution that lies between these two, but I'm having trouble figuring out how to use sorted in this situation.

You're very much on the right track. First, please note that there is no loop in your second attempt.
The problem is that you can't simply sort all of s and then look for sorted(t) in that. Rather, you have to consider each len(t) sized substring of s, and check that against the sorted t. Consider the trivial example:
t = "abd"
s = "abdc"
s trivially contains t. However, when you sort them, you get the strings abd and abcd, and the in comparison fails. The sorting gets other letters in the way.
Instead, you need to step through s in chunks the size of t.
t_len = len(t)
s_len = len(s)
t_sort = sorted(t)
for start in range(s_len - t_len + 1):
chunk = s[start:start+t_len]
if t_sort == sorted(chunk):
# SUCCESS!!

I think your problem lies in the "substring" requirement. If you sort, you destroy order. Which means that while you can determine that an anagram of string1 is an anagram of a substring of string2, until you actually deal with string2 in order, you won't have a correct answer.
I'd suggest iterating over all the substrings of length len(s1) in s2. This is a straightforward for loop. Once you have the substrings, you can compare them (sorted vs sorted) with s1 to decide if there is any rearrangement of s1 that yields a contiguous substring of s2.
Viz:
s1 = "jks"
s2 = "aksjd"
print('s1=',s1, ' s2=', s2)
for offset in range(len(s2) - len(s1) + 1):
ss2 = s2[offset:offset+len(s1)]
if sorted(ss2) == sorted(s1):
print('{} is an anagram of {} at offset {} in {}'.format(ss2, s1, offset, s2))

Python - packing/unpacking by letters

I'm just starting to learn python and I have this exercise that's puzzling me:
Create a function that can pack or unpack a string of letters.
So aaabb would be packed a3b2 and vice versa.
For the packing part of the function, I wrote the following
def packer(s):
if s.isalpha(): # Defines if unpacked
stack = []
for i in s:
if s.count(i) > 1:
if (i + str(s.count(i))) not in stack:
stack.append(i + str(s.count(i)))
else:
stack.append(i)
print "".join(stack)
else:
print "Something's not quite right.."
return False
packer("aaaaaaaaaaaabbbccccd")
This seems to work all proper. But the assignment says that
if the input has (for example) the letter a after b or c, then
it should later be unpacked into it's original form.
So "aaabbkka" should become a3b2k2a, not a4b2k2.
I hence figured, that I cannot use the "count()" command, since
that counts all occurrences of the item in the whole string, correct?
What would be my options here then?
On to the unpacking -
I've thought of the basics what my code needs to do -
between the " if s.isalpha():" and else, I should add an elif that
checks whether or not the string has digits in it. (I figured this would be
enough to determine whether it's the packed version or unpacked).
Create a for loop and inside of it an if sentence, which then checks for every element:
2.1. If it has a number behind it > Return (or add to an empty stack) the number times the digit
2.2. If it has no number following it > Return just the element.
Big question number 2 - how do I check whether it's a number or just another
alphabetical element following an element in the list? I guess this must be done with
slicing, but those only take integers. Could this be achieved with the index command?
Also - if this is of any relevance - so far I've basically covered lists, strings, if and for
and I've been told this exercise is doable with just those (...so if you wouldn't mind keeping this really basic)
All help appreciated for the newbie enthusiast!
SOLVED:
def packer(s):
if s.isalpha(): # Defines if unpacked
groups= []
last_char = None
for c in s:
if c == last_char:
groups[-1].append(c)
else:
groups.append([c])
last_char = c
return ''.join('%s%s' % (g[0], len(g)>1 and len(g) or '') for g in groups)
else: # Seems to be packed
stack = ""
for i in range(len(s)):
if s[i].isalpha():
if i+1 < len(s) and s[i+1].isdigit():
digit = s[i+1]
char = s[i]
i += 2
while i < len(s) and s[i].isdigit():
digit +=s[i]
i+=1
stack += char * int(digit)
else:
stack+= s[i]
else:
""
return "".join(stack)
print (packer("aaaaaaaaaaaabbbccccd"))
print (packer("a4b19am4nmba22"))
So this is my final code. Almost managed to pull it all off with just for loops and if statements.
In the end though I had to bring in the while loop to solve reading the multiple-digit numbers issue. I think I still managed to keep it simple enough. Thanks a ton millimoose and everyone else for chipping in!

A straightforward solution:
If a char is different, make a new group. Otherwise append it to the last group. Finally count all groups and join them.
def packer(s):
groups = []
last_char = None
for c in s:
if c == last_char:
groups[-1].append(c)
else:
groups.append([c])
last_char = c
return ''.join('%s%s'%(g[0], len(g)) for g in groups)
Another approach is using re.
Regex r'(.)\1+' can match consecutive characters longer than 1. And with re.sub you can easily encode it:
regex = re.compile(r'(.)\1+')
def replacer(match):
return match.group(1) + str(len(match.group(0)))
regex.sub(replacer, 'aaabbkka')
#=> 'a3b2k2a'

I think You can use `itertools.grouby' function
for example
import itertools
data = 'aaassaaasssddee'
groupped_data = ((c, len(list(g))) for c, g in itertools.groupby(data))
result = ''.join(c + (str(n) if n > 1 else '') for c, n in groupped_data)
of course one can make this code more readable using generator instead of generator statement

This is an implementation of the algorithm I outlined in the comments:
from itertools import takewhile, count, islice, izip
def consume(items):
from collections import deque
deque(items, maxlen=0)
def ilen(items):
result = count()
consume(izip(items, result))
return next(result)
def pack_or_unpack(data):
start = 0
result = []
while start < len(data):
if data[start].isdigit():
# `data` is packed, bail
return unpack(data)
run = run_len(data, start)
# append the character that might repeat
result.append(data[start])
if run > 1:
# append the length of the run of characters
result.append(str(run))
start += run
return ''.join(result)
def run_len(data, start):
"""Return the end index of the run of identical characters starting at
`start`"""
return start + ilen(takewhile(lambda c: c == data[start],
islice(data, start, None)))
def unpack(data):
result = []
for i in range(len(data)):
if data[i].isdigit():
# skip digits, we'll look for them below
continue
# packed character
c = data[i]
# number of repetitions
n = 1
if (i+1) < len(data) and data[i+1].isdigit():
# if the next character is a digit, grab all the digits in the
# substring starting at i+1
n = int(''.join(takewhile(str.isdigit, data[i+1:])))
# append the repeated character
result.append(c*n) # multiplying a string with a number repeats it
return ''.join(result)
print pack_or_unpack('aaabbc')
print pack_or_unpack('a3b2c')
print pack_or_unpack('a10')
print pack_or_unpack('b5c5')
print pack_or_unpack('abc')
A regex-flavoured version of unpack() would be:
import re
UNPACK_RE = re.compile(r'(?P<char> [a-zA-Z]) (?P<count> \d+)?', re.VERBOSE)
def unpack_re(data):
matches = UNPACK_RE.finditer(data)
pairs = ((m.group('char'), m.group('count')) for m in matches)
return ''.join(char * (int(count) if count else 1)
for char, count in pairs)
This code demonstrates the most straightforward (or "basic") approach of implementing that algorithm. It's not particularly elegant or idiomatic or necessarily efficient. (It would be if written in C, but Python has the caveats such as: indexing a string copies the character into a new string, and algorithms that seem to copy data excessively might be faster than trying to avoid this if the copying is done in C and the workaround was implemented with a Python loop.)

Python reversing a string using recursion

I want to use recursion to reverse a string in python so it displays the characters backwards (i.e "Hello" will become "olleh"/"o l l e h".
I wrote one that does it iteratively:
def Reverse( s ):
result = ""
n = 0
start = 0
while ( s[n:] != "" ):
while ( s[n:] != "" and s[n] != ' ' ):
n = n + 1
result = s[ start: n ] + " " + result
start = n
return result
But how exactly do I do this recursively? I am confused on this part, especially because I don't work with python and recursion much.
Any help would be appreciated.

def rreverse(s):
if s == "":
return s
else:
return rreverse(s[1:]) + s[0]
(Very few people do heavy recursive processing in Python, the language wasn't designed for it.)

To solve a problem recursively, find a trivial case that is easy to solve, and figure out how to get to that trivial case by breaking the problem down into simpler and simpler versions of itself.
What is the first thing you do in reversing a string? Literally the first thing? You get the last character of the string, right?
So the reverse of a string is the last character, followed by the reverse of everything but the last character, which is where the recursion comes in. The last character of a string can be written as x[-1] while everything but the last character is x[:-1].
Now, how do you "bottom out"? That is, what is the trivial case you can solve without recursion? One answer is the one-character string, which is the same forward and reversed. So if you get a one-character string, you are done.
But the empty string is even more trivial, and someone might actually pass that in to your function, so we should probably use that instead. A one-character string can, after all, also be broken down into the last character and everything but the last character; it's just that everything but the last character is the empty string. So if we handle the empty string by just returning it, we're set.
Put it all together and you get:
def backward(text):
if text == "":
return text
else:
return text[-1] + backward(text[:-1])
Or in one line:
backward = lambda t: t[-1] + backward(t[:-1]) if t else t
As others have pointed out, this is not the way you would usually do this in Python. An iterative solution is going to be faster, and using slicing to do it is going to be faster still.
Additionally, Python imposes a limit on stack size, and there's no tail call optimization, so a recursive solution would be limited to reversing strings of only about a thousand characters. You can increase Python's stack size, but there would still be a fixed limit, while other solutions can always handle a string of any length.

I just want to add some explanations based on Fred Foo's answer.
Let's say we have a string called 'abc', and we want to return its reverse which should be 'cba'.
def reverse(s):
if s == "":
return s
else:
return reverse(s[1:]) + s[0]
s = "abc"
print (reverse(s))
How this code works is that:
when we call the function
reverse('abc') #s = abc
=reverse('bc') + 'a' #s[1:] = bc s[0] = a
=reverse('c') + 'b' + 'a' #s[1:] = c s[0] = a
=reverse('') + 'c' + 'b' + 'a'
='cba'

If this isn't just a homework question and you're actually trying to reverse a string for some greater goal, just do s[::-1].

def reverse_string(s):
if s: return s[-1] + reverse_string(s[0:-1])
else: return s
or
def reverse_string(s):
return s[-1] + reverse_string(s[0:-1]) if s else s

I know it's too late to answer original question and there are multiple better ways which are answered here already. My answer is for documentation purpose in case someone is trying to implement tail recursion for string reversal.
def tail_rev(in_string,rev_string):
if in_string=='':
return rev_string
else:
rev_string+=in_string[-1]
return tail_rev(in_string[:-1],rev_string)
in_string=input("Enter String: ")
rev_string=tail_rev(in_string,'')
print(f"Reverse of {in_string} is {rev_string}")

s = input("Enter your string: ")
def rev(s):
if len(s) == 1:
print(s[0])
exit()
else:
#print the last char in string
#end="" prints all chars in string on same line
print(s[-1], end="")
"""Next line replaces whole string with same
string, but with 1 char less"""
return rev(s.replace(s, s[:-1]))
rev(s)

if you do not want to return response than you can use this solution. This question is part of LeetCode.
class Solution:
i = 0
def reverseString(self, s: List[str]) -> None:
"""
Do not return anything, modify s in-place instead.
"""
if self.i >= (len(s)//2):
return
s[self.i], s[len(s)-self.i-1] = s[len(s)-self.i-1], s[self.i]
self.i += 1
self.reverseString(s)

Python: Clever ways at string manipulation

I'm new to Python and am currently reading a chapter on String manipulation in "Dive into Python."
I was wondering what are some of the best (or most clever/creative) ways to do the following:
1) Extract from this string: "stackoverflow.com/questions/ask" the word 'questions.' I did string.split(/)[0]-- but that isn't very clever.
2) Find the longest palindrome in a given number or string
3) Starting with a given word (i.e. "cat")-- find all possible ways to get from that to another three- letter word ("dog"), changing one letter at a time such that each change in letters forms a new, valid word.
For example-- cat, cot, dot, dog

As personal exercise, here's to you, (hopefully) well commented code with some hints.
#!/usr/bin/env python2
# Let's take this string:
a = "palindnilddafa"
# I surround with a try/catch block, explanation following
try:
# In this loop I go from length of a minus 1 to 0.
# range can take 3 params: start, end, increment
# This way I start from the thow longest subsring,
# the one without the first char and without the last
# and go on this way
for i in range(len(a)-1, 0, -1):
# In this loop I want to know how many
# Palidnrome of i length I can do, that
# is len(a) - i, and I take all
# I start from the end to find the largest first
for j in range(len(a) - i):
# this is a little triky.
# string[start:end] is the slice operator
# as string are like arrays (but unmutable).
# So I take from j to j+i, all the offsets
# The result of "foo"[1:3] is "oo", to be clear.
# with string[::-1] you take all elements but in the
# reverse order
# The check string1 in string2 checks if string1 is a
# substring of string2
if a[j:j+i][::-1] in a:
# If it is I cannot break, 'couse I'll go on on the first
# cycle, so I rise an exception passing as argument the substring
# found
raise Exception(a[j:j+i][::-1])
# And then I catch the exception, carrying the message
# Which is the palindrome, and I print some info
except Exception as e:
# You can pass many things comma-separated to print (this is python2!)
print e, "is the longest palindrome of", a
# Or you can use printf formatting style
print "It's %d long and start from %d" % (len(str(e)), a.index(str(e)))
After the discussion, and I'm little sorry if it goes ot. I've written another implementation of palindrome-searcher, and if sberry2A can, I'd like to know the result of some benchmark tests!
Be aware, there are a lot of bugs (i guess) about pointers and the hard "+1 -1"-problem, but the idea is clear. Start from the middle and then expand until you can.
Here's the code:
#!/usr/bin/env python2
def check(s, i):
mid = s[i]
j = 1
try:
while s[i-j] == s[i+j]:
j += 1
except:
pass
return s[i-j+1:i+j]
def do_all(a):
pals = []
mlen = 0
for i in range(len(a)/2):
#print "check for", i
left = check(a, len(a)/2 + i)
mlen = max(mlen, len(left))
pals.append(left)
right = check(a, len(a)/2 - i)
mlen = max(mlen, len(right))
pals.append(right)
if mlen > max(2, i*2-1):
return left if len(left) > len(right) else right
string = "palindnilddafa"
print do_all(string)

No 3:
If your string is s:
max((j-i,s[i:j]) for i in range(len(s)-1) for j in range(i+2,len(s)+1) if s[i:j]==s[j-1:i-1:-1])[1]
will return the answer.

For #2 - How to find the longest palindrome in a given string?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Finding substring in python [duplicate] - python

Getting the remainder of the string using a for-loop: n = len(substr) rest = next((s[i+n:] for i in range(len(s) - n + 1) if s[i:i+n] == substr), None) # return None if substr not in s It is equivalent to: _, sep, rest = s.partition(substr) if substr and not sep: # substr not in s rest = None

Related

python Question: Could any one explain how the above class reversed the string return val? sorry coming back to python after many years [duplicate]

Python 2.7 finding if some anagram of one string is a substring of another [duplicate]

Python - packing/unpacking by letters

Python reversing a string using recursion

Python: Clever ways at string manipulation

Categories

Resources