Navigating around a string in python - python

Pardon the incredibly trivial/noob question, at least it should be easy to answer. I've been working through the coderbyte problems and solving the easy ones in python, but have come across a wall. the problem is to return True if a string (e.g. d+==f+d++) has all alpha characters surrounded by plus signs (+) and if not return false. I'm blanking on the concept that would help navigate around these strings, I tried doing with a loop and if statement, but it failed to loop through the text entirely, and always returned false (from the first problem):
def SimpleSymbols(str):
split = list(str)
rawletters = "abcdefghijklmnopqrstuvwxyz"
letters = list(rawletters)
for i in split:
if i in letters and (((split.index(i)) - 1) == '+') and (((split.index(i)) + 1) == '+'):
return True
else:
return False
print SimpleSymbols(raw_input())
Also editing to add the problem statement: "Using the Python language, have the function SimpleSymbols(str) take the str parameter being passed and determine if it is an acceptable sequence by either returning the string true or false. The str parameter will be composed of + and = symbols with several letters between them (ie. ++d+===+c++==a) and for the string to be true each letter must be surrounded by a + symbol. So the string to the left would be false. The string will not be empty and will have at least one letter."
Any assistance would be greatly appreciated. Thank you!

Here's how I would do the first part (if I weren't using regex):
import string
LOWERCASE = set(string.ascii_lowercase)
def plus_surrounds(s):
"""Return True if `+` surrounds a single ascii lowercase letter."""
# initialize to 0 -- The first time through the loop is guaranteed not
# to find anything, but it will initialize `idx1` and `idx2` for us.
# We could actually make this more efficient by factoring out
# the first 2 `find` operations (left as an exercise).
idx2 = idx1 = 0
# if the indices are negative, we hit the end of the string.
while idx2 >= 0 and idx1 >= 0:
# if they're 2 spaces apart, check the character between them
# otherwise, keep going.
if (idx2 - idx1 == 2) and (s[idx1+1] in LOWERCASE):
return True
idx1 = s.find('+', idx2)
idx2 = s.find('+', max(idx1+1, 0))
return False
assert plus_surrounds('s+s+s')
assert plus_surrounds('+s+')
assert not plus_surrounds('+aa+')
I think that if you study this code and understand it, you should be able to get the second part without too much trouble.

More of a note than an answer, but I wanted to mention regular expressions as a solution not because it's the right one for your scenario (this looks distinctly homework-ish so I understand you're almost certainly not allowed to use regex) but just to ingrain upon you early that in Python, almost EVERYTHING is solved by import foo.
import re
def SimpleSymbols(target):
return not (re.search(r"[^a-zA-Z+=]",target) and re.search(r"(?<!\+)\w|\w(?!\+)",target))

Related

Trying to sort two combined strings alphabetically without duplicates

Challenge: Take 2 strings s1 and s2 including only letters from a to z. Return a new sorted string, the longest possible, containing distinct letters - each taken only once - coming from s1 or s2.
# Examples
a = "xyaabbbccccdefww"
b = "xxxxyyyyabklmopq"
assert longest(a, b) == "abcdefklmopqwxy"
a = "abcdefghijklmnopqrstuvwxyz"
assert longest(a, a) == "abcdefghijklmnopqrstuvwxyz"
So I am just starting to learn, but so far I have this:
def longest(a1, a2):
for letter in max(a1, a2):
return ''.join(sorted(a1+a2))
which returns all the letters but I am trying to filter out the duplicates.
This is my first time on stack overflow so please forgive anything I did wrong. I am trying to figure all this out.
I also do not know how to indent in the code section if anyone could help with that.
You have two options here. The first is the answer you want and the second is an alternative method
To filter out duplicates, you can make a blank string, and then go through the returned string. For each character, if the character is already in the string, move onto the next otherwise add it
out = ""
for i in returned_string:
if i not in out:
out += i
return out
This would be empedded inside a function
The second option you have is to use Pythons sets. For what you want to do you can consider them as lists with no dulicate elements in them. You could simplify your function to
def longest(a: str, b: str):
return "".join(set(a).union(set(b)))
This makes a set from all the characters in a, and then another one with all the characters in b. It then "joins" them together (union) and you get another set. You can them join all the characters together in this final set to get your string. Hope this helps

How to remove punctuation and capitalization in my palindrome assignment?

The problem is trying to use strings to prove whether a word or phrase is a palindrome.
def is_palindrome(input_string):
left = 0
right = len(input_string) - 1
while left < right:
if input_string[left] != input_string[right]:
return False
left += 1
right -= 1
return True
This what I attempted to do but when typing in
my_palindrome("race car") it was proven false when it is supposed to be proven true. Need help on finding code to add to this to make punctuation and capitalization negligible.
For characters that aren't letters, there is a string function called .isalpha() that returns True if the string contains only alphabetic characters, False otherwise. You'll need to iterate over each character of the string, retaining only the characters that pass this check.
To make your function case-insensitive, you can use .upper() or .lower(), which returns the string with all the letters upper/lowercased, respectively.
Note that this answer is deliberately incomplete, (i.e. no code), per the recommendations here. If, after several days from the posting of this answer, you're still stuck, post a comment and I can advise further.

writing an adaptor removal tool, advice on ignoring case on the sequence

I am learning how to code. I need to code, among other things, an adaptor removal tool. My scripts works fine except in the cases where the sequence is a mix of lower and upper cases.
adaptor sequence== TATA
sequence == TAtaGATTACA
This is the function for the adaptor removal
elif operation == "adaptor-removal":
adaptor = args.adaptor
reads = sequences(args.input, format)
num_reads = len(reads)
bases = "".join([read["seq"] for read in reads])
adaptors_found = 0
for read in reads:
for i, j in read.items():
if i == "seq":
if j.startswith(adaptor.upper()) or j.startswith(adaptor.lower()):
adaptors_found += 1
j = j.replace(adaptor.upper(), "", 1)
j = j.replace(adaptor.lower(), "", 1)
args.output.write("%s\n" % j)
print_summary(operation)
print("%s adaptors found" % adaptors_found)
I tried with:
if j.startswith(adaptor,re.I):
but doesn't work, I don't really understand why. Can anyone experienced guide me through this?
Thank you very much
Let's suppose j is TAtaGATTACA and adaptor is TATA.
Is j.startswith(adaptor.upper()) true? No, because j doesn't start with TATA.
Is j.startswith(adaptor.lower()) true? No, because j doesn't start with tata.
The easiest way to compare two strings case-insensitively is to convert both of them to the same case, upper or lower, and then compare those two strings as if you were comparing them case-sensitively. It doesn't matter whether you choose upper-case or lower-case, as long as you choose the same for both.
Is j.lower().startswith(adaptor.lower()) true? Yes, because j.lower() starts with tata.
Also, take care with your two .replace() calls: it's possible that one of them may end up removing text further along in j, which I don't believe you want. If you just want to trim the adaptor off the front of j, you are better off using a string slice:
if j.lower().startswith(adaptor.lower()):
adaptors_found += 1
j = j[len(adaptor):]
Finally, you also ask why
if j.startswith(adaptor,re.I):
doesn't do what you want. The answer is that if you pass a second parameter to .startswith(), the value of this second parameter is the start position that you search from, not a flag that controls the matching:
"abcd".startswith("cd") # False
"abcd".startswith("cd", 2) # True
It happens that re.I can be converted to the integer 2. So the following is also True, although it looks odd:
"abcd".startswith("cd", re.I)

Single specific character removal without slicing nor strip ,

how can i remove a single character from a string ?
Basically i have a string like :
abccbaa
I wish to remove the first and last letter. With the string.rstrip or string.lstrip methods all of the occurrences are removed and i get a string bccb. the same goes to replace.
is there a way of doing so ? i cannot import anything , i cant use slicing (except accessing single letter ) . Also I cannot use any kind of loops as well .
To get the whole picture , i need to write a recursive palindrome algorithm. My current code is:
def is_palindrome(s):
if s == '':
return True
if s[0] != s[-1]:
return False
else:
s = s.replace(s[0], '')
s = s.replace(s[-1], '')
return is_palindrome(s)
print is_palindrome("abccbaa")
as you can see it will work unless provides with a string like the one in the print line , since more than "edge" letters are stripped .
Slicing/replacing the string isn't needed and is costly because it creates strings over and over. In languages where strings are much less convenient to handle (like C), you wouldn't even have imagined to do like that.
Of course you need some kind of looping, but recursion takes care of that.
You could do it "the old way" just pass the start & end indices recursively, with a nested function to hide the start condition to the caller:
def is_palindrome(s):
def internal_method(s,start,end):
if start>=end:
return True
if s[start] != s[end]:
return False
else:
return internal_method(s,start+1,end-1)
return internal_method(s,0,len(s)-1)
recursion stops if start meets end or if checked letters don't match (with a different outcome of course)
testing a little bit seems to work :)
>>> is_palindrome("")
True
>>> is_palindrome("a")
True
>>> is_palindrome("ab")
False
>>> is_palindrome("aba")
True
>>> is_palindrome("abba")
True
>>> is_palindrome("abbc")
False
I'm taking a guess at what you are looking for here since your question wasn't all that clear but this works for taking off the first and last character of the word you provided?
Python 2.7.14 (default, Nov 12 2018, 12:56:03)
>>> string = "abccbaa"
>>> print(string[1:-1])
bccba

What is the syntactical structure of a += statement in python?

I have discovered something in python today. But haven't found a clear explanation for it yet.
In python it seems that this works:
variable += a_single_statement
So, following statements are correct:
variable += another_variable
variable += (another_variable - something_else)
But doing the following is incorrect:
variable += a_variable - b_variable
Could someone explain why this is the case, preferably with a link to the documentation to the syntactical structure that explains what the operands of a += operator are, what expressions are expected and what their structure is? Also, are my thoughts, outlined above, even correct?
The behavior seems to be different from other programming languages I'm used too, and that last 'statement' leads to a syntax error.
Edit: the code where it doesn't work. It might be a whitespace error instead :/
T = input()
counter = 0
# For each word, figure out edit length to palindrome
for _ in range(T):
counter += 1
word = raw_input()
word_len = len(word) #stored for efficiency
index = 0
sum_edits = 0
# Iterate half the word and always compare characters
# at equal distance d from the beginning and from
# the ending of the word
while index < word_len/2.0:
sum_edits += max(ord(word[index]), ord(word[word_len-index-1])) -
min(ord(word[index]), ord(word[word_len - index - 1]))
index += 1
print sum_edits
It's code to detect how many edits it would take to make a word into a palindrome, if you could only change letters 'downwards' towards an 'a'.
Does this mean you can not arbitrarily break up a line in python code, if it's clear that the 'expression' has to continue anyway? Or can you only break up lines of code if they are surrounded with parentheses?
Sorry, I'm very new to python.
It has nothing to do with +=. It's just that Python doesn't let you split a statement across multiple lines unless there's an open (, {, or [, or unless you perform a line continuation with \. It might look obvious that those two lines are supposed to be one statement, but then when you have statements like
a = loooooooooooooooooooooooooooooong_thiiiiiiiiiiiiiiiiiiiiiiiing
+ ooooooooooooootheeeeeeeeeeeeer_thiiiiiiiiiiiiiiiiiiiing
is that one statement or two? If you allow
a = loooooooooooooooooooooooooooooong_thiiiiiiiiiiiiiiiiiiiiiiiing +
ooooooooooooootheeeeeeeeeeeeer_thiiiiiiiiiiiiiiiiiiiing
to be one statement, then either interpretation for having the + operator on the second line is confusing and bug-prone. Javascript tries to allow this kind of thing, and its semicolon insertion causes all kinds of problems.
It's usually recommended to use parentheses if you're not already inside brackets or braces.

Categories

Resources