Having trouble understanding bisection searching and recursion - python

Here's the problem I'm trying to wrap my head around:
We can use the idea of bisection search to determine if a character
is in a string, so long as the string is sorted in alphabetical order.
First, test the middle character of a string against the character
you're looking for (the "test character"). If they are the same, we
are done - we've found the character we're looking for!
If they're not the same, check if the test character is "smaller" than
the middle character. If so, we need only consider the lower half of
the string; otherwise, we only consider the upper half of the string.
(Note that you can compare characters using Python's < function.)
Implement the function isIn(char, aStr) which implements the above
idea recursively to test if char is in aStr. char will be a single
character and aStr will be a string that is in alphabetical order. The
function should return a boolean value.
As you design the function, think very carefully about what the base
cases should be.
Here's the code I tried to do. I'm getting errors, but I'm falling behind in understanding the basics of how to do this problem.
def isIn(char, aStr):
'''
char: a single character
aStr: an alphabetized string
returns: True if char is in aStr; False otherwise
'''
# Your code here
middle_char = len(aStr)/2
if char == middle_char:
True
elif char == "" or char == 1:
False
elif char < aStr[:middle_char]:
return isIn(char,aStr(middle_char)
else:
return isIn(char, aStr(middle_char))

One reason you're falling behind is that you're trying to write a recursive function when you haven't yet mastered writing simple statements. You have about 10 lines of active code here, including at least four syntax errors and two semantic errors.
Back off and use incremental programming. Write a few lines of code, test them, and don't advance until you're sure they work as expected. Insert diagnostic print statements to check values as you go. For instance, start with force-fed values and no actual function call, like this:
# def isIn(char, aStr):
'''
char: a single character
aStr: an alphabetized string
returns: True if char is in aStr; False otherwise
'''
char = 'q'
aStr = "abcdefghijklmnopqrstuvwxyz"
print "parameters:", char, aStr
middle_char = len(aStr)/2
print len(aStr), middle_char
print "if", char, "==", middle_char, ":"
This gives you the output
parameters: q abcdefghijklmnopqrstuvwxyz
26 13
if q == 13 :
Obviously, a character is not going to equal the integer 13.
Fix this before you go any further. Then you can try actually writing your first if statement.
See how that works?

middle_char = len(aStr)/2
if char == middle_char:
Middle char is half the length (I.e. an integer value)
It's not going to be equal to your char value.
middle_index = len(aStr)//2
middle_char = aStr[middle_index]
to actually get the middle char value. Note the integer division (//). we want to make sure that the resulting index is a whole number.
elif char == "" or char == 1:
you've already tested (well tried to) the case where there is one char left, you dont need to handle that specifically. You also need to test for empty string before you try extracting values.
elif char < aStr[:middle_char]:
here you actually do try and index into the string. unfortunately, what you are actually doing is slicing it, and seeing if the secind hald of the string (middle character onwards) is equal to your char. this will only ever match if you are looking at a one character string. e.g. isin('d', 'd')
return isIn(char,aStr(middle_char)
else:
return isIn(char, aStr(middle_char))
- Missing parenthesis on the first return )
- aStr() is not a function. you need [ and ]
- you are trying to pass just a single char into the recursive call. you need to slice the string and pass the resulting sub-string into the recursive string
- both of these (ignoring the missing bracket) are identical calls. you need one to call with the first half of aStr and one with the second half.
Your task says to think about the base cases. They are (I'm listing them because you almost got them spot on):
- empty string (return False)
- mid char = search char (return True)
- mid char > search char (search left substring)
- mid char < search char (search right substring)
note that there is no need to explicitly check for a non matching string with a length of 1, as that will pass an empty string into the next call
something for you to think about: why does the string need to be sorted? what happens if the string isnt sorted?
a working implementation:
def isin (char, str):
if not str:
return False
mid_index = len(str)/2
mid_char = str[mid_index]
return True if mid_char == char else isin(char, str[:mid_index] if mid_char > char else str[mid_index+1:])
DO NOT just use this code. This code is just for your reference so you can understand what it is doing and rewrite you code once you understand. There is no point in just copying the code if you dont understand it. It wont help you in the future.
You do seem to have the general idea of what you need to do (I'm guessing you have gone over this in class), but are lacking knowlege in the how (syntax etc).
I recommend going through the python tutorial in your own time, doing the exercises it takes you through. It will introduce you to the features of the language in turn and this will really help you.
good luck!

Related

Randomly replacing letters, numbers, and punctuation in a string: can this code be condensed?

Writing a function to check an input string for numbers, and if there are any, to randomize every digit, letter, and punctuation mark in the string. (i.e. "hello3.14" might become "jdbme6?21")
This code works (and the goal makes sense in context, I promise) but it sure seems redundant. Not sure how to tighten it up. The ELSE is just there to make me feel better about loose ends, but it's probably disposable.
My primary question is, Can this method be condensed?
Secondary question, Is there a completely different, better way I should do this?
Thanks for any guidance.
import random
import string
def new_thing(old_thing):
output_str = ''
if any(char.isdigit() for char in old_thing):
for char in old_thing:
get_new = char
if char in string.digits:
while get_new == char:
get_new = random.choice(string.digits)
output_str += get_new
elif char in string.ascii_lowercase:
while get_new == char:
get_new = random.choice(string.ascii_lowercase)
output_str += get_new
elif char in string.punctuation:
while get_new == char:
get_new = random.choice(string.punctuation)
output_str += get_new
else:
output_str += char
print(output_str)
else:
print("lol no numbers gg")
new_thing(input("Type a thing: ").lower())
As pointed out, this isn't really the place for reviewing code, but since it's here I wanted to point out how to do your selections without needing a while loop.
A while loop will work, but has a real downside, in that it's no longer a consistent time to finish. It also has a theoretical downside in that it has no theoretical guarantee of ever succeeding, but you're not likely to actually hit that problem.
I'll use digits as an example. In all cases, you'll need the index of the character.
char_idx = string.digits.find(char)
One way is to make a new string of only available characters.
avail_digits = string.digits[:char_idx]+string.digits[char_idx+1:]
get_new = random.choice(avail_digits)
Another is to select a random index from the length of the string minus one, and add that to your current index. Use the modulus operator % to wrap around to the beginning. By selecting one less than the string length, you will never wrap around to your original character.
get_new_idx = random.randrange(0, len(string.digits)-1) % len(string.digits)
get_new = string.digits[get_new_idx]
You might find it simpler to consider the current index as a marker between a lower and upper half of the string. You still select an index one less than the length of the string, but if the new index is the current index of higher, add one to shift it into the range of the upper part.
get_new_idx = random.randrange(0, length(string.digits)-1)
if get_new_idx == char_idx:
get_new_idx += 1
get_new = string.digits[get_new_idx]
Your best bet is to make a function for this and call it for each category (pass the character and category string as parameters). You can even move the check for the category into the function and return None if it's not in the category, but I'll leave that for you to try.
Hope that helps a little.
I'd suggest not doing lots in operations where the RHS is a list. This takes O(n) time so all your if statements are very slow. I'd also note that by rejecting the same character as input you're biasing the result, which was part of the why the Enigma machine was broken.
I'd do something like:
from random import choice
from string import digits, ascii_lowercase, ascii_uppercase, punctuation
# generate a dictionary once that turns characters into the char classes
# that can be used to draw replacements from
CHAR_CLASSES = {
c: charclass
for charclass in (digits, ascii_lowercase, ascii_uppercase, punctuation)
for c in charclass
}
def _randomizer(c):
"get a replacement character drawn from the same class"
charclass = CHAR_CLASSES.get(c)
return choice(charclass) if charclass else c
def randomize(text):
return ''.join(map(_randomizer, text))
print(randomize("Hello World 123"))
You could golf this code down if you want, but I've tried to make it somewhat verbose and readable.
If you really want to exclude characters from even appearing as themselves, you could replace the initialization of CHAR_CLASSES with:
CHAR_CLASSES = {
c: charclass.replace(c, '')
for charclass in (digits, ascii_lowercase, ascii_uppercase, punctuation)
for c in charclass
}
but I'd suggest not doing that!

First recurring character problem in Python

I'm trying to solve a problem that I have with a recurring character problem.
I'm a beginner in development so I'm trying to think of ways I can do this.
thisWord = input()
def firstChar(thisWord):
for i in range(len(thisWord)):
for j in range(i+1, len(thisWord)):
if thisWord[i] == thisWord[j]:
return thisWord[i]
print(firstChar(thisWord))
This is what I came up with. In plenty of use cases, the result is fine. The problem I found after some fiddling around is that with a word like "statistics", where the "t" is the first recurring letter rather than the "s" because of the distance between the letters, my code counts the "s" first and returns that as the result.
I've tried weird solutions like measuring the entire string first for each possible case, creating variables for string length, and then comparing it to another variable, but I'm just ending up with more errors than I can handle.
Thank you in advance.
So you want to find the first letter that recurs in your text, with "first" being determined by the recurrence, not the first occurrence of the letter? To illustrate that with your "statistics" example, the t is the first letter that recurs, but the s had its first occurrence before the first occurrence of the t. I understand that in such cases, it's the t you want, not the s.
If that's the case, then I think a set is what you want, since it allows you to keep track of letters you've already seen before:
thisword = "statistics"
set_of_letters = set()
for letter in thisword:
if letter not in set_of_letters:
set_of_letters.add(letter)
else:
firstchar = letter
break
print(firstchar)
Whenever you're looking at a certain character in the word, you should not check whether the character will occur again at all, but whether it has already occurred. The algorithmically optimal way would be to use a set to store and look up characters as you go, but it could just as well be done with your double loop. The second one should then become for j in range(i).
This is not an answer to your problem (one was already provided), but an advice for a better solution:
def firstChar(thisWord):
occurrences: dict[str, int] = {char: 0 for char in thisWord} # At the beginning all the characters occurred once
for char in thisWord:
occurrences[char] += 1 # You found this char
if (occurrences[char] == 2): # This was already found one time before
return char # So you return it as the first duplicate
This works as expected:
>>> firstChar("statistics")
't'
EDIT:
occurrences: dict[str, int] = {char: 0 for char in thisWord}
This line of code creates a dictionary with the chars from thisWord as keys and 0 as values, so that you can use it to count the occurrences starting from 0 (before finding a char its count is 0).

My code is incorrectly removing a strings from a larger string

"""
This code takes two strings and returns a copy of the first string with
all instances of the second string removed
"""
# This function removes the letter from the word in the event that the
# word has the letter in it
def remove_all_from_string(word, letter):
while letter in word:
find_word = word.find(letter)
word_length = len(word)
if find_word == -1:
continue
else:
word = word[:find_word] + word[find_word + word_length:]
return word
# This call of the function states the word and what letter will be
# removed from the word
print(remove_all_from_string("bananas", "an"))
This code is meant to remove a defined string from a larger define string. In this case the larger string is "bananas" and the smaller string which is removed is "an".
In this case the smaller string is removed multiple times. I believe I am very close to the solution of getting the correct output, but I need the code to output "bas". Instead, it outputs "ba".
The code is supposed to remove all instances of "an" and print whatever is left, however it does not do this. Any help is appreciated.
Your word_length should be len(letter), and as the while ensures the inclusion, don't need to test the value of find_word
def remove_all_from_string(word, replacement):
word_length = len(replacement)
while replacement in word:
find_word = word.find(replacement)
word = word[:find_word] + word[find_word + word_length:]
return word
Note that str.replace exists
def remove_all_from_string(word, replacement):
return word.replace(replacement, "")
You can simply use the .replace() function for python strings.
def remove_all_from_string(word, letter):
word = word.replace(letter, "")
return word
print(remove_all_from_string("bananas", "an"))
Output: bas
The Python language has built-in utilities to do that in a single expression.
The fact that you need to do that, indicates you are doing sme exercise to better understand coding, and that is important. (Hint: to do it in a single glob, just use the string replace method)
So, first thing - avoid using built-in tools that perform more than basic tasks - in this case, in your tentative code, you are using the string find method. It is powerful, but combining it to find and remove all occurrences of a sub-string is harder than doing so step by step.
So, what ou need is to have variables to annotate the state of your search, and your result. Variables are "free" - do not hesitate in creating as many, and updating then inside the proper if blocks to keep track of your solution.
In this case, you can start with a "position = 0", and increase this "0" until you are at the end of the parent string. You check the character at that position - if it does match the starting character of your substring, you update other variables indicating you are "inside a match", and start a new "position_at_substring" index - to track the "matchee". If at any point the character in the main string does not correspond to the character on the substring: not an occurrence, you bail out (and copy the skipped charactrs to your result -therefore you also have to accumulate all skipped characters in a "match_check" substring) .
Build your code with the simplest 'while', 'if' and variable updates - stick it all inside a function, so that whenever it works, you can reuse it at will with no effort, and you will have learned a lot.

Need to Figure Out Why String Functions Aren't Executing in My Conditionals

Write a function called 'string_type' which accepts one
string argument and determines what type of string it is.
If the string is empty, return "empty".
If the string is a single character, return "character".
If the string represents a single word, return "word".
The string is a single word if it has no spaces.
If the string is a whole sentence, return "sentence".
The string is a sentence if it contains spaces, but at most one period.
If the string is a paragraph, return "paragraph". The
string is a paragraph if it contains both spaces and
multiple periods (we won't worry about other punctuation marks).
If the string is multiple paragraphs, return "page".
The string is a paragraph if it contains any newline
characters ("\n").
I'm allowed to use Python 3's built-in string functions (e.g., len, count, etc.)
I have been able to write a function with different conditions. At first, I tried doing conditions in the order outlined in the problem, however, I wasn't getting answers that matched my test case. I then reversed the order starting with a condition to check if the string is a page, then paragraph, etc.
def string_type(a_string):
if a_string.count("\n") >= 1:
return "page"
elif a_string.count("") >= 1 and a_string.count(".") > 1:
return "paragraph"
elif len(a_string) > 1 and a_string.count("") > 1 and a_string.count(".") == 1:
return "sentence"
elif len(a_string) > 1 and a_string.count("") == 0:
return "word"
elif len(a_string) == 1:
return "character"
else:
return "empty"
Below are some lines of code that will test your function.
You can change the value of the variable(s) to test your
function with different inputs.
If your function works correctly, this will originally print
#empty
#character
#word
#sentence
#paragraph
#page
print(string_type(""))
print(string_type("!"))
print(string_type("CS1301."))
print(string_type("This is too many cases!"))
print(string_type("There's way too many ostriches. Why are there so many ostriches. The brochure said there'd only be a few ostriches."))
print(string_type("Paragraphs need to have multiple sentences. It's true.\nHowever, two is enough. Yes, two sentences can make a paragraph."))
When I run my current code, I get the following results:
#empty
#character
#sentence (instead of word)
#empty (instead of sentence)
#paragraph
#page
I have been tweaking both my word and sentence conditionals, however, I haven't figured out how to correct. Any explanation of what I did wrong and how to fix is appreciated.
The places where you're searching for spaces in your string are wrong.
elif a_string.count("") >= 1
This will try to to find the empty sting "" in the input - which it will obviously find.
That part (and others) should be:
elif a_string.count(" ") >= 1
Note it's " " - space.

Python treats + as a digit?

Basically im making a small piece of code to remove the end character of a string until the string passes .isdigit.
def string_conversion(string):
for i in string:
if i.isdigit == False:
string = string[0:len[string]-1]
print(string) #This is just here to see if it goes through
test = "60+"
string_conversion(test)
I used http://www.pythontutor.com/visualize.html to see why I wasn't getting any output, it passed a + symbol as a digit, just like the numbers 6 and 0.
Am I doing something wrong or does python treat + as a digit?
str.isdigit is a method not an attribute, so (as was already mentioned in the comments) you need to call it.
Also you check each character (starting from the left side) for isdigit but if it's not passing the isdigit() test you remove a character from the right side. That doesn't seem right. You probably wanted to iterate over the for i in reversed(string) (or using slicing: for i in string[::-1]).
But you also could simplify the logic by using a while loop:
def string_conversion(string):
while string and not string.isdigit():
string = string[:-1]
return string
def string_conversion(string):
for i, s in enumerate(string):
if s.isdigit() == False:
string = string[:i]
break
print(string) #This is just here to see if it goes through
test = "60+"
string_conversion(test)
Try This

Categories

Resources