How to use multiple 'if' statements nested inside an enumerator? - python

I have a massive string of letters all jumbled up, 1.2k lines long.
I'm trying to find a lowercase letter that has EXACTLY three capital letters on either side of it.
This is what I have so far
def scramble(sentence):
try:
for i,v in enumerate(sentence):
if v.islower():
if sentence[i-4].islower() and sentence[i+4].islower():
....
....
except IndexError:
print() #Trying to deal with the problem of reaching the end of the list
#This section is checking if
the fourth letters before
and after i are lowercase to ensure the central lower case letter has
exactly three upper case letters around it
But now I am stuck with the next step. What I would like to achieve is create a for-loop in range of (-3,4) and check that each of these letters is uppercase. If in fact there are three uppercase letters either side of the lowercase letter then print this out.
For example
for j in range(-3,4):
if j != 0:
#Some code to check if the letters in this range are uppercase
#if j != 0 is there because we already know it is lowercase
#because of the previous if v.islower(): statement.
If this doesn't make sense, this would be an example output if the code worked as expected
scramble("abcdEFGhIJKlmnop")
OUTPUT
EFGhIJK
One lowercase letter with three uppercase letters either side of it.

Here is a way to do it "Pythonically" without
regular expressions:
s = 'abcdEFGhIJKlmnop'
words = [s[i:i+7] for i in range(len(s) - 7) if s[i:i+3].isupper() and s[i+3].islower() and s[i+4:i+7].isupper()]
print(words)
And the output is:
['EFGhIJK']
And here is a way to do it with regular expressions,
which is, well, also Pythonic :-)
import re
words = re.findall(r'[A-Z]{3}[a-z][A-Z]{3}', s)

if you can't use regular expression
maybe this for loop can do the trick
if v.islower():
if sentence[i-4].islower() and sentence[i+4].islower():
for k in range(1,4):
if sentence[i-k].islower() or sentence[i+k].islower():
break
if k == 3:
return i

regex is probably the easiest, using a modified version of #Israel Unterman's answer to account for the outside edges and non-upper surroundings the full regex might be:
s = 'abcdEFGhIJKlmnopABCdEFGGIddFFansTBDgRRQ'
import re
words = re.findall(r'(?:^|[^A-Z])([A-Z]{3}[a-z][A-Z]{3})(?:[^A-Z]|$)', s)
# words is ['EFGhIJK', 'TBDgRRQ']
using (?:.) groups keeps the search for beginning of line or non-upper from being included in match groups, leaving only the desired tokens in the result list. This should account for all conditions listed by OP.
(removed all my prior code as it was generally *bad*)

Related

First recurring character problem in Python

I'm trying to solve a problem that I have with a recurring character problem.
I'm a beginner in development so I'm trying to think of ways I can do this.
thisWord = input()
def firstChar(thisWord):
for i in range(len(thisWord)):
for j in range(i+1, len(thisWord)):
if thisWord[i] == thisWord[j]:
return thisWord[i]
print(firstChar(thisWord))
This is what I came up with. In plenty of use cases, the result is fine. The problem I found after some fiddling around is that with a word like "statistics", where the "t" is the first recurring letter rather than the "s" because of the distance between the letters, my code counts the "s" first and returns that as the result.
I've tried weird solutions like measuring the entire string first for each possible case, creating variables for string length, and then comparing it to another variable, but I'm just ending up with more errors than I can handle.
Thank you in advance.
So you want to find the first letter that recurs in your text, with "first" being determined by the recurrence, not the first occurrence of the letter? To illustrate that with your "statistics" example, the t is the first letter that recurs, but the s had its first occurrence before the first occurrence of the t. I understand that in such cases, it's the t you want, not the s.
If that's the case, then I think a set is what you want, since it allows you to keep track of letters you've already seen before:
thisword = "statistics"
set_of_letters = set()
for letter in thisword:
if letter not in set_of_letters:
set_of_letters.add(letter)
else:
firstchar = letter
break
print(firstchar)
Whenever you're looking at a certain character in the word, you should not check whether the character will occur again at all, but whether it has already occurred. The algorithmically optimal way would be to use a set to store and look up characters as you go, but it could just as well be done with your double loop. The second one should then become for j in range(i).
This is not an answer to your problem (one was already provided), but an advice for a better solution:
def firstChar(thisWord):
occurrences: dict[str, int] = {char: 0 for char in thisWord} # At the beginning all the characters occurred once
for char in thisWord:
occurrences[char] += 1 # You found this char
if (occurrences[char] == 2): # This was already found one time before
return char # So you return it as the first duplicate
This works as expected:
>>> firstChar("statistics")
't'
EDIT:
occurrences: dict[str, int] = {char: 0 for char in thisWord}
This line of code creates a dictionary with the chars from thisWord as keys and 0 as values, so that you can use it to count the occurrences starting from 0 (before finding a char its count is 0).

Why does it recognize the second capital T as 0?

I'm trying to make a short program that will find all the capital letters in a single string. I got it to work for the first two capital letters but it won't return the correct position of the last capital letter. What did I do wrong?
def capital_indexes(n):
listOfUpperPlaces = []
for x in n:
print(x)
if x.isupper():
characterPlace = n.index(x)
print(characterPlace)
listOfUpperPlaces.append(characterPlace)
return listOfUpperPlaces
print(capital_indexes("TEsTo"))
That is because n.index(x) returns the first occurrence of x in the string n. Because "T" occurs multiple times, n.index(x) returns the first occurrence of "T"
You want to iterate through range(len(n), like
def capital_indexes(n):
listOfUpperPlaces = []
for x in range(len(n)):
print(n[x])
if n[x].isupper():
print(x)
listOfUpperPlaces.append(x)
return listOfUpperPlaces
print(capital_indexes("TEsTo"))
The issue is the call to n.index(x)
This is searching the string to find x, and its able to find a capital T right at the beginning of the string.
A better way to do this would be to use enumerate, which gives you both the index and the item at the same time.
Can't code very well from a phone, but something like:
for index, character in enumerate(n):
if character.isUpper():
list_of_upper_places.append(index)
This will handle duplicates correctly, and will also be faster, since you don't need to search through the string just to count which character you are currently checking. It will be easier to read for most python programmers too.

My code is incorrectly removing a strings from a larger string

"""
This code takes two strings and returns a copy of the first string with
all instances of the second string removed
"""
# This function removes the letter from the word in the event that the
# word has the letter in it
def remove_all_from_string(word, letter):
while letter in word:
find_word = word.find(letter)
word_length = len(word)
if find_word == -1:
continue
else:
word = word[:find_word] + word[find_word + word_length:]
return word
# This call of the function states the word and what letter will be
# removed from the word
print(remove_all_from_string("bananas", "an"))
This code is meant to remove a defined string from a larger define string. In this case the larger string is "bananas" and the smaller string which is removed is "an".
In this case the smaller string is removed multiple times. I believe I am very close to the solution of getting the correct output, but I need the code to output "bas". Instead, it outputs "ba".
The code is supposed to remove all instances of "an" and print whatever is left, however it does not do this. Any help is appreciated.
Your word_length should be len(letter), and as the while ensures the inclusion, don't need to test the value of find_word
def remove_all_from_string(word, replacement):
word_length = len(replacement)
while replacement in word:
find_word = word.find(replacement)
word = word[:find_word] + word[find_word + word_length:]
return word
Note that str.replace exists
def remove_all_from_string(word, replacement):
return word.replace(replacement, "")
You can simply use the .replace() function for python strings.
def remove_all_from_string(word, letter):
word = word.replace(letter, "")
return word
print(remove_all_from_string("bananas", "an"))
Output: bas
The Python language has built-in utilities to do that in a single expression.
The fact that you need to do that, indicates you are doing sme exercise to better understand coding, and that is important. (Hint: to do it in a single glob, just use the string replace method)
So, first thing - avoid using built-in tools that perform more than basic tasks - in this case, in your tentative code, you are using the string find method. It is powerful, but combining it to find and remove all occurrences of a sub-string is harder than doing so step by step.
So, what ou need is to have variables to annotate the state of your search, and your result. Variables are "free" - do not hesitate in creating as many, and updating then inside the proper if blocks to keep track of your solution.
In this case, you can start with a "position = 0", and increase this "0" until you are at the end of the parent string. You check the character at that position - if it does match the starting character of your substring, you update other variables indicating you are "inside a match", and start a new "position_at_substring" index - to track the "matchee". If at any point the character in the main string does not correspond to the character on the substring: not an occurrence, you bail out (and copy the skipped charactrs to your result -therefore you also have to accumulate all skipped characters in a "match_check" substring) .
Build your code with the simplest 'while', 'if' and variable updates - stick it all inside a function, so that whenever it works, you can reuse it at will with no effort, and you will have learned a lot.

How to change a single letter in input string

I'm newbie in Python so that I have a question. I want to change letter in word if the first letter appears more than once. Moreover I want to use input to get the word from user. I'll present the problem using an example:
word = 'restart'
After changes the word should be like this:
word = 'resta$t'
I was trying couple of ideas but always I got stuck. Is there any simple sollutions for this?
Thanks in advance.
EDIT: In response to Simas Joneliunas
It's not my homework. I'm just finished reading some basic Python tutorials and I found some questions that I couldn't solve on my own. My first thought was to separate word into a single letters and then to find out the place of the letter I want to replace by "$". I have wrote that code but I couldn't came up with sollution how to get to specific place and replace it.
word = 'restart'
how_many = {}
for x in word:
how_many=+1
else:
how_many=1
for y in how_many:
if how_many[y] > 0:
print(y,how_many[y])
Using str.replace:
s = "restart"
new_s = s[0] + s[1:].replace(s[0], "$")
Output:
'resta$t'
Try:
"".join([["$" if ch in word[:i] else ch for i, ch in enumerate(word)])
enumerate iterates through the string (i.e. a list of characters) and keeps a running index of the iteration
word[:i] checks the list of chars until the current index, i.e. previously appeared characters
"$" if ch in word[:i] else ch means replace the character at existing position with $ if it appears before others keep the character
"".join() joins the list of characters into a single string.
This is where the python console is handy and lets you experiment. Since you have to keep track of number of letters, for a good visual I would list the alphabet in a list. Then in the loop remove from the list the current letter. If letter does not exist in the list replace the letter with $.
So check if it exists first thing in the loop, if it exists, remove it, if it doesn’t exist replace it from example above.

How to find find a grouping of upper and lower case characters in a string in python 3

I'm working on Python Challenges, and the level I'm on asks us to find a lower case letter surrounded on both sides by exactly three upper case letters. I've written the following code, which seems pretty crude but I feel should work. However, all I get is an empty string.
source="Hello there" #the string I have to work with
key=""#where I want to put the characters that fit
for i in source:
if i==i.lower(): # if it's uppercase
x=source.index(i) #makes number that's the index of i
if source[x-1].upper()==source[x-1] and source[x-2]==source[x-2].upper() and source[x-3].upper()==source[x-3]: #checks that the three numbers before it are upper case
if source[x+1].upper()==source[x+1] and source[x+2].upper()==source[x+2] and source[x+3].upper()==source[x+3]: #checks three numbers after are uppercase
if source[x+4].lower()==source[x=4] and source[x-4].lower()==source[x-4]: #checks that the fourth numbers are lowercase
key+=i #adds the character to key
print(key)
I know this is really, really messy but I don't understand why it just returns an empty string. If you have any idea what's wrong, or a more efficient way to do it, I would really appreciate it. Thanks
This is far, far easier with a regular expression.
re.findall(r'(?<![A-Z])[A-Z]{3}([a-z])(?=[A-Z]{3}(?:\Z|[^A-Z]))', text)
Here's how it works:
(?<![A-Z]) is a negative lookbehind assertion which makes sure that we are not preceded by an upper-case letter.
[A-Z]{3} is three upper-case letters.
([a-z]) is the lower-case letter we are looking for.
(?=[A-Z]{3}(?:\Z|[^A-Z])) is a lookahead assertion that makes sure we are followed by three upper-case letters but not four.
You will probably need to change the grouping depending on what you actually want to find. This finds the lower-case letter.
I would suggest using the itertools.groupby method with a keyfunc to distinguish lowercase from capital letters.
First you need a helper function to refactor the check logic:
def check(subseq):
return (subseq[0][0] and len(subseq[0][1]) == 3
and len(subseq[1][1]) == 1
and len(subseq[2][1]) == 3)
Then group up and check:
def findNeedle(mystr):
seq = [(k,list(g)) for k,g in groupby(mystr, str.isupper)]
for i in range(len(seq) - 2):
if check(seq[i:i+3]):
return seq[i+1][1][0]
Inspect seq in the interpreter to see how this works, it should be very clear.
Edit: Some typos, I didn't test the code.
Now a test:
>>> findNeedle("Hello there HELxOTHere")
'x'

Categories

Resources