Ignore space in python - python

I need this code to ignore (not replace) spaces. Basically, it should capitalise every second letter of the alphabet only.
def spaces(test_space):
text = (str.lower, str.upper)
return ''.join(text[i%2](x) for i, x in enumerate(test_space))
print(spaces('Ignore spaces and other characters'))
print(spaces('Ignore spaces and 3rd characters!'))
Output
iGnOrE SpAcEs aNd oThEr cHaRaCtErS
iGnOrE SpAcEs aNd 3Rd cHaRaCtErS

This sounds like homework, so I'm only going to give suggestions and resources not complete code:
One way to do this would be to:
Replace every space with 2 of some character that can't appear in your text. for example use "$$". This can easily be done via python's replace function. We replace a space with 2 characters because each space is "throwing off" the index by one (mod 2), so by replacing each space by two characters corrects the problem (since 2 (mod 2) = 0).
Capitalize every other character using your current program
Replace each occurrence of '$$' with a space.
Put the spaces back using the indexes you saved
Output: iGnOrE sPaCeS aNd 3rD cHaRaCtErS!
Alternatively, you could iterate through the string (using a loop), keeping a position counter, but use a regex to ignore all non-Alphabet characters in the counter. You could also probably accomplish this succinctly via a list comprehension, although it might be more confusing to read.

def spaces(test_space):
return " ".join(
[
"".join(
char.upper() if i % 2 == 1 else char.lower()
for i, char in enumerate(word)
)
for word in test_space.split()
]
)
outputs
iGnOrE sPaCeS aNd oThEr cHaRaCtErS

Related

Swap last two characters in a string, make it lowercase, and add a space

I'm trying to take the last two letters of a string, swap them, make them lowercase, and leave a space in the middle. For some reason the output gives me white space before the word.
For example if input was APPLE then the out put should be e l
It would be nice to also be nice to ignore non string characters so if the word was App3e then the output would be e p
def last_Letters(word):
last_two = word[-2:]
swap = last_two[-1:] + last_two[:1]
for i in swap:
if i.isupper():
swap = swap.lower()
return swap[0]+ " " +swap[1]
word = input(" ")
print(last_Letters(word))
You can try with the following function:
import re
def last_Letters(word):
letters = re.sub(r'\d', '', word)
if len(letters) > 1:
return letters[-1].lower() + ' ' + letters[-2].lower()
return None
It follows these steps:
removes all the digits
if there are at least two characters:
lowers every character
builds the required string by concatenation of the nth letter, a space and the nth-1 letter
and returns the string
returns "None"
Since I said there was a simpler way, here's what I would write:
text = input()
result = ' '.join(reversed([ch.lower() for ch in text if ch.isalpha()][-2:]))
print(result)
How this works:
[ch.lower() for ch in text] creates a list of lowercase characters from some iterable text
adding if ch.isalpha() filters out anything that isn't an alphabetical character
adding [-2:] selects the last two from the preceding sequence
and reversed() takes the sequence and returns an iterable with the elements in reverse
' '.join(some_iterable) will join the characters in the iterable together with spaces in between.
So, result is set to be the last two characters of all of the alphabetical characters in text, in reverse order, separated by a space.
Part of what makes Python so powerful and popular, is that once you learn to read the syntax, the code very naturally tells you exactly what it is doing. If you read out the statement, it is self-describing.

Why I am getting these different outputs?

print('xyxxyyzxxy'.lstrip('xyy'))
# output:zxxy
print("xyxefgooeeee".lstrip("efg"))
# ouput:xyxefgooeeee
print('reeeefooeeee'.lstrip('eeee'))
# output:reeeefooeeee
Here for the last two print statements, I am expecting output as a first print statement, as it has stripped 'xyxxyy', but in the last two print statements, it is not stripping in the same way as it has done in first. Please tell me why it so?
In Python leading characters in Strings containing xyy are removed because of .lstrip(). For example:
txt = ",,,,,ssaaww.....banana"
x = txt.lstrip(",.asw")
print(x)
The output will be: banana
string.lstrip(chars) removes characters from the left size of the string until it reached a character that does not appear in chars.
In your second and third examples, the first character of the string does not appear in chars, so no characters are removed from the string.
I just got to know lstrip() removes, all combinations of the characters passed as an argument are removed from the left-hand side.
I think because the order of char doesn't matter.
xyy or yxx will result in the same thing. It will remove chars from the left side until it sees a char that is not included. For example:
print('xyxxyyzxxy'.lstrip('xyy'))
zxxy
print('xyxxyyzxxy'.lstrip('yxx'))
zxxy
In fact, if you only used 2 chars 'xy' or 'yx' you will get the same thing:
print('xyxxyyzxxy'.lstrip('xy'))
zxxy
In the other cases, the first left char is not included, therefore there's no stripping
lstring using the set of the chars in the string and then removes the all characters from the primary string start from the left
print('xyxefgooeeee'.lstrip('yxefg'))
"""In 'xyxefgooeeee' the first char is 'x' and it exists in the 'yxefg' so
will be removed and then it will move to the next char 'y','x','x','e','f',
'g' and then 'o' which doesn't exist. therefore will return string after 'o'
"""
OutPut : ooeeee
print('xyxefgooeeee'.lstrip('efg'))
"""In the xyxefgooeeee' the first char 'x' does to exist in the 'efg' so will
not be removed and will not move to the next char and will return the
entire primary string
"""
OutPut: xyxefgooeeee

How can I delete the letter in string

How can I remove a letter from string in python.
For example, I have the word "study", I will have a list something like this "tudy","stdy","stuy","stud".
I have to use something like
for i in range(len(string)):
sublist.append(string0.replace(string[i], ""))
It works well. However, if I change the word "studys", when it replaces s with "", two s will disappear and It not works anymore (tudy instead study/tudys). I need help
Here's one:
s = 'studys'
lst = [s[:index] + s[index + 1:] for i in range(len(s))]
print(lst)
Output:
['tudys', 'sudys', 'stdys', 'stuys', 'studs', 'study']
Explanation:
Your code did not work because replace finds all the occurrences of the character in the word, and replaces them with the character you want. Now you can specify the number of counts to replace, as someone suggested in the comments, but even then replace checks the string from the beginning. So if you said, string.replace('s','',1) it will check the string from the start and as soon as it finds the first 's' it will replace it with '' and break, so you will not get the intended effect of removing the character at the current index.

Python - string index out of range issue

This is the question I was given to solve:
Create a program inputs a phrase (like a famous quotation) and prints all of the words that start with h-z.
I solved the problem, but the first two methods didn't work and I wanted to know why:
#1 string index out of range
quote = input("enter a 1 sentence quote, non-alpha separate words: ")
word = ""
for character in quote:
if character.isalpha():
word += character.upper()
else:
if word[0].lower() >= "h":
print(word)
word = ""
else:
word = ""
I get the IndexError: string index out of range message for any words after "g". Shouldn't the else statement catch it? I don't get why it doesn't, because if I remove the brackets [] from word[0], it works.
#2: last word not printing
quote = input("enter a 1 sentence quote, non-alpha separate words: ")
word = ""
for character in quote:
if character.isalpha():
word += character.upper()
else:
if word.lower() >= "h":
print(word)
word = ""
else:
word = ""
In this example, it works to a degree. It eliminates any words before 'h' and prints words after 'h', but for some reason doesn't print the last word. It doesn't matter what quote i use, it doesn't print the last word even if it's after 'h'. Why is that?
You're calling on word[0]. This accesses the first element of the iterable string word. If word is empty (that is, word == ""), there is no "first element" to access; thus you get an IndexError. If a "word" starts with a non-alphabetic character (e.g. a number or a dash), then this will happen.
The second error you're having, with your second code snippet leaving off the last word, is because of the approach you're using for this problem. It looks like you're trying to walk through the sentence you're given, character by character, and decide whether to print a word after having read through it (which you know because you hit a space character. But this leads to the issue with your second approach, which is that it doesn't print the last string. That's because the last character in your sentence isn't a space - it's just the last letter in the last word. So, your else loop is never executed.
I'd recommend using an entirely different approach, using the method string.split(). This method is built-in to python and will transform one string into a list of smaller strings, split across the character/substring you specify. So if I do
quote = "Hello this is a sentence"
words = quote.split(' ')
print(words)
you'll end up seeing this:
['Hello', 'this', 'is', 'a', 'sentence']
A couple of things to keep in mind on your next approach to this problem:
You need to account for empty words (like if I have two spaces in a row for some reason), and make sure they don't break the script.
You need to account for non-alphanumeric characters like numbers and dashes. You can either ignore them or handle them differently, but you have to have something in place.
You need to make sure that you handle the last word at some point, even if the sentence doesn't end in a space character.
Good luck!
Instead of what you're doing, you can Iterate over each word in the string and count how many of them begin in those letters. Read about the function str.split(), in the parameter you enter the divider, in this case ' ' since you want to count the words, and that returns a list of strings. Iterate over that in the loop and it should work.

Split a string using a list of strings as a pattern

Consider an input string :
mystr = "just some stupid string to illustrate my question"
and a list of strings indicating where to split the input string:
splitters = ["some", "illustrate"]
The output should look like
result = ["just ", "some stupid string to ", "illustrate my question"]
I wrote some code which implements the following approach. For each of the strings in splitters, I find its occurrences in the input string, and insert something which I know for sure would not be a part of my input string (for example, this '!!'). Then I split the string using the substring that I just inserted.
for s in splitters:
mystr = re.sub(r'(%s)'%s,r'!!\1', mystr)
result = re.split('!!', mystr)
This solution seems ugly, is there a nicer way of doing it?
Splitting with re.split will always remove the matched string from the output (NB, this is not quite true, see the edit below). Therefore, you must use positive lookahead expressions ((?=...)) to match without removing the match. However, re.split ignores empty matches, so simply using a lookahead expression doesn't work. Instead, you will lose one character at each split at minimum (even trying to trick re with "boundary" matches (\b) does not work). If you don't care about losing one whitespace / non-word character at the end of each item (assuming you only split at non-word characters), you can use something like
re.split(r"\W(?=some|illustrate)")
which would give
["just", "some stupid string to", "illustrate my question"]
(note that the spaces after just and to are missing). You could then programmatically generate these regexes using str.join. Note that each of the split markers is escaped with re.escape so that special characters in the items of splitters do not affect the meaning of the regular expression in any undesired ways (imagine, e.g., a ) in one of the strings, which would otherwise lead to a regex syntax error).
the_regex = r"\W(?={})".format("|".join(re.escape(s) for s in splitters))
Edit (HT to #Arkadiy): Grouping the actual match, i.e. using (\W) instead of \W, returns the non-word characters inserted into the list as seperate items. Joining every two subsequent items would then produce the list as desired as well. Then, you can also drop the requirement of having a non-word character by using (.) instead of \W:
the_new_regex = r"(.)(?={})".format("|".join(re.escape(s) for s in splitters))
the_split = re.split(the_new_regex, mystr)
the_actual_split = ["".join(x) for x in itertools.izip_longest(the_split[::2], the_split[1::2], fillvalue='')]
Because normal text and auxiliary character alternate, the_split[::2] contains the normal split text and the_split[1::2] the auxiliary characters. Then, itertools.izip_longest is used to combine each text item with the corresponding removed character and the last item (which is unmatched in the removed characters)) with fillvalue, i.e. ''. Then, each of these tuples is joined using "".join(x). Note that this requires itertools to be imported (you could of course do this in a simple loop, but itertools provides very clean solutions to these things). Also note that itertools.izip_longest is called itertools.zip_longest in Python 3.
This leads to further simplification of the regular expression, because instead of using auxiliary characters, the lookahead can be replaced with a simple matching group ((some|interesting) instead of (.)(?=some|interesting)):
the_newest_regex = "({})".format("|".join(re.escape(s) for s in splitters))
the_raw_split = re.split(the_newest_regex, mystr)
the_actual_split = ["".join(x) for x in itertools.izip_longest([""] + the_raw_split[1::2], the_raw_split[::2], fillvalue='')]
Here, the slice indices on the_raw_split have swapped, because now the even-numbered items must be added to item afterwards instead of in front. Also note the [""] + part, which is necessary to pair the first item with "" to fix the order.
(end of edit)
Alternatively, you can (if you want) use string.replace instead of re.sub for each splitter (I think that is a matter of preference in your case, but in general it is probably more efficient)
for s in splitters:
mystr = mystr.replace(s, "!!" + s)
Also, if you use a fixed token to indicate where to split, you do not need re.split, but can use string.split instead:
result = mystr.split("!!")
What you could also do (instead of relying on the replacement token not to be in the string anywhere else or relying on every split position being preceded by a non-word character) is finding the split strings in the input using string.find and using string slicing to extract the pieces:
def split(string, splitters):
while True:
# Get the positions to split at for all splitters still in the string
# that are not at the very front of the string
split_positions = [i for i in (string.find(s) for s in splitters) if i > 0]
if len(split_positions) > 0:
# There is still somewhere to split
next_split = min(split_positions)
yield string[:next_split] # Yield everything before that position
string = string[next_split:] # Retain the rest of the string
else:
yield string # Yield the rest of the string
break # Done.
Here, [i for i in (string.find(s) for s in splitters) if i > 0] generates a list of positions where the splitters can be found, for all splitters that are in the string (for this, i < 0 is excluded) and not right at the beginning (where we (possibly) just split, so i == 0 is excluded as well). If there are any left in the string, we yield (this is a generator function) everything up to (excluding) the first splitter (at min(split_positions)) and replace the string with the remaining part. If there are none left, we yield the last part of the string and exit the function. Because this uses yield, it is a generator function, so you need to use list to turn it into an actual list.
Note that you could also replace yield whatever with a call to some_list.append (provided you defined some_list earlier) and return some_list at the very end, I do not consider that to be very good code style, though.
TL;DR
If you are OK with using regular expressions, use
the_newest_regex = "({})".format("|".join(re.escape(s) for s in splitters))
the_raw_split = re.split(the_newest_regex, mystr)
the_actual_split = ["".join(x) for x in itertools.izip_longest([""] + the_raw_split[1::2], the_raw_split[::2], fillvalue='')]
else, the same can also be achieved using string.find with the following split function:
def split(string, splitters):
while True:
# Get the positions to split at for all splitters still in the string
# that are not at the very front of the string
split_positions = [i for i in (string.find(s) for s in splitters) if i > 0]
if len(split_positions) > 0:
# There is still somewhere to split
next_split = min(split_positions)
yield string[:next_split] # Yield everything before that position
string = string[next_split:] # Retain the rest of the string
else:
yield string # Yield the rest of the string
break # Done.
Not especially elegant but avoiding regex:
mystr = "just some stupid string to illustrate my question"
splitters = ["some", "illustrate"]
indexes = [0] + [mystr.index(s) for s in splitters] + [len(mystr)]
indexes = sorted(list(set(indexes)))
print [mystr[i:j] for i, j in zip(indexes[:-1], indexes[1:])]
# ['just ', 'some stupid string to ', 'illustrate my question']
I should acknowledge here that a little more work is needed if a word in splitters occurs more than once because str.index finds only the location of the first occurrence of the word...

Categories

Resources