Print words in between two keywords - python

I am trying to write a code in Python where I get a print out of all the words in between two keywords.
scenario = "This is a test to see if I can get Python to print out all the words in between Python and words"
go = False
start = "Python"
end = "words"
for line in scenario:
if start in line: go = True
elif end in line:
go = False
continue
if go: print(line)
Want to have a print out of "to print out all the"

Slightly different approach, let's create a list which each element being a word in the sentence. Then let's use list.index() to find which position in the sentence the start and end words first occur. We can then return the words in the list between those indices. We want it back as a string and not a list, so we join them together with a space.
# list of words ['This', 'is', 'a', 'test', ...]
words = scenario.split()
# list of words between start and end ['to', 'print', ..., 'the']
matching_words = words[words.index(start)+1:words.index(end)]
# join back to one string with spaces between
' '.join(matching_words)
Result:
to print out all the

Your initial problem is that you're iterating over scenario the string, instead of splitting it into seperate words, (Use scenario.split()) but then there are other issues about switching to searching for the end word once the start has been found, instead, you might like to use index to find the two strings and then slice the string
scenario = "This is a test to see if I can get Python to print out all the words in between Python and words"
start = "Python"
end = "words"
start_idx = scenario.index(start)
end_idx = scenario.index(end)
print(scenario[start_idx + len(start):end_idx].strip())

You can accomplish this with a simple regex
import re
txt = "This is a test to see if I can get Python to print out all the words in between Python and words"
x = re.search("(?<=Python\s).*?(?=\s+words)", txt)
Here is the regex in action --> REGEX101

Split the string and go over it word by word to find the index at which the two keywords occur. Once you have those two indices, combine the list between those indices into a string.
scenario = 'This is a test to see if I can get Python to print out all the words in between Python and words'
start_word = 'Python'
end_word = 'words'
# Split the string into a list
list = scenario.split()
# Find start and end indices
start = list.index(start_word) + 1
end = list.index(end_word)
# Construct a string from elements at list indices between `start` and `end`
str = ' '.join(list[start : end])
# Print the result
print str

Related

How to avoid .replace replacing a word that was already replaced

Given a string, I have to reverse every word, but keeping them in their places.
I tried:
def backward_string_by_word(text):
for word in text.split():
text = text.replace(word, word[::-1])
return text
But if I have the string Ciao oaiC, when it try to reverse the second word, it's identical to the first after beeing already reversed, so it replaces it again. How can I avoid this?
You can use join in one line plus generator expression:
text = "test abc 123"
text_reversed_words = " ".join(word[::-1] for word in text.split())
s.replace(x, y) is not the correct method to use here:
It does two things:
find x in s
replace it with y
But you do not really find anything here, since you already have the word you want to replace. The problem with that is that it starts searching for x from the beginning at the string each time, not at the position you are currently at, so it finds the word you have already replaced, not the one you want to replace next.
The simplest solution is to collect the reversed words in a list, and then build a new string out of this list by concatenating all reversed words. You can concatenate a list of strings and separate them with spaces by using ' '.join().
def backward_string_by_word(text):
reversed_words = []
for word in text.split():
reversed_words.append(word[::-1])
return ' '.join(reversed_words)
If you have understood this, you can also write it more concisely by skipping the intermediate list with a generator expression:
def backward_string_by_word(text):
return ' '.join(word[::-1] for word in text.split())
Splitting a string converts it to a list. You can just reassign each value of that list to the reverse of that item. See below:
text = "The cat tac in the hat"
def backwards(text):
split_word = text.split()
for i in range(len(split_word)):
split_word[i] = split_word[i][::-1]
return ' '.join(split_word)
print(backwards(text))

(python) I keep getting an IndexError: string index out of range

I am trying to solve the question
Implement the ​mapper​, ​mapFileToCount​, which takes a string (text from a file) and returns the number of capitalized words in that string. A word is defined as a series
of characters separated from other words by either a space or a newline. A word is capitalized if its first letter is capitalized (A vs a).
and my python code currently reads
def mapFileToCount(s):
lines = (str(s)).splitlines()
words = (str(lines)).split(" ")
up = 0
for word in words:
if word[0].isupper() == True:
up = up + 1
return up
However I keep getting the error IndexError: string index out of range
please help
For now
given Hi huy \n hi you there
lines will be ['Hi huy ', ' hi you there']
words will be ["['Hi", 'huy', "',", "'", 'hi', 'you', "there']"] as you use the str(lines) to split on
I'd suggest you split on any whitespace at once with words = re.split("\s+", s).
Then the problem of IndexError comes in cases like Hi where are you__ (_ is space), when you split there will be an empty string at the end, and you can't access the first char char of this, so just add a condition in the if
if word because 0-size word are False, and other True
if word[0].isupper() for you test
import re
def mapFileToCount(s):
words = re.split("\s+", s)
up = 0
for word in words:
if word and word[0].isupper():
up = up + 1
return up
The string index out of range means that the index you are trying to access does not exist in a string. That means you're trying to get a character from the string at a given point. If that given point does not exist , then you will be trying to get a character that is not inside of the string.
In your code its that word[0].

Python - Capture string with or without specific character

I am trying to capture the sentence after a specific word. Each sentences are different in my code and those sentence doesn't necessarily have to have this specific word to split by. If the word doesn't appear, I just need like blank string or list.
Example 1: working
my_string="Python is a amazing programming language"
print(my_string.split("amazing",1)[1])
programming language
Example 2:
my_string="Java is also a programming language."
print(my_string.split("amazing",1)[1]) # amazing word doesn't appear in the sentence.
Error: IndexError: list index out of range
Output needed :empty string or list ..etc.
I tried something like this, but it still fails.
my_string.split("amazing",1)[1] if my_string.split("amazing",1)[1] == None else my_string.split("amazing",1)[1]
When you use the .split() argument you can specify what part of the list you want to use with either integers or slices. If you want to check a specific word in your string you can do is something like this:
my_str = "Python is cool"
my_str_list = my_str.split()
if 'cool' in my_str_list:
print(my_str)`
output:
"Python is cool"
Otherwise, you can run a for loop in a list of strings to check if it finds the word in multiple strings.
You have some options here. You can split and check the result:
tmp = my_string.split("amazing", 1)
result = tmp[1] if len(tmp) > 1 else ''
Or you can check for containment up front:
result = my_string.split("amazing", 1)[1] if 'amazing' in my_string else ''
The first option is more efficient if most of the sentences have matches, the second one if most don't.
Another option similar to the first is
result = my_string.split("amazing", 1)[-1]
if result == my_string:
result = ''
In all cases, consider doing something equivalent to
result = result.lstrip()
Instead of calling index 1, call index -1. This calls the last item in the list.
my_string="Java is also a programming language."
print(my_string.split("amazing",1)[1])
returns ' programming language.'

Python : find words in string without white space

I'm trying to make a function to look for words in a string without white space : 'Daysaregood' .
i iterate for every letter until i find if the word exists by comparing with list based on already iterated letter, using enchant the module enchant.
and this is what i tried:
import enchant
import time
fulltext =[]
def work(out):
if len(out)>0:
word = ''
wd = ""
# iterate for every Letter
for i in out:
word = word + i
print word
d = enchant.Dict('en_US')
# a list of words to compare to
list = d.suggest(word.title())
print list
#check if word exists
if word.title() in list :
print 'Word found'
wd = word
else:
print 'Word not found'
print '\n'+wd
fulltext.append(str(wd))
time.sleep(2)
work(out[len(wd):])
else:
print '\n fulltext : '
print fulltext
word="Daysaregood"
work(word)
Now for this text the scripts runs like i want, i get a list like this :
['Days', 'are', 'good'].
But when i try something like 'spaceshuttle', the function gets confused with 'space' and steels the 's' in 'shuttle' so i get this :
['spaces', 'hut', 't', 'l', 'e'].
My goal is to take return every word by itself and store them into a list.
Any help is appreciated.
The issue with your task is that the desired output doesn't follow strict rules, per se. If you were to input 'pineapple', would you expect ['pine', 'apple'] or ['pineapple']? It would be rather difficult / impossible to have it predict this.

How might I create an acronym by splitting a string at the spaces, taking the character indexed at 0, joining it together, and capitalizing it?

My code
beginning = input("What would you like to acronymize? : ")
second = beginning.upper()
third = second.split()
fourth = "".join(third[0])
print(fourth)
I can't seem to figure out what I'm missing. The code is supposed to the the phrase the user inputs, put it all in caps, split it into words, join the first character of each word together, and print it. I feel like there should be a loop somewhere, but I'm not entirely sure if that's right or where to put it.
Say input is "Federal Bureau of Agencies"
Typing third[0] gives you the first element of the split, which is "Federal". You want the first element of each element in the sprit. Use a generator comprehension or list comprehension to apply [0] to each item in the list:
val = input("What would you like to acronymize? ")
print("".join(word[0] for word in val.upper().split()))
In Python, it would not be idiomatic to use an explicit loop here. Generator comprehensions are shorter and easier to read, and do not require the use of an explicit accumulator variable.
When you run the code third[0], Python will index the variable third and give you the first part of it.
The results of .split() are a list of strings. Thus, third[0] is a single string, the first word (all capitalized).
You need some sort of loop to get the first letter of each word, or else you could do something with regular expressions. I'd suggest the loop.
Try this:
fourth = "".join(word[0] for word in third)
There is a little for loop inside the call to .join(). Python calls this a "generator expression". The variable word will be set to each word from third, in turn, and then word[0] gets you the char you want.
works for me this way:
>>> a = "What would you like to acronymize?"
>>> a.split()
['What', 'would', 'you', 'like', 'to', 'acronymize?']
>>> ''.join([i[0] for i in a.split()]).upper()
'WWYLTA'
>>>
One intuitive approach would be:
get the sentence using input (or raw_input in python 2)
split the sentence into a list of words
get the first letter of each word
join the letters with a space string
Here is the code:
sentence = raw_input('What would you like to acronymize?: ')
words = sentence.split() #split the sentece into words
just_first_letters = [] #a list containing just the first letter of each word
#traverse the list of words, adding the first letter of
#each word into just_first_letters
for word in words:
just_first_letters.append(word[0])
result = " ".join(just_first_letters) #join the list of first letters
print result
#acronym2.py
#illustrating how to design an acronymn
import string
def main():
sent=raw_input("Enter the sentence: ")#take input sentence with spaces
for i in string.split(string.capwords(sent)):#split the string so each word
#becomes
#a string
print string.join(i[0]), #loop through the split
#string(s) and
#concatenate the first letter
#of each of the
#split string to get your
#acronym
main()
name = input("Enter uppercase with lowercase name")
print(f'the original string = ' + name)
def uppercase(name):
res = [char for char in name if char.isupper()]
print("The uppercase characters in string are : " + "".join(res))
uppercase(name)

Categories

Resources