Python: Iterate through string and print only specific words - python

I'm taking a class in python and now I'm struggling to complete one of the tasks.
The aim is to ask for an input, integrate through that string and print only words that start with letters > g. If the word starts with a letter larger than g, we print that word. Otherwise, we empty the word and iterate through the next word(s) in the string to do the same check.
This is the code I have, and the output. Would be grateful for some tips on how to solve the problem.
# [] create words after "G" following the Assignment requirements use of functions, menhods and kwyowrds
# sample quote "Wheresoever you go, go with all your heart" ~ Confucius (551 BC - 479 BC)
# [] copy and paste in edX assignment page
quote = input("Enter a sentence: ")
word = ""
# iterate through each character in quote
for char in quote:
# test if character is alpha
if char.isalpha():
word += char
else:
if word[0].lower() >= "h":
print(word.upper())
else:
word=""
Enter a sentence: Wheresoever you go, go with all your heart
WHERESOEVER
WHERESOEVERYOU
WHERESOEVERYOUGO
WHERESOEVERYOUGO
WHERESOEVERYOUGOGO
WHERESOEVERYOUGOGOWITH
WHERESOEVERYOUGOGOWITHALL
WHERESOEVERYOUGOGOWITHALLYOUR
The output should look like,
Sample output:
WHERESOEVER
YOU
WITH
YOUR
HEART

Simply a list comprehension with split will do:
s = "Wheresoever you go, go with all your heart"
print(' '.join([word for word in s.split() if word[0].lower() > 'g']))
# Wheresoever you with your heart
Modifying to match with the desired output (Making all uppercase and on new lines):
s = "Wheresoever you go, go with all your heart"
print('\n'.join([word.upper() for word in s.split() if word[0].lower() > 'g']))
'''
WHERESOEVER
YOU
WITH
YOUR
HEART
'''
Without list comprehension:
s = "Wheresoever you go, go with all your heart"
for word in s.split(): # Split the sentence into words and iterate through each.
if word[0].lower() > 'g': # Check if the first character (lowercased) > g.
print(word.upper()) # If so, print the word all capitalised.

Here is a readable and commented solution. The idea is first to split the sentence into a list of words using re.findall (regex package) and iterate through this list, instead of iterating on each character as you did. It is then quite easy to print only the words starting by a letter greater then 'g':
import re
# Prompt for an input sentence
quote = input("Enter a sentence: ")
# Split the sentence into a list of words
words = re.findall(r'\w+', quote)
# Iterate through each word
for word in words:
# Print the word if its 1st letter is greater than 'g'
if word[0].lower() > 'g':
print(word.upper())
To go further, here is also the one-line style solution based on exactly the same logic, using list comprehension:
import re
# Prompt for an input sentence
quote = input("Enter a sentence: ")
# Print each word starting by a letter greater than 'g', in upper case
print(*[word.upper() for word in re.findall(r'\w+', quote) if word[0].lower() > 'g'], sep='\n')

s = "Wheresoever you go, go with all your heart"
out = s.translate(str.maketrans(string.punctuation, " "*len(string.punctuation)))
desired_result = [word.upper() for word in out.split() if word and word[0].lower() > 'g']
print(*desired_result, sep="\n")

Your problem is that you're only resetting word to an empty string in the else clause. You need to reset it to an empty string immediately after the print(word.upper()) statement as well for the code as you've wrote it to work correctly.
That being said, if it's not explicitly disallowed for the class you're taking, you should look into string methods, specifically string.split()

Related

Filter a list of strings by a char in same position

I am trying to make a simple function that gets three inputs: a list of words, list of guessed letters and a pattern. The pattern is a word with some letters hidden with an underscore. (for example the word apple and the pattern '_pp_e')
For some context it's a part of the game hangman where you try to guess a word and this function gives a hint.
I want to make this function to return a filtered list of words from the input that does not contain any letters from the list of guessed letters and the filtered words contain the same letters and their position as with the given pattern.
I tried making this work with three loops.
First loop that filters all words by the same length as the pattern.
Second loop that checks for similarity between the pattern and the given word. If the not filtered word does contain the letter but not in the same position I filter it out.
Final loop checks the filtered word that it does not contain any letters from the given guessed list.
I tried making it work with not a lot of success, I would love for help. Also any tips for making the code shorter (without using third party libraries) will be a appreciated very much.
Thanks in advance!
Example: pattern: "d _ _ _ _ a _ _ _ _" guessed word list ['b','c'] and word list contain all the words in english.
output list: ['delegating', 'derogation', 'dishwasher']
this is the code for more context:
def filter_words_list(words, pattern, wrong_guess_lst):
lst_return = []
lst_return_2 = []
lst_return_3 = []
new_word = ''
for i in range(len(words)):
if len(words[i]) == len(pattern):
lst_return.append(words[i])
pattern = list(pattern)
for i in range(len(lst_return)):
count = 0
word_to_check = list(lst_return[i])
for j in range(len(pattern)):
if pattern[j] == word_to_check[j] or (pattern[j] == '_' and
(not (word_to_check[j] in
pattern))):
count += 1
if count == len(pattern):
lst_return_2.append(new_word.join(word_to_check))
for i in range(len(lst_return_2)):
word_to_check = lst_return_2[i]
for j in range(len(wrong_guess_lst)):
if word_to_check.find(wrong_guess_lst[j]) == -1:
lst_return_3.append(word_to_check)
return lst_return_3
The easiest, and likely quite efficient, way to do this would be to translate your pattern into a regular expression, if regular expressions are in your "toolbox". (The re module is in the standard library.)
In a regular expression, . matches any single character. So, we replace all _s with .s and add "^" and "$" to anchor the regular expression to the whole string.
import re
def filter_words(words, pattern, wrong_guesses):
re_pattern = re.compile("^" + re.escape(pattern).replace("_", ".") + "$")
# get words that
# (a) are the correct length
# (b) aren't in the wrong guesses
# (c) match the pattern
return [
word
for word in words
if (
len(word) == len(pattern) and
word not in wrong_guesses and
re_pattern.match(word)
)
]
all_words = [
"cat",
"dog",
"mouse",
"horse",
"cow",
]
print(filter_words(all_words, "c_t", []))
print(filter_words(all_words, "c__", []))
print(filter_words(all_words, "c__", ["cat"]))
prints out
['cat']
['cat', 'cow']
['cow']
If you don't care for using regexps, you can instead translate the pattern to a dict mapping each defined position to the character that should be found there:
def filter_words_without_regex(words, pattern, wrong_guesses):
# get a map of the pattern's defined letters to their positions
letter_map = {i: letter for i, letter in enumerate(pattern) if letter != "_"}
# get words that
# (a) are the correct length
# (b) aren't in the wrong guesses
# (c) have the correct letters in the correct positions
return [
word
for word in words
if (
len(word) == len(pattern) and
word not in wrong_guesses and
all(word[i] == ch for i, ch in letter_map.items())
)
]
The result is the same.
Probably not the most efficient, but this should work:
def filter_words_list(words, pattern, wrong_guess_lst):
fewer_words = [w for w in words if not any([wgl in w for wgl in wrong_guess_lst])]
equal_len_words = [w for w in fewer_words if len(w) == len(pattern)]
pattern_indices = [idl for idl, ltr in enumerate(pattern) if ltr != '_']
word_indices = [[idl for idl, ltr in enumerate(w) if ((ltr in pattern) and (ltr != '_'))] for w in equal_len_words]
out = [w for wid, w in zip(word_indices, equal_len_words) if ((wid == pattern_indices) and (w[pid] == pattern[pid] for pid in pattern_indices))]
return out
The idea is to first remove all words that have letters in your wrong_guess_lst.
Then, remove everything which does not have the same length (you could also merge this condition in the first one..).
Next, for both pattern and your remaining words, you create a pattern mask, which indicates the positions of non '_' letters.
To be a candidate, the masks have to be identical AND the letters in these positions have to be identical as well.
Note, that I replaced a lot of for loops in you code by list comprehension snippets. List comprehension is a very useful construct which helps a lot especially if you don't want to use other libraries.
Edit: I cannot really tell you, where your code went wrong as it was a little too long for me..
The regex rule is explicitely constructed, in particular no check on the word's length is needed. To achieve this the groupby function from the itertools package of the standard library is used:
'_ b _ _ _' -- regex-- > r'^.{1}b.{3}$'
Here how to filter the dictionary by a guess string:
import itertools as it
import re
# sample dictionary
dictionary = "a ability able about above accept according account across act action activity actually add address"
dictionary = dictionary.split()
guess = '_ b _ _ _'
guess = guess.replace(' ', '') # remove white spaces
# construction of the regex rule
regex = r'^'
for _, i in it.groupby(guess, key=lambda x: x == '_'):
if '_' in (l:=list(i)):
regex += ''.join(f'.{{{len(l)}}}') # escape the curly brackets
else:
regex += ''.join(l)
regex += '$'
# processing the regex rule
pattern = re.compile(regex)
# filter the dictionary by the rule
l = [word for word in dictionary if pattern.match(word)]
print(l)
Output
['about', 'above']

Python program to generate all possible words by change letter 'c' in the word "ace"

this is my first post here so apologies for formatting in advance, I am trying to write a simple program in Python that takes a word, in this case, the word "ace" and checks all the possible words that can be generated by switching out the letter 'c' with all the letters in the alphabet. What I have tried is turning both my word and the alphabet into lists so I can and create some kind of loop that runs through all the possibilites and eventually cross references them with a dictionary of possible english words (haven't got there yet). I don't have strong a programming background so this has proven to be harder than I thought, my limited work is below, have been at this for a few hours, thanks!
My code...(it doesnt work at the moment)
#take letter ace and input new middle letter
word = list("ace")
alphabet = list("abcdefghijklmnopqrstuvwxyz")
wordnew = []
counter = 0
for word[1] in word:
wordnew = word.replace("c", alphabet[0] if counter < 25)
print(wordnew)
Note that you have to put a variable name to be created between the for and the in — word[1] is not a valid variable name, so your code should fail with a SyntaxError exception.
You can iterate over each letter of the alphabet and create a list of words generated from ace:
alphabet = "abcedfghijklmnopqrstuvwxyz"
words = []
for letter in alphabet:
words.append("ace".replace("c", letter))
You can even do this in one line, using a list comprehension:
words = [ "ace".replace("c", letter) for letter in "abcdefghijklmnopqrstuvwxyz" ]
Note how I didn't have to turn alphabet into a list—in Python, strings are iterable, meaning that you can loop through them just like you can with lists.
Of course, you can print them all, add this at the end:
print(words)
PS: You could also turn "abcdefghijklmnopqrstuvwxyz" into string.ascii_lowercase, though you'll have to import the string module (built into python).
You're close, here is a simple way to do it
>>> word = "ace" #no need to make it a list, you want it to be a string so you can use .replace on it
>>> alphabet = "abcdefghijklmnopqrstuvwxyz" #you can use string in a for loop, in which case they are treated like a list of characters
>>> for letter in alphabet:
print(word.replace("c",letter)) #here you do what you need, here I print it but you can save in a list by doing an .append into one or with a list comprehension which is the prefer mode to make list
aae
abe
ace
ade
aee
afe
age
ahe
aie
aje
ake
ale
ame
ane
aoe
ape
aqe
are
ase
ate
aue
ave
awe
axe
aye
aze
>>>
>>> wordnew = [word.replace("c",letter) for letter in alphabet] #the list comprehension version of the above
>>> wordnew
['aae', 'abe', 'ace', 'ade', 'aee', 'afe', 'age', 'ahe', 'aie', 'aje', 'ake', 'ale', 'ame', 'ane', 'aoe', 'ape', 'aqe', 'are', 'ase', 'ate', 'aue', 'ave', 'awe', 'axe', 'aye', 'aze']
>>>
word = 'ace'
alphabet = list("abcdefghijklmnopqrstuvwxyz")
for letter in alphabet:
print(word.replace('c', letter))
If you want the list of all possible "words" after replacement of letter "c" by any other letter from the alphabet you can simply do the following
word = "ace"
alphabet = "abcdefghijklmnopqrstuvwxyz"
new_words = [word.replace('c', ch) for ch in alphabet]
print(new_words)
word = "ace"
alphabet = list("abcdefghijklmnopqrstuvwxyz")
for letter in alphabet:
wordnew = word.replace("c", letter)
print(wordnew)
You should iterate through the alphabet, not the word.
Assuming you somehow have a list of all the words that exist, this works:
word = "ace"
alphabet = "abcdefghijklmnopqrstuvwxyz"
wordnew = []
counter = 0
# list full of all real words
legitWords = ['ace', 'man', 'math', 'marry' 'age',
'ape', 'are', 'ate', 'awe' 'axe']
for letter in alphabet: # looping through every letter in the alphabet
newWord = word.replace('c', letter) # replaces c with current letter
if newWord in legitWords:
# adds the counter and appends the new word, if it really exists
counter += 1
wordnew.append(newWord)
note that you don't have to convert the strings to lists as you have done, because they are iterable
.

How to read two characters from an input string?

I want my program to read one character from an input of random string, but when there is a space in the string, I want it to read the next two characters.
For example, if I type H He, I want it to return the value for H, then detect a space then return He. How do I do this?
This code is a small part in school assignment (calculating the molecular mass of random compounds).
string=input('enter:')
pos=0
start=None
for a in string:
if a == 'H':
print(string[start:1])
elif a == ' ':
pos=int(string.find(' '))+1
start=pos
print(string[start:1])
You can split the string with space and then get both the values.
string=input('enter:')
values = string.split(' ')
if len(values) > 1:
print("first char:", values[0])
print("remaining:", values[1])
else:
print("first char: ", values[0])
To split the string without the spaces based on the uppercase letter.
import re
elements = re.findall('[A-Z][^A-Z]*', 'NaCl')
print(elements)
You can create a list for the string which you enter and print the list like below:
string=input('enter:')
l=list(string.split())
for i in l:
print(i)
For your new request
string=input('enter: ')
i=0
l=len(string)
while (i<l):
if(i<l-1):
if(string[i].isupper() and string[i+1].isupper()):
print(string[i])
elif(string[i].isupper() and string[i+1].islower()):
print('{}{}'.format(string[i],string[i+1]))
elif(i==l-1 and string[i].isupper()):
print(string[i])
i=i+1
Also, I was wondering if it was possible to do the same thing but separating using lowercase letters?
For example read 'HHe' as 'H', 'He' or 'NaCl' as 'Na', 'Cl' Sorry this is a bit selfish but I was wondering it it could be done
How about this?
import re
words = [
'HHe',
'NaCl',
]
pattern = re.compile(r'[A-Z][a-z]*')
for word in words:
print(pattern.findall(word))
output:
['H', 'He']
['Na', 'Cl']

Search strings in list containing specific letters in random order

I am writing a code in Python 2.7 in which I have defined a list of strings. I then want to search this list's elements for a set of letters. These letters must be in random order. i.e. search the list for every single letter from input.
I have been google'ing around but i haven't found a solution.
Here's what i got:
wordlist = ['mississippi','miss','lake','que']
letters = str(aqk)
for item in wordlist:
if item.find(letters) != -1:
print item
This is an example. Here the only output should be 'lake' and 'que' since these words contain 'a','q' and 'k'.
How can I rewrite my code so that this will be done?
Thanks in advance!
Alex
It would be easy using set():
wordlist = ['mississippi','miss','lake','que']
letters = set('aqk')
for word in wordlist:
if letters & set(word):
print word
Output:
lake
que
Note: The & operator does an intersection between the two sets.
for item in wordlist:
for character in letters:
if character in item:
print item
break
Here goes your solution:
for item in wordlist:
b = False
for c in letters:
b = b | (item.find(c) != -1)
if b:
print item
[word for word in wordlist if any(letter in word for letter in 'aqk')]
Using sets and the in syntax to check.
wordlist = ['mississippi','miss','lake','que']
letters = set('aqk')
for word in wordlist:
if word in letters:
print word

how to search for a capital letter within a string and return the list of words with and without capital letters

My homework assignment is to Write a program that reads a string from the user and creates a list of words from the input.Create two lists, one containing the words that contain at least one upper-case letter and one of the words that don't contain any upper-case letters.
Use a single for loop to print out the words with upper-case letters in them, followed by the words with no upper-case letters in them, one word per line.
What I know is not correct:
s= input("Enter your string: ")
words = sorted(s.strip().split())
for word in words:
print (word)
Because it only sorts the sequence if the Capitol is in the first character. For this assignment a character could appear any where within a word. Such as, 'tHis is a sTring'.
I was playing around with a solution that looked similar to this, just to see if I could get the words with CAPS out..But it just wasnt working:
s = input("Please enter a sentence: ")
while True:
cap = 0
s = s.strip().split()
for c in s:
if c in "ABCDEFGHIJKLMNOPQRSTUVWXYZ":
print(c[:cap])
cap += 1
else:
print("not the answer")
break
But a regular expression would probably do a better job than writing out the whole alphabet.
Any help is much appreciated. Needless to say I am new to python.
>>> w = 'AbcDefgHijkL'
>>> r = re.findall('([A-Z])', word)
>>> r
['A', 'D', 'H', 'L']
This can give you all letters in caps in a word...Just sharing the idea
>>> r = re.findall('([A-Z][a-z]+)', w)
>>> r
['Abc', 'Defg', 'Hijk']
Above will give you all words starting with Caps letter. Note: Last one not captured as it does not make a word but even that can be captured
>>> r = re.findall('([A-Z][a-z]*)', w)
>>> r
['Abc', 'Defg', 'Hijk', 'L']
This will return true if capital letter is found in the word:
>>> word = 'abcdD'
>>> bool(re.search('([A-Z])', word))
Hint: "Create two lists"
s= input("Enter your string: ")
withcap = []
without = []
for word in s.strip().split():
# your turn
The way you are using the for .. else in is wrong - the else block is executed when there is no break from the loop. The logic you are trying to do looks like this
for c in s:
if c.isupper():
# s contains a capital letter
# <do something>
break # one such letter is enough
else: # we did't `break` out of the loop
# therefore have no capital letter in s
# <do something>
which you can also write much shorter with any
if any(c in "ABCDEFGHIJKLMNOPQRSTUVWXYZ" for c in s):
# <do something>
else:
# <do something>
Sounds like regexs would be easier for the first part of the problem (a regex that just looks for [A-Z] should do the trick).
For the second part, I'd recommend using two lists, as that's an easy way to print everything out in one for loop. Have one list of non_upper_words and one of upper_words.
So, the basic outline of the program would be:
split the string into an array of words.
for each word in array: if regex returns true, add to upper_words. Else: add to non_upper_words.
print each word in the first array and then in the second.
I wrote this out in pseudo-code because it's a programming assignment, so you should really write the actual code yourself. Hope it helps!
You can use isupper method for your purpose:
text = 'sssample Text with And without'
uppers = []
lowers = []
# Note that loop below could be modified to skip ,.-\/; and etc if neccessary
for word in text.split():
uppers.append(word) if word[0].isupper() else lowers.append(word)
EDITED: You can also use islower method the following way:
text = 'sssample Text with And without'
other = []
lowers = []
# Note that loop below could be modified to skip ,.-\/; and etc if neccessary
for word in text.split():
lowers.append(word) if word.islower() else other.append(word)
OR depends on what you really need you can take a look at istitle method:
titled = []
lowers = []
for word in text.split():
titled.append(word) if word.istitle() else lower.append(word)
AND with simple if else statement:
titled = []
lowers = []
for word in text.split():
if word.istitle():
titled.append(word)
else:
lower.append(word)
You can use List Comprehensions to get all upper case characters and lower case characters.
def get_all_cap_lowercase_list(inputStr):
cap_temp_list = [c for c in inputStr if c.isupper()]
low_temp_list = [c for c in inputStr if c.islower()]
print("List of Cap {0} and List of Lower {1}".format(cap_temp_list,low_temp_list))
upper_case_count = len(cap_temp_list)
lower_case_count = len(low_temp_list)
print("Count of Cap {0} and Count of Lower {1}".format(upper_case_count,lower_case_count))
get_all_cap_lowercase_list("Hi This is demo Python program")
And The output is:
List of Cap ['H', 'T', 'P'] and List of Lower ['i', 'h', 'i', 's',
'i', 's', 'd', 'e', 'm', 'o', 'y', 't', 'h', 'o', 'n', 'p', 'r', 'o',
'g', 'r', 'a', 'm']
Count of Cap 3 and Count of Lower 22
Try doing the following:
Split the string into a list where each item is a separate word.
For every word in that list, iterate through and check for capital letters (consider the string constants such as string.uppercase). If it has a capital letter, insert it onto the front of your result list. If not, append it to the end of your result list.
Iterate through your results, printing them. Or, if you want to avoid iterating, join the items in the string using the newline character \n.
Thank you to everyone for your input and help, it was all very informative. Here is the answer that I finally submitted.
s = input("Please enter a sentence: ")
withcap = []
without = []
for word in s.split():
if word.islower():
without.append(word)
else:
withcap.append(word)
onelist = withcap + without
for word in onelist:
print (word)
I think your answer might only be searching for words where the first letter is capitalized. To find words that contain a capital letter anywhere in the word, you'd have to enumerate over each letter in the word, like this:
uin = input("Enter your text: ")
##create lists to hold upper and lower case words
up_words = []
no_up_words = []
for i, word in enumerate(uin.strip().split()):
if word.islower():
no_up_words.append(word)
else:
up_words.append(word)
print(up_words, no_up_words)
My regex:
vendor = "MyNameIsJoe. IWorkInDDFinc."
ven = re.split(r'(?<=[a-z])[A-Z]|[A-Z](?=[a-z])', vendor)
I need split word that would have happened:
My Name Is Joe. I Work In DDF inc.

Categories

Resources