Get text between two signs in a sentence - python

The task is to get text between two signs in a sentence.
User input sentence in one line in next one he input signs(for this case it's [ and ]).
Example:
In this sentence [need to get] only [few words].
Output needs to look like:
need to get few words
Can someone have any clue how to do this?
I have some idea like split input so we will access every element of the list and if a first sign is [ and finish with ] we save that word to other list, but there is a problem if the word doesn't end with ]
P.S. user will never input empty string or have a sign inside sign like [word [another] word].

You can use a regex:
import re
text = 'In this sentence [need to get] only [few words] and not [unbalanced'
' '.join(re.findall(r'\[(.*?)\]', text))
Output: 'need to get few words'
Or '(?<=\[).*?(?=\])' as regex using lookarounds

You can use regular expressions like this:
import re
your_string = "In this sentence [need to get] only [few words]"
matches = re.findall(r'\[([^\[\]]*)]', your_string)
print(' '.join(matches))
Regex demo
Solution without regex:
your_string = "In this sentence [need to get] only [few words]"
result_parts = []
current_square_brackets_part = ''
need_to_add_letter_to_current_square_brackets_part = False
for letter in your_string:
if letter == '[':
need_to_add_letter_to_current_square_brackets_part = True
elif letter == ']':
need_to_add_letter_to_current_square_brackets_part = False
result_parts.append(current_square_brackets_part)
current_square_brackets_part = ''
elif need_to_add_letter_to_current_square_brackets_part:
current_square_brackets_part += letter
print(' '.join(result_parts))

Here is a more classical solution using parsing.
It reads the string character by character and keeps it only if a flag is set. The flag is set when meeting a [ and unset on ]
text = 'In this sentence [need to get] only [few words] and not [unbalanced'
add = False
l = []
m = []
for c in text:
if c == '[':
add = True
elif c == ']':
if add and m:
l.append(''.join(m))
add = False
m = []
elif add:
m.append(c)
out = ' '.join(l)
print(out)
Output: need to get few words

Related

How to get the position of a character in Python and store it in a variable?

I am looking for a way to store the position integer of a character into a variable, but now I'm using a way I used in Delphi 2010, which is not right, according to Jupyter Notebook
This is my code I have this far:
def animal_crackers(text):
for index in text:
if index==' ':
if text[0] == text[pos(index)+1]:
return True
else:
return False
else:
pass
The aim, is to get two words (word + space + word) and if the beginning letters, of both words, match, then it has to show True, otherwise it shows False
For getting the index of a letter in a string (as the title asks), just use str.index(), or str.find() if you don't want an error to be raised if the letter/substring could not be found:
>>> text = 'seal sheep'
>>> text.index(' ')
4
However for your program, you do not need to use str.index if you want to identify the first and second word. Instead, use str.split() to break up a given text into a list of substrings:
>>> words = text.split() # With no arguments, splits words by whitespace
>>> words
['seal', 'sheep']
Then, you can take the letter of the first word and check if the second word begins with the same letter:
# For readability, you can assign the two words into their own variables
>>> first_word, second_word = words[0], words[1]
>>> first_word[0] == second_word[0]
True
Combined into a function, it may look like this:
def animal_crackers(text):
words = text.split()
first_word, second_word = words[0], words[1]
return first_word[0] == second_word[0]
Assuming that text is a single line containing two words:
def animal_crackers(text):
words = text.split()
if len(words)== 1:
break # we only have one word!
# here, the .lower() is only necessary is the program is NOT case-sensitive
# if you do care about the case of the letter, remove them
if word[0].lower() == words[1][0].lower():
return True
else:
return false

check if a pattern is in a list of words

I need an output that contains words that are exactly like a pattern - same letters in same spots only (and letters shouldn't show in the word at other places) and the same length
for example:
words = ['hatch','catch','match','chat','mates']
pattern = '_atc_
needed output:
['hatch','match']
I have tried to use nested for loops but it didn't work for a pattern that starts and ends with '_'
def filter_words_list(words, pattern):
relevant_words = []
for word in words:
if len(word) == len(pattern):
for i in range(len(word)):
for j in range(len(pattern)):
if word[i] != pattern[i]:
break
if word[i] == pattern[i]:
relevant_words.append(word)
thx !
So you should use regex. and replace the underscore with '.' which means any single character.
so the input looks like:
words = ['hatch','catch','match','chat','mates']
pattern = '.atc.'
and the code is:
import re
def filter_words_list(words, pattern):
ret = []
for word in words:
if(re.match(pattern,word)):ret.append(word)
return ret
Hopes tha helped
You could use a regular expression:
import re
words = ['hatch','catch','match','chat','mates']
pattern = re.compile('[^atc]atc[^atc]')
result = list(filter(pattern.fullmatch, words))
print(result)
Output
['hatch', 'match']
The pattern '[^atc]atc[^atc]' matches everything that is not a or t or c ([^atc]) followed by 'atc' and again everything that is not a or t or c.
As an alternative you could write your own matching function that will work with any given pattern:
from collections import Counter
def full_match(word, pattern='_atc_'):
if len(pattern) != len(word):
return False
pattern_letter_counts = Counter(e for e in pattern if e != '_') # count characters that are not wild card
word_letter_counts = Counter(word) # count letters
if any(count != word_letter_counts.get(ch, 0) for ch, count in pattern_letter_counts.items()):
return False
return all(p == w for p, w in zip(pattern, word) if p != '_') # the word must match in all characters that are not wild card
words = ['hatch', 'catch', 'match', 'chat', 'mates']
result = list(filter(full_match, words))
print(result)
Output
['hatch', 'match']
Further
See the documentation on the built-in functions any and all.
See the documentation on Counter.

python regex matching "ab" or "ba" words

I tried matching words including the letter "ab" or "ba" e.g. "ab"olition, f"ab"rics, pro"ba"ble. I came up with the following regular expression:
r"[Aa](?=[Bb])[Bb]|[Bb](?=[Aa])[Aa]"
But it includes words that start or end with ", (, ), / ....non-alphanumeric characters. How can I erase it? I just want to match words list.
import sys
import re
word=[]
dict={}
f = open('C:/Python27/brown_half.txt', 'rU')
w = open('C:/Python27/brown_halfout.txt', 'w')
data = f.read()
word = data.split() # word is list
f.close()
for num2 in word:
match2 = re.findall("\w*(ab|ba)\w*", num2)
if match2:
dict[num2] = (dict[num2] + 1) if num2 in dict.keys() else 1
for key2 in sorted(dict.iterkeys()):print "%s: %s" % (key2, dict[key2])
print len(dict.keys())
Here, I don't know how to mix it up with "re.compile~~" method that 1st comment said...
To match all the words with ab or ba (case insensitive):
import re
text = 'fabh, obar! (Abtt) yybA, kk'
pattern = re.compile(r"(\w*(ab|ba)\w*)", re.IGNORECASE)
# to print all the matches
for match in pattern.finditer(text):
print match.group(0)
# to print the first match
print pattern.search(text).group(0)
https://regex101.com/r/uH3xM9/1
Regular expressions are not the best tool for the job in this case. They'll complicate stuff way too much for such simple circumstances. You can instead use Python's builtin in operator (works for both Python 2 and 3)...
sentence = "There are no probable situations whereby that may happen, or so it seems since the Abolition."
words = [''.join(filter(lambda x: x.isalpha(), token)) for token in sentence.split()]
for word in words:
word = word.lower()
if 'ab' in word or 'ba' in word:
print('Word "{}" matches pattern!'.format(word))
As you can see, 'ab' in word evaluates to True if the string 'ab' is found as-is (that is, exactly) in word, or False otherwise. For example 'ba' in 'probable' == True and 'ab' in 'Abolition' == False. The second line takes take of dividing the sentence in words and taking out any punctuation character. word = word.lower() makes word lowercase before the comparisons, so that for word = 'Abolition', 'ab' in word == True.
I would do it this way:
Strip your string from unwanted chars using the below two
techniques, your choice:
a - By building a translation dictionary and using translate method:
>>> import string
>>> del_punc = dict.fromkeys(ord(c) for c in string.punctuation)
s = 'abolition, fabrics, probable, test, case, bank;, halfback 1(ablution).'
>>> s = s.translate(del_punc)
>>> print(s)
'abolition fabrics probable test case bank halfback 1ablution'
b - using re.sub method:
>>> import string
>>> import re
>>> s = 'abolition, fabrics, probable, test, case, bank;, halfback 1(ablution).'
>>> s = re.sub(r'[%s]'%string.punctuation, '', s)
>>> print(s)
'abolition fabrics probable test case bank halfback 1ablution'
Next will be finding your words containing 'ab' or 'ba':
a - Splitting over whitespaces and finding occurrences of your desired strings, which is the one I recommend to you:
>>> [x for x in s.split() if 'ab' in x.lower() or 'ba' in x.lower()]
['abolition', 'fabrics', 'probable', 'bank', 'halfback', '1ablution']
b -Using re.finditer method:
>>> pat
re.compile('\\b.*?(ab|ba).*?\\b', re.IGNORECASE)
>>> for m in pat.finditer(s):
print(m.group())
abolition
fabrics
probable
test case bank
halfback
1ablution
string = "your string here"
lowercase = string.lower()
if 'ab' in lowercase or 'ba' in lowercase:
print(true)
else:
print(false)
Try this one
[(),/]*([a-z]|(ba|ab))+[(),/]*

How to format my string

def main():
print('Please enter a sentence without spaces and each word has ' + \
'a capital letter.')
sentence = input('Enter your sentence: ')
for ch in sentence:
if ch.isupper():
capital = ch
sentence = sentence.replace(capital, ' ' + capital)
main()
Ex: sentence = 'ExampleSentenceGoesHere'
I need this to print as: Example sentence goes here
as of right now, it prints as: Example Sentence Goes Here (with space at the beginning)
You can iterate over the string character by character and replace every upper case letter with a space and appropriate lower case letter:
>>> s = 'ExampleSentenceGoesHere'
>>> "".join(' ' + i.lower() if i.isupper() else i for i in s).strip().capitalize()
'Example sentence goes here'
Note that check if the string is in upper case is done by isupper(). Calling strip() and capitalize() just helps to deal with the first letter.
Also see relevant threads:
Elegant Python function to convert CamelCase to snake_case?
How to check if a character is upper-case in Python?
You need to convert the each uppercase letter to a lowercase one using capital.lower(). You should also ignore the first letter of the sentence so it stays capitalised and doesn't have a space first. You can do this using a flag as such:
is_first_letter = True
for ch in sentence:
if is_first_letter:
is_first_letter = False
continue
if ch.isupper():
capital = ch
sentence = sentence.replace(capital, ' ' + capital.lower())
I'd probably use re and re.split("[A-Z]", text) but I'm assuming you can't do that because this looks like homework. How about:
def main():
text = input(">>")
newtext = ""
for character in text:
if character.isupper():
ch = " " + character.lower()
else:
ch = character
newtext += ch
text = text[0]+newtext[2:]
You could also do:
transdict = {letter:" "+letter.lower() for letter in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'}
transtable = str.maketrans(transdict)
text.translate(transtable).strip().capitalize()
But again I think that's outside the scope of the assignment

How do I print words with only 1 vowel?

my code so far, but since i'm so lost it doesn't do anything close to what I want it to do:
vowels = 'a','e','i','o','u','y'
#Consider 'y' as a vowel
input = input("Enter a sentence: ")
words = input.split()
if vowels == words[0]:
print(words)
so for an input like this:
"this is a really weird test"
I want it to only print:
this, is, a, test
because they only contains 1 vowel.
Try this:
vowels = set(('a','e','i','o','u','y'))
def count_vowels(word):
return sum(letter in vowels for letter in word)
my_string = "this is a really weird test"
def get_words(my_string):
for word in my_string.split():
if count_vowels(word) == 1:
print word
Result:
>>> get_words(my_string)
this
is
a
test
Here's another option:
import re
words = 'This sentence contains a bunch of cool words'
for word in words.split():
if len(re.findall('[aeiouy]', word)) == 1:
print word
Output:
This
a
bunch
of
words
You can translate all the vowels to a single vowel and count that vowel:
import string
trans = string.maketrans('aeiouy','aaaaaa')
strs = 'this is a really weird test'
print [word for word in strs.split() if word.translate(trans).count('a') == 1]
>>> s = "this is a really weird test"
>>> [w for w in s.split() if len(w) - len(w.translate(None, "aeiouy")) == 1]
['this', 'is', 'a', 'test']
Not sure if words with no vowels are required. If so, just replace == 1 with < 2
You may use one for-loop to save the sub-strings into the string array if you have checked he next character is a space.
Them for each substring, check if there is only one a,e,i,o,u (vowels) , if yes, add into the another array
aFTER THAT, FROM another array, concat all the strings with spaces and comma
Try this:
vowels = ('a','e','i','o','u','y')
words = [i for i in input('Enter a sentence ').split() if i != '']
interesting = [word for word in words if sum(1 for char in word if char in vowel) == 1]
i found so much nice code here ,and i want to show my ugly one:
v = 'aoeuiy'
o = 'oooooo'
sentence = 'i found so much nice code here'
words = sentence.split()
trans = str.maketrans(v,o)
for word in words:
if not word.translate(trans).count('o') >1:
print(word)
I find your lack of regex disturbing.
Here's a plain regex only solution (ideone):
import re
str = "this is a really weird test"
words = re.findall(r"\b[^aeiouy\W]*[aeiouy][^aeiouy\W]*\b", str)
print(words)

Categories

Resources