Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
a = ["I like apple","I love you so much"]
I want to change "apple" and "love" words with followings : "orange","like"
but I don't want to change the elements ( sentences) in a so output must be
a= ["I like orange","I like you so much"]
No changes should happen in anything except words orange and like. Not even a space added or dropped.
You can have a dictionary to do the mapping:
word_mapping = {
'apple': 'orange',
'love': 'like'
}
And then, providing you have a list of strings:
def translate(text):
return reduce(lambda x, y: x.replace(y, word_mapping[y]), word_mapping, text)
def translate_all(text_list):
return [translate(s) for s in text_list]
e.g.:
a = ["I like apple","I love you so much"]
b = translate_all(a)
print(b)
# > ['I like orange', 'I like you so much']
You can use:
a[0].replace('apple', 'orange')
a[1].replace('love', 'like')
if you want to remain as it is try this,
>>> a = ["I like apple","I love you so much"]
>>> [i.replace('apple','orange').replace('love','like') for i in a]
['I like orange', 'I like you so much']
>>>
Strings are immutable in Python, i.e cannot be modified
So, a call like
a[0].replace("apple","orange")
will return a new string and not change the original element of the list.
a[0]=a[0].replace("apple","orange")
This should do the job.
Are you looking for something like this?
What the following code does is loop the strings in the list and check for apple or the word love and replace them with orange or like accordingly and return the list back to you.
[(i.replace("apple","orange") or i.replace("love","like")) for i in a]
Here's an execution
>>> a = ["I like apple","I love you so much"]
>>> [(i.replace("apple","orange") or i.replace("love","like")) for i in a]
['I like orange', 'I love you so much']
replace_map = [
('apple', 'orange'),
('love', 'like')
]
def replace_all(s, m):
for k, v in m:
s = s.replace(k, v)
return s
b = [replace_all(s, replace_map) for s in a]
The list comprehension use, as proposed by Patrick Haugh, Sudheesh Singanamalla and Mohideen ibn Mohammed, is may be the more pythonic way ...
Step by step and less pythonic is:
>>> a = ["I like apple","I love you so much"]
>>> for i in xrange(len(a)):
... if a[i].find('apple') is not -1:
... a[i]=a[i].replace('apple','orange')
... if a[i].find('love') is not -1:
... a[i]=a[i].replace('love','like')
...
>>> a
['I like orange', 'I like you so much']
Or more directly:
>>> for i in xrange(len(a)):
... a[i]=a[i].replace('apple','orange')
... a[i]=a[i].replace('love','like')
...
>>> a
['I like orange', 'I like you so much']
Related
I am looking to match a phrase inside a list.
I'm using python to match a phrase inside a list. The phrases can be inside the list, or they can not be inside a list.
list1 = ['I would like to go to a party', 'I am sam', 'That is
correct', 'I am currently living in Texas']
phrase1= 'I would like to go to a party'
phrase2= 'I am sam'
If phrase1 and phrase 2 are inside the list1, return correct or 100%. The purpose of it is to make sure that phrase 2 and phrase 2 are matched word for word.
Conversely, If the phrase is not inside a list or only one phrase is inside, for instance in list 2, then return false or 0%.
list2 = ['I am mike', 'I don\'t go to party', 'I am sam']
phrase1= 'I would like to go to a party'
phrase2= 'I am sam'
phrases can be changed so that it can be different than just those two phrases. For instance, it can be changed to whatever user sets like 'I am not good.'
It seems like you simply want to check for membership in the list:
list1 = ['I would like to go to a party', 'I am sam', 'That is correct', 'I am currently living in Texas']
phrase1 = 'I would like to go to a party'
phrase2 = 'I am sam'
if phrase1 in list1 and phrase2 in list1:
# whatever you want, this will execute if True
pass
else:
# whatever you want, this will execute if False
pass
I am not sure about I understand you but I guess maybe you can try
if phrase1 in list1
to check whether a phrase is in a list.
You can use all and a comprehension:
def check(phrase_list, *phrases):
return all(p in phrase_list for p in phrases)
In use:
list1 = ['I would like to go to a party', 'I am sam', 'That is correct', 'I am currently living in Texas']
phrase1= 'I would like to go to a party'
phrase2= 'I am sam'
print(check(list1, phrase1, phrase2))
#True
print(check(list1, 'I am sam', 'dragon'))
#False
You can also use a set
Like this:
set(list1) >= {phrase1, phrase2}
#True
Or like this:
#you can call this the same way I called the other check
def check(phrase_list, *phrases):
return set(list1) >= set(phrases)
Edit
To print 100% or 0% you could simply use an if statement or use boolean indexing:
print(('0%', '100%')[check(list1, phrases)])
To do this in your return statement:
return ('0%', '100%')[the_method_you_choose]
I came across the following question and was wondering what would be an elegant way to solve it.
Let's say we have two strings:
string1 = "I love to eat $(fruit)"
string2 = "I love to eat apples"
The only difference between those strings is $(fruit) and apples.
So, I can find the fruit is apples, and a dict{fruit:apples} could be returned.
Another example would be:
string1 = "I have $(food1), $(food2), $(food3) for lunch"
string2 = "I have rice, soup, vegetables for lunch"
I would like to have a dict{food1:rice, food2:soup, food3:vegetables} as the result.
Anyone have a good idea about how to implement it?
Edit:
I think I need the function to be more powerful.
ex.
string1 = "I want to go to $(place)"
string2 = "I want to go to North America"
result: {place : North America}
ex.
string1 = "I won $(index)place in the competition"
string2 = "I won firstplace in the competition"
result: {index : first}
The Rule would be: map the different parts of the string and make them a dict
So I guess all answers using str.split() or trying to split the string will not work. There is no rule that says what characters would be used as a separator in the string.
I think this can be cleanly done with regex-based splitting. This should also handle punctuation and other special characters (where a split on space is not enough).
import re
p = re.compile(r'[^\w$()]+')
mapping = {
x[2:-1]: y for x, y in zip(p.split(string1), p.split(string2)) if x != y}
For your examples, this returns
{'fruit': 'apple'}
and
{'food1': 'rice', 'food2': 'soup', 'food3': 'vegetable'}
One solution is to replace $(name) with (?P<name>.*) and use that as a regex:
def make_regex(text):
replaced = re.sub(r'\$\((\w+)\)', r'(?P<\1>.*)', text)
return re.compile(replaced)
def find_mappings(mapper, text):
return make_regex(mapper).match(text).groupdict()
Sample usage:
>>> string1 = "I have $(food1), $(food2), $(food3) for lunch"
>>> string2 = "I have rice, soup, vegetable for lunch"
>>> string3 = "I have rice rice rice, soup, vegetable for lunch"
>>> make_regex(string1).pattern
'I have (?P<food1>.*), (?P<food2>.*), (?P<food3>.*) for lunch'
>>> find_mappings(string1, string2)
{'food1': 'rice', 'food3': 'vegetable', 'food2': 'soup'}
>>> find_mappings(string1, string3)
{'food1': 'rice rice rice', 'food3': 'vegetable', 'food2': 'soup'}
Note that this can handle non alpha numeric tokens (see food1 and rice rice rice). Obviously this will probably do an awful lot of backtracking and might be slow. You can tweak the .* regex to try and make it faster depending on your expectations on "tokens".
For production ready code you'd want to re.escape the parts outside the (?P<name>.*) groups. A bit of pain in the ass to do because you have to "split" that string and call re.escape on each piece, put them together and call re.compile.
Since my answer got accepted I wanted to include a more robust version of the regex:
def make_regex(text):
regex = ''.join(map(extract_and_escape, re.split(r'\$\(', text)))
return re.compile(regex)
def extract_and_escape(partial_text):
m = re.match(r'(\w+)\)', partial_text)
if m:
group_name = m.group(1)
return ('(?P<%s>.*)' % group_name) + re.escape(partial_text[len(group_name)+1:])
return re.escape(partial_text)
This avoids issues when the text contains special regex characters (e.g. I have $(food1) and it costs $$$. The first solution would end up considering $$$ as three times the $ anchor (which would fail), this robust solution escapes them.
I suppose this does the trick.
s_1 = 'I had $(food_1), $(food_2) and $(food_3) for lunch'
s_2 = 'I had rice, meat and vegetable for lunch'
result = {}
for elem1, elem2 in zip(s_1.split(), s_2.split()):
if elem1.startswith('$'):
result[elem1.strip(',')[2:-1]] = elem2
print result
# {'food_3': 'vegetable', 'food_2': 'meat', 'food_1': 'rice,'}
If you'd rather not use regex:
string1 = "I have $(food1), $(food2), $(food3) for lunch"
string2 = "I have rice, soup, vegetable for lunch"
trans_table = str.maketrans({'$': '', '(': '', ')': '', ',': ''})
{
substr1.translate(trans_table): substr2.translate(trans_table)
for substr1, substr2 in zip(string1.split(),string2.split())
if substr1 != substr2
}
Output:
{'food1': 'rice', 'food2': 'soup', 'food3': 'vegetable'}
Alternatively, something a bit more flexible:
def substr_parser(substr, chars_to_ignore='$(),'):
trans_table = str.maketrans({char: '' for char in chars_to_ignore})
substr = substr.translate(trans_table)
# More handling here
return substr
{
substr_parser(substr1): substr_parser(substr2)
for substr1, substr2 in zip(string1.split(),string2.split())
if substr1 != substr2
}
Same output as above.
You can use re:
import re
def get_dict(a, b):
keys, values = re.findall('(?<=\$\().*?(?=\))', a), re.findall(re.sub('\$\(.*?\)', '(\w+)', a), b)
return dict(zip(keys, values if not isinstance(_values[0], tuple) else _values[0]))
d = [["I love to eat $(fruit)", "I love to eat apple"], ["I have $(food1), $(food2), $(food3) for lunch", "I have rice, soup, vegetable for lunch"]]
results = [get_dict(*i) for i in d]
Output:
[{'fruit': 'apple'}, {'food3': 'vegetable', 'food2': 'soup', 'food1': 'rice'}]
You can do:
>>> dict((x.strip('$(),'),y.strip(',')) for x,y in zip(string1.split(), string2.split()) if x!=y)
{'food1': 'rice', 'food2': 'soup', 'food3': 'vegetable'}
Or with a regex:
>>> import re
>>> dict((x, y) for x,y in zip(re.findall(r'\w+', string1), re.findall(r'\w+', string2)) if x!=y)
{'food1': 'rice', 'food2': 'soup', 'food3': 'vegetable'}
zip in combination with dictionary comprehension works well here we can zip the two lists and only take the pairs that are not equal.
l = [*zip(s1.split(),s2.split())]
d = {i[0].strip('$(),'): i[1] for i in l if i[0] != i[1] }
Hi I have the following 2 lists and I want to get a 3rd updated list basically such that if any of the strings from the list 'wrong' appears in the list 'old' it filters out that entire line of string containing it. ie I want the updated list to be equivalent to the 'new' list.
wrong = ['top up','national call']
old = ['Hi Whats with ','hola man top up','binga dingo','on a national call']
new = ['Hi Whats with', 'binga dingo']
You can use filter:
>>> list(filter(lambda x:not any(w in x for w in wrong), old))
['Hi Whats with ', 'binga dingo']
Or, a list comprehension,
>>> [i for i in old if not any(x in i for x in wrong)]
['Hi Whats with ', 'binga dingo']
If you're not comfortable with any of those, use a simple for loop based solution like below:
>>> result = []
>>> for i in old:
... for x in wrong:
... if x in i:
... break
... else:
... result.append(i)
...
>>> result
['Hi Whats with ', 'binga dingo']
>>> wrong = ['top up','national call']
>>> old = ['Hi Whats with ','hola man top up','binga dingo','on a national call']
>>> [i for i in old if all(x not in i for x in wrong)]
['Hi Whats with ', 'binga dingo']
>>>
I want to write a python program to test if there are any phrase can match the string using python.
string ='I love my travel all over the world'
list =['I love','my travel','all over the world']
So I want to text if there are any one of list can match that string that can print 'I love' or 'my travel','all over the world'.
any(x in string for x in list)
Or I need to use text mining to solve the problem?
Your current solution is probably the best to use in this given scenario. You could encapsulate it as a function if you wanted.
def list_in_string(slist, string):
return any(x in string for x in slist_list)
You can't do this:
if any(x in string for x in word_list)
print x
Because the any function iterates through the entire string/list, discards the x variable, and then simply returns a Boolean (True or False).
You can however, just break apart your any function so that you can get your desired output.
string ='I love traveling all over the world'
word_list =['I love','traveling','all over the world']
for x in word_list:
if x in string:
print x
This will output:
>>>
I love
traveling
all over the world
>>>
Update using string.split() :
string =['I', 'love','traveling','all', 'over', 'the', 'world']
word_list =['I love','traveling','all over the world']
count=0
for x in word_list:
for y in x.split():
if y in string:
count+=1
if count==len(x.split()) and (' ' in x) == True:
print x
count=0
This will output:
>>>
I love
all over the world
>>>
If you want a True or False returned, you can definitely use any(), for example:
>>> string = 'I love my travel all over the world'
>>> list_string =['I love',
'my travel',
'all over the world',
'Something something',
'blah']
>>> any(x for x in list_string if x in string)
True
>>>
Otherwise, you could do some simple list comprehension:
>>> string ='I love my travel all over the world'
>>> list_string =['I love',
'my travel',
'all over the world',
'Something something',
'blah']
>>> [x for x in list_string if x in string]
['I love', 'my travel', 'all over the world']
>>>
Depending on what you want returned, both of these work perfectly.
You could also probably use regular expression, but it's a little overkill for something so simple.
For completeness, one may mention the find method:
_string ='I love my travel all over the world'
_list =['I love','my travel','all over the world','spam','python']
for i in range(len(_list)):
if _string.find(_list[i]) > -1:
print _list[i]
Which outputs:
I love
my travel
all over the world
Note: this solution is not as elegant as the in usage mentioned, but may be useful if the position of the found substring is needed.
I am trying to append a list (null) with "sentences" which have # (Hashtags) from a different list.
Currently my code is giving me a new list with length of total number of elements involved in the list and not single sentences.
The code snippet is given below
import re
old_list = ["I love #stackoverflow because #people are very #helpful!","But I dont #love hastags",
"So #what can you do","Some simple senetnece","where there is no hastags","however #one can be good"]
new_list = [ ]
for tt in range(0,len(s)):
for ui in s:
if bool(re.search(r"#(\w+)",s[tt])) == True :
njio.append(s[tt])
Please let me know how to append only the single sentence.
I am not sure what you are wanting for output, but this will preserve the original sentence along with its matching set of hashtags:
>>> import re
>>> old_list = ["I love #stackoverflow because #people are very #helpful!","But I dont #love hastags",
... "So #what can you do","Some simple senetnece","where there is no hastags","however #one can be good"]
>>> hash_regex = re.compile('#(\w+)')
>>> [(hash_regex.findall(l), l) for l in old_list]
[(['stackoverflow', 'people', 'helpful'], 'I love #stackoverflow because #people are very #helpful!'), (['love'], 'But I dont #love hastags'), (['what'], 'So #what can you do'), ([], 'Some simple senetnece'), ([], 'where there is no hastags'), (['one'], 'however #one can be good')]