How to extract a string contained in nested list? - python

Please help me out to extract the string containing particular text. I have tried with below:
lst = [['abc', 'abgoodhj', 'rygbadkk'], ['jhjbadnm'], ['hjhj', 'iioytu'], ['hjjh', 'ghjgood1hj', 'jjkkbadgghhj', 'hjhgkll']]
for lst1 in lst:
good_wrd = [txt for txt in lst1 if txt.contains('good')]
bad_wrd = [txt for txt in lst1 if txt.contains('bad')]
I want the words that contain good and bad.

use list comprehension to create a new list.
good_wrd = [
word
for sub_lst in lst
for word in sub_lst
if "good" in word
]
bad_wrd = [
word
for sub_lst in lst
for word in sub_lst
if "bad" in word
]
Alternatively using for loops:
good_wrd = []
bad_wrd = []
for sub_lst in lst:
for word in sub_lst:
if "bad" in word:
bad_wrd.append(word)
elif "good" in word:
good_wrd.append(word)

This would work:
lst = [['abc', 'abgoodhj', 'rygbadkk'], ['jhjbadnm'], ['hjhj', 'iioytu'], ['hjjh', 'ghjgood1hj', 'jjkkbadgghhj', 'hjhgkll']]
good_wrd = []
bad_wrd = []
for lst1 in lst:
good_wrd.extend([txt for txt in lst1 if 'good' in txt])
bad_wrd.extend([txt for txt in lst1 if 'bad' in txt])
print(good_wrd)
print(bad_wrd)

target1 = 'good'
target2 = 'bad'
goods = []
bads = []
for lis in lst:
for txt in lis:
if target1 in txt:
goods.append(txt)
elif target2 in txt:
bads.append(txt)

Related

Limit test phrase to a few characters

Im having an issue limiting characters not words running different solutions as shown below..any help is appreciated!
test_list = ['Running the boy gladly jumped']
test_words = test_list[0].split()
suffix_list = ['ed', 'ly', 'ing']
final_list = []
for word in test_words:
if suffix_list[0] == word[-len(suffix_list[0]):]:
final_list.append(word[0:-len(suffix_list[0])])
elif suffix_list[1] == word[-len(suffix_list[1]):]:
final_list.append(word[0:-len(suffix_list[1])])
elif suffix_list[2] == word[-len(suffix_list[2]):]:
final_list.append(word[0:-len(suffix_list[2])])
else:
final_list.append(word)
final_list = [' '.join(final_list)]
print (final_list)
If you mean to include only the first 8 characters of each word, you can do this with a list comprehension over final_list like so:
final_list = [word[:min(len(word), 8)] for word in final_list
Removes suffixes and limits each result word to limit characters
limit = 8
test_list = ['Running the boy gladly jumped continuously']
test_words = test_list[0].split()
suffix_list = ['ed', 'ly', 'ing']
final_list = []
for word in test_words:
for suffix in suffix_list:
if word.endswith(suffix):
final_list.append(word[:-len(suffix)][:limit])
break
else:
final_list.append(word[:limit])
print(' '.join(final_list))
Prints:
Runn the boy glad jump continuo
You could use splicing to get the first 8 words
final_list = [' '.join(final_list)][:8]

How can I filter out a list using another list

I've got a list which includes other lists like this:
l = [['a book','an owl','a banana'],['a car','an apple','a carrot']]
And I've got another list like this:
f = [['book','banana'],[['car','apple']]
Now what I wanna do is to find the items in l list that contain the words from their counterpart in f list. for example items containing both "book" and "banana" ( from item 1 in f ) in "a book", "an owl" and "a banana" ( from item 1 in l ).
Next step is to append the found items to a new list so the result will end up like this:
l_filtered = [['a book','a banana'],['a car','an apple']]
I've been trying to do this through the piece of code below but it returns literally nothing back but a bunch of [ ]'s.
Anyway here is the piece of code I wrote :
l = [['a book','an owl','a banana'],['a car','an apple','a carrot']]
f = [['book','a'],['car','apple']]
s = []
for item in l:
eachlist = item
filtered = []
for item in f:
eachmatcherlist = item
for item in eachmatcherlist:
eachword = item
finder = [s for s in eachlist if eachword in s ]
filtered.extend(finder)
s.append(filtered)
filtered.clear()
print(s)
Another issue is I want to append items from l to s which contain both "book" and "a". Hypothetically if my code worked It would return all the items with a or book in them but I want items with book and a.
I'd really appreciate anyhelp.
Worked on your code a litte:
l = [['a book','an owl','a banana'],['a car','an apple','a carrot']]
f = [['book','banana'],['car','apple']]
result = []
for phrases,checks in zip(l,f):
tmp = []
for phrase in phrases:
for check in checks:
for word in phrase.split():
if check == word:
tmp.append(phrase)
result.append(tmp)
print(result)
Results in:
[['a book', 'a banana'], ['a car', 'an apple']]

Sorting a list based on upper and lower case

I have a list:
List1 = ['name','is','JOHN','My']
I want to append the pronoun as the first item in a new list and append the names at last. Other items should be in the middle and their positions can change.
So far I have written:
my_list = ['name','is','JOHN','My']
new_list = []
for i in my_list:
if i.isupper():
my_list.remove(i)
new_list.append(i)
print(new_list)
Here, I can't check if an item is completely upper case or only its first letter is upper case.
Output I get:
['name','is','JOHN','My']
Output I want:
['My','name','is','JOHN']
or:
['My','is','name','JOHN']
EDIT: I have seen this post and it doesn’t have answers to my question.
i.isupper() will tell you if it's all uppercase.
To test if just the first character is uppercase and the rest lowercase, you can use i.istitle()
To make your final result, you can append to different lists based on the conditions.
all_cap = []
init_cap = []
non_cap = []
for i in my_list:
if i.isupper():
all_cap.append(i)
elif i.istitle():
init_cap.append(i)
else:
non_cap.append(i)
new_list = init_cap + non_cap + all_cap
print(new_list)
DEMO
How about this:
s = ['name', 'is', 'JOHN', 'My']
pronoun = ''
name = ''
for i in s:
if i.isupper():
name = i
if i.istitle():
pronoun = i
result = [pronoun, s[0], s[1], name]
print(result)
Don't # me pls XD. Try this.
my_list = ['name','is','JOHN','My']
new_list = ['']
for i in range(len(my_list)):
if my_list[i][0].isupper() and my_list[i][1].islower():
new_list[0] = my_list[i]
elif my_list[i].islower():
new_list.append(my_list[i])
elif my_list[i].isupper():
new_list.append(my_list[i])
print(new_list)

Remove short overlapping string from list of string

I have a list of strings: mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"], I need to remove shorter strings that are substring of another string in the list.
For example in the case above, output should be : ["Tom Hanks","Tom Can"].
What I have done in python:
mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"]
newlst = []
for x in mylist:
noexist = True
for j in mylist:
if x==j:continue
noexist = noexist and not(x in j)
if (noexist==True):
newlst.append(x)
print(newlst)
The code works fine. How can I make it efficient?
If order in output does not matter (replace ',' character with a character that doesn't occur in strings of your list):
mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"]
mylist.sort(key = len)
newlst = []
for i,x in enumerate(mylist):
if x not in ','.join(mylist[i+1:]):
newlst.append(x)
list comprehension alternative (less readable):
mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"]
mylist.sort(key = len)
newlst = [x for i,x in enumerate(mylist) if x not in ','.join(mylist[i+1:])]
output:
['Tom Can', 'Tom Hanks']
And if you want to keep the order:
mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"]
mylist_sorted = mylist.copy()
mylist_sorted.sort(key = len)
newlst = [x for i,x in enumerate(mylist_sorted) if x not in ','.join(mylist_sorted[i+1:])]
newlst = [x for x in mylist if x in newlst]
output:
['Tom Hanks', 'Tom Can']
See this can help you. Added answer based on question sample list :
mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"]
newlist = []
newstring = "|".join(mylist)
for a in mylist:
if newstring.count(a) == 1:
print("Big string: ",a)
newlist.append(a)
else:
print("Small String: ",a)
print(newlist)
Added if else print statement how its traverse and check condition.
a pretty minor improvement without changing the overall algorithm is that once you find another element that contains the current element then you can break out of the inner loop since it is skipped after that.
mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"]
newlist = []
for elem in mylist:
for candidate in mylist:
if elem == candidate:
continue
elif elem in candidate:
break
else:
newlist.append(elem)
print(newlist)
If your strings are always words, you can just split on the words and filter by set operations, which should be quite fast.
from collections import Counter
items = ["Hanks", "Tom Hanks","Tom","Tom Can"]
items = set(items) # Don't want to think about uniqueness
item_words = {} # {item: all_words}
word_counts = Counter() # {word: item_counts}
word_lookups = {} # {word: {all_words: {item, ...}, ...}, ...}
for item in items:
words = frozenset(item.split())
item_words[item] = words
for word in words:
word_lookups.setdefault(word, {}).setdefault(words, set()).add(item)
word_counts[word] += 1
def is_ok(item):
words = item_words[item]
min_word = min(words, key=word_counts.__getitem__)
if word_counts[min_word] == 1:
return True # This item has a unique word
for all_words, others in word_lookups[min_word].items():
if not words.issubset(all_words):
continue # Not all words present
for other in others:
if item == other:
continue # Don't remove yourself
if item in other:
return False
return True # No matches
final = [item for item in items if is_ok(item)]
If you want to be very fast, consider a variation on the Aho–Corasick algorithm, where you would construct patterns for all your entries, and match them against all your inputs, and discard any patterns that have more than one match. This could potentially be linear in time.

In how many lists does the word appear?

I have different lists in python:
list1 = [hello,there,hi]
list2 = [my,name,hello]
I need to make a dictionary with the key being the number of lists a word appears in. So my answer would look like
{2:hello,1:hi ....}
I am new to python and I have no idea how to do this.
You need to use a dictionary to store key-value results.
Here is some code to help you get started, but you'll has to modify to your exact solution.
#!/usr/bin/python
list1 = ["hello","there","hi"]
list2 = ["my","name","hello"]
result = dict()
for word in list1:
if word in result.keys():
result[word] = result[word] + 1
else:
result[word] = 1
for word in list2:
if word in result.keys():
result[word] = result[word] + 1
else:
result[word] = 1
print result
As first step, make reverse dictionary like so
initialize it
words_count = {}
and then for each list of words do like so
for word in list_of_words:
if not word in words_count:
words_count[word] = 1
else:
words_count[word] += 1
then reverse words_count like so:
inv_words_count = {v: k for k, v in words_count.items()}
inv_words_count is the desired result
I have slightly modified your input lists (list1 & list2) as shown below:
list1 = ['hello,there,hi'] # Added quotes as it is a string
list2 = ['my,name,hello']
Here is the logic:
list1 = list1[0].split(',')
list2 = list2[0].split(',')
list_final = list1 + list2
dict_final = {}
for item in list_final:
if item in dict_final.keys():
dict_final.update({item:(dict_final.get(item) + 1)})
else:
dict_final.update({item:1})
Hope it will work as you are expecting :)

Categories

Resources