check if a nested list contains a substring - python

How to check if a nested list contains a substring?
strings = [[],["one", "two", "three"]]
substring = "wo"
strings_with_substring = [string for string in strings if substring in string]
print(strings_with_substring)
this script just prints :
[]
how to fix it? output should be:
two
==
Sayse, solution you provided doesn't work for me. I am new to python. I am sure I am missing something here. any thoughts?
import re
s = [[],["one", "two", "three"]]
substring = "wo"
# strings_with_substring = [string for string in strings if substring in string]
strings_with_substring = next(s for sl in strings for s in sl if substring in s)
print(strings_with_substring)

You are missing another level of iteration. Here is the looping logic without using a comprehension:
for sublist in strings:
for item in sublist:
if substring in item:
print(item)
Roll that up to a comprehension:
[item for sublist in strings for item in sublist if substring in item]

You're looking for
next(s for sl in strings for s in sl if substring in s)
This outputs "two", if you want a list of all elements then change the next for your list comprehension with given ammendments, or likewise, change next to any if you just want a boolean result

Since you said it should just print the string ~ You could use itertools to flatten your list and run it through a filter that you loop over.
from itertools import chain
strings = [[], ['one', 'two', 'three']]
substring = 'wo'
for found in filter(lambda s: substring in s, chain.from_iterable(strings)):
print(found)

Related

Join characters from list of strings by index

For example, I have the following list.
list=['abc', 'def','ghi','jkl','mn']
I want to make a new list as:
newList=['adgjm','behkn','cfil']
picking every first character of each element forming a new string then appending into the new list, and then with the second character of every element and so on:
Thanks for the help.
One way is zipping the strings in the list, which will interleave the characters from each string in the specified fashion, and join them back with str.join:
l = ['abc', 'def','ghi','jkl']
list(map(''.join, zip(*l)))
# ['adgj', 'behk', 'cfil']
For strings with different length, use zip_longest, and fill with an empty string:
from itertools import zip_longest
l = ['abcZ', 'def','ghi','jkl']
list(map(''.join, zip_longest(*l, fillvalue='')))
# ['adgj', 'behk', 'cfil', 'Z']
You can try this way:
>>> list1 =['abc', 'def','ghi','jkl']
>>> newlist = []
>>> for args in zip(*list1):
... newlist.append(''.join(args))
...
>>> newlist
['adgj', 'behk', 'cfil']
Or using list comprehension:
>>> newlist = [''.join(args) for args in zip(*list1)]
>>> newlist
['adgj', 'behk', 'cfil']
You can try this:
list=['abc', 'def','ghi','jkl']
n = len(list[0])
newList = []
i = 0
for i in range(n):
newword = ''
for word in list:
newword += word[i]
newList.append(newword)
print(newList)

Does string contain any of the words in my list?

I want to check a string to see if it contains any of the words i have in my list.
the list is has somewhere around 100 individual words.
i have tried using regex but cant get it to work...
string = "<div class="header_links">$$ - $$$, Dansk, Veganske retter, Glutenfri retter</div>"
list = ['Café','Afrikansk','............','Sushi','Svensk','Sydamerikansk','Syditaliensk','Szechuan','Taiwansk','Thai','Tibetansk','Østeuropæisk','Dansk']
in this case the string has 'Dansk' in it. The string could contain more than one of the words in the list.
i want to write a piece of code that prints the words in the list which is also in the string.
in this case the output should be: Dansk
if there was more than one word in the string it should be: Dansk, ...., ....
I hope someone can help
>>> list = ['Café','Afrikansk','............','Sushi','Svensk','Sydamerikansk','Syditaliensk','Szechuan','Taiwansk','Thai','Tibetansk','Østeuropæisk','Dansk']
>>> string = """<div class="header_links">$$ - $$$, Dansk, Veganske retter, Glutenfri retter</div>"""
>>> [x for x in list if x in string]
['Dansk']
I recommend not using list as a variable name, as it usually referring to the type list (like str or int)
Use a list comprehension with a membership check:
[x for x in lst if x in string]
Note that I have renamed your list to lst, as list is built-in.
Example:
string = '<div class="header_links">$$ - $$$, Dansk, Veganske retter, Glutenfri retter</div>'
lst = ['Café','Afrikansk','Sushi','Svensk','Sydamerikansk','Syditaliensk','Szechuan','Taiwansk','Thai','Tibetansk','Østeuropæisk','Dansk']
print([x for x in lst if x in string])
# ['Dansk']
in your case you can use:
string_intersection = set(string.replace(',', '').split()).intersection(my_list)
print(*string_intersection, sep =',')
output:
Dansk

Spliting string into two by comma using python

I have following data in a list and it is a hex number,
['aaaaa955554e']
I would like to split this into ['aaaaa9,55554e'] with a comma.
I know how to split this when there are some delimiters between but how should i do for this case?
Thanks
This will do what I think you are looking for:
yourlist = ['aaaaa955554e']
new_list = [','.join([x[i:i+6] for i in range(0, len(x), 6)]) for x in yourlist]
It will put a comma at every sixth character in each item in your list. (I am assuming you will have more than just one item in the list, and that the items are of unknown length. Not that it matters.)
i assume you wanna split into every 6th character
using regex
import re
lst = ['aaaaa955554e']
newlst = re.findall('\w{6}', lst[0])
# ['aaaaa9', '55554e']
Using list comprehension, this works for multiple items in lst
lst = ['aaaaa955554e']
newlst = [item[i:i+6] for i in range(0,len(a[0]),6) for item in lst]
# ['aaaaa9', '55554e']
This could be done using a regular expression substitution as follows:
import re
print re.sub(r'([a-zA-Z]+\d)(.*?)', r'\1,\2', 'aaaaa955554e', count=1)
Giving you:
aaaaa9,55554e
This splits after seeing the first digit.

How to get string list index?

I have a list text_lines = ['asdf','kibje','ABC','beea'] and I need to find an index where string ABCappears.
ABC = [s for s in text_lines if "ABC" in s]
ABC is now "ABC".
How to get index?
Greedy (raises exception if not found):
index = next(i for i, s in enumerate(text_lines) if "ABC" in s)
Or, collect all of them:
indices = [i for i, s in enumerate(text_lines) if "ABC" in s]
text_lines = ['asdf','kibje','ABC','beea']
abc_index = text_lines.index('ABC')
if 'ABC' appears only once. the above code works, because index gives the index of first occurrence.
for multiple occurrences you can check wim's answer
Simple python list function.
index = text_lines.index("ABC")
If the string is more complicated, you may need to combine with regex, but for a perfect match this simple solution is best.
Did you mean the index of "ABC" ?
If there is just one "ABC", you can use built-in index() method of list:
text_lines.index("ABC")
Else, if there are more than one "ABC"s, you can use enumerate over the list:
indices = [idx for idx,val in enumerate(text_lines) if val == "ABC"]

Finding a substring within a list in Python [duplicate]

This question already has answers here:
How to check if a string is a substring of items in a list of strings
(18 answers)
Closed 4 years ago.
Background:
Example list: mylist = ['abc123', 'def456', 'ghi789']
I want to retrieve an element if there's a match for a substring, like abc
Code:
sub = 'abc'
print any(sub in mystring for mystring in mylist)
above prints True if any of the elements in the list contain the pattern.
I would like to print the element which matches the substring. So if I'm checking 'abc' I only want to print 'abc123' from list.
print [s for s in list if sub in s]
If you want them separated by newlines:
print "\n".join(s for s in list if sub in s)
Full example, with case insensitivity:
mylist = ['abc123', 'def456', 'ghi789', 'ABC987', 'aBc654']
sub = 'abc'
print "\n".join(s for s in mylist if sub.lower() in s.lower())
All the answers work but they always traverse the whole list. If I understand your question, you only need the first match. So you don't have to consider the rest of the list if you found your first match:
mylist = ['abc123', 'def456', 'ghi789']
sub = 'abc'
next((s for s in mylist if sub in s), None) # returns 'abc123'
If the match is at the end of the list or for very small lists, it doesn't make a difference, but consider this example:
import timeit
mylist = ['abc123'] + ['xyz123']*1000
sub = 'abc'
timeit.timeit('[s for s in mylist if sub in s]', setup='from __main__ import mylist, sub', number=100000)
# for me 7.949463844299316 with Python 2.7, 8.568840944994008 with Python 3.4
timeit.timeit('next((s for s in mylist if sub in s), None)', setup='from __main__ import mylist, sub', number=100000)
# for me 0.12696599960327148 with Python 2.7, 0.09955992100003641 with Python 3.4
Use a simple for loop:
seq = ['abc123', 'def456', 'ghi789']
sub = 'abc'
for text in seq:
if sub in text:
print(text)
yields
abc123
This prints all elements that contain sub:
for s in filter (lambda x: sub in x, list): print (s)
I'd just use a simple regex, you can do something like this
import re
old_list = ['abc123', 'def456', 'ghi789']
new_list = [x for x in old_list if re.search('abc', x)]
for item in new_list:
print item

Categories

Resources