Finding a substring within a list in Python [duplicate] - python

This question already has answers here:
How to check if a string is a substring of items in a list of strings
(18 answers)
Closed 4 years ago.
Background:
Example list: mylist = ['abc123', 'def456', 'ghi789']
I want to retrieve an element if there's a match for a substring, like abc
Code:
sub = 'abc'
print any(sub in mystring for mystring in mylist)
above prints True if any of the elements in the list contain the pattern.
I would like to print the element which matches the substring. So if I'm checking 'abc' I only want to print 'abc123' from list.

print [s for s in list if sub in s]
If you want them separated by newlines:
print "\n".join(s for s in list if sub in s)
Full example, with case insensitivity:
mylist = ['abc123', 'def456', 'ghi789', 'ABC987', 'aBc654']
sub = 'abc'
print "\n".join(s for s in mylist if sub.lower() in s.lower())

All the answers work but they always traverse the whole list. If I understand your question, you only need the first match. So you don't have to consider the rest of the list if you found your first match:
mylist = ['abc123', 'def456', 'ghi789']
sub = 'abc'
next((s for s in mylist if sub in s), None) # returns 'abc123'
If the match is at the end of the list or for very small lists, it doesn't make a difference, but consider this example:
import timeit
mylist = ['abc123'] + ['xyz123']*1000
sub = 'abc'
timeit.timeit('[s for s in mylist if sub in s]', setup='from __main__ import mylist, sub', number=100000)
# for me 7.949463844299316 with Python 2.7, 8.568840944994008 with Python 3.4
timeit.timeit('next((s for s in mylist if sub in s), None)', setup='from __main__ import mylist, sub', number=100000)
# for me 0.12696599960327148 with Python 2.7, 0.09955992100003641 with Python 3.4

Use a simple for loop:
seq = ['abc123', 'def456', 'ghi789']
sub = 'abc'
for text in seq:
if sub in text:
print(text)
yields
abc123

This prints all elements that contain sub:
for s in filter (lambda x: sub in x, list): print (s)

I'd just use a simple regex, you can do something like this
import re
old_list = ['abc123', 'def456', 'ghi789']
new_list = [x for x in old_list if re.search('abc', x)]
for item in new_list:
print item

Related

check if a nested list contains a substring

How to check if a nested list contains a substring?
strings = [[],["one", "two", "three"]]
substring = "wo"
strings_with_substring = [string for string in strings if substring in string]
print(strings_with_substring)
this script just prints :
[]
how to fix it? output should be:
two
==
Sayse, solution you provided doesn't work for me. I am new to python. I am sure I am missing something here. any thoughts?
import re
s = [[],["one", "two", "three"]]
substring = "wo"
# strings_with_substring = [string for string in strings if substring in string]
strings_with_substring = next(s for sl in strings for s in sl if substring in s)
print(strings_with_substring)
You are missing another level of iteration. Here is the looping logic without using a comprehension:
for sublist in strings:
for item in sublist:
if substring in item:
print(item)
Roll that up to a comprehension:
[item for sublist in strings for item in sublist if substring in item]
You're looking for
next(s for sl in strings for s in sl if substring in s)
This outputs "two", if you want a list of all elements then change the next for your list comprehension with given ammendments, or likewise, change next to any if you just want a boolean result
Since you said it should just print the string ~ You could use itertools to flatten your list and run it through a filter that you loop over.
from itertools import chain
strings = [[], ['one', 'two', 'three']]
substring = 'wo'
for found in filter(lambda s: substring in s, chain.from_iterable(strings)):
print(found)

How to extract strings between two markers for each object of a list in python

I got a list of strings. Those strings have all the two markers in. I would love to extract the string between those two markers for each string in that list.
example:
markers 'XXX' and 'YYY' --> therefore i want to extract 78665786 and 6866
['XXX78665786YYYjajk', 'XXX6866YYYz6767'....]
You can just loop over your list and grab the substring. You can do something like:
import re
my_list = ['XXX78665786YYYjajk', 'XXX6866YYYz6767']
output = []
for item in my_list:
output.append(re.search('XXX(.*)YYY', item).group(1))
print(output)
Output:
['78665786', '6866']
import re
l = ['XXX78665786YYYjajk', 'XXX6866YYYz6767'....]
l = [re.search(r'XXX(.*)YYY', i).group(1) for i in l]
This should work
Another solution would be:
import re
test_string=['XXX78665786YYYjajk','XXX78665783336YYYjajk']
int_val=[int(re.search(r'\d+', x).group()) for x in test_string]
the command split() splits a String into different parts.
list1 = ['XXX78665786YYYjajk', 'XXX6866YYYz6767']
list2 = []
for i in list1:
d = i.split("XXX")
for g in d:
d = g.split("YYY")
list2.append(d)
print(list2)
it's saved into a list

Join characters from list of strings by index

For example, I have the following list.
list=['abc', 'def','ghi','jkl','mn']
I want to make a new list as:
newList=['adgjm','behkn','cfil']
picking every first character of each element forming a new string then appending into the new list, and then with the second character of every element and so on:
Thanks for the help.
One way is zipping the strings in the list, which will interleave the characters from each string in the specified fashion, and join them back with str.join:
l = ['abc', 'def','ghi','jkl']
list(map(''.join, zip(*l)))
# ['adgj', 'behk', 'cfil']
For strings with different length, use zip_longest, and fill with an empty string:
from itertools import zip_longest
l = ['abcZ', 'def','ghi','jkl']
list(map(''.join, zip_longest(*l, fillvalue='')))
# ['adgj', 'behk', 'cfil', 'Z']
You can try this way:
>>> list1 =['abc', 'def','ghi','jkl']
>>> newlist = []
>>> for args in zip(*list1):
... newlist.append(''.join(args))
...
>>> newlist
['adgj', 'behk', 'cfil']
Or using list comprehension:
>>> newlist = [''.join(args) for args in zip(*list1)]
>>> newlist
['adgj', 'behk', 'cfil']
You can try this:
list=['abc', 'def','ghi','jkl']
n = len(list[0])
newList = []
i = 0
for i in range(n):
newword = ''
for word in list:
newword += word[i]
newList.append(newword)
print(newList)

Python: Check if Any List Item In String, If So Assign Found Item to Variable

I've the following list:
my_list = ['a', 'b', 'c']
I've the following list of strings:
my_strings = ['azz', 'bzz', 'czz']
I'm doing the following to determine if any items of my_list are in contained within a item in my_strings:
for my_string in my_strings:
if any(x in my_string for x in my_list):
# Do Stuff
What's the best practice for retaining the x as found in my_list so that I might be able to then do the following:
#Do Stuff
new_var = my_string.split('x')[1]
The desired result would be able to assign zz to new_var by determining that a from my list was in azz from my_strings
It is simple indeed, you can even do it in a beautiful one liner using list comprehension as follows:
new_var = [my_string.split(x)[1] for my_string in my_strings for x in my_list if x in my_string]
this returns for you 2nd element from all splits of all strings in my_strings in which there exists elements from my_list
You should not use any because you actually care about which one matches:
for my_string in my_strings:
for x in my_list:
if x in my_string:
#Do Stuff
my_string.split('x')[0]
break
You should not use any() because you need to know specifically which element matches a certain string. Simply use a regular for loop instead:
>>> def split(strings, lst):
... for string in strings:
... for el in lst:
... if el in string:
... yield string.split(el)[1]
...
>>>
>>> for string in split(['azz', 'bzz', 'czz'], ['a', 'b', 'c']):
... print(string)
...
zz
zz
zz
>>>
Let's say if you want to split first value of my_string with first value of my_list
my_string[0].split(my_list[0])

Python: how to ignore 'substring not found' error

Let's say that you have a string array 'x', containing very long strings, and you want to search for the following substring: "string.str", within each string in array x.
In the vast majority of the elements of x, the substring in question will be in the array element. However, maybe once or twice, it won't be. If it's not, then...
1) is there a way to just ignore the case and then move onto the next element of x, by using an if statement?
2) is there a way to do it without an if statement, in the case where you have many different substrings that you're looking for in any particular element of x, where you might potentially end up writing tons of if statements?
You want the try and except block. Here is a simplified example:
a = 'hello'
try:
print a[6:]
except:
pass
Expanded example:
a = ['hello', 'hi', 'hey', 'nice']
for i in a:
try:
print i[3:]
except:
pass
lo
e
You can use list comprehension to filter the list concisely:
Filter by length:
a_list = ["1234", "12345", "123456", "123"]
print [elem[3:] for elem in a_list if len(elem) > 3]
>>> ['4', '45', '456']
Filter by substring:
a_list = ["1234", "12345", "123456", "123"]
a_substring = "456"
print [elem for elem in a_list if a_substring in elem]
>>> ['123456']
Filter by multiple substrings (Checks if all the substrings are in the element by comparing the filtered array size and the number of substrings):
a_list = ["1234", "12345", "123456", "123", "56", "23"]
substrings = ["56","23"]
print [elem for elem in a_list if\
len(filter(lambda x: x in elem, substrings)) == len(substrings)]
>>> ['123456']
Well, if I understand what you wrote, you can use the continue keyword to jump to the next element in the array.
elements = ["Victor", "Victor123", "Abcdefgh", "123456", "1234"]
astring = "Victor"
for element in elements:
if astring in element:
# do stuff
else:
continue # this is useless, but do what you want, buy without it the code works fine too.
Sorry for my English.
Use any() to see if any of the substrings are in an item of x. any() will consume a generator expression and it exhibits short circuit beavior - it will return True with the first expression that evaluates to True and stop consuming the generator.
>>> substrings = ['list', 'of', 'sub', 'strings']
>>> x = ['list one', 'twofer', 'foo sub', 'two dollar pints', 'yard of hoppy poppy']
>>> for item in x:
if any(sub in item.split() for sub in substrings):
print item
list one
foo sub
yard of hoppy poppy
>>>

Categories

Resources