Checking the elements of a list for multiple strings - python

Say I have a list:
['[name]\n', 'first_name,jane\n', 'middle_name,anna\n', 'last_name,doe\n', '[age]\n', 'age,30\n', 'dob,1/1/1988\n']
How could I check if the strings 'jane', 'anna' and 'doe' are ALL contained in an element of the list.

For each name you can use any to see if it is contained in any of the strings in the list, then make sure this is true for all of the names
>>> data = ['[name]\n', 'first_name,jane\n', 'middle_name,anna\n', 'last_name,doe\n', '[age]\n', 'age,30\n', 'dob,1/1/1988\n']
>>> names = ['jane', 'anna', 'doe']
>>> all(any(name in sub for sub in data) for name in names)
True

Related

Find Strings Located Between Specific Strings in List Python

I'm writing code that's pulling in data from a website and it's printing out all the text between certain tags. I am storing the result into a list every time the code pulls data from a tag so I have a list looking something like
Warning
Not
News
Legends
Name1
Name2
Name3
Pickle
Stop
Hello
I want to look into this list of strings and have code that'll find the keywords legends and pickle and print whatever strings are between them.
To elaborate in a further activity, I may create a whole list of all possible legend names and then, if they occur whenever I generate my list, to print those out that reoccur. Any insight into any of these questions?
For the second approach, you could create a regex alternation of expected matching names, then use a list comprehension to generate a list of matches:
tags = ['Warning', 'Not', 'News', 'Legends', 'Name1', 'Name2', 'Name3', 'Pickle', 'Stop', 'Hello']
names = ['Name1', 'Name2', 'Name3']
regex = r'^(?:' + r'|'.join(names) + r')$'
matches = [x for x in tags if re.search(regex, x)]
print(matches) # ['Name1', 'Name2', 'Name3']
Try this:
words = [
"Warning", "Not", "News", "Legends", "Name1",
"Name2", "Name3", "Pickle", "Stop", "Hello"
]
words_in_between = words[words.index("Legends") + 1:words.index("Pickle")]
print(words_in_between)
output:
['Name1', 'Name2', 'Name3']
This assumes that both "Legends" and "Pickle" are in the list exactly once.
You can use the list.index() method to find the numerical index of an item within a list, and then use list slicing to return the items in your list between those two points:
your_list = ['Warning','Not','News','Legends','Name1','Name2','Name3','Pickle','Stop','Hello']
your_list[your_list.index('Legends')+1:your_list.index('Pickle')]
The caveat is that .index() returns only the index of the first occurrence of the given item, so if your list has two 'legends' items, you'll only return the first index.
You can use list.index() to get the index of the first occurance of legends and pickle. Then you can use list slicing to get the elements in between
l = ['Warning','Not','News','Legends','Name1','Name2','Name3','Pickle','Stop','Hello']
l[l.index('Legends')+1 : l.index('Pickle')]
['Name1', 'Name2', 'Name3']
numpys function where gives you all occurances of a given item. So first make the lsit a numpy array
my_array = numpy.array(["Warning","Not","News","Legends","Name1","Name2","Name3","Pickle","Stop","Hello","Legends","Name1","Name2","Name3","Pickle",])
From here on you can use methods of numpy:
legends = np.where(my_array == "Legends")
pickle = np.where(my_array == "Pickle")
concatinating for easier looping
stack = np.concatenate([legends, pickle], axis=0)
look for the values between legends and pickle
np.concatenate([my_list[stack[0, i] + 1:stack[1, i]] for i in range(stack.shape[0])] )
The result in my case is:
array(['Name1', 'Name2', 'Name3', 'Name1', 'Name2'], dtype='<U7')

comprehension list in pymongo query

Good morning everyone,
I would like to write a pymongo snippet to query from a database all documents having a specific field value within a given list (so also a value which is a subset of any element in the given list).
Within python, I would have two lists and I would like to find all elements from list one which are contained in at least one element from list two. Eg:
list1 = ['abc', 'bob', 'joe_123']
list2 = ['abc', 'joe', 'mike']
for string in list2:
print( string, any([ string in xxx for xxx in list1 ]) )
which gives the correct result:
abc True
joe True
mike False
How could I get the same from pymongo? I tried the "$in" operator, but the result is not complete.
from pymongo import MongoClient
from pprint import pprint
client = MongoClient('mongodb://localhost:27017')
db = client['test_query']
my_coll = db['test_collection']
list1 = ['abc', 'bob', 'joe_123']
list2 = ['abc', 'joe', 'mike']
for string in list2:
my_coll.insert_one({'string' : string})
cursor = my_coll.find({'string' : { '$in' : list1} })
which misses the case in joe < joe_123
pprint([c for c in cursor])
[{'_id': ObjectId('5f60ce9b682ff1bf4dcafb94'), 'string': 'abc'}]
Could anyone give me a hint on this?
More generally, what is the syntax to incorporate a python comprehension list into a pymongo query?
Thank you so much
Marco
$in in mongodb does not match on a partial string.
To do this you must use $regex, e.g.
for string in list1:
my_coll.insert_one({'string' : string})
query = {'$regex': '|'.join(list2)}
cursor = my_coll.find({'string' : query})
pprint([c for c in cursor])

How can I extract names from a concatenated string using Python?

Suppose I have a string of concatenated names like so:
name.s = 'johnwilliamsfrankbrown'.
How do I go from here to a list of names and surnames ["john", "williams", "frank", "brown"]?
So far I only found pieces of code to extract words from non concatenated strings.
As timgeb noted in the comments, this is only possible if you already know which names you expect. Assuming that you have this information, you can extract them like this:
>>> import re
>>> names = ['john', 'frank', 'brown', 'williams']
>>> regex = '(' + '|'.join(names) + ')'
>>> separated_names = re.findall(regex, 'johnwilliamsfrankbrown')
>>> separated_names
['john', 'williams', 'frank', 'brown']

Python LOB to List

Using:
cur.execute(SQL)
response= cur.fetchall() //response is a LOB object
names = response[0][0].read()
i have following SQL response as String names:
'Mike':'Mike'
'John':'John'
'Mike/B':'Mike/B'
As you can see it comes formatted. It is actualy formatted like:\\'Mike\\':\\'Mike\\'\n\\'John\\'... and so on
in order to check if for example Mike is inside list at least one time (i don't care how many times but at least one time)
I would like to have something like that:
l = ['Mike', 'Mike', 'John', 'John', 'Mike/B', 'Mike/B'],
so i could simply iterate over the list and ask
for name in l:
'Mike' == name:
do something
Any Ideas how i could do that?
Many thanks
Edit:
When i do:
list = names.split()
I receive the list which is nearly how i want it, but the elements inside look still like this!!!:
list = ['\\'Mike\\':\\'Mike\\", ...]
names = ['\\'Mike\\':\\'Mike\\", ...]
for name in names:
if "Mike" in name:
print "Mike is here"
The \\' business is caused by mysql escaping the '
if you have a list of names try this:
my_names = ["Tom", "Dick", "Harry"]
names = ['\\'Mike\\':\\'Mike\\", ...]
for name in names:
for my_name in my_names:
if myname in name:
print myname, " is here"
import re
pattern = re.compile(r"[\n\\:']+")
list_of_names = pattern.split(names)
# ['', 'Mike', 'Mike', 'John', 'John', 'Mike/B', '']
# Quick-tip: Try not to name a list with "list" as "list" is a built-in
You can keep your results this way or do a final cleanup to remove empty strings
clean_list = list(filter(lambda x: x!='', list_of_names))

Removing characters from a tuple

Im using
Users = win32net.NetGroupGetUsers(IP,'none',0),
to get all the local users on a system. The output is a tuple,
(([{'name': u'Administrator'}, {'name': u'Guest'}, {'name': u'Tom'}], 3, 0),)
I want to clean this up so it just prints out "Administrator, Guest, Tom". I tried using strip and replace but you cant use those on tuples. Is there a way to convert this into a string so i can manipulate it or is there an even simpler way to go about it?
This should not end with a comma:
Users = win32net.NetGroupGetUsers(IP,'none',0),
The trailing comma turns the result into a single item tuple containing the result, which is itself a tuple.
The data you want is in Users[0].
>>> print Users[0]
[{'name': u'Administrator'}, {'name': u'Guest'}, {'name': u'Tom'}]
To unpack this list of dictionaries we use a generator expression:
Users = win32net.NetGroupGetUsers(IP,'none',0)
print ', '.join(d['name'] for d in Users[0])
', '.join(user['name'] for user in Users[0][0])
input = (([{'name': u'Administrator'}, {'name': u'Guest'}, {'name': u'Tom'}], 3, 0),)
in_list = input[0][0]
names = [x['name'] for x in in_list]
print names
[u'Administrator', u'Guest', u'Tom']

Categories

Resources