comprehension list in pymongo query - python

Good morning everyone,
I would like to write a pymongo snippet to query from a database all documents having a specific field value within a given list (so also a value which is a subset of any element in the given list).
Within python, I would have two lists and I would like to find all elements from list one which are contained in at least one element from list two. Eg:
list1 = ['abc', 'bob', 'joe_123']
list2 = ['abc', 'joe', 'mike']
for string in list2:
print( string, any([ string in xxx for xxx in list1 ]) )
which gives the correct result:
abc True
joe True
mike False
How could I get the same from pymongo? I tried the "$in" operator, but the result is not complete.
from pymongo import MongoClient
from pprint import pprint
client = MongoClient('mongodb://localhost:27017')
db = client['test_query']
my_coll = db['test_collection']
list1 = ['abc', 'bob', 'joe_123']
list2 = ['abc', 'joe', 'mike']
for string in list2:
my_coll.insert_one({'string' : string})
cursor = my_coll.find({'string' : { '$in' : list1} })
which misses the case in joe < joe_123
pprint([c for c in cursor])
[{'_id': ObjectId('5f60ce9b682ff1bf4dcafb94'), 'string': 'abc'}]
Could anyone give me a hint on this?
More generally, what is the syntax to incorporate a python comprehension list into a pymongo query?
Thank you so much
Marco

$in in mongodb does not match on a partial string.
To do this you must use $regex, e.g.
for string in list1:
my_coll.insert_one({'string' : string})
query = {'$regex': '|'.join(list2)}
cursor = my_coll.find({'string' : query})
pprint([c for c in cursor])

Related

Checking the elements of a list for multiple strings

Say I have a list:
['[name]\n', 'first_name,jane\n', 'middle_name,anna\n', 'last_name,doe\n', '[age]\n', 'age,30\n', 'dob,1/1/1988\n']
How could I check if the strings 'jane', 'anna' and 'doe' are ALL contained in an element of the list.
For each name you can use any to see if it is contained in any of the strings in the list, then make sure this is true for all of the names
>>> data = ['[name]\n', 'first_name,jane\n', 'middle_name,anna\n', 'last_name,doe\n', '[age]\n', 'age,30\n', 'dob,1/1/1988\n']
>>> names = ['jane', 'anna', 'doe']
>>> all(any(name in sub for sub in data) for name in names)
True

How can I extract names from a concatenated string using Python?

Suppose I have a string of concatenated names like so:
name.s = 'johnwilliamsfrankbrown'.
How do I go from here to a list of names and surnames ["john", "williams", "frank", "brown"]?
So far I only found pieces of code to extract words from non concatenated strings.
As timgeb noted in the comments, this is only possible if you already know which names you expect. Assuming that you have this information, you can extract them like this:
>>> import re
>>> names = ['john', 'frank', 'brown', 'williams']
>>> regex = '(' + '|'.join(names) + ')'
>>> separated_names = re.findall(regex, 'johnwilliamsfrankbrown')
>>> separated_names
['john', 'williams', 'frank', 'brown']

Python LOB to List

Using:
cur.execute(SQL)
response= cur.fetchall() //response is a LOB object
names = response[0][0].read()
i have following SQL response as String names:
'Mike':'Mike'
'John':'John'
'Mike/B':'Mike/B'
As you can see it comes formatted. It is actualy formatted like:\\'Mike\\':\\'Mike\\'\n\\'John\\'... and so on
in order to check if for example Mike is inside list at least one time (i don't care how many times but at least one time)
I would like to have something like that:
l = ['Mike', 'Mike', 'John', 'John', 'Mike/B', 'Mike/B'],
so i could simply iterate over the list and ask
for name in l:
'Mike' == name:
do something
Any Ideas how i could do that?
Many thanks
Edit:
When i do:
list = names.split()
I receive the list which is nearly how i want it, but the elements inside look still like this!!!:
list = ['\\'Mike\\':\\'Mike\\", ...]
names = ['\\'Mike\\':\\'Mike\\", ...]
for name in names:
if "Mike" in name:
print "Mike is here"
The \\' business is caused by mysql escaping the '
if you have a list of names try this:
my_names = ["Tom", "Dick", "Harry"]
names = ['\\'Mike\\':\\'Mike\\", ...]
for name in names:
for my_name in my_names:
if myname in name:
print myname, " is here"
import re
pattern = re.compile(r"[\n\\:']+")
list_of_names = pattern.split(names)
# ['', 'Mike', 'Mike', 'John', 'John', 'Mike/B', '']
# Quick-tip: Try not to name a list with "list" as "list" is a built-in
You can keep your results this way or do a final cleanup to remove empty strings
clean_list = list(filter(lambda x: x!='', list_of_names))

how to match python list using regular expression

I have following lists in python ["john","doe","1","90"] and ["prince","2","95"]. the first number column is field: id and second number field is score. I would like to use re in python to parse out the field and print. So far, I only know how to do split of field comma. Any one can help?
You better use a dictionary than a regex (which I don't see how you use here):
{'name': 'John Doe', 'id': '1', 'score': '90'}
Or better yet, use numbers:
{'name': 'John Doe', 'id': 1, 'score': 90}
You don't really need regular expression here. You can just use isinstance() and slicing.
This should do what you want :
a_list = ['john','doe','1','90']
for i, elem in enumerate(a_list):
try:
elem = int(elem)
except ValueError, e:
pass
if isinstance(elem, int):
names_part = a_list[:i-1]
id_and_score = a_list[i-1:]
print 'name(s): {0}, '.format(' '.join(names_part)), 'id: {id}, score: {score}'.format(id=id_and_score[0], score=id_and_score[1])
Though, this solution could be improve if we were know the source of your data or if there is a way to pridict the field position you can just turn your list into a dict as suggested. If you extract your data you may consider building a dict instead of a list which prevent you from having to do what above.

Removing characters from a tuple

Im using
Users = win32net.NetGroupGetUsers(IP,'none',0),
to get all the local users on a system. The output is a tuple,
(([{'name': u'Administrator'}, {'name': u'Guest'}, {'name': u'Tom'}], 3, 0),)
I want to clean this up so it just prints out "Administrator, Guest, Tom". I tried using strip and replace but you cant use those on tuples. Is there a way to convert this into a string so i can manipulate it or is there an even simpler way to go about it?
This should not end with a comma:
Users = win32net.NetGroupGetUsers(IP,'none',0),
The trailing comma turns the result into a single item tuple containing the result, which is itself a tuple.
The data you want is in Users[0].
>>> print Users[0]
[{'name': u'Administrator'}, {'name': u'Guest'}, {'name': u'Tom'}]
To unpack this list of dictionaries we use a generator expression:
Users = win32net.NetGroupGetUsers(IP,'none',0)
print ', '.join(d['name'] for d in Users[0])
', '.join(user['name'] for user in Users[0][0])
input = (([{'name': u'Administrator'}, {'name': u'Guest'}, {'name': u'Tom'}], 3, 0),)
in_list = input[0][0]
names = [x['name'] for x in in_list]
print names
[u'Administrator', u'Guest', u'Tom']

Categories

Resources