combine fields and values in python for dictionary - python

I'm probably overlooking something, but I've looked everywhere for a way to do this. I am trying to join fields and values together that come out separated for SQL to something i can use with MongoDB.
So for example (input):
fields = ['first-name', 'last-name', 'email-address', 'phone-number']
values = ['John', 'Doe', 'john.doe#johndoe.com', '1-800-123-4567']
Output:
{
'first-name':'John',
'last-name':'Doe',
'email':'john.doe#johndoe.com',
'phone-number':'1-800-123-4567'
}
I need it like this so i can just do a simple (I know i don't need to do this):
def getFirstName(self, lastname):
client = MongoClient()
db = client.test.contacts
result db.find({ 'last-name':lastname })
return result['first-name']
self.getFirstName("Doe")
My app supports MySQL and PostgreSQL so I can't really change how it spits fields and values without breaking those. Sorry if i made code errors, i typed this at the top of my head.
If you need more info, just ask.

you can use zip to wrap the two lists together and pass that to dict()
dict(zip(fields, values))
this assumes though that the two lists are always the same length

You could use a dict comprehension and iterate through the lists like:
d = {fields[i] : values[i] for i in range(len(fields))}

Related

Python Mongodb sorting too big, how to use index?

I'm trying to iterate in Python over all elements of a large Mongodb database.
Usually, I do:
mgclient = MongoClient('mongodb://user:pwd#0.0.0.0:27017')
mgdb = mgclient['mongo']
mgcol = mgdb['name']
for mg_ob in mgcol.find().sort('Date').sort('time'):
#DOTHINGS
But it says "Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit".
So I created an index named 'SortedTime', but I don't understand how I can use it now.
Basically, I'm trying to have something like:
mgclient = MongoClient('mongodb://user:pwd#0.0.0.0:27017')
mgdb = mgclient['mongo']
mgcol = mgdb['name']
for mg_ob in mgcol.find()['SortedTime']:
#DOTHINGS
Any ideas ? A little hand would be much appreciated.
I hope this post will help others. Thank you very much
Update:
I managed to make it work thanks to Joe. After I created the Index:
resp = mgcol.create_index(
[
("date", 1),
("time", 1)
]
)
print ("index response:", resp)
What I did was just:
mgclient = MongoClient('mongodb://user:pwd#0.0.0.0:27017')
mgdb = mgclient['mongo']
mgcol = mgdb['name']
for mg_ob in mgcol.find():
#DOTHINGS
No need to use the index name.
Your query is sorting on 2 fields, Date and time, so you will need an index that includes these fields first in the key specification.
Working from the mongo shell, you might use the createIndex shell helper:
db.getSiblingDB("mongo").getCollection("name").createIndex({Date:1, time:1})
Working from the client side, you might use the createIndexes database command.
Once the index has been created, query just like you did before and the mongod's query executor should use the index.
You can use explain() to get detailed query execution stages to see which indexes were considered and the comparative performance of each.

Django exclude for same field with different values

I have to create a query like this:
Obj.objects.exclude(title__iexact="Hello", title_iexact="Hell")
I want to create the exclude query as a dictionary from a list and use kwargs to pass it to exclude. Is it possible?
I know it will never exclude any records because of the impossible condition. But, this query comes from a parser, so I need to make sure it can be given properly to the SQL engine.
I got it... Q objects... below link is pretty good...
http://www.michelepasin.org/blog/2010/07/20/the-power-of-djangos-q-objects/
Some code:
for rule in rules:
negative_rules.append((rule.key.lower() + "__" + "iexact", rule.value))
negative_rules = reduce(operator.or_, [Q(x) for x in negative_rules]) if negative_rules else Q()
Obj.objects.exclude(negative_rules)

Retrieve all related records using values_list or similar

I have 3 Models: Album, Picture and PictureFile, linked by Picture.album and Picture.pic_file
I'd like to do something like the following, and end up with a list of PictureFile records:
pic_files = Picture.objects.filter(album_id=1).exclude(pic_file=None).values_list('pic_file', flat=True)
The above just seems to spit out a list of pic_file_ids however I'm wondering if Django provides a built in way to handle this or if I simply need to loop through each and construct the list myself?

Modify the order in which properties are displayed in MongoDB

I am using PyMongo to insert data (title, description, phone_number ...) into MongoDB. However, when I use mongo client to view the data, it displays the properties in a strange order. Specifically, phone_number property is displayed first, followed by title and then comes description. Is there some way I can force a particular order?
The above question and answer are quite old. Anyhow, if somebody visits this I feel like I should add:
This answer is completely wrong. Actually in Mongo Documents ARE ordered key-value pairs. However when using pymongo it will use python dicts for documents which indeed are not ordered (as of cpython 3.6 python dicts retain order, however this is considered an implementation detail). But this is a limitation of the pymongo driver.
Be aware, that this limitation actually impacts the usability. If you query the db for a subdocument it will only match if the order of the key-values pairs is correct.
Just try the following code yourself:
from pymongo import MongoClient
db = MongoClient().testdb
col = db.testcol
subdoc = {
'field1': 1,
'field2': 2,
'filed3': 3
}
document = {
'subdoc': subdoc
}
col.insert_one(document)
print(col.find({'subdoc': subdoc}).count())
Each time this code gets executed the 'same' document is added to the collection. Thus, each time we run this code snippet the printed value 'should' increase by one. It does not because find only maches subdocuemnts with the correct ordering but python dicts just insert the subdoc in arbitrary order.
see the following answer how to use ordered dict to overcome this: https://stackoverflow.com/a/30787769/4273834
Original answer (2013):
MongoDB documents are BSON objects, unordered dictionaries of key-value pairs. So, you can't rely on or set a specific fields order. The only thing you can operate is which fields to display and which not to, see docs on find's projection argument.
Also see related questions on SO:
MongoDB field order and document position change after update
Can MongoDB and its drivers preserve the ordering of document elements
Ordering fields from find query with projection
Hope that helps.

Loading a DB table into nested dictionaries in Python

I have a table in MySql DB which I want to load it to a dictionary in python.
the table columns is as follows:
id,url,tag,tagCount
tagCount is the number of times that a tag has been repeated for a certain url. So in that case I need a nested dictionary, in other words a dictionary of dictionary, to load this table. Because each url have several tags for which there are different tagCounts.the code that I used is this:( the whole table is about 22,000 records )
cursor.execute( ''' SELECT url,tag,tagCount
FROM wtp ''')
urlTagCount = cursor.fetchall()
d = defaultdict(defaultdict)
for url,tag,tagCount in urlTagCount:
d[url][tag]=tagCount
print d
first of all I want to know if this is correct.. and if it is why it takes so much time? Is there any faster solutions? I am loading this table into memory to have fast access to get rid of the hassle of slow database operations, but with this slow speed it has become a bottleneck itself, it is even much slower than DB access. and anyone help? thanks
You need to ensure that the dictionary (and each of the nested dictionaries) exist before you assign a key, value to them. It is helpful to use setdefault for this purpose. You end up with something like this:
d = {}
for url, tag, tagCount in urlTagCount:
d.setdefault(url, {})[tag] = tagCount
maybe you could try with normal dicts and tuple keys like
d = dict()
for url,tag,tagCount in urlTagCount:
d[(url, tag)] = tagCount
in any case did you try:
d = defaultdict(dict)
instead of
d = defaultdict(defaultdict)
I could manage to verify the code, and it is working perfectly. For those amateurs like me, i suggest never try to "print" a very large nested dictionary. that "print d" in the last line of the code was the problem for it being slow. If remove it or try to access the dictionary with actual keys, then it is very fast.

Categories

Resources