Python MongoDB retrieve value of ISODate field - python

I am writing a script in python that will query a MongoDB database, parse the results in a format that can be imported into a relational database.
The data is stored in an associative array. I am able to query most of the fields by using dot notation such as $status.state.
Issue:
The issue is that the field $last_seen ISODate is not returning a value when attempting to use dot notation.
# Last seen dot notation does not return a value
"u_updated_timestamp": "$last_seen.date"
Here is the data structure:
{
"status" : {
"state" : "up",
},
"addresses" : {
"ipv4" : "192.168.1.1"
},
"last_seen" : ISODate("2016-04-29T14:06:17.441Z")
}
Here is the code that I am starting with. All of the other fields are returnign in the correct format. However, the last_seen ISO date field is not returning any value at all. What other steps are required in order to retrieve the value?
I tried $dateToString but it did not work (we are running pymongo 2.7).
computers = db['computer'].aggregate([
{"$project" : {
"u_ipv4": "$addresses.ipv4",
"u_status": "$status.state",
# Last seen dot notation does not return a value
"u_updated_timestamp": "$last_seen.date"
}}
])
I also tried simply $last_seen but that returns key and value, I only need the value.
UPDATE: The desired format is flexible. It could be a unix timestamp or mm-dd-yyyy. Any format would be acceptable. The main issue is that there is no date value being returned at all with this query as it stands currently.

Related

How to set type of field when inserting into mongodb

So I am querying an API, receiving data and then storing it into MongoDB.
All was working fine so far, Except now I have started using Mongo's Aggregation pipeline. During this I realized that Mongo is inserting the number data as strings. Hence now, my aggregation pipeline wont work as I am doing numerical computation such as calculating averages etc....Because Mongo is seeing it as a string.
How can I set the type of the field during Insert.....such that I specify that this is float etc...
What I have tried so far is the below code: but it does not work well, because the mongo shell is complaining because the field name starts with a number:
db.weeklycol.find().forEach(function(ch)
{
db.weeklycol.update({
"_id":ch._id},
{"$set":
{
"4_close":parseInt(ch.4_close)
}
});
To access property, which name is weird use []:
ch['4_close']
Then about saving numbers, well I made test:
> db.test.insertOne({_id:1, field: 2})
{ "acknowledged" : true, "insertedId" : 1 }
> db.test.find({_id:1})
{ "_id" : 1, "field" : 2 }
Seems to be added number alright. Can you please post exact example of code with some dummy values where inserted object have property with number value, and inserted document have this value turned to string?
I have managed to resolve this by the below code....So I changing the variable type before inserting it into my "insert" string with the below code: I created this function, and I call the function on my whole dictionary just before im inserting....If its a number it will convert, else it will pass:
I have similar function which converts as well....instead of float on line 4, i have it changed to date.
def convertint(bdic):
for key, value in bdic.items():
try:
bdic[key] = float(value)
except:
pass
return bdic

Elasticsearch fix date field value. (change field value from int to string)

I used python ship data from mongodb to elasticsearch.
There is a timestamp field named update_time, mapping like:
"update_time" : {
"type": "date",
"format": "yyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
},
epoch_millis is used for timestamp.
After everything done, I used range to query doc in date range. But nothing return. After some researching, I thought the problem was: python timestamp int(time.time()) is 10 digit, but it is 13 digit in elasticsearch , official example :
PUT my_index/my_type/2?timestamp=1420070400000.
so I tried to update the update_time field by multiple 1000:
{
"script": {
"inline": "ctx._source.update_time=ctx._source.update_time*1000"
}
}
Unfortunately, I found all update_time become minus. Then I come up with that Java int type is pow(2,32) -1 , much lower than 1420070400000. So I guess timestamp field should not be int ?
Then I want to update the field value from int to string(I don't want ot change field mapping type, I know change mapping type need reindex, but here only need change value )
But I can't figure out what script I can use, official script document not mention about this
I didn't know groovy was a lauguage and thought it only used for math method because ES doc only wrote about that.
The sulotion is simple, just convert int to long.
curl -XPOST http://localhost:9200/my_index/_update_by_query -d '
{
"script": {
"inline": "ctx._source.update_time=(long)ctx._source.update_time*1000"
}
}'

Simple MongoDB query slow

I am new to MongoDB. I am trying to write some data to a Mongo database from Python script, the data structure is simple:
{"name":name, "first":"2016-03-01", "last":"2016-03-01"}
I have a script to query if the "name" exists, if yes, update the "last" date, otherwise, create the document.
if db.collections.find_one({"name": the_name}):
And the size of data is actually very small, <5M bytes, and <150k records.
It was fast at first (e.g. the first 20,000 records), and then getting slower and slower. I checked the analyzer profile, some queries were > 50 miliseconds, but I don't see anything abnormal with those records.
Any ideas?
Update 1:
Seems there is no index for the "name" field:
> db.my_collection.getIndexes()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "domains.my_collection"
}
]
First, you should check if the collection has an index on the "name" field. See the output of the following command in mongo CLI.
db.my_collection.getIndexes();
If there is no index then create it (note, on production environment you'd better create index in background).
db.my_collection.createIndex({name:1},{unique:true});
And if you want to insert a document if the document does not exist or update one field if the document exists then you can do it in one step without pre-querying. Use UPDATE command with upsert option and $set/$setOnInsert operators (see https://docs.mongodb.org/manual/reference/operator/update/setOnInsert/).
db.my_collection.update(
{name:"the_name"},
{
$set:{last:"current_date"},
$setOnInsert:{first:"current_date"}
},
{upsert:true}
);

Set correct data type with mongodb and python

I've just started using mongodb via the pyhton drive (pymongo). When I'm posting new data (which actually comes from a MySQL db) some data types appear incorrectly mapped. For example, some single digit number are inserted as long ints, a timestamp is inserted as a string. Also the date which is stored in MySQL as YY-MM-DD is changed to YY-MM-DD 00:00:00 (i.e a time is added).This seems like a waste of space, is this normal procedure for mongodb or should I somehow change the data types which are incorrectly(?) mapped?
ps I did search through the docs as mongodb but I couldn't find anything to match my query.
post = {
"title": video_title,
"ext_id": video_external_id,
"source": video_source,
"date_added": d1,
"views":{
"views_all": views_all,
"views_year": views_yr,
"views_day": views_day,
"views_week": views_wk,
"views_month": views_mo
},
"video_type": 0,
"hd": video_hd,
"features": featured,
"video_thumbs": video_thumbnails,
"video_main_thumb": video_main_thumbnail,
"length": video_length,
"length_sort": video_length,
"rating": {
"rating_all": rating_all,
"rating_year": rating_yr,
"rating_day": rating_day,
"rating_week": rating_wk,
"rating_month": rating_mo
}
}
posts = db.posts
post_id = video_list.insert(post)
For example, some single digit number are inserted as long ints
PyMongo stores python 2.x long as BSON int64 regardless of value, python int as BSON int32 or BSON int64 depending on the value of the int. See the table here for a mapping of python types to BSON types.
a timestamp is inserted as a string
Assuming the timestamp was passed in ISO-8601 format that's correct. If you want to store the timestamp as a BSON datetime pass a python datetime object instead.
Also the date which is stored in MySQL as YY-MM-DD is changed to YY-MM-DD 00:00:00 (i.e a time is added)
BSON (the storage format used by MongoDB) does not have a date type without a time element. It is up to your application to convert dates to datetimes.

Why am I getting 2 different results for similar queries?

I'm trying to link one document to another. To do that, I'm trying to store the ObjectID of one document in the other. I'm trying a couple of different ways that should produce the same results, but they actually look different. Here are the ways I'm trying:
Method 1
owner['ownedCar'] = db.cars.find_one({ '_id' : ObjectId( $theCarsObjectIDstring ) }, {'_id': 1})
db.owners.save(owner)
which looks like this in the database:
{
_id {"$oid": "502186421fe3321dfa000001"}
}
and Method 2
car = db.cars.find_one( { '_id' : ObjectId( $theCarsObjectIDstring ) } )
owner['ownedCar'] = car['_id']
db.owners.save(owner)
which looks like this:
{"$oid": "502186421fe3321dfa000001"}
Shouldn't they look the same? What's the preferred way to link documents?
EDIT Why is this question getting downvoted?
These two results are the same, the difference is how you are picking out the results to populate the linked field.
When you use the second param of find to return fields, even if it is just one it will always return an object with the field names as the keys and the field values as the value. You make the linked field equal that object as such you don't just get the ID back as the value of the linked field. So the result of your first query is:
{
_id {"$oid": "502186421fe3321dfa000001"}
}
And you make the field equal that.
Alternatively you are physically picking out car['_id'] in the second query as such the value of the linked field is just the id.
This is a driver and language difference in interpretation of how it should return values.
I would say the second method is the best way since the first adds unnessecary bloat to the field in the form of the extra object.

Categories

Resources