Problem:
Trying to get a record using Django ORM, from a table that contains a JSON field, I'm using the following line:
test_object = House.objects.get(id=301)
Error
TypeError: the JSON object must be str, bytes or bytearray, not dict
Possible issue
Noticed that a previous developer updated the format of the JSON field in the table, seems that JSON had a bad format. Script used to format the JSON column:
for i in data:
jsonstring = json.dumps(i.result)
new_clear = jsonstring.replace("\\", "")
new_clear = jsonstring.replace("NaN", "null")
i.result = json.loads(new_clear)
i.save()
Comments
In pgAdmin the JSON field looks good and it is formatted properly, see a partial copy of the JSON below:
{"owner_id": 45897, "first_name": "John", "last_name": "DNC", "estate_id": 3201, "sale_date": "3/18/19", "property_street": "123 main st", "property_city": "Miami", "property_state": "FL", "property_zipcode": 33125, "Input_First_Name": "John", "Input_Last_Name": "DNC"}
I would like to know how to deal with this JSON field in order to query the object. Any help will be appreciated. Thanks.
Check if there's a custom decoder being used in the field (docs reference).
If the json data is valid in db, try connecting to db in shell using psycopg2.connect(), query, and decode using json.loads().
I'd intended to post this as a comment, but not enough reputation, in case this is of any concern.
Related
I am trying to upsert the user data to salesforce using a python patch request. I have a dataset in the form of dataframe that consists of several null values. While trying to upsert the data to salesforce it throws an error that,
{'message': 'Deserializing the instance of date from VALUE_STRING, value nan, cannot be executed or the request is missing a required field at [line:1, column:27]', 'errorCode': 'JSON_PARSER_ERROR'}
To resolve the error I have tried to replace the values using None and also, Null as mentioned in the below code. Still, I receive the same error.
df_1.fillna(value=None,method = None,inplace=True)
df_1 = df_1.replace(np.NaN,"null")
The error then is :
{'message': 'Deserializing the instance of date from VALUE_STRING, value null, cannot be executed or the request is missing a required field at [line:1, column:27]', 'errorCode': 'JSON_PARSER_ERROR'}
Any possible leads would be immensely helpful
You'll need to find a way to inspect the final JSON generated just before it's sent out.
You can use workbench to experiment (there's "Utilities -> REST Explorer" after you log in) and over normal REST API the update is a straightforward operation
PATCH {your instance url}/services/data/v55.0/sobjects/Account/0017000000Lg8Wh
(put your account id) with body (either form works)
{
"BillingCity" : null,
"BillingCountry": ""
}
should clear the fields. A "null" won't work, it counts as a string.
====
If you're using "Bulk API 2.0" (https://trailhead.salesforce.com/content/learn/modules/api_basics/api_basics_bulk, I think you'd notice it's different, asynchronous dance of intialise job, upload data, start processing, periodically check "is it done yet"...) for JSON format null should work too, for XML you need special tag and if your format is CSV - it's supposed to be #N/A
Try
df.replace({np.nan: None}, inplace=True)
This is the equivalent of submitting a Null or Empty String value to Salesforce
I have a simplified json string such as follows:
j = '{"_id": {"_id": "5923e0e8bf681d1000abea4c", "copyingData": true}, "currency": "USD"}'
What I'd like to do is to deserialize it so that in return, I'd have a dictionary with _id as bson.objectid an currency as string - similar to the original document retrieved using pymongo.
How do I go about it?
I tried to use bson.json_util.loads with various arguments but it just loaded it as a simple json (so that _id is a dict)
Thank you!
I am using python 2.7 and psycopg2 for connecting to postgresql
I read a bunch of data from a source which has strings like 'Aéropostale'. I then store it in the database. However, in postgresql it is ending up as 'A\u00e9ropostale'. But I want it to get stored as 'Aéropostale'.
The encoding of postgresql database is utf-8.
Please tell me how can I store the actual string 'Aéropostale' instead.
I suspect that the problem is happening in python. Please advice.
EDIT:
Here is my data source
response_json = json.loads(response.json())
response is obtained via service call and looks like:
print(type(response.json())
>> <type'str'>
print(response.json())
>> {"NameRecommendation": ["ValueRecommendation": [{"Value": "\"Handmade\""}, { "Value": "Abercrombie & Fitch"}, {"Value": "A\u00e9ropostale"}, {"Value": "Ann Taylor"}}]
From the above data, my goal is to construct a list of all ValueRecommendation.Value and store in a postgresql json datatype column. So the python equivalent list that I want to store is
py_list = ["Handmade", "Abercrombie & Fitch", "A\u00e9ropostale", "Ann Taylor"]
Then I convert py_list in to json representation using json.dumps()
json_py_list = json.dumps(py_list)
And finally, to insert, I use psycopg2.cursor() and mogrify()
conn = psycopg2.connect("connectionString")
cursor = conn.cursor()
cursor.execute(cursor.mogrify("INSERT INTO table (columnName) VALUES (%s), (json_py_list,)))
As I mentioned earlier, using the above logic, string with special charaters like è are getting stored as utf8 character code.
Please spot my mistake.
json.dumps escapes non-ASCII characters by default so its output can work in non-Unicode-safe environments. You can turn this off with:
json_py_list = json.dumps(py_list, ensure_ascii=False)
Now you will get UTF-8-encoded bytes (unless you change that too with encoding=) so you'll need to make sure your database connection is using that encoding.
In general it shouldn't make any difference as both forms are valid JSON and even with ensure_ascii off there are still characters that get \u-encoded.
I am collecting stock market data periodically and I am storing in mongodb using pymongo as:
db.apple_stock.insert({'time': datetime.datetime.utcnow(), 'price': price})
Now I want the output in JSON format so that I can use highstocks:
[
[1403546401372, 343],
[1403560801637, 454],
[1403575202199, 345],
[1403618402379, 345]
]
The Tornado is running on the server and 'mysite.com/api/stock.json' should provide above data in JSON file.
So, I query my database and used pymongo's json_util to dump in json:
from bson.json_util import dumps
dumps(query_result)
I am getting output as:
[
[{"$date": 1403546401372}, 343],
[{"$date": 1403560801637}, 454],
[{"$date": 1403575202199}, 353]]
]
So how do I change the first item from dictionary to date, containing only value part? Is there any function available which does it or do I have to iterate through list and convert it myself?
secondly, if I really have to iterate the list, then what is the proper way of storing in MongoDB so that I get required output directly?
The dumps functionality is deliberately supplied in the pymongo driver utilities in order to provide what is known as extended JSON format.
The purpose of this is to provide "type fidelity" where JSON itself is not aware of strict "types" such as what the BSON format is desinged for, and what MongoDB itself uses. It allows the transfer of JSON to clients that "may" be able to understand the "types" presented, such as "$date" or "$oid" and then use those keys to "de-serialize" the JSON into a specific "type" under that language specification.
What you will find "otherwise" with a standard form of JSON encode under most language implementations, is either:
An interesting tuple of the time values
A string representing the time value
An epoch timestamp representing the time value
The best case for "custom serialization" is to either iterate the returned structure and implement serialization of the types yourself. Or to just use dumps form along with a "decode" of the JSON and then remove those "keys" identifying the "type".
Or of course just live with the base JSON encode outside of the special library and it's results.
The proper BSON date types that result in this are the "proper" way of storing in MongoDB. The usage of the library function is "by design". But the final way you use this is actually up to you.
When i try to get this data from my mongodb database using flask-restfuland pymongo i get some wierdly formatted data.
For example.
This is what the data looks like in the database.
{ "_id" : ObjectId("5217f3cc7466c06862c4a4f7"), "Hello" : "World" }
This is what it looks like when it gets returned from the database.
"{\"_id\": {\"$oid\": \"5217f3cc7466c06862c4a4f7\"}, \"Hello\": \"World\"}"
Using this code:
def toJSON(data):
return json.dumps(data, default=json_util.default)
And this:
def get(self, objectid):
collection = db["products"]
result = collection.find_one({"_id": ObjectId(objectid)})
return toJSON(result)
Anyone know what i'm doing wrong?
No, that's supposed to be like that.
MongoDB uses BSON, which extends JSON with some extra types, such as ObjectId. To represent those in JSON, you get the weird-looking $oid and friends.
The backslashes are most likely added by some tool to allow for quotes inside of a String literal (which is enclosed by quotes). Unless you are somehow double-encoding things.
flask-restful expects you to return a dictionary and not json here. It would convert the dictionary into json on its own. So your code should look like
def get(self, objectid):
collection = db["products"]
result = collection.find_one({"_id": ObjectId(objectid)})
result['_id'] = result['_id'].__str__()
return result
When you return json flask-restful sees that and infers that it is a string and escapes the double quotes.