Golang - detecting duplicate keys in JSON input - python

I recently completed a project where I used the "object hook" in Python to detect whether a JSON key was a duplicate of another key. Normal JSON decoders seem to just give the last value encountered, but I would like to be able to detect and return an error. My new project (at new company) is to write this in golang, so wondering if there is a similar method to Python's object hook. I had also used a different object hook to get an "ordered dict" in Python; essentially a list form of the JSON input, with the ordering of the original JSON preserved. Haven't been tasked with that in golang yet, but I bet it's coming.... anyway, input on either of these JSON capabilities as relate to golang appreciated!

What you need is a json library which works like a bufio.Scanner or SAX. One is implemented here: github.com/garyburd/json. It will generate events as it scans which you can use to discover duplicate keys.
Here's an example of how to use it:
package main
import (
"fmt"
"github.com/garyburd/json"
"io"
"strings"
)
type Nothing struct{}
type Context struct {
Kind json.Kind
Keys map[string]Nothing
}
func Validate(rdr io.Reader) error {
scanner := json.NewScanner(rdr)
stack := []Context{}
for scanner.Scan() {
if scanner.Kind() == json.Object || scanner.Kind() == json.Array {
stack = append(stack, Context{
Kind: scanner.Kind(),
Keys: map[string]Nothing{},
})
} else if scanner.Kind() == json.End {
if len(stack) == 0 {
return fmt.Errorf("expected start object or array")
}
stack = stack[:len(stack)-1]
} else if len(stack) > 0 {
current := stack[len(stack)-1]
if current.Kind == json.Object {
key := string(scanner.Name())
_, exists := current.Keys[key]
if exists {
return fmt.Errorf("found duplicate key: %v", key)
}
current.Keys[key] = Nothing{}
}
}
}
return nil
}
func main() {
rdr := strings.NewReader(`
{
"x": 10,
"y": {
"z": 1,
"z": 2
},
"z": [1,2,3,4,5]
}
`)
err := Validate(rdr)
if err == nil {
fmt.Println("valid json!")
} else {
fmt.Println("invalid json:", err)
}
}
As it walks through the JSON object it builds a stack of hash tables. (for nested objects / arrays) Any duplicate keys in one of those hash tables results in an error. If you need more detail you could easily add a Name property to the Context and walk the stack backwards to generate a json path. (like a.b.c.d is a duplicate key)

Related

Access dictionary to X depth with a list of X values

Situation
I want to make a function that makes me free to give a full dictionary path parameter, and get back the value or node I need, without doing it node by node.
Code
This is the function. Obviously, as is now, it throws TypeError: unhashable type: 'list'. But it's only for getting the idea.
def get_section(api_data, section):
if "/" in section:
section = section.split("/")
return api_data.json()[section]
return api_data.json()[section]
Example
JSON
{
"component": {
"name": "gino",
"measures": [
{
"value": "12",
},
{
"value": "14"
}
]
},
"metrics": {
...
}
}
Expectation
analyses = get_section(analyses_data, "component/measures") # Returns measures node
analyses = get_section(analyses_data, "component/name") # Returns 'gino'
analyses = get_section(analyses_data, "component/measures/value") # Returns error, because it's ambigous
Request
How can I do it?
Edits
Added examples for clarity
A cool solution could be:
def get_section(api_data, section):
return [api_data := api_data[sec] for sec in section.split("/")][-1]
So if you execute it with:
analyses_data = {
"analyses": {
"dates": {
"xyz": "abc"
}
}
}
print(get_section(analyses_data, "analyses/dates/xyz")) # Returns: abc
Or since you are accessing a json using a custom method:
print(get_section(analyses_data.json(), "analyses/dates/xyz")) # Returns: abc
This works because the := operator in python is a variable assignment that returns the assigned value, so it loops all the parts of the section string by reassigning the api_data variable to the result of accessing that key and storing the result of every assignment in a list. Then with the [-1] at the end it returns the last assignment that corresponds to the last accessed key (a.k.a the last accessed dictionary level).

Get either a single object or list of objects from an argument of PYBIND11 binded c++ function

I'm trying to bind a function that accepts multiple arguments and keyword arguments with PYBIND.
The function looks something like this:
{
OutputSP output;
InputSP input;
if (args.size() == 1)
{
input = py::cast<InputSP>(args[0]);
} else if (args.size() == 2)
{
query = py::cast<OutputSP>(args[0]);
expression = py::cast<InputSP>(args[1]);
}
// TODO: Default values - should be elsewhere?
Flags mode(Flags::FIRST_OPTION);
std::vector<MultiFlags> status({ MultiFlags::COMPLETE });
for (auto& kv : kwargs)
{
auto key_name = py::cast<string>(kv.first);
if (key_name == "flag")
{
mode = py::cast<Flags>(kv.second);
}
else if (key_name == "status")
{
status = py::cast<std::vector<MultiFlags> >(kv.second);
}
}
return method(output, input, mode, status);
}
Where Flags and MultiFlags are defined like this
py::enum_<Flags>(m, "Flags")
.value("FirstOption", Flags::FIRST_OPTION)
.value("SecondOption", Flags::SECOND_OPTION)
.export_values();
py::enum_<MultiFlags>(m, "MultiFlags")
.value("Complete", MultiFlags::COMPLETE)
.value("Incomplete", MultiFlags::INCOMPLETE)
.export_values();
and the wrapper with
m.def("method", &method_wrapper);
Now this should work fine if the call contains
status=[MultiFlags.Complete]
I'm looking for a way to check the type of the kwarg in advance so I could also accept a call that contains
status=MultiFlags.Complete
but can't find anything relevant in PYBIND11 docs. what is the correct way to do this?
Found it.
py::cast throws py::cast_error when casting fails, so I can treat the two options with try-catch:
else if (key_name == "status")
{
try
{
status = py::cast<std::vector<MultiFlags> >(kv.second);
} catch (py::cast_error&)
{
status = { py::cast<MultiFlags>(kv.second) };
}
}

TypeError: unhashable type: 'dict' Python/Flask

So, I'm working on a little program that is supposed to send values to a firestore database, almost everything is working fine, but I get an error from this part of the code. I'm trying to save the string that is inside temp
if block == "ITEMS":
champs = form.areaItems.data #Get the user input text field from the WTForm (it's a dict for whatever reason)
itemsChamps = ItemsChamps(champs.values()) #Stock the dict value inside itemsChamps
temp = next(iter(itemsChamps.name)) #Get the 1st value from itemsChamps (I only want the 1st value)
data = {
"items": {
champs: {
"string": temp
}
}
}
Here is the error :
File "C:\[..]\flaskblog\routes.py", line 63, in ajouter
"string": temp
TypeError: unhashable type: 'dict'
My code may look a bit """confusing""", I'm a newbie, sorry for that !
Edit 1 : It work now !
I feel so dumb right now, I was confused a bit by all the code I wrote, there was a few mistake :
if block == "ITEMS":
champs = form.itemsFields.data #I was using the wrong form field...
itemsChamps = ItemsChamps(form.areaItems.data.values()) #I'm now getting all the value from the right field
temp = next(iter(itemsChamps.name)) #Didn't touch this, it work
data = {
"items": {
champs: {
"string": temp
}
}
}
Thanks you for giving me a little of your time !
The problem is with this piece of code. champs is a dictionary, and you are using it as a key, dict key has to be a str, int, float (in general something that can be hashed and not a dictionary)
data = {
"items": {
champs: {
"string": temp
}
}
}
If champs["user_input"] is the data you are interested in you can change champs to champs["user_input"] to solve this.
You are trying to use a dict as a dict-key, to cite your comment: "is a dict for some reason". Dicts are not hashable and therefore cannot be used as keys. Maybe extract the data from the dict and use this as key?

Unwanted double quotes around server response in Python Flask-RESTful [duplicate]

Given a string of JSON data, how can I safely turn that string into a JavaScript object?
Obviously I can do this unsafely with something like:
var obj = eval("(" + json + ')');
but that leaves me vulnerable to the JSON string containing other code, which it seems very dangerous to simply eval.
JSON.parse(jsonString) is a pure JavaScript approach so long as you can guarantee a reasonably modern browser.
The jQuery method is now deprecated. Use this method instead:
let jsonObject = JSON.parse(jsonString);
Original answer using deprecated jQuery functionality:
If you're using jQuery just use:
jQuery.parseJSON( jsonString );
It's exactly what you're looking for (see the jQuery documentation).
This answer is for IE < 7, for modern browsers check Jonathan's answer above.
This answer is outdated and Jonathan's answer above (JSON.parse(jsonString)) is now the best answer.
JSON.org has JSON parsers for many languages including four different ones for JavaScript. I believe most people would consider json2.js their goto implementation.
Use the simple code example in "JSON.parse()":
var jsontext = '{"firstname":"Jesper","surname":"Aaberg","phone":["555-0100","555-0120"]}';
var contact = JSON.parse(jsontext);
and reversing it:
var str = JSON.stringify(arr);
This seems to be the issue:
An input that is received via Ajax websocket etc, and it will be in String format, but you need to know if it is JSON.parsable. The touble is, if you always run it through JSON.parse, the program MAY continue "successfully" but you'll still see an error thrown in the console with the dreaded "Error: unexpected token 'x'".
var data;
try {
data = JSON.parse(jqxhr.responseText);
} catch (_error) {}
data || (data = {
message: 'Server error, please retry'
});
I'm not sure about other ways to do it but here's how you do it in Prototype (JSON tutorial).
new Ajax.Request('/some_url', {
method:'get',
requestHeaders: {Accept: 'application/json'},
onSuccess: function(transport){
var json = transport.responseText.evalJSON(true);
}
});
Calling evalJSON() with true as the argument sanitizes the incoming string.
If you're using jQuery, you can also use:
$.getJSON(url, function(data) { });
Then you can do things like
data.key1.something
data.key1.something_else
etc.
Just for fun, here is a way using a function:
jsonObject = (new Function('return ' + jsonFormatData))()
$.ajax({
url: url,
dataType: 'json',
data: data,
success: callback
});
The callback is passed the returned data, which will be a JavaScript object or array as defined by the JSON structure and parsed using the $.parseJSON() method.
Using JSON.parse is probably the best way.
Here's an example
var jsonRes = '{ "students" : [' +
'{ "firstName":"Michel" , "lastName":"John" ,"age":18},' +
'{ "firstName":"Richard" , "lastName":"Joe","age":20 },' +
'{ "firstName":"James" , "lastName":"Henry","age":15 } ]}';
var studentObject = JSON.parse(jsonRes);
The easiest way using parse() method:
var response = '{"result":true,"count":1}';
var JsonObject= JSON.parse(response);
Then you can get the values of the JSON elements, for example:
var myResponseResult = JsonObject.result;
var myResponseCount = JsonObject.count;
Using jQuery as described in the jQuery.parseJSON() documentation:
JSON.parse(jsonString);
Try using the method with this Data object. ex:Data='{result:true,count:1}'
try {
eval('var obj=' + Data);
console.log(obj.count);
}
catch(e) {
console.log(e.message);
}
This method really helps in Nodejs when you are working with serial port programming
I found a "better" way:
In CoffeeScript:
try data = JSON.parse(jqxhr.responseText)
data ||= { message: 'Server error, please retry' }
In Javascript:
var data;
try {
data = JSON.parse(jqxhr.responseText);
} catch (_error) {}
data || (data = {
message: 'Server error, please retry'
});
JSON parsing is always a pain. If the input is not as expected it throws an error and crashes what you are doing.
You can use the following tiny function to safely parse your input. It always turns an object even if the input is not valid or is already an object which is better for most cases:
JSON.safeParse = function (input, def) {
// Convert null to empty object
if (!input) {
return def || {};
} else if (Object.prototype.toString.call(input) === '[object Object]') {
return input;
}
try {
return JSON.parse(input);
} catch (e) {
return def || {};
}
};
Parse the JSON string with JSON.parse(), and the data becomes a JavaScript object:
JSON.parse(jsonString)
Here, JSON represents to process JSON dataset.
Imagine we received this text from a web server:
'{ "name":"John", "age":30, "city":"New York"}'
To parse into a JSON object:
var obj = JSON.parse('{ "name":"John", "age":30, "city":"New York"}');
Here obj is the respective JSON object which looks like:
{ "name":"John", "age":30, "city":"New York"}
To fetch a value use the . operator:
obj.name // John
obj.age //30
Convert a JavaScript object into a string with JSON.stringify().
JSON.parse(jsonString);
json.parse will change into object.
JSON.parse() converts any JSON string passed into the function into a JSON object.
To understand it better, press F12 to open "Inspect Element" in your browser and go to the console to write the following commands:
var response = '{"result":true,"count":1}'; //sample json object(string form)
JSON.parse(response); //converts passed string to JSON Object.
Now run the command:
console.log(JSON.parse(response));
You'll get output as an Object {result: true, count: 1}.
In order to use that Object, you can assign it to the variable, maybe obj:
var obj = JSON.parse(response);
By using obj and the dot (.) operator you can access properties of the JSON object.
Try to run the command:
console.log(obj.result);
Official documentation:
The JSON.parse() method parses a JSON string, constructing the JavaScript value or object described by the string. An optional reviver function can be provided to perform a transformation on the resulting object before it is returned.
Syntax:
JSON.parse(text[, reviver])
Parameters:
text
: The string to parse as JSON. See the JSON object for a description of JSON syntax.
reviver (optional)
: If a function, this prescribes how the value originally produced by parsing is transformed, before being returned.
Return value
The Object corresponding to the given JSON text.
Exceptions
Throws a SyntaxError exception if the string to parse is not valid JSON.
If we have a string like this:
"{\"status\":1,\"token\":\"65b4352b2dfc4957a09add0ce5714059\"}"
then we can simply use JSON.parse twice to convert this string to a JSON object:
var sampleString = "{\"status\":1,\"token\":\"65b4352b2dfc4957a09add0ce5714059\"}"
var jsonString= JSON.parse(sampleString)
var jsonObject= JSON.parse(jsonString)
And we can extract values from the JSON object using:
// instead of last JSON.parse:
var { status, token } = JSON.parse(jsonString);
The result will be:
status = 1 and token = 65b4352b2dfc4957a09add0ce5714059
Performance
There are already good answer for this question, but I was curious about performance and today 2020.09.21 I conduct tests on MacOs HighSierra 10.13.6 on Chrome v85, Safari v13.1.2 and Firefox v80 for chosen solutions.
Results
eval/Function (A,B,C) approach is fast on Chrome (but for big-deep object N=1000 they crash: "maximum stack call exceed)
eval (A) is fast/medium fast on all browsers
JSON.parse (D,E) are fastest on Safari and Firefox
Details
I perform 4 tests cases:
for small shallow object HERE
for small deep object HERE
for big shallow object HERE
for big deep object HERE
Object used in above tests came from HERE
let obj_ShallowSmall = {
field0: false,
field1: true,
field2: 1,
field3: 0,
field4: null,
field5: [],
field6: {},
field7: "text7",
field8: "text8",
}
let obj_DeepSmall = {
level0: {
level1: {
level2: {
level3: {
level4: {
level5: {
level6: {
level7: {
level8: {
level9: [[[[[[[[[['abc']]]]]]]]]],
}}}}}}}}},
};
let obj_ShallowBig = Array(1000).fill(0).reduce((a,c,i) => (a['field'+i]=getField(i),a) ,{});
let obj_DeepBig = genDeepObject(1000);
// ------------------
// Show objects
// ------------------
console.log('obj_ShallowSmall:',JSON.stringify(obj_ShallowSmall));
console.log('obj_DeepSmall:',JSON.stringify(obj_DeepSmall));
console.log('obj_ShallowBig:',JSON.stringify(obj_ShallowBig));
console.log('obj_DeepBig:',JSON.stringify(obj_DeepBig));
// ------------------
// HELPERS
// ------------------
function getField(k) {
let i=k%10;
if(i==0) return false;
if(i==1) return true;
if(i==2) return k;
if(i==3) return 0;
if(i==4) return null;
if(i==5) return [];
if(i==6) return {};
if(i>=7) return "text"+k;
}
function genDeepObject(N) {
// generate: {level0:{level1:{...levelN: {end:[[[...N-times...['abc']...]]] }}}...}}}
let obj={};
let o=obj;
let arr = [];
let a=arr;
for(let i=0; i<N; i++) {
o['level'+i]={};
o=o['level'+i];
let aa=[];
a.push(aa);
a=aa;
}
a[0]='abc';
o['end']=arr;
return obj;
}
Below snippet presents chosen solutions
// src: https://stackoverflow.com/q/45015/860099
function A(json) {
return eval("(" + json + ')');
}
// https://stackoverflow.com/a/26377600/860099
function B(json) {
return (new Function('return ('+json+')'))()
}
// improved https://stackoverflow.com/a/26377600/860099
function C(json) {
return Function('return ('+json+')')()
}
// src: https://stackoverflow.com/a/5686237/860099
function D(json) {
return JSON.parse(json);
}
// src: https://stackoverflow.com/a/233630/860099
function E(json) {
return $.parseJSON(json)
}
// --------------------
// TEST
// --------------------
let json = '{"a":"abc","b":"123","d":[1,2,3],"e":{"a":1,"b":2,"c":3}}';
[A,B,C,D,E].map(f=> {
console.log(
f.name + ' ' + JSON.stringify(f(json))
)})
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
This shippet only presents functions used in performance tests - it not perform tests itself!
And here are example results for chrome
Converting the object to JSON, and then parsing it, works for me, like:
JSON.parse(JSON.stringify(object))
The recommended approach to parse JSON in JavaScript is to use JSON.parse()
Background
The JSON API was introduced with ECMAScript 5 and has since been implemented in >99% of browsers by market share.
jQuery once had a $.parseJSON() function, but it was deprecated with jQuery 3.0. In any case, for a long time, it was nothing more than a wrapper around JSON.parse().
Example
const json = '{ "city": "Boston", "population": 500000 }';
const object = JSON.parse(json);
console.log(object.city, object.population);
Browser Compatibility
Is JSON.parse supported by all major browsers?
Pretty much, yes (see reference).
Older question, I know, however nobody notice this solution by using new Function(), an anonymous function that returns the data.
Just an example:
var oData = 'test1:"This is my object",test2:"This is my object"';
if( typeof oData !== 'object' )
try {
oData = (new Function('return {'+oData+'};'))();
}
catch(e) { oData=false; }
if( typeof oData !== 'object' )
{ alert( 'Error in code' ); }
else {
alert( oData.test1 );
alert( oData.test2 );
}
This is a little more safe because it executes inside a function and do not compile in your code directly. So if there is a function declaration inside it, it will not be bound to the default window object.
I use this to 'compile' configuration settings of DOM elements (for example the data attribute) simple and fast.
Summary:
Javascript (both browser and NodeJS) have a built in JSON object. On this Object are 2 convenient methods for dealing with JSON. They are the following:
JSON.parse() Takes JSON as argument, returns JS object
JSON.stringify() Takes JS object as argument returns JSON object
Other applications:
Besides for very conveniently dealing with JSON they have can be used for other means. The combination of both JSON methods allows us to make very easy make deep clones of arrays or objects. For example:
let arr1 = [1, 2, [3 ,4]];
let newArr = arr1.slice();
arr1[2][0] = 'changed';
console.log(newArr); // not a deep clone
let arr2 = [1, 2, [3 ,4]];
let newArrDeepclone = JSON.parse(JSON.stringify(arr2));
arr2[2][0] = 'changed';
console.log(newArrDeepclone); // A deep clone, values unchanged
You also can use reviver function to filter.
var data = JSON.parse(jsonString, function reviver(key, value) {
//your code here to filter
});
For more information read JSON.parse.
Just to the cover parse for different input types
Parse the data with JSON.parse(), and the data becomes a JavaScript object.
var obj = JSON.parse('{ "name":"John", "age":30, "city":"New York"}');
When using the JSON.parse() on a JSON derived from an array, the method will return a JavaScript array, instead of a JavaScript object.
var myArr = JSON.parse(this.responseText);
console.log(myArr[0]);
Date objects are not allowed in JSON.
For Dates do somthing like this
var text = '{ "name":"John", "birth":"1986-12-14", "city":"New York"}';
var obj = JSON.parse(text);
obj.birth = new Date(obj.birth);
Functions are not allowed in JSON.
If you need to include a function, write it as a string.
var text = '{ "name":"John", "age":"function () {return 30;}", "city":"New York"}';
var obj = JSON.parse(text);
obj.age = eval("(" + obj.age + ")");
Another option
const json = '{ "fruit": "pineapple", "fingers": 10 }'
let j0s,j1s,j2s,j3s
console.log(`{ "${j0s="fruit"}": "${j1s="pineapple"}", "${j2s="fingers"}": ${j3s="10"} }`)
Try this. This one is written in typescript.
export function safeJsonParse(str: string) {
try {
return JSON.parse(str);
} catch (e) {
return str;
}
}

Use Python and JSON to recursively get all keys associated with a value

Giving data organized in JSON format (code example bellow) how can we get the path of keys and sub-keys associated with a given value?
i.e.
Giving an input "23314" we need to return a list with:
Fanerozoico, Cenozoico, Quaternario, Pleistocenico, Superior.
Since data is a json file, using python and json lib we had decoded it:
import json
def decode_crono(crono_file):
with open(crono_file) as json_file:
data = json.load(json_file)
Now on we do not know how to treat it in a way to get what we need.
We can access keys like this:
k = data["Fanerozoico"]["Cenozoico"]["Quaternario "]["Pleistocenico "].keys()
or values like this:
v= data["Fanerozoico"]["Cenozoico"]["Quaternario "]["Pleistocenico "]["Superior"].values()
but this is still far from what we need.
{
"Fanerozoico": {
"id": "20000",
"Cenozoico": {
"id": "23000",
"Quaternario": {
"id": "23300",
"Pleistocenico": {
"id": "23310",
"Superior": {
"id": "23314"
},
"Medio": {
"id": "23313"
},
"Calabriano": {
"id": "23312"
},
"Gelasiano": {
"id": "23311"
}
}
}
}
}
}
It's a little hard to understand exactly what you are after here, but it seems like for some reason you have a bunch of nested json and you want to search it for an id and return a list that represents the path down the json nesting. If so, the quick and easy path is to recurse on the dictionary (that you got from json.load) and collect the keys as you go. When you find an 'id' key that matches the id you are searching for you are done. Here is some code that does that:
def all_keys(search_dict, key_id):
def _all_keys(search_dict, key_id, keys=None):
if not keys:
keys = []
for i in search_dict:
if search_dict[i] == key_id:
return keys + [i]
if isinstance(search_dict[i], dict):
potential_keys = _all_keys(search_dict[i], key_id, keys + [i])
if 'id' in potential_keys:
keys = potential_keys
break
return keys
return _all_keys(search_dict, key_id)[:-1]
The reason for the nested function is to strip off the 'id' key that would otherwise be on the end of the list.
This is really just to give you an idea of what a solution might look like. Beware the python recursion limit!
Based on the assumption that you need the full dictionary path until a key named id has a particular value, here's a recursive solution that iterates the whole dict. Bear in mind that:
The code is not optimized at all
For huge json objects it might yield StackOverflow :)
It will stop at first encountered value found (in theory there shouldn't be more than 1 if the json is semantically correct)
The code:
import json
from types import DictType
SEARCH_KEY_NAME = "id"
FOUND_FLAG = ()
CRONO_FILE = "a.jsn"
def decode_crono(crono_file):
with open(crono_file) as json_file:
return json.load(json_file)
def traverse_dict(dict_obj, value):
for key in dict_obj:
key_obj = dict_obj[key]
if key == SEARCH_KEY_NAME and key_obj == value:
return FOUND_FLAG
elif isinstance(key_obj, DictType):
inner = traverse_dict(key_obj, value)
if inner is not None:
return (key,) + inner
return None
if __name__ == "__main__":
value = "23314"
json_dict = decode_crono(CRONO_FILE)
result = traverse_dict(json_dict, value)
print result

Categories

Resources