Using Bad Json in Python

Using Bad Json in Python - python

I am having a json in a file which i want to access in my Python Code. The Json file looks like :
{
"fc1" : {
region : "Delhi",
marketplace : "IN"
},
"fc2" : {
region : "Rajasthan",
marketplace : "IN"
}
}
The above json i want to use in my Python code. I want to access according to its keys("fc1", "fc2")
Since this is not like actual json, i am facing difficulty in accessing the values in json.
Is there any way in python language to access these type of json.
Thanks.

I agree with the comment that, if you generated that file, then you should put quotes around region and marketplace when generating it (or have the person who generated it do the same). However, if this absolutely isn't an option for whatever reason, the following approach might work:
import json
data_string = """
{
"fc1":{
region:"Delhi",
marketplace: "IN"
},
"fc2" : {
region:"Rajasthan",
marketplace: "IN"
}
}
"""
data = json.loads(data_string.replace('region', '"region"').replace('marketplace', '"marketplace"'))
data
>>>{'fc1': {'region': 'Delhi', 'marketplace': 'IN'},
'fc2': {'region': 'Rajasthan', 'marketplace': 'IN'}}
Note that you would have to do the same for any unquoted key.

There is module dirtyjson which reads this incorrect JSON.
import dirtyjson
data_string = """
{
"fc1":{
region:"Delhi",
marketplace: "IN"
},
"fc2" : {
region:"Rajasthan",
marketplace: "IN"
}
}
"""
data = dirtyjson.loads(data_string)
print(data)
print(data['fc1'])
print(data['fc2'])

Related

How to get value off JSON in Robot framework

I have for example a log that will change each time it is run an example is below. I will like to take one of the value(id) lets say as a variable and log only the id to console or use that value somewhere else.
[
{
"#type": "type",
"href": [
{
"#url": "url1",
"#method": "get"
},
{
"#url": "url2",
"#method": "post"
},
{
"#url": "url3",
"#method": "post"
}
],
"id": "3",
"meta": [
{
"key": "key1",
"value": "value1"
},
{
"key": "key2",
"value": "value2"
}
]
}
]
I want to get the id in a variable because the id changes after each time the robot framework is ran

You can see here that the JSON you are getting is in list format. Which means that to get a value from the JSON, you'll first need to read the JSON object in, then get the dictionary out of the list and only then access the key value you'd need to extract.
Robot Framework supports using Python statements with Evaluate keyword. When we need to simply parse some string to JSON we can get by using
${DATA}= Evaluate json.loads("""${DATA}""")
Notice that the ${DATA} here should contain your JSON as a string.
Now that we have the JSON object, we can do whatever we want with it. We can actually see from your JSON that it is actually a dictionary nested inside a list object (See surrounding []). So first, extract dictionary from the list, then access the dictionary key normally. The following should work fine.
${DICT}= Get From List ${DATA} 0
${ID}= Get From Dictionary ${DICT} id

Examine and tweak a given analyzer?

I'm using the French analyzer.
Having examined the output from IndexClient.analyze(...) for this analyzer I'm a little unhappy with some of the stopwords (e.g. the expression 'ayant-cause' comes out as 'caus', because 'ayant' is a stopword: French stopwords).
How do I go about examining these stopwords and then tweaking them? Do I have to create a custom analyzer based on the existing French one? Or can I directly tweak the French one?
NB I am using the Python elasticsearch module ("thin client"), but an answer in terms of REST commands would be fine.

Yes, you can easily tweak the existing analyzer and examine them using the Analyze API of elasticsearch
Ultimately analyzer is made of three things, char filter, tokeniser and token-filter and you can create your own combination of these things to build your own custom analyzer and test it using the REST API.

Spent quite a bit of time figuring out at least a workaround arrangement.
Having downloaded that French stop-words file from Github I then edited it (e.g. to exclude "ayant"). Currently residing in the "config" directory of my installed ES setup (although you can set an absolute path).
Then I made my settings/mappings object like this:
{
'settings' : {
'analysis' : {
'analyzer' : {
'tweaked_french' : {
'type' : 'french',
# NB W10, config path currently D:\apps\ElasticSearch\elasticsearch-7.10.2\config
'stopwords_path' : 'tweaked_french_stop.txt',
},
},
},
},
'mappings': {
'dynamic': 'strict',
'properties': {
'my_french_field' : {
'type' : 'text',
'term_vector' : 'with_positions_offsets',
'fields' : {
'french' : {
'type' : 'text',
'analyzer' : 'tweaked_french',
'term_vector' : 'with_positions_offsets',
},
},
},
},
},
}
What is then rather wonderful is that, according to my experiments, you can get a query object to find and use that custom-built analyser (i.e. it's there and available, in the installed index). So your query object is relatively simple:
{
'query': {
'simple_query_string': {
'query': query_text,
'fields': [
'my_french_field.french',
],
'analyzer' : 'tweaked_french',
},
},
'highlight': {
'fields': {
'my_french_field.french': {
'type': 'fvh',
...
},
},
'number_of_fragments': 0
}
}
After that you can query in French: your query gets stemmed and the result is used for the search. If "ayant" is a word in your query string, it will now return hits including "ayant-cause", proving that both the query and the mapping spec are using the tweaked stop-word list.
I'd still like to know whether a way exists not involving using an external file, i.e. of editing on-the-fly what is already there (or of just seeing what it already there...).

Python: Pretty print json file with array ID's

I would like to pretty print a json file where i can see the array ID's. Im working on a Cisco Nexus Switch with NX-OS that runs Python (2.7.11). Looking at following code:
cmd = 'show interface Eth1/1 counters'
out = json.loads(clid(cmd))
print (json.dumps(out, sort_keys=True, indent=4))
This gives me:
{
"TABLE_rx_counters": {
"ROW_rx_counters": [
{
"eth_inbytes": "442370508663",
"eth_inucast": "76618907",
"interface_rx": "Ethernet1/1"
},
{
"eth_inbcast": "4269",
"eth_inmcast": "49144",
"interface_rx": "Ethernet1/1"
}
]
},
"TABLE_tx_counters": {
"ROW_tx_counters": [
{
"eth_outbytes": "217868085254",
"eth_outucast": "66635610",
"interface_tx": "Ethernet1/1"
},
{
"eth_outbcast": "1137",
"eth_outmcast": "557815",
"interface_tx": "Ethernet1/1"
}
]
}
}
But i need to access the field by:
rxuc = int(out['TABLE_rx_counters']['ROW_rx_counters'][0]['eth_inucast'])
rxmc = int(out['TABLE_rx_counters']['ROW_rx_counters'][1]['eth_inmcast'])
rxbc = int(out['TABLE_rx_counters']['ROW_rx_counters'][1]['eth_inbcast'])
txuc = int(out['TABLE_tx_counters']['ROW_tx_counters'][0]['eth_outucast'])
txmc = int(out['TABLE_tx_counters']['ROW_tx_counters'][1]['eth_outmcast'])
txbc = int(out['TABLE_tx_counters']['ROW_tx_counters'][1]['eth_outbcast'])
So i need to know the array ID (in this example zeros and ones) to access the information for this interface. It seems pretty easy with only 2 arrays, but imagine 500. Right now, i always copy the json code to jsoneditoronline.org where i can see the ID's:
Is there an easy way to make the IDs visible within python itself?

You posted is valid JSON.
The image is from a tool that takes the data from JSON and displays it. You can display it in any way you want, but the contents in the file will need to be valid JSON.
If you do not need to load the JSON later, you can do with it whatever you like, but json.dumps() will give you JSON only.

How does Python JSON library dealing with time?

So I'm currently learning MongoDB and I'm using PyMongo rather than MongoDB shell.
When I started trying the basic CRUD operations, I found it is hard to load the bios data using PyMongo, since the original data posted on the website had a strange ISODATA for time.
The original python JSON library seemed to be not support this and the mongoimport seemed to be not support this either(not sure). But I found this, after modifying into {$date:"2017-04-01T05:00:00Z"}, mongoimport was working.
Right now I'm using subprocess to call a external command to import the data. So my question is, how to use python correctly read the JSON data and using PyMongo to insert the data.
Details
the bios data in the mongodb documentation looks like this
{
"_id" : 1,
"name" : {
"first" : "John",
"last" : "Backus"
},
"birth" : ISODate("1924-12-03T05:00:00Z"),
"death" : ISODate("2007-03-17T04:00:00Z"),
"contribs" : [
"Fortran",
"ALGOL",
"Backus-Naur Form",
"FP"
],
"awards" : [
{
"award" : "W.W. McDowell Award",
"year" : 1967,
"by" : "IEEE Computer Society"
},
{
"award" : "National Medal of Science",
"year" : 1975,
"by" : "National Science Foundation"
},
{
"award" : "Turing Award",
"year" : 1977,
"by" : "ACM"
},
{
"award" : "Draper Prize",
"year" : 1993,
"by" : "National Academy of Engineering"
}
]
}
And when I try to parse it with Python's JSON library, I get a error messagejson.decoder.JSONDecodeError because of the "birth" : ISODate("1924-12-03T05:00:00Z"),. And mongoimport can not parse this because of the same reason.
When I modified,
"birth" : ISODate("1924-12-03T05:00:00Z"), into
"birth" : $date:"2017-04-01T05:00:00Z"
mongoimport was working but python still wasn't able to parse it.
What I am asking here is a way to deal this problem within Python and PyMongo rather than calling a external commands.

The example that you're looking at was probably intended to be used within the mongo shell, where the use of the ISODate bson type can be parsed as shown.
Outside of that, we have the challenge that JSON does not have a date datatype, nor does it have a standard way of representing dates. To deal with this challenge, MongoDB created something called Extended JSON, which can encode dates in JSON similar to how you have shown with $date.
In order to work with Extended JSON in Python / PyMongo, you could use json_util.
Here's a brief example:
from bson.json_util import loads
from pymongo import MongoClient
json = '''
{
"_id" : 1,
"name" : {
"first" : "John",
"last" : "Backus"
},
"birth" : {"$date":"2017-04-01T05:00:00.000Z"},
"death" : {"$date":"2017-04-01T05:00:00.000Z"}
}
'''
bson = loads(json)
print(str(bson))
db = MongoClient().test
collection = db.bios
collection.insert(bson)

JSON parsing using python

I'm attempting to understand the basics of JSON and thought using some Google translate examples would be interesting. I'm not actually making requests via the API but they have the following example I have saved as "file.json":
{
"data": {
"detections": [
[
{
"language": "en",
"isReliable": false,
"confidence": 0.18397073
}
]
]
}
}
I'm reading in the file and used simplejson:
json_data = open('file.json').read()
json = simplejson.loads(json_data)
>>> json
{'data': {'detections': [[{'isReliable': False, 'confidence': 0.18397073, 'language': 'en'}]]}}
I've tried multiple ways to print the value of 'language' with no success. For example, this fails. Any pointers would be appreciated!
print json['detections']['language']

You need json['data']['detections'][0][0]['language']. As your example data shows, 'language' is a key of a dict that is inside a list that is inside another list that inside the 'detections' dict which is inside the 'data' dict.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using Bad Json in Python - python

There is module dirtyjson which reads this incorrect JSON. import dirtyjson data_string = """ { "fc1":{ region:"Delhi", marketplace: "IN" }, "fc2" : { region:"Rajasthan", marketplace: "IN" } } """ data = dirtyjson.loads(data_string) print(data) print(data['fc1']) print(data['fc2'])

Related

How to get value off JSON in Robot framework

Examine and tweak a given analyzer?

Python: Pretty print json file with array ID's

How does Python JSON library dealing with time?

JSON parsing using python

Categories

Resources