I want to store some sequence information in a JSON. For example, I want to store a variable value which can have following values:
some_random_string_2
some_random_string_3
some_random_string_4
...
To do so, I have tried using the following format:
json_obj = {
"k1": {
"nk1": "some_random_string_{$1}"
"patterns": {
"p1": {
"pattern": "[2-9]|[1-9]\d+",
"symbol_type": "int",
"start_symbol": 2,
"step": 1
}
}
}
}
Above json contains regex pattern for variable string, its type, start symbol and step. But, it seems unnecessarily complicated and difficult to generate sequence from.
Is there some simpler way to store this sequence information so that its easier to generate the sequence while parsing?
Currently, I don't have exhaustive patterns list, so we'll have to assume, that it can be anything that can be written as a regular exp. On a side note, I'll be using python to parse this json and generate a sequence.
Related
This is not much an error I am having but I would like the reason behind the following:
For example in a tutorial page we have
json_string = """
{
"researcher": {
"name": "Ford Prefect",
"species": "Betelgeusian",
"relatives": [
{
"name": "Zaphod Beeblebrox",
"species": "Betelgeusian"
}
]
}
}
"""
data = json.loads(json_string)
Which is ok, but my question is why all the bother to put the json as a string and then call json.loads when the same thing can be obtained by
otro={
"researcher": {
"name": "Ford Prefect",
"species": "Betelgeusian",
"relatives": [
{
"name": "Zaphod Beeblebrox",
"species": "Betelgeusian"
}
]
}
}
print(type(otro))
print(otro)
print(otro==data) #True
Because your second example is not JSON at all, that's Python. They have superficial similarities, but you are only confusing yourself by mixing them.
For example, the values None, True, and False are valid in Python but not in JSON, where they would be represented by null, true, and false, respectively. Another difference is in how Unicode characters are represented. Obviously there are also many Python constructs which cannot be represented in JSON at all.
Which to use in practice depends on your use case. If you are exercising or testing code which needs to work on actual JSON input, obviously pass it JSON, not something else. The example you are citing is obviously trying to demonstrate how to use JSON functions from Python, and the embedding of the example data in a string is just to make the example self-contained, where in reality you would probably be receiving the data from a file or network API.
I'm totally new in python, not able to find any way where I could create data structure to hold below string.
for example, below is a sample string, I can have these kind of multiple strings (as record) in a file. main variables will be static in count but elements for example in types can vary.
{"name": "pim pom",
"types": "amy \n klim\nshining rock(ABC)\nflying\nchanning",
"url": "http://doingrock.com",
"image": "http://static.doingrock.com/rockisland.jpg",
"pullTime": "PT3AM",
"rockHeight": "8",
"dateLive": "2010-10-14",
"hitTime": "PT8PM",
"desc": "Amazing view"}
what's a better way of create data structure in python to access elements in this string ?
please suggest
The way you stored the strings using python dictionary is absolutely fine. You can use list instead of string for "types", inside the dict:
"types": ["amy", "klim", "shining rock(ABC)", "flying", "channing"]
I have a bunch of JSON files, and suppose each have the following structure:
{
"fields": {
"name": "Bob",
"key": "bob"
},
"results": {
"bob": { ... }
}
}
Where by some unfortunate reason, while the structure of the JSON is fairly consistent, there is one dynamic key under "results". Defining the schema for under the fields is fairly straight-forward to me.
So, for several JSON files, the final schema might be:
fieldSchema = StructField(...)
resultSchema = StructField("results", StructType([StructField("bob", ...)]))
finalSchema = StructType([fieldSchema, resultsSchema])
Where the problem is this line: StructField("bob", ...)
Obviously, bob is not the key I'm looking for. This name for the StructField would ideally be some kind of wildcard character, regex pattern, or worst case, some dynamic field based on other fields.
I'm a newbie to Spark and have been scouring the documentation and historical StackOverflow posts, but I've been unable to find anything.
Long story short, I want to be able to pass some kind of wide net for the name parameter in StructField to encompass a variety of different keys, similar to a regex pattern.
Is it possible to check if particular element exists in json without iterating through it? For example, in the following json data, I want to check if appid with value 4000 exists. I need to process hundreds of similar json data set so it needs to be quick and efficient.
{
"response": {
"game_count": 62,
"games": [
{
"appid": 10,
"playtime_forever": 15
},
{
"appid": 20,
"playtime_forever": 0
},
...
{
"appid": 4000,
"playtime_2weeks": 104,
"playtime_forever": 21190
}
]
}
}
The relevant object is contained in an array, so no, it is not possible to find it without iteration. The data could be massaged to use the appid as the key for an object that contains the games as the value, but that requires additional preprocessing.
However, it could be possible to craft a parser such that the appropriate data can be extracted immediately upon parsing. This would piggyback the iteration within the parser itself instead of being explicit code after the fact. See the object_hook argument of the parser.
If I have JSON data stored in a string called 'data' (e.g. the example below) how do I access specific information (such as messages->unread or pokes->most_recent)?
{
"messages": {
"unread": 0,
"most_recent": 1300047276
},
"pokes": {
"unread": 0,
"most_recent": 0
},
"shares": {
"unread": 0,
"most_recent": 0
},
"friend_requests": [],
"group_invites": [],
"event_invites": []
}
I'd like something like data['messages']['unread'] to work - but of course it won't when my data is stored as a string!
JSON parser is bundled with Python since 2.6: json module. To unserialize a string, use json.loads, e.g.
import json
data = json.loads(...)
You can also load directly from a file-like object with json.load.
You run the string through a JSON parser to turn it into a suitable data structure for whatever language you are using (sets, arrays, strings, etc in Python) . There are a number listed near the bottom of http://json.org/ for a variety of languages.