how to create data structure of dynamic string? - python

I'm totally new in python, not able to find any way where I could create data structure to hold below string.
for example, below is a sample string, I can have these kind of multiple strings (as record) in a file. main variables will be static in count but elements for example in types can vary.
{"name": "pim pom",
"types": "amy \n klim\nshining rock(ABC)\nflying\nchanning",
"url": "http://doingrock.com",
"image": "http://static.doingrock.com/rockisland.jpg",
"pullTime": "PT3AM",
"rockHeight": "8",
"dateLive": "2010-10-14",
"hitTime": "PT8PM",
"desc": "Amazing view"}
what's a better way of create data structure in python to access elements in this string ?
please suggest

The way you stored the strings using python dictionary is absolutely fine. You can use list instead of string for "types", inside the dict:
"types": ["amy", "klim", "shining rock(ABC)", "flying", "channing"]

Related

Validate object values against yaml configuration

I have an application where a nested Python dictionary is created based on a JSON document that I get as a response from an API. Example:
colleagues = [
{ "name": "John",
"skills": ["python", "java", "scala"],
"job": "developer"
},
{ "name": "George",
"skills": ["c", "go", "nodejs"],
"job": "developer"
}]
This dictionary can have many more nested levels.
What I want to do is let the user define their own arbitrary conditions (e.g. in order to find colleagues that have "python" among their skills, or whose name is "John") in a YAML configuration file, which I will use to check against the Python dictionary.
I thought about letting them configure that in the following manner in the YAML file, but this would require using exec(), which I want to avoid for security reasons:
constraints:
- "python" in colleagues[x]["skills"]
- colleagues[x]["name"] == "John"
What other options are there for such a problem, so that the user can specify their own constraints for the dictionary values? Again, the dictionary above is just an example. The actual one is much larger in size and nesting levels.
You could use a Lucene query parser to convert queries like "skill:python" and "name:John" to executable predicate functions, and then filter your list of colleagues using those predicates. Googling for "python lucene parser" will turn up several parsing options.

Storing sequence Information in JSON

I want to store some sequence information in a JSON. For example, I want to store a variable value which can have following values:
some_random_string_2
some_random_string_3
some_random_string_4
...
To do so, I have tried using the following format:
json_obj = {
"k1": {
"nk1": "some_random_string_{$1}"
"patterns": {
"p1": {
"pattern": "[2-9]|[1-9]\d+",
"symbol_type": "int",
"start_symbol": 2,
"step": 1
}
}
}
}
Above json contains regex pattern for variable string, its type, start symbol and step. But, it seems unnecessarily complicated and difficult to generate sequence from.
Is there some simpler way to store this sequence information so that its easier to generate the sequence while parsing?
Currently, I don't have exhaustive patterns list, so we'll have to assume, that it can be anything that can be written as a regular exp. On a side note, I'll be using python to parse this json and generate a sequence.

Parsing Json data sent by android clinet using "post" request in django view

I am trying to loop around a json data sent by my android client. I used the code below but its not working for me. Any possible error that I am doing...????
def api_json(request):
try:
x101=json.loads(request.body)
print x101
for data in x101:
print data+"xp"
asset_code=data['asset_code']
credential=data['credential']
d1=data['d1']
d2=data['d2']
d3=data['d3']
angle=data['angle']
status=data['status']
operator=data['operator']
location=data['location']
print asset_code,credential,d1,d2,d3,angle,status,operator,location
v=Verification(asset_code=asset_code,
scan_time=datetime.datetime.now(),
credential=credential,
d1=d1,
d2=d2,
d3=d3,
angle=angle,
status=status,
operator=operator,
location=location,
image='')
v.save()
except:
print 'nope'
return HttpResponse('success')
error trace:
TypeError: string indices must be integers
Assuming your JSON decodes to a dictionary, for data in x101 iterates through the keys of that dictionary. So data['d1'] will give the TypeError that you see, "string indices must be integers".
Since you have given absolutely no details about what the data structure actually looks like, we can only guess, but you perhaps want to iterate through the dict's values with for data in x101.values().
In any case, you should definitely remove that try/except that does nothing except print "nope". Errors are there for a reason, and silencing them will only prevent you from debugging properly, as we see here.
Edit
x101 is just a single dict. You say that there will frequently be more than one dict, but it can't possibly work like that: the only way to have multiple dicts is to have them inside a list (ie a JSON array). And if so, they would have to always be in a list, even when there is just one. So your structure should be:
[
{
"angle": "10",
"asset_code": "XPS1020",
"credential": "wqw2323ds2",
"d1": "1",
"d2": "2",
"d3": "3",
"location": "Bangalore",
"operator": "pradeep",
"status": "1"
}
]
and then your code will work as is, whether there is a single dict or many.

What is the best way to search millions of JSON files?

I've very recently picked up programming in Python and am working on creating a database.
I've already worked out extracting all these files from their source so they are all in a directory on my computer.
All of these files are structured the same way and what I want to do is search these multidimensional dictionaries and locate the value for a specific set of keys.
These json files are all structured similarly,
{
"userid": 34535367,
"result": {
"list": [
{
"name": 264,
"age": 64,
"id": 456345345
},
{
"name": 263,
"age": 42,
"id": 364563463456
}
]
}
}
In my case, I would like to search for the "name" key and return the relevant data(quality, id and the original userid) for the thousands of names just like it from my millions of JSON files.
Basically I'm very new at this and the little programming knowledge I have is in Python. I'm happy to start learning whatever I need to, but I'm not sure which direction to go.
If your goal is to create a database, then you should look on how databases work and solve the same problem you are trying to solve right now :)
NoSQL databases (like mangodb) work also with json documents and implements most likely a whole set of tools to search and filter documents.
Now to answer your question, there is no quick way to do so unless you do some preprocessing, meaning that you store different information about the data (called metadata).
This is a huge subject and I don't have enough expertise to give you all the answers, but I can give you a simple tip: Use indexes.
An index is a sorted key/value map where for every value, we store the documents that contains that value (or the file + position of the Json document) . For example an index for the name property would like this:
{
263: ('jsonfile10.json', '0')
264: ('jsonfile10.json', '30'),
# The json document can be found on the jsonfile10.json file on line 30
}
By keeping an index for the most queried values, you can turn a linear time search into a logarithmic time search not to mention that inserting a new document is much faster. in your case, you seems to only need an index on the name field.
Creating/updating the index is done when you insert, update or remove a document. Using a balanced binary tree can accelerate the updates on the index.
As a suggestion, why don't you just process all the incoming files and insert the data into a database? You will have a toolset to query that database. SQLite for example will do (as well as any other more sophisticated database):
http://www.sqlite.org/
http://docs.python.org/2/library/sqlite3.html
Simple other solution might be to build a file mapping name_id to /file/path. Then you can logarithmically do a binary search by the name id. But I'd still advise using a proper database as maintaining the index will be more cumbersome than doing some inserts/deletes.

Altering JSON array using python

This is the way reading from a .json file on ubuntu terminal:
python -c "import json;print json.loads(open('json_file.json', 'r').read())['foo']['bar']"
What I'd like to do is altering the JSON file, adding new objects and arrays. So how to do this in python?
json_file.json:
{
"data1" :
[
{
"unit" : "Unit_1",
"value" : "20"
},
{
"unit" : "Unit_2",
"value" : "10"
}
]
}
First of all, create a new python file.
import json
data = json.loads(open('json_file.json', 'r').read())
The data is then just a bunch of nested dictionaries and lists.
You can modify it the same way you would modify any python dictionary and list; it shouldn't be hard to find a resource on this as it is one of the most basic python functionalities. You can find a complete reference at the official python documentation, and if you are familiar with arrays/lists and associative arrays/hashes in any language, this should be enough to get you going. If it's not, you can probably find a tutorial and if that doesn't help, if you are able to create a well-formed specific question then you could ask it here.
once you are done, you can put everything back into json:
print json.dumps(data)
For more information on how to customize the output, and about the json module overall, see the documentation.

Categories

Resources