So I have a text file,
question_one = {question:"what is 2+2", answer: "4", fake1: "5"}
question_two = {question:"what is the meaning of life?", answer:"pizza", fake:"42"}
How can I then import these dictionaries so that I could use them like this,
print(question_one["question"])
print(question_two["question"])
So the out come would be
what is 2+2
what is the meaning of life?
I would like this so that I can add questions to a text file from within the program and then save them should I add more, If this is possible another way please let me know!
The simplest way would be to store your questions into a JSON file, like #Thom Wiggers is suggesting.
Here's an example:
[
{
"question": "what is 2+2",
"answer": "4",
"fake1": "5"
},
{
"question": "what is the meaning of life?",
"answer": "pizza",
"fake1": "42"
}
]
import json
with open('questions.json') as f:
questions = json.load(f)
for question in questions:
print(question['question'])
You can read more about the JSON module in the official documentation.
If you only want to serialize data, you want to use pickle or json. exec will execute all Python code, and can be a serious security problem.
pickle is faster, and is specificity tailored to Python, while json can be read & written work by just about any programming language, and is still fairly human-readable & human-editable.
Now, to answer the question as you asked it (you probably don't want to do this):
You can use exec()
This function supports dynamic execution of Python code. object must
be either a string or a code object. If it is a string, the string is
parsed as a suite of Python statements which is then executed (unless
a syntax error occurs).
ie.
exec(open('data.txt', 'r').read())
Another way to do is would be to (ab)use import, assuming your file is named data.py:
import data
data.question_one['question']
This is obviously not what import was intended for... I've 'used' import like this in the past, and regretted it (there are a number of caveats, I'll leave it as an exercise to the reader to think about what they might be).
Warning Both are eval-like statements, and should be used with care, any Python code in data.txt will be executed, which may be potentially dangerous. Be very sure you trust the source of whatever you pass to exec(), and don't use if you only want to serialize data (instead of running Python code as such).
Related
I know that JSON is good for encoding/decoding complex data to/from text files, but can it also encode/decode functions to/from files?
You can always use eval or exec and store your functions as plain text :
>>> exec(compile("def f(a):\n print(a)", "exec", "exec"))
>>> f(2)
2
Let me stress, however, that executing code from an untrusted source like a JSON file seems like a big red warning flag. It's tricky to get right without risking giving anyone with access to your JSON file access to everything else, and even then there's plenty of better ways to do it.
If you don't require being able to execute arbitrary code, a better way would be the way Celery stores its tasks : store the name of the function and its arguments in JSON. You can then check if the function is allowed to be executed, and pass arguments to it. You're still vulnerable if your functions are badly designed, but much less than with arbitrary code.
First, define the function :
>>> def f(a):
... print(a)
Store and retrieve a JSON with the name and arguments :
>>> json_data = {"name": "f", "args": ["hello world"]}
Check that we're allowed to execute this :
>>> if json_data["name"] not in ["f"]:
... raise Exception("forbidden function!")
Retrieve and execute the function :
>>> json_func = globals(json_data["name"])
>>> json_func(*json_data["args"])
hello world
Some background first: I have a few rather simple data structures which are persisted as json files on disk. These json files are shared between applications of different languages and different environments (like web frontend and data manipulation tools).
For each of the files I want to create a Python "POPO" (Plain Old Python Object), and a corresponding data mapper class for each item should implement some simple CRUD like behavior (e.g. save will serialize the class and store as json file on disk).
I think a simple mapper (which only knows about basic types) will work. However, I'm concerned about security. Some of the json files will be generated by a web frontend, so a possible security risk if a user feeds me some bad json.
Finally, here is the simple mapping code (found at How to convert JSON data into a Python object):
class User(object):
def __init__(self, name, username):
self.name = name
self.username = username
import json
j = json.loads(your_json)
u = User(**j)
What possible security issues do you see?
NB: I'm new to Python.
Edit: Thanks all for your comments. I've found out that I have one json where I have 2 arrays, each having a map. Unfortunately this starts to look like it gets cumbersome when I get more of these.
I'm extending the question to mapping a json input to a recordtype. The original code is from here: https://stackoverflow.com/a/15882054/1708349.
Since I need mutable objects, I'd change it to use a namedlist instead of a namedtuple:
import json
from namedlist import namedlist
data = '{"name": "John Smith", "hometown": {"name": "New York", "id": 123}}'
# Parse JSON into an object with attributes corresponding to dict keys.
x = json.loads(data, object_hook=lambda d: namedlist('X', d.keys())(*d.values()))
print x.name, x.hometown.name, x.hometown.id
Is it still safe?
There's not much wrong that can happen in the first case. You're limiting what arguments can be provided and it's easy to add validation/conversion right after loading from JSON.
The second example is a bit worse. Packing things into records like this will not help you in any way. You don't inherit any methods, because each type you define is new. You can't compare values easily, because dicts are not ordered. You don't know if you have all arguments handled, or if there is some extra data, which can lead to hidden problems later.
So in summary: with User(**data), you're pretty safe. With namedlist there's space for ambiguity and you don't really gain anything. (compared to bare, parsed json)
If you blindly accept users json input without sanity check, you are at risk of become json injection victim.
See detail explanation of json injection attack here: https://www.acunetix.com/blog/web-security-zone/what-are-json-injections/
Besides security vulnerability, parse JSON to Python object this way is not type safe.
With your example of User class, I would assume you expect both fields name and username to be string type. What if the json input is like this:
{
"name": "my name",
"username": 1
}
j = json.loads(your_json)
u = User(**j)
type(u.username) # int
You have gotten an object with unexpected type.
One solution to make sure type safe is to use json schema to validate input json. more about json schema: https://json-schema.org/
I have a json object saved inside test_data and I need to know if the string inside test_data['sign_in_info']['package_type'] contains the string "vacation_package" in it. I assumed that in could help but I'm not sure how to use it properly or if it´s correct to use it. This is an example of the json object:
"checkout_details": {
"file_name" : "pnc04",
"test_directory" : "test_pnc04_package_today3_signedout_noinsurance_cc",
"scope": "wdw",
"number_of_adults": "2",
"number_of_children": "0",
"sign_in_info": {
"should_login": false,
**"package_type": "vacation_package"**
},
package type has "vacation_package" in it, but it's not always this way.
For now I´m only saving the data this way:
package_type = test_data['sign_in_info']['package_type']
Now, is it ok to do something like:
p= "vacation_package"
if(p in package_type):
....
Or do I have to use 're' to cut the string and find it that way?
You answer depends on what exactly you expect to get from test_data['sign_in_info']['package_type']. Will 'vacation_package' always be by itself? Then in is fine. Could it be part of a larger string? Then you need to use re.search. It might be safer just to use re.search (and a good opportunity to practice regular expressions).
No need to use re, assuming you are using the json package. Yes, it's okay to do that, but are you trying to see if there is a "package type" listed, or if the package type contains vacation_package, possibly among other things? If not, this might be closer to what you want, as it checks for exact matches:
import json
data = json.load(open('file.json'))
if data['sign_in_info'].get('package_type') == "vacation_package":
pass # do something
This is the way reading from a .json file on ubuntu terminal:
python -c "import json;print json.loads(open('json_file.json', 'r').read())['foo']['bar']"
What I'd like to do is altering the JSON file, adding new objects and arrays. So how to do this in python?
json_file.json:
{
"data1" :
[
{
"unit" : "Unit_1",
"value" : "20"
},
{
"unit" : "Unit_2",
"value" : "10"
}
]
}
First of all, create a new python file.
import json
data = json.loads(open('json_file.json', 'r').read())
The data is then just a bunch of nested dictionaries and lists.
You can modify it the same way you would modify any python dictionary and list; it shouldn't be hard to find a resource on this as it is one of the most basic python functionalities. You can find a complete reference at the official python documentation, and if you are familiar with arrays/lists and associative arrays/hashes in any language, this should be enough to get you going. If it's not, you can probably find a tutorial and if that doesn't help, if you are able to create a well-formed specific question then you could ask it here.
once you are done, you can put everything back into json:
print json.dumps(data)
For more information on how to customize the output, and about the json module overall, see the documentation.
I have data coming in to a python server via a socket. Within this data is the string '<port>80</port>' or which ever port is being used.
I wish to extract the port number into a variable. The data coming in is not XML, I just used the tag approach to identifying data for future XML use if needed. I do not wish to use an XML python library, but simply use something like regexp and strings.
What would you recommend is the best way to match and strip this data?
I am currently using this code with no luck:
p = re.compile('<port>\w</port>')
m = p.search(data)
print m
Thank you :)
Regex can't parse XML and shouldn't be used to parse fake XML. You should do one of
Use a serialization method that is nicer to work with to start with, such as JSON or an ini file with the ConfigParser module.
Really use XML and not something that just sort of looks like XML and really parse it with something like lxml.etree.
Just store the number in a file if this is the entirety of your configuration. This solution isn't really easier than just using JSON or something, but it's better than the current one.
Implementing a bad solution now for future needs that you have no way of defining or accurately predicting is always a bad approach. You will be kept busy enough trying to write and maintain software now that there is no good reason to try to satisfy unknown future needs. I have never seen a case where "I'll put this in for later" has led to less headache later on, especially when I put it in by doing something completely wrong. YAGNI!
As to what's wrong with your snippet other than using an entirely wrong approach, angled brackets have a meaning in regex.
Though Mike Graham is correct, using regex for xml is not 'recommended', the following will work:
(I have defined searchType as 'd' for numerals)
searchStr = 'port'
if searchType == 'd':
retPattern = '(<%s>)(\d+)(</%s>)'
else:
retPattern = '(<%s>)(.+?)(</%s>)'
searchPattern = re.compile(retPattern % (searchStr, searchStr))
found = searchPattern.search(searchStr)
retVal = found.group(2)
(note the complete lack of error checking, that is left as an exercise for the user)