Can JSON for Python Encode/decode functions from text files?

Can JSON for Python Encode/decode functions from text files? - python

I know that JSON is good for encoding/decoding complex data to/from text files, but can it also encode/decode functions to/from files?

You can always use eval or exec and store your functions as plain text :
>>> exec(compile("def f(a):\n print(a)", "exec", "exec"))
>>> f(2)
2
Let me stress, however, that executing code from an untrusted source like a JSON file seems like a big red warning flag. It's tricky to get right without risking giving anyone with access to your JSON file access to everything else, and even then there's plenty of better ways to do it.
If you don't require being able to execute arbitrary code, a better way would be the way Celery stores its tasks : store the name of the function and its arguments in JSON. You can then check if the function is allowed to be executed, and pass arguments to it. You're still vulnerable if your functions are badly designed, but much less than with arbitrary code.
First, define the function :
>>> def f(a):
... print(a)
Store and retrieve a JSON with the name and arguments :
>>> json_data = {"name": "f", "args": ["hello world"]}
Check that we're allowed to execute this :
>>> if json_data["name"] not in ["f"]:
... raise Exception("forbidden function!")
Retrieve and execute the function :
>>> json_func = globals(json_data["name"])
>>> json_func(*json_data["args"])
hello world

Related

Python-How to execute code and store into variable?

So I have been struggling with this issue for what seems like forever now (I'm pretty new to Python). I am using Python 3.7 (need it to be 3.7 due to variations in the versions of packages I am using for the project) to develop an AI chatbot system that can converse with you based on your text input. The program reads the contents of a series of .yml files when it starts. In one of the .yml files I am developing a syntax for when the first 5 characters match a ^###^ pattern, it will instead execute the code and return the result of that execution rather than just output text back to the user. For example:
Normal Conversation:
- - What is AI?
- Artificial Intelligence is the branch of engineering and science devoted to constructing machines that think.
Service/Code-based conversation:
- - Say hello to me
- ^###^print("HELLO")
The idea is that when you ask it to say hello to you, the ^##^print("HELLO") string will be retrieved from the .yml file, the first 5 characters of the response will be removed, the response will be sent to a separate function in the python code where it will run the code and store the result into a variable which will be returned from the function into a variable that will give the nice, clean result of HELLO to the user. I realize that this may be a bit hard to follow, but I will straighten up my code and condense everything once I have this whole error resolved. As a side note: Oracle is just what I am calling the project. I'm not trying to weave Java into this whole mess.
THE PROBLEM is that it does not store the result of the code being run/executed/evaluated into the variable like it should.
My code:
def executecode(input):
print("The code to be executed is: ",input)
#note: the input may occasionally have single quotes and/or double quotes in the input string
result = eval("{}".format(input))
print ("The result of the code eval: ", result)
test = eval("2+2")
test
print(test)
return result
#app.route("/get")
def get_bot_response():
userText = request.args.get('msg')
print("Oracle INTERPRETED input: ", userText)
ChatbotResponse = str(english_bot.get_response(userText))
print("CHATBOT RESPONSE VARIABLE: ", ChatbotResponse)
#The interpreted string was a request due to the ^###^ pattern in front of the response in the custom .yml file
if ChatbotResponse[:5] == '^###^':
print("---SERVICE REQUEST---")
print(executecode(ChatbotResponse[5:]))
interpreter_response = executecode(ChatbotResponse[5:])
print("Oracle RESPONDED with: ", interpreter_response)
else:
print("Oracle RESPONDED with: ", ChatbotResponse)
return ChatbotResponse
When I run this code, this is the output:
Oracle INTERPRETED input: How much RAM do you have?
CHATBOT RESPONSE VARIABLE: ^###^print("HELLO")
---SERVICE REQUEST---
The code to be executed is: print("HELLO")
HELLO
The result of the code eval: None
4
None
The code to be executed is: print("HELLO")
HELLO
The result of the code eval: None
4
Oracle RESPONDED with: None
Output on the website interface
Essentially, need it to say HELLO for the "The result of the code eval:" output. This should get it to where the chatbot responds with HELLO in the web interface, which is the end goal here. It seems as if it IS executing the code due to the HELLO's after the "The code to be executed is:" output text. It's just not storing it into a variable like I need it to.
I have tried eval, exec, ast.literal_eval(), converting the input to string with str(), changing up the single and double quotes, putting \ before pairs of quotes, and a few other things. Whenever I get it to where the program interprets "print("HELLO")" when it executes the code, it complains about the syntax. Also, from several days of looking online I have figured out that exec and eval aren't generally favored due to a bunch of issues, however I genuinely do not care about that at the moment because I am trying to make something that works before I make something that is good and works. I have a feeling the problem is something small and stupid like it always is, but I have no idea what it could be. :(
I used these 2 resources as the foundation for the whole chatbot project:
Text Guide
Youtube Guide
Also, I am sorry for the rather lengthy and descriptive question. It's rare that I have to ask a question of my own on stackoverflow because if I have a question, it usually already has a good answer. It feels like I've tried everything at this point. If you have a better suggestion of how to do this whole system or you think I should try approaching this another way, I'm open to ideas.
Thank you for any/all help. It is very much appreciated! :)

The issue is that python's print() doesn't have a return value, meaning it will always return None. eval simply evaluates some expression, and returns back the return value from that expression. Since print() returns None, an eval of some print statement will also return None.
>>> from_print = print('Hello')
Hello
>>> from_eval = eval("print('Hello')")
Hello
>>> from_print is from_eval is None
True
What you need is a io stream manager! Here is a possible solution that captures any io output and returns that if the expression evaluates to None.
from contextlib import redirect_stout, redirect_stderr
from io import StringIO
# NOTE: I use the arg name `code` since `input` is a python builtin
def executecodehelper(code):
# Capture all potential output from the code
stdout_io = StringIO()
stderr_io = StringIO()
with redirect_stdout(stdout_io), redirect_stderr(stderr_io):
# If `code` is already a string, this should work just fine without the need for formatting.
result = eval(code)
return result, stdout_io.getvalue(), stderr_io.getvalue()
def executecode(code):
result, std_out, std_err = executecodehelper(code)
if result is None:
# This code didn't return anything. Maybe it printed something?
if std_out:
return std_out.rstrip() # Deal with trailing whitespace
elif std_err:
return std_err.rstrip()
else:
# Nothing was printed AND the return value is None!
return None
else:
return result
As a final note, this approach is heavily linked to eval since eval can only evaluate a single statement. If you want to extend your bot to multiple line statements, you will need to use exec, which changes the logic. Here's a great resource detailing the differences between eval and exec: What's the difference between eval, exec, and compile?

It is easy just convert try to create a new list and add the the updated values of that variable to it, for example:
if you've a variable name myVar store the values or even the questions no matter.
1- First declare a new list in your code as below:
myList = []
2- If you've need to answer or display the value through myVar then you can do like below:
myList.append(myVar)
and this if you have like a generator for the values instead if you need the opposite which means the values are already stored then you will just update the second step to be like the following:
myList[0]='The first answer of the first question'
myList[1]='The second answer of the second question'
ans here all the values will be stored in your list and you can also do this in other way, for example using loops is will be much better if you have multiple values or answers.

Replacing strings with variables inside file in Python

I have a bunch of files with many tags inside of the form {my_var}, {some_var}, etc. I am looking to open them, and replace them with my_var and some_var that I've read into Python.
To do these sorts of things I've been using inspect.cleandoc():
import inspect, markdown
my_var='this'
some_var='that'
something=inspect.cleandoc(f'''
All my vars are {some_var} and {my_var}. This is all.
''')
print(something)
#All my vars are that and this. This is all.
But I'd like to do this by reading files file1.md and file2.md
### file1.md
There are some strings such as {my_var} and {some_var}.
Done.
### file2.md
Here there are also some vars: {some_var}, {my_var}. Also done.
Here's the Python code:
import inspect, markdown
my_var='this'
some_var='that'
def filein(file):
with open(file, 'r') as file:
data = file.read()
return data
for filei in ['file1.md','file2.md']:
fin=filein(file)
pre=inspect.cleandoc(f'''{fin}''')
However, the above does not evaluate the strings inside filei and replace them with this (my_var) and that (some_var), and instead keeps them as strings {my_var} and {some_var}.
What am I doing wrong?

You can use the .format method.
You can use ** to pass it a dictionary containing the variable.
Therefore you can use the locals() or globals(), which are dictionary of all the locals and globals variables.
e.g.
text = text.format(**globals())
Complete code:
my_var="this"
some_var="that"
for file in ["file1.md", "file2.md"]:
with open(file, "r") as f:
text = f.read()
text = text.format(**globals())
print(text)

f-strings are a static replacement mechanism, they're an intrinsic part of the bytecode, not a general-purpose templating mechanism
I've no idea what you think inspect.cleandoc does, but it does not do that.
Python generally avoids magic, meaning it really doesn't give a rat's ass about your local variables unless you specifically make it, which is not the case here. Python generally works with explicitely provided dicts (mappings of some term to its replacement).
I guess what you want here is the format/format_map methods, which do apply to format strings using {} e.g.
filein(file).format(my_var=my_var, some_var=some_var)
This can be risky if the files you're reading are under the control of a third party though: str.format allows attribute access and thus ultimately provides tools for arbitrary code execution. In that case, tools like string.Template, old-style string substitution (%) or a proper template engine might be a better idea.

Representation of python dictionaries with unicode in database queries

I have a problem that I would like to know how to efficiently tackle.
I have data that is JSON-formatted (used with dumps / loads) and contains unicode.
This is part of a protocol implemented with JSON to send messages. So messages will be sent as strings and then loaded into python dictionaries. This means that the representation, as a python dictionary, afterwards will look something like:
{u"mykey": u"myVal"}
It is no problem in itself for the system to handle such structures, but the thing happens when I'm going to make a database query to store this structure.
I'm using pyOrient towards OrientDB. The command ends up something like:
"CREATE VERTEX TestVertex SET data = {u'mykey': u'myVal'}"
Which will end up in the data field getting the following values in OrientDB:
{'_NOT_PARSED_': '_NOT_PARSED_'}
I'm assuming this problem relates to other cases as well when you wish to make a query or somehow represent a data object containing unicode.
How could I efficiently get a representation of this data, of arbitrary depth, to be able to use it in a query?
To clarify even more, this is the string the db expects:
"CREATE VERTEX TestVertex SET data = {'mykey': 'myVal'}"
If I'm simply stating the wrong problem/question and should handle it some other way, I'm very much open to suggestions. But what I want to achieve is to have an efficient way to use python2.7 to build a db-query towards orientdb (using pyorient) that specifies an arbitrary data structure. The data property being set is of the OrientDB type EMBEDDEDMAP.
Any help greatly appreciated.
EDIT1:
More explicitly stating that the first code block shows the object as a dict AFTER being dumped / loaded with json to avoid confusion.

Dargolith:
ok based on your last response it seems you are simply looking for code that will dump python expression in a way that you can control how unicode and other data types print. Here is a very simply function that provides this control. There are ways to make this function more efficient (for example, by using a string buffer rather than doing all of the recursive string concatenation happening here). Still this is a very simple function, and as it stands its execution is probably still dominated by your DB lookup.
As you can see in each of the 'if' statements, you have full control of how each data type prints.
def expr_to_str(thing):
if hasattr(thing, 'keys'):
pairs = ['%s:%s' % (expr_to_str(k),expr_to_str(v)) for k,v in thing.iteritems()]
return '{%s}' % ', '.join(pairs)
if hasattr(thing, '__setslice__'):
parts = [expr_to_str(ele) for ele in thing]
return '[%s]' % (', '.join(parts),)
if isinstance(thing, basestring):
return "'%s'" % (str(thing),)
return str(thing)
print "dumped: %s" % expr_to_str({'one': 33, 'two': [u'unicode', 'just a str', 44.44, {'hash': 'here'}]})
outputs:
dumped: {'two':['unicode', 'just a str', 44.44, {'hash':'here'}], 'one':33}

I went on to use json.dumps() as sobolevn suggested in the comment. I didn't think of that one at first since I wasn't really using json in the driver. It turned out however that json.dumps() provided exactly the formats I needed on all the data types I use. Some examples:
>>> json.dumps('test')
'"test"'
>>> json.dumps(['test1', 'test2'])
'["test1", "test2"]'
>>> json.dumps([u'test1', u'test2'])
'["test1", "test2"]'
>>> json.dumps({u'key1': u'val1', u'key2': [u'val21', 'val22', 1]})
'{"key2": ["val21", "val22", 1], "key1": "val1"}'
If you need to take more control of the format, quotes or other things regarding this conversion, see the reply by Dan Oblinger.

Python parsing json data

I have a json object saved inside test_data and I need to know if the string inside test_data['sign_in_info']['package_type'] contains the string "vacation_package" in it. I assumed that in could help but I'm not sure how to use it properly or if it´s correct to use it. This is an example of the json object:
"checkout_details": {
"file_name" : "pnc04",
"test_directory" : "test_pnc04_package_today3_signedout_noinsurance_cc",
"scope": "wdw",
"number_of_adults": "2",
"number_of_children": "0",
"sign_in_info": {
"should_login": false,
**"package_type": "vacation_package"**
},
package type has "vacation_package" in it, but it's not always this way.
For now I´m only saving the data this way:
package_type = test_data['sign_in_info']['package_type']
Now, is it ok to do something like:
p= "vacation_package"
if(p in package_type):
....
Or do I have to use 're' to cut the string and find it that way?

You answer depends on what exactly you expect to get from test_data['sign_in_info']['package_type']. Will 'vacation_package' always be by itself? Then in is fine. Could it be part of a larger string? Then you need to use re.search. It might be safer just to use re.search (and a good opportunity to practice regular expressions).

No need to use re, assuming you are using the json package. Yes, it's okay to do that, but are you trying to see if there is a "package type" listed, or if the package type contains vacation_package, possibly among other things? If not, this might be closer to what you want, as it checks for exact matches:
import json
data = json.load(open('file.json'))
if data['sign_in_info'].get('package_type') == "vacation_package":
pass # do something

Pythonic way to import multiple dictionaries from text file

So I have a text file,
question_one = {question:"what is 2+2", answer: "4", fake1: "5"}
question_two = {question:"what is the meaning of life?", answer:"pizza", fake:"42"}
How can I then import these dictionaries so that I could use them like this,
print(question_one["question"])
print(question_two["question"])
So the out come would be
what is 2+2
what is the meaning of life?
I would like this so that I can add questions to a text file from within the program and then save them should I add more, If this is possible another way please let me know!

The simplest way would be to store your questions into a JSON file, like #Thom Wiggers is suggesting.
Here's an example:
[
{
"question": "what is 2+2",
"answer": "4",
"fake1": "5"
},
{
"question": "what is the meaning of life?",
"answer": "pizza",
"fake1": "42"
}
]
import json
with open('questions.json') as f:
questions = json.load(f)
for question in questions:
print(question['question'])
You can read more about the JSON module in the official documentation.

If you only want to serialize data, you want to use pickle or json. exec will execute all Python code, and can be a serious security problem.
pickle is faster, and is specificity tailored to Python, while json can be read & written work by just about any programming language, and is still fairly human-readable & human-editable.
Now, to answer the question as you asked it (you probably don't want to do this):
You can use exec()
This function supports dynamic execution of Python code. object must
be either a string or a code object. If it is a string, the string is
parsed as a suite of Python statements which is then executed (unless
a syntax error occurs).
ie.
exec(open('data.txt', 'r').read())
Another way to do is would be to (ab)use import, assuming your file is named data.py:
import data
data.question_one['question']
This is obviously not what import was intended for... I've 'used' import like this in the past, and regretted it (there are a number of caveats, I'll leave it as an exercise to the reader to think about what they might be).
Warning Both are eval-like statements, and should be used with care, any Python code in data.txt will be executed, which may be potentially dangerous. Be very sure you trust the source of whatever you pass to exec(), and don't use if you only want to serialize data (instead of running Python code as such).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.