What is the equivalent of Python's ast.literal_eval() in Julia? - python

Is there anything in Julia which is equivalent to Python's literal_eval provided by the package ast (Abstract Syntax Tree)?
A summary of its (literal_eval) description:
This function only evaluates Python literal structures: strings,
bytes, numbers, tuples, lists, dicts, sets, booleans, and None, and
can be used for safely evaluating strings from untrusted sources
without the need to parse the values oneself. It is not capable of
evaluating arbitrarily complex expressions, for example involving
operators or indexing.

There is no equivalent, although you could potentially write one fairly easily by parsing code and then recursively ensuring that you only have certain syntactic forms in the resulting expression before evaluating it. However, unlike Python where a lot of basic types and their syntax and behavior are built in and unchangeable, Julia's "built in" types are just user-defined types that happen to be defined before the system starts up. Let's explore what happens, for example, when you use vector literal syntax:
julia> :([1,2,3]) |> dump
Expr
head: Symbol vect
args: Array{Any}((3,))
1: Int64 1
2: Int64 2
3: Int64 3
typ: Any
julia> f() = [1,2,3]
f (generic function with 2 methods)
julia> #code_lowered f()
CodeInfo(:(begin
nothing
return (Base.vect)(1, 2, 3)
end))
julia> methods(Base.vect)
# 3 methods for generic function "vect":
vect() in Base at array.jl:63
vect(X::T...) where T in Base at array.jl:64
vect(X...) in Base at array.jl:67
So [1,2,3] is just a syntactic form that is lowered as a call to the Base.vect function, i.e. Base.vect(1,2,3). Now, we might in the future make it possible to "seal" some functions so that one can't add any submethods or overwrite their behavior in any way, but currently modifying the behavior of Base.vect for some set of arguments is entirely possible:
julia> function Base.vect(a::Int, b::Int, c::Int)
warn("SURPRISE!")
return invoke(Base.vect, Tuple{Any,Any,Any}, a, b, c)
end
julia> [1,2,3]
WARNING: SURPRISE!
3-element Array{Int64,1}:
1
2
3
Since an array literal is overloadable in Julia it's not really a purely literal syntax. Of course, I don't recommend doing what I just did – "SURPRISE!" is not something you want to see in the middle of your program – but it is possible and therefore the syntax is not "safe" in the sense of this question. Some other constructs which are expressed with literals in Python or JavaScript (or most scripting languages), are explicitly function calls in Julia, such as constructing dictionaries:
julia> Dict(:foo => 1, :bar => 2, :baz => 42)
Dict{Symbol,Int64} with 3 entries:
:baz => 42
:bar => 2
:foo => 1
This is just a function call to the Dict type with three pair object arguments, not a literal syntax at all. The a => b pair syntax itself is also just a special syntax for a function call to the => operator, which is an alias for the Pair type:
julia> dump(:(a => b))
Expr
head: Symbol call
args: Array{Any}((3,))
1: Symbol =>
2: Symbol a
3: Symbol b
typ: Any
julia> :foo => 1.23
:foo=>1.23
julia> =>
Pair
julia> Pair(:foo, 1.23)
:foo=>1.23
What about integer literals? Surely those are safe! Well, yes and no. Small integer literals are currently safe, since they are converted in the parser directly to Int values, without any overloadable entry points (that could change in the future, however, allowing user code to opt into different behaviors for integer literals). Large enough integer literals, however, are lowered to macro calls, for example:
julia> :(18446744073709551616)
:(#int128_str "18446744073709551616")
An integer literal that is too big for the Int64 type is lowered as macro call with a string argument containing the integer digits, allowing the macro to parse the string and return an appropriate integer object – in this case an Int128 value – to be spliced into the abstract syntax tree. But you can define new behaviors for these macros:
julia> macro int128_str(s)
warn("BIG SUPRISE!")
9999999999999999
end
julia> 18446744073709551616
WARNING: BIG SUPRISE!
9999999999999999
Essentially, there is no meaningful "safe literal subset" of Julia. Philosophically, Julia is very different from Python: instead of building in a fixed set of types with special capabilities that are inaccessible to user-defined types, Julia includes powerful enough mechanisms in the language that the language can be built from within itself – a process known as "bootstrapping". These powerful language mechanisms are just as available to Julia programmers as they are to the programmers of Julia. This is where much of Julia's flexibility and power comes from. But with great power comes great responsibility and all that... so don't actually do any of the things I've done in this answer unless you have a really good reason :)
To get back to your original problem, the best one could do to create a parser for safe literal object construction using Julia syntax would be to implement a parser for a subset of Julia, giving literals their usual meaning in a way that cannot be overloaded. This safe syntax subset could include numeric literals, string literals, array literals, and Dict constructors, for example. But it would probably be more practical to just use JSON syntax and parse it using Julia's JSON package.

I know I'm late here, but Meta.parse does the job:
julia> eval(Meta.parse("[1,2,3]"))
3-element Array{Int64,1}:
1
2
3
Specifically, Meta.parse turns your string into an Expr which eval then turns into a useable data structure. Definitely works in Julia 1.0.
https://discourse.julialang.org/t/how-to-convert-a-string-into-an-expression/11160

Related

Scientific notation literal with variable exponent in Python

The literal 3e4 represents the float 30000 in python (3.8 at least).
>>> print(3e4)
30000.0
The syntax of the following code is clearly invalid:
x=4
3ex
3ex is not a valid expression, but the example helps me ask my question:
Clearly, the expression 3*10**4 represents the same number, but my question here is purely related to the scientific notation literals. Just for my curiosity, is there a way to use the same syntax with a variable power, better than:
x=4
eval(f"1e{x}")
One subtle difference between 3e4 and 3*10**4 is the type (float and int respectively).
Is there also a difference in execution time perhaps in calculating these two expressions?
To your first question: No, the documentation does not suggest you can.
Is there a way to use the same syntax with a variable power?
When float is instantiated from a string, it calls out to a CPython C library PyOS_string_to_double to which handles making the str locale-aware (. vs ,) before passing the string directly to the C function strtod doc.
Meanwhile the documentation for PyOS_string_to_double does not mention of any special way to configure the exponent.
To your second question about performance, this is easily benchmarked. But, we do not have a candidate to benchmark against. So, this is a moot question.
Is there also a difference in execution time perhaps?
I hope this satiates your curiosity. If not, feel free to dig into the C code that I linked.

What are the types of Python operators?

I tried type(+) hoping to know more about how is this operator represented in python but i got SyntaxError: invalid syntax.
My main problem is to cast as string representing an operation :"3+4" into the real operation to be computed in Python (so to have an int as a return: 7).
I am also trying to avoid easy solutions requiring the os library if possible.
Operators don't really have types, as they aren't values. They are just syntax whose implementation is often defined by a magic method (e.g., + is defined by the appropriate type's __add__ method).
You have to parse your string:
First, break it down into tokens: ['3', '+', '4']
Then, parse the token string into an abstract syntax tree (i.e., something at stores the idea of + having 3 and 4 as its operands).
Finally, evaluate the AST by applying functions stored at a node to the values stored in its children.
The built-in eval function probably does what you want:
eval('3+4')
returns 7.

Passing dictionary to a function with **

I'm trying to understand the following.
def exp(**argd):
print(argd)
a={1:'a',2:'b'}
exp(**a)
This will give TypeError: exp() keywords must be strings.
This is working fine if i use a={'1':'a','2':'b'}. why i can't pass the dictionary key as a number to the exp function ?
exp(**a) in your example expands literally to exp(1='a', 2='b'), which is an error because integer literals cannot be variable names.
You might think, why doesn't the ** process cast keys into strings as part of the expansion? There's no one singular reason, but in general Python's philosophy is "explicit is better than implicit", and implicit casting can have some pitfalls -- many object types that are distinct from each other, for instance, will cast to the same string, which could cause unintended consequences if you relied on implicit string casting during expansion.
because you cannot (Guido is probably the only one who can tell you why) ... it makes them partially adhere to variable naming rules ... the **a_dict unpacks the dict
a={1:'a',2:'b'}
exp(**a) #is basically exp(1='a',2='b')
which is obviously a syntax error
although it does allow funny things like
a = {'a variable':7,'some$thing':88}
exp(**a)
as long as they are strings... it seems the only rule they enforce is that they are strings ... this is likely to guarantee that they are hashable(a huge guess...)
disclaimer: this is probably a gross oversimplification

Symbol vs Operator in Python

I'm reviewing for a test over some basic Python syntax stuff and I'm wanting to make sure I have a proper understanding of the difference between a symbol and an operator. A symbol can be a string of characters or a operator and an operator can only be something that does something to characters or strings right?
Operator is a syntactic representation for some important Python function. For example, and infix + operator as in a + b. There is a module called operator to represent standard operators as functions. Also, special methods (as in the hus787 comment above) can override operators for instances of a class.
Symbol is an element of Python grammar. Symbol can represent a whole program, a statement, operator, name, literal, etc, even indent and dedent (in case of Python).
This terminology is not Python specific even.
A symbol in a programming language is either a binding to some value (eg. variable identifiers), a value itself (eg. "foo", 123, True), keyword (eg. def, class, import,try, except,...) or other language specific construct( (), {}, [],...).
So a symbol does not always have to be a string of characters.
In contrast, an operator defines a specific function among one or more values. (There are unary, binary, tertiary,... operators)
eg. + in 1+1, < in a<b are operators
It's noteworthy if you are considering this idea in a compiler's standpoint, everything you write in your code is a symbol. That is even +, - , *, / , are mere symbols to a lexical analyzer. (I assume that this fact is out of the scope of your question). Hence we will restrict our answer to the domain of language syntax.
However this idea is universal for any programming language

Why does Python not perform type conversion when concatenating strings?

In Python, the following code produces an error:
a = 'abc'
b = 1
print(a + b)
(The error is "TypeError: cannot concatenate 'str' and 'int' objects").
Why does the Python interpreter not automatically try using the str() function when it encounters concatenation of these types?
The problem is that the conversion is ambiguous, because + means both string concatenation and numeric addition. The following question would be equally valid:
Why does the Python interpreter not automatically try using the int() function when it encounters addition of these types?
This is exactly the loose-typing problem that unfortunately afflicts Javascript.
There's a very large degree of ambiguity with such operations. Suppose that case instead:
a = '4'
b = 1
print(a + b)
It's not clear if a should be coerced to an integer (resulting in 5), or if b should be coerced to a string (resulting in '41'). Since type juggling rules are transitive, passing a numeric string to a function expecting numbers could get you in trouble, especially since almost all arithmetic operators have overloaded operations for strings too.
For instance, in Javascript, to make sure you deal with integers and not strings, a common practice is to multiply a variable by one; in Python, the multiplication operator repeats strings, so '41' * 1 is a no-op. It's probably better to just ask the developer to clarify.
The short answer would be because Python is a strongly typed language.
This was a design decision made by Guido. It could have been one way or another really, concatenating str and int to str or int.
The best explanation, is still the one given by guido, you can check it here
The other answers have provided pretty good explanations, but have failed to mention that this feature is known a Strong Typing. Languages that perform implicit conversions are Weakly Typed.
Because Python does not perform type conversion when concatenating strings. This behavior is by design, and you should get in the habit of performing explicit type conversions when you need to coerce objects into strings or numbers.
Change your code to:
a = 'abc'
b = 1
print(a + str(b))
And you'll see the desired result.
Python would have to know what's in the string to do it correctly. There's an ambiguous case: what should '5' + 5 generate? A number or a string? That should certainly throw an error. Now to determine whether that situation holds, python would have to examine the string to tell. Should it do that every time you try to concatenate or add two things? Better to just let the programmer convert the string explicitly.
More generally, implicit conversions like that are just plain confusing! They're hard to predict, hard to read, and hard to debug.
That's just how they decided to design the language. Probably the rationale is that requiring explicit conversions to string reduces the likelihood of unintended behavior (e.g. integer addition if both operands happen to be ints instead of strings).
tell python that the int is a list to disambiguate the '+' operation.
['foo', 'bar'] + [5]
this returns: ['foo', 'bar', 5]

Categories

Resources