Does Perl 6 have an equivalent to Python's bytearray method? - python

I can't find bytearray method or similar in Raku doc as in Python. In Python, the bytearray defined as this:
class bytearray([source[, encoding[, errors]]])
Return a new array of bytes. The bytearray class is a mutable sequence of integers in the range 0 <= x < 256. It has most of the usual methods of mutable sequences, described in Mutable Sequence Types, as well as most methods that the str type has, see String Methods.
Does Raku should provide this method or some module?

I think you're looking for Buf - a mutable sequence of (usually unsigned) integers. Opening a file with :bin returns a Buf.

brian d foy answer is essentially correct. You can pretty much translate this code into Perl6
my $frame = Buf.new;
$frame.append(0xA2);
$frame.append(0x01);
say $frame; # OUTPUT: «Buf:0x<a2 01>␤»
However, the declaration is not the same:
bu = bytearray( 'þor', encoding='utf8',errors='replace')
in Python would be equivalent to this in Perl 6
my $bú = Buf.new('þor'.encode('utf-8'));
say $bú; # OUTPUT: «Buf:0x<c3 be 6f 72>␤»
And to use something equivalent to the error transformation, the approach is different due to the way Perl 6 approaches Unicode normalization; you would probably have to use UTF8 Clean 8 encoding.
For most uses, however, I guess Buf, as indicated by brian d foy, is correct.

Related

What is the equivalent of Python's ast.literal_eval() in Julia?

Is there anything in Julia which is equivalent to Python's literal_eval provided by the package ast (Abstract Syntax Tree)?
A summary of its (literal_eval) description:
This function only evaluates Python literal structures: strings,
bytes, numbers, tuples, lists, dicts, sets, booleans, and None, and
can be used for safely evaluating strings from untrusted sources
without the need to parse the values oneself. It is not capable of
evaluating arbitrarily complex expressions, for example involving
operators or indexing.
There is no equivalent, although you could potentially write one fairly easily by parsing code and then recursively ensuring that you only have certain syntactic forms in the resulting expression before evaluating it. However, unlike Python where a lot of basic types and their syntax and behavior are built in and unchangeable, Julia's "built in" types are just user-defined types that happen to be defined before the system starts up. Let's explore what happens, for example, when you use vector literal syntax:
julia> :([1,2,3]) |> dump
Expr
head: Symbol vect
args: Array{Any}((3,))
1: Int64 1
2: Int64 2
3: Int64 3
typ: Any
julia> f() = [1,2,3]
f (generic function with 2 methods)
julia> #code_lowered f()
CodeInfo(:(begin
nothing
return (Base.vect)(1, 2, 3)
end))
julia> methods(Base.vect)
# 3 methods for generic function "vect":
vect() in Base at array.jl:63
vect(X::T...) where T in Base at array.jl:64
vect(X...) in Base at array.jl:67
So [1,2,3] is just a syntactic form that is lowered as a call to the Base.vect function, i.e. Base.vect(1,2,3). Now, we might in the future make it possible to "seal" some functions so that one can't add any submethods or overwrite their behavior in any way, but currently modifying the behavior of Base.vect for some set of arguments is entirely possible:
julia> function Base.vect(a::Int, b::Int, c::Int)
warn("SURPRISE!")
return invoke(Base.vect, Tuple{Any,Any,Any}, a, b, c)
end
julia> [1,2,3]
WARNING: SURPRISE!
3-element Array{Int64,1}:
1
2
3
Since an array literal is overloadable in Julia it's not really a purely literal syntax. Of course, I don't recommend doing what I just did – "SURPRISE!" is not something you want to see in the middle of your program – but it is possible and therefore the syntax is not "safe" in the sense of this question. Some other constructs which are expressed with literals in Python or JavaScript (or most scripting languages), are explicitly function calls in Julia, such as constructing dictionaries:
julia> Dict(:foo => 1, :bar => 2, :baz => 42)
Dict{Symbol,Int64} with 3 entries:
:baz => 42
:bar => 2
:foo => 1
This is just a function call to the Dict type with three pair object arguments, not a literal syntax at all. The a => b pair syntax itself is also just a special syntax for a function call to the => operator, which is an alias for the Pair type:
julia> dump(:(a => b))
Expr
head: Symbol call
args: Array{Any}((3,))
1: Symbol =>
2: Symbol a
3: Symbol b
typ: Any
julia> :foo => 1.23
:foo=>1.23
julia> =>
Pair
julia> Pair(:foo, 1.23)
:foo=>1.23
What about integer literals? Surely those are safe! Well, yes and no. Small integer literals are currently safe, since they are converted in the parser directly to Int values, without any overloadable entry points (that could change in the future, however, allowing user code to opt into different behaviors for integer literals). Large enough integer literals, however, are lowered to macro calls, for example:
julia> :(18446744073709551616)
:(#int128_str "18446744073709551616")
An integer literal that is too big for the Int64 type is lowered as macro call with a string argument containing the integer digits, allowing the macro to parse the string and return an appropriate integer object – in this case an Int128 value – to be spliced into the abstract syntax tree. But you can define new behaviors for these macros:
julia> macro int128_str(s)
warn("BIG SUPRISE!")
9999999999999999
end
julia> 18446744073709551616
WARNING: BIG SUPRISE!
9999999999999999
Essentially, there is no meaningful "safe literal subset" of Julia. Philosophically, Julia is very different from Python: instead of building in a fixed set of types with special capabilities that are inaccessible to user-defined types, Julia includes powerful enough mechanisms in the language that the language can be built from within itself – a process known as "bootstrapping". These powerful language mechanisms are just as available to Julia programmers as they are to the programmers of Julia. This is where much of Julia's flexibility and power comes from. But with great power comes great responsibility and all that... so don't actually do any of the things I've done in this answer unless you have a really good reason :)
To get back to your original problem, the best one could do to create a parser for safe literal object construction using Julia syntax would be to implement a parser for a subset of Julia, giving literals their usual meaning in a way that cannot be overloaded. This safe syntax subset could include numeric literals, string literals, array literals, and Dict constructors, for example. But it would probably be more practical to just use JSON syntax and parse it using Julia's JSON package.
I know I'm late here, but Meta.parse does the job:
julia> eval(Meta.parse("[1,2,3]"))
3-element Array{Int64,1}:
1
2
3
Specifically, Meta.parse turns your string into an Expr which eval then turns into a useable data structure. Definitely works in Julia 1.0.
https://discourse.julialang.org/t/how-to-convert-a-string-into-an-expression/11160

python: how to generate char by adding int

I can use 'a'+1 to get 'b' in C language, so what the convient way to do this in Python?
I can write it like:
chr(ord('a')+1)
but I don't know whether it is the best way.
Yes, this is the best way. Python doesn't automatically convert between a character and an int the way C and C++ do.
Python doesn't actually have a character type, unlike C, so yea, chr(ord is the way to do it.
If you wanted to do it a bit more cleanly, you could do something like:
def add(c, x):
return chr(ord(c)+x)
There is the bytearray type in Python -
it is slower than regular strings, but behaves mostly like a C string:
it is mutable, acessing inidividual elements raise 0 - 255 integer numbers, insetead of substrings with lenght 1, and you can assign to the elements. Still, it is represented as a string, and in Python 2, can be used in most places a string can without being cast to a str object:
>>> text = bytearray("a")
>>> text
bytearray(b'a')
>>> print text
a
>>> text[0]+=1
>>> print text
b
>>> text[0]
98
>>> print "other_text" + text
other_textb
When using Python 3, to use the contents of a bytearray as a text object, simply call its decode method with an appropriate encoding such as "latin1" or utf-8":
>>> print ("other_text" + text.decode("latin1"))
What you're doing is really the right way. Python does not conflate a character with its numerical codepoint, as C and similar languages do. The reason is that once you go beyond ASCII, the same integral value can represent different characters, depending on the encoding. C emphasizes direct access to the underlying hardware formats, but python emphasizes well-defined semantics.

Why does Python not perform type conversion when concatenating strings?

In Python, the following code produces an error:
a = 'abc'
b = 1
print(a + b)
(The error is "TypeError: cannot concatenate 'str' and 'int' objects").
Why does the Python interpreter not automatically try using the str() function when it encounters concatenation of these types?
The problem is that the conversion is ambiguous, because + means both string concatenation and numeric addition. The following question would be equally valid:
Why does the Python interpreter not automatically try using the int() function when it encounters addition of these types?
This is exactly the loose-typing problem that unfortunately afflicts Javascript.
There's a very large degree of ambiguity with such operations. Suppose that case instead:
a = '4'
b = 1
print(a + b)
It's not clear if a should be coerced to an integer (resulting in 5), or if b should be coerced to a string (resulting in '41'). Since type juggling rules are transitive, passing a numeric string to a function expecting numbers could get you in trouble, especially since almost all arithmetic operators have overloaded operations for strings too.
For instance, in Javascript, to make sure you deal with integers and not strings, a common practice is to multiply a variable by one; in Python, the multiplication operator repeats strings, so '41' * 1 is a no-op. It's probably better to just ask the developer to clarify.
The short answer would be because Python is a strongly typed language.
This was a design decision made by Guido. It could have been one way or another really, concatenating str and int to str or int.
The best explanation, is still the one given by guido, you can check it here
The other answers have provided pretty good explanations, but have failed to mention that this feature is known a Strong Typing. Languages that perform implicit conversions are Weakly Typed.
Because Python does not perform type conversion when concatenating strings. This behavior is by design, and you should get in the habit of performing explicit type conversions when you need to coerce objects into strings or numbers.
Change your code to:
a = 'abc'
b = 1
print(a + str(b))
And you'll see the desired result.
Python would have to know what's in the string to do it correctly. There's an ambiguous case: what should '5' + 5 generate? A number or a string? That should certainly throw an error. Now to determine whether that situation holds, python would have to examine the string to tell. Should it do that every time you try to concatenate or add two things? Better to just let the programmer convert the string explicitly.
More generally, implicit conversions like that are just plain confusing! They're hard to predict, hard to read, and hard to debug.
That's just how they decided to design the language. Probably the rationale is that requiring explicit conversions to string reduces the likelihood of unintended behavior (e.g. integer addition if both operands happen to be ints instead of strings).
tell python that the int is a list to disambiguate the '+' operation.
['foo', 'bar'] + [5]
this returns: ['foo', 'bar', 5]

Exploits in Python - manipulating hex strings

I'm quite new to python and trying to port a simple exploit I've written for a stack overflow (just a nop sled, shell code and return address). This isn't for nefarious purposes but rather for a security lecture at a university.
Given a hex string (deadbeef), what are the best ways to:
represent it as a series of bytes
add or subtract a value
reverse the order (for x86 memory layout, i.e. efbeadde)
Any tips and tricks regarding common tasks in exploit writing in python are also greatly appreciated.
In Python 2.6 and above, you can use the built-in bytearray class.
To create your bytearray object:
b = bytearray.fromhex('deadbeef')
To alter a byte, you can reference it using array notation:
b[2] += 7
To reverse the bytearray in place, use b.reverse(). To create an iterator that iterates over it in reverse order, you can use the reversed function: reversed(b).
You may also be interested in the new bytes class in Python 3, which is like bytearray but immutable.
Not sure if this is the best way...
hex_str = "deadbeef"
bytes = "".join(chr(int(hex_str[i:i+2],16)) for i in xrange(0,len(hex_str),2))
rev_bytes = bytes[::-1]
Or might be simpler:
bytes = "\xde\xad\xbe\xef"
rev_bytes = bytes[::-1]
In Python 2.x, regular str values are binary-safe. You can use the binascii module's b2a_hex and a2b_hex functions to convert to and from hexadecimal.
You can use ordinary string methods to reverse or otherwise rearrange your bytes. However, doing any kind of arithmetic would require you to use the ord function to get numeric values for individual bytes, then chr to convert the result back, followed by concatenation to reassemble the modified string.
For mutable sequences with easier arithmetic, use the array module with type code 'B'. These can be initialized from the results of a2b_hex if you're starting from hexadecimal.

Python: Why does ("hello" is "hello") evaluate as True? [duplicate]

This question already has answers here:
About the changing id of an immutable string
(5 answers)
Closed 4 years ago.
Why does "hello" is "hello" produce True in Python?
I read the following here:
If two string literals are equal, they have been put to same
memory location. A string is an immutable entity. No harm can
be done.
So there is one and only one place in memory for every Python string? Sounds pretty strange. What's going on here?
Python (like Java, C, C++, .NET) uses string pooling / interning. The interpreter realises that "hello" is the same as "hello", so it optimizes and uses the same location in memory.
Another goodie: "hell" + "o" is "hello" ==> True
So there is one and only one place in memory for every Python string?
No, only ones the interpreter has decided to optimise, which is a decision based on a policy that isn't part of the language specification and which may change in different CPython versions.
eg. on my install (2.6.2 Linux):
>>> 'X'*10 is 'X'*10
True
>>> 'X'*30 is 'X'*30
False
similarly for ints:
>>> 2**8 is 2**8
True
>>> 2**9 is 2**9
False
So don't rely on 'string' is 'string': even just looking at the C implementation it isn't safe.
Literal strings are probably grouped based on their hash or something similar. Two of the same literal strings will be stored in the same memory, and any references both refer to that.
Memory Code
-------
| myLine = "hello"
| /
|hello <
| \
| myLine = "hello"
-------
The is operator returns true if both arguments are the same object. Your result is a consequence of this, and the quoted bit.
In the case of string literals, these are interned, meaning they are compared to known strings. If an identical string is already known, the literal takes that value, instead of an alternative one. Thus, they become the same object, and the expression is true.
The Python interpreter/compiler parses the string literals, i.e. the quoted list of characters. When it does this, it can detect "I've seen this string before", and use the same representation as last time. It can do this since it knows that strings defined in this way cannot be changed.
Why is it strange. If the string is immutable it makes a lot of sense to only store it once. .NET has the same behavior.
I think if any two variables (not just strings) contain the same value, the value will be stored only once not twice and both the variables will point to the same location. This saves memory.

Categories

Resources