Convert bytes -> string -> back to bytes, and get original value - python

I checked all Stackoverflow questions on this matter and none can answer my problem.I need to convert \\ to \.
Edited:
This is what I am trying:
>>> a = b'\xe5jb\x8c?Q$\xf3\x1d\x97^\xfa3O\xa6U.txt'
>>> b = str(a)
>>> b
"b'\\xe5jb\\x8c?Q$\\xf3\\x1d\\x97^\\xfa3O\\xa6U.txt'"
>>> b = b.replace('b\'','')
>>> b = b[:len(b)-1]
>>> b
'\\xe5jb\\x8c?Q$\\xf3\\x1d\\x97^\\xfa3O\\xa6U.txt'
>>> c = bytes(b,'utf8')
>>> c
b'\\xe5jb\\x8c?Q$\\xf3\\x1d\\x97^\\xfa3O\\xa6U.txt'
>>> a == c
False
How do I make a==c True? I tried
.replace("\\\\","\\")
but this doesn't help. The string remains the same. I need to store the byte in variable 'a' to a file as a text and call it back. Python-3.8, Windows=10

You can convert c to a string with the decode method, and then use ast.literal_eval to evaluate it as a bytes literal after wrapping it with b'...':
from ast import literal_eval
a = b'\xe5jb\x8c?Q$\xf3\x1d\x97^\xfa3O\xa6U.txt'
c = b'\\xe5jb\\x8c?Q$\\xf3\\x1d\\x97^\\xfa3O\\xa6U.txt'
c = literal_eval("b'%s'" % c.decode())
print(a == c)
This outputs: True

Use .replace() function for string

Related

Why does indexing a binary string return an integer in python3?

If given a binary string in python like
bstring = b'hello'
why does bstring[0] return the ascii code for the char 'h' (104) and not the binary char b'h' or b'\x68'?
It's probably also good to note that b'h' == 104 returns False (this cost me about 2 hours of debugging, so I'm a little annoyed)
Because bytes are not characters.
It returns the value of the byte (as integer) that is sliced.
If you take 'hello', this is quite simple: 5 ASCII characters -> 5 bytes:
b'hello' == 'hello'.encode('utf-8')
# True
len('hello'.encode('utf-8'))
# 5
If you were to use non-ASCII characters, those could be encoded on several bytes and slicing could give you only part of a character:
len('å'.encode('utf-8'))
# 2
'å'.encode('utf-8')[0]
# 195
'å'.encode('utf-8')[1]
# 165
Think of bytes less as a “string” and more of an immutable list (or tuple) with the constraints that all elements be integers in range(256).
So, think of:
>>> bstring = b'hello'
>>> bstring[0]
104
as being equivalent to
>>> btuple = (104, 101, 108, 108, 111)
>>> btuple[0]
104
except with a different sequence type.
It's actually str that behaves weirdly in Python. If you index a str, you don't get a char object like you would in some other languages; you get another str.
>>> string = 'hello'
>>> string[0]
'h'
>>> type(string[0])
<class 'str'>
>>> string[0][0]
'h'
>>> string[0][0][0]
'h'

Internal working of strings with null characters

I just tried replacing a character in a python string with a null ('') character. Some weird things are happening. Can someone please explain me why is all this happening?
>>> a = "SampleText"
>>> a
'SampleText'
>>> a.replace('a','\0')
'S\x00mpleText'
>>> len(a)
10
>>> a.replace('\0','a')
'SampleText'
>>> len(a)
10
>>> a.replace('a','')
'SmpleText'
>>> len(a)
10
>>> a.replace('','a')
'aSaaamapalaeaTaeaxata'
>>> len(a)
10
The replace function returns the new string and therefore you need to asign it to a variable again. if you write a = a.replace('a','\0') it'll work as you expect it.

Safely unpacking results of str.split [duplicate]

This question already has answers here:
How do I reliably split a string in Python, when it may not contain the pattern, or all n elements?
(5 answers)
Closed 5 years ago.
I've often been frustrated by the lack of flexibility in Python's iterable unpacking. Take the following example:
a, b = "This is a string".split(" ", 1)
Works fine. a contains "This" and b contains "is a string", just as expected. Now let's try this:
a, b = "Thisisastring".split(" ", 1)
Now, we get a ValueError:
ValueError: not enough values to unpack (expected 2, got 1)
Not ideal, when the desired result was "Thisisastring" in a, and None or, better yet, "" in b.
There are a number of hacks to get around this. The most elegant I've seen is this:
a, *b = mystr.split(" ", 1)
b = b[0] if b else ""
Not pretty, and very confusing to Python newcomers.
So what's the most Pythonic way to do this? Store the return value in a variable and use an if block? The *varname hack? Something else?
This looks perfect for str.partition:
>>> a, _, b = "This is a string".partition(" ")
>>> a
'This'
>>> b
'is a string'
>>> a, _, b = "Thisisastring".partition(" ")
>>> a
'Thisisastring'
>>> b
''
>>>
How about adding the default(s) at the end and throwing away the unused ones?
>>> a, b, *_ = "This is a string".split(" ", 1) + ['']
>>> a, b
('This', 'is a string')
>>> a, b, *_ = "Thisisastring".split(" ", 1) + ['']
>>> a, b
('Thisisastring', '')
>>> a, b, c, *_ = "Thisisastring".split(" ", 2) + [''] * 2
>>> a, b, c
('Thisisastring', '', '')
Similar (works in Python 2 as well):
>>> a, b, c = ("Thisisastring".split(" ", 2) + [''] * 2)[:3]
>>> a, b, c
('Thisisastring', '', '')
The *varname hack seems very pythonic to me:
Similar to how function parameters are handled
Lets you use a one-liner or if block or nothing to correct type of the element if desired
You could also try something like the following if you don't find that clear enough for new users
def default(default, tuple_value):
return tuple(map(lambda x: x if x is not None else default, tuple_value))
Then you can do something like
a, *b = default("", s.split(...))
Then you should be able to depend on b[0] being a string.
I fully admit that the definition of default is obscure, but if you like the effect, you can refine until it meets your aesthetic. In general this is all about what feels right for your style.

Comparing two variables in Python

I'm trying to test for equality of a and b here. I'm not sure why the output prints 'not equal' even though both a and b equates to '75', which is the same value.
a = (len(products)) #products is a dictionary, its length is 75
b = (f.read()) #f is a txt file that contains only '75'
if(str(a) == str(b)):
print ('equal')
else:
print ('not equal')
Add an int() around the f.read() to typecast the str to an int.
>>> b = f.read().strip() # call strip to remove unwanted whitespace chars
>>> type(b)
<type 'str'>
>>>
>>> type(int(b))
<type 'int'>
>>>
>>> b = int(b)
Now you can compare a and b with the knowledge that they'd have values of the same type.
File contents are always returned in strings/bytes, and you need to convert/typecase accordingly.
The value of 'a' is 75 the integer while the value of 'b' is "75" the string. When calculated if equal the result will be false because they are not the same type. Try casting b to an integer with:
b = int(b)

Strange Python behavior from inappropriate usage of 'is not' comparison?

I (incorrectly?) used 'is not' in a comparison and found this curious behavior:
>>> a = 256
>>> b = int('256')
>>> c = 300
>>> d = int('300')
>>>
>>> a is not b
False
>>> c is not d
True
Obviously I should have used:
>>> a != b
False
>>> c != d
False
But it worked for a long time due to small-valued test-cases until I happened to
use a number of 495.
If this is invalid syntax, then why? And shouldn't I at least get a warning?
"is" is not a check of equality of value, but a check that two variables point to the same instance of an object.
ints and strings are confusing for this as is and == can happen to give the same result due to how the internals of the language work.
For small numbers, Python is reusing the object instances, but for larger numbers, it creates new instances for them.
See this:
>>> a=256
>>> b=int('256')
>>> c=300
>>> d=int('300')
>>> id(a)
158013588
>>> id(b)
158013588
>>> id(c)
158151472
>>> id(d)
158151436
which is exactly why a is b, but c isn't d.
Don't use is [not] to compare integers; use == and != instead. Even though is works in current CPython for small numbers due to an optimization, it's unreliable and semantically wrong. The syntax itself is valid, but the benefits of a warning (which would have to be checked on every use of is and could be problematic with subclasses of int) are presumably not worth the trouble.
This is covered elsewhere on SO, but I didn't find it just now.
Int is an object in python, and python caches small integer between [-5,256] by default, so where you use int in [-5,256], they are identical.
a = 256
b = 256
a is b # True
If you declare two integers not in [-5,256], python will create two objects which are not the same(though they have the same value).
a = 257
b = 257
a is b # False
In your case, using != instead to compare the value is the right way.
a = 257
b = 257
a != b # False
For more understanding why this occurs take a look to Python-2.6.5/Objects/intobject.c:78:small_ints array and Python-2.6.5/Objects/intobject.c:1292:_PyInt_Init function in python sources.
Also similar thing occurs with lists:
>>> a = [12]
>>> id_a = id(a)
>>> del(a)
>>> id([1,2,34]) == id_a
True
>>>
Removed lists are not destroyed. They are reused

Categories

Resources