I have this code :
toto = 'récépissé.pdf'.encode('utf-8')
print(toto)
=> b'r\xc3\xa9c\xc3\xa9piss\xc3\xa9.pdf'
Of course it return a byte type, but I still want a string with the content of the encoded result.
I want a string like this :
'r\xc3\xa9c\xc3\xa9piss\xc3\xa9.pdf'
If I try to return a string by using a str() or decode(), it revert back to initial value.
PS: the final purpose is to pass string data in header for dropbox api.
Probably this is what you are looking for:
toto = 'récépissé.pdf'.encode('utf-8')
print(str(toto).split("'")[1])
Related
I am writing a program to call an API. I am trying to convert my data payload into json. Thus, I am using json.loads() to achieve this.
However, I have encountered the following problem.
I set my variable as following:
apiVar = [
"https://some.url.net/api/call", #url
'{"payload1":"email#user.net", "payload2":"stringPayload"}',#payload
{"Content-type": "application/json", "Accept": "text/plain"}#headers
]
Then I tried to convert apiVar[1] value into json object.
jsonObj = json.loads(apiVar[1])
However, instead of giving me output like the following:
{"payload1":"email#user.net", "payload2":"stringPayload"}
It gives me this instead:
{'payload1':'email#user.net', 'payload2':'stringPayload'}
I know for sure that this is not a valid json format. What I would like to know is, why does this happen? I try searching a solution for it but am not able to find anything on it. All code examples suggest it should have given me the double quote instead.
How should I fix it so that it will give the double quote output?
json.loads() takes a JSON string and converts it into the equivalent Python datastructure, which in this case is a dict containing strings. And Python strings display in single quotes by default.
If you want to convert a Python datastructure to JSON, use json.dumps(), which will return a string. Or if you're outputting straight to a file, use json.dump().
In any case, your payload is already valid JSON, so the only reason to load it is if you want to make changes to it before calling the API.
You need to use the json.dumps to convert the object back into json format.
The string with single quotes that you are reverencing is probably a str() or repr() method that is simply used to visualize the data as a python object (dictionary) not a json object. try taking a look at this:
print(type(jsonObj))
print(str(jsonObj))
print(json.dumps(jsonObj))
I have a string s, its contents are variable. How can I make it a raw string? I'm looking for something similar to the r'' method.
i believe what you're looking for is the str.encode("string-escape") function. For example, if you have a variable that you want to 'raw string':
a = '\x89'
a.encode('unicode_escape')
'\\x89'
Note: Use string-escape for python 2.x and older versions
I was searching for a similar solution and found the solution via:
casting raw strings python
Raw strings are not a different kind of string. They are a different way of describing a string in your source code. Once the string is created, it is what it is.
Since strings in Python are immutable, you cannot "make it" anything different. You can however, create a new raw string from s, like this:
raw_s = r'{}'.format(s)
As of Python 3.6, you can use the following (similar to #slashCoder):
def to_raw(string):
return fr"{string}"
my_dir ="C:\data\projects"
to_raw(my_dir)
yields 'C:\\data\\projects'. I'm using it on a Windows 10 machine to pass directories to functions.
raw strings apply only to string literals. they exist so that you can more conveniently express strings that would be modified by escape sequence processing. This is most especially useful when writing out regular expressions, or other forms of code in string literals. if you want a unicode string without escape processing, just prefix it with ur, like ur'somestring'.
For Python 3, the way to do this that doesn't add double backslashes and simply preserves \n, \t, etc. is:
a = 'hello\nbobby\nsally\n'
a.encode('unicode-escape').decode().replace('\\\\', '\\')
print(a)
Which gives a value that can be written as CSV:
hello\nbobby\nsally\n
There doesn't seem to be a solution for other special characters, however, that may get a single \ before them. It's a bummer. Solving that would be complex.
For example, to serialize a pandas.Series containing a list of strings with special characters in to a textfile in the format BERT expects with a CR between each sentence and a blank line between each document:
with open('sentences.csv', 'w') as f:
current_idx = 0
for idx, doc in sentences.items():
# Insert a newline to separate documents
if idx != current_idx:
f.write('\n')
# Write each sentence exactly as it appared to one line each
for sentence in doc:
f.write(sentence.encode('unicode-escape').decode().replace('\\\\', '\\') + '\n')
This outputs (for the Github CodeSearchNet docstrings for all languages tokenized into sentences):
Makes sure the fast-path emits in order.
#param value the value to emit or queue up\n#param delayError if true, errors are delayed until the source has terminated\n#param disposable the resource to dispose if the drain terminates
Mirrors the one ObservableSource in an Iterable of several ObservableSources that first either emits an item or sends\na termination notification.
Scheduler:\n{#code amb} does not operate by default on a particular {#link Scheduler}.
#param the common element type\n#param sources\nan Iterable of ObservableSource sources competing to react first.
A subscription to each source will\noccur in the same order as in the Iterable.
#return an Observable that emits the same sequence as whichever of the source ObservableSources first\nemitted an item or sent a termination notification\n#see ReactiveX operators documentation: Amb
...
Just format like that:
s = "your string"; raw_s = r'{0}'.format(s)
With a little bit correcting #Jolly1234's Answer:
here is the code:
raw_string=path.encode('unicode_escape').decode()
s = "hel\nlo"
raws = '%r'%s #coversion to raw string
#print(raws) will print 'hel\nlo' with single quotes.
print(raws[1:-1]) # will print hel\nlo without single quotes.
#raws[1:-1] string slicing is performed
The solution, which worked for me was:
fr"{orignal_string}"
Suggested in comments by #ChemEnger
I suppose repr function can help you:
s = 't\n'
repr(s)
"'t\\n'"
repr(s)[1:-1]
't\\n'
Just simply use the encode function.
my_var = 'hello'
my_var_bytes = my_var.encode()
print(my_var_bytes)
And then to convert it back to a regular string do this
my_var_bytes = 'hello'
my_var = my_var_bytes.decode()
print(my_var)
--EDIT--
The following does not make the string raw but instead encodes it to bytes and decodes it.
I'm having an issue with inserting text into a specific element in an XML tree.
My goal is to take an image, convert it to base64, then insert the base64 string into an element.
Below is my current code:
with open("t.png", "rb") as imageFile:
str = base64.b64encode(imageFile.read())
tree=ElementTree()
tree = ET.parse('image-template.xml')
root = tree.getroot()
for z in root.iter('body'):
z.text=(str)
tree.write('new_branding.xml')
If I insert a variable with a shorter character length the code seems to work properly. When I try inserting the long character length of a base64 string I get the following error:
" "cannot serialize %r (type %s)" % (text, type(text).__name__)"
Is there something I need to add to my for loop to insert longer strings?
As it stands, your str variable is actually of type bytes. You can test this by putting a print(type(str)) statement after you declare str. It will print <class 'bytes'>.
To get it working, you first of all should change the name of your str variable so that you aren't overwriting the built-in str() function.
If you change it to, say, imageString, you can then change the line where you declare it to something like imageString = str(base64.b64encode(imageFile.read())). Note that we're using str() to convert your variable from bytes to a string: that's why you need to change the name.
Anyway, that should work, or at least it does on my end.
I get this string from stdin.
{u'trades': [Custom(time=1418854520, sn=47998, timestamp=1418854517,
price=322, amount=0.269664, tid=48106793, type=u'ask',
start=1418847319, end=1418847320), Custom(time=1418854520, sn=47997,
timestamp=1418854517, price=322, amount=0.1, tid=48106794,
type=u'ask', start=1418847319, end=1418847320),
Custom(time=1418854520, sn=47996, timestamp=1418854517, price=321.596,
amount=0.011, tid=48106795, type=u'ask', start=1418847319,
end=1418847320)]}
My program fails when i try to access jsonload["trades"]. If i use jsonload[0] I only receive one character: {.
I checked it isn't a problem from get the text from stdin, but I don't know if it is a problem of format received (because i used Incursion library) or if it is a problem in my python code. I have tried many combinations about json.load/s and json.dump/s but without success.
inputdata = sys.stdin.read()
jsondump = json.dumps(inputdata)
jsonload = json.loads(jsondump)
print jsonload
print type(jsonload) # return me "<type 'unicode'>"
print repr(jsonload) # return me same but with u" ..same string.... "
for row in jsonload["trades"]: # error here: TypeError: string indices must be integers
You read input data into a string. This is then turned into a JSON encoded string by json.dumps. You then turn it back into a plain string using json.loads. You have not interpreted the original data as JSON at any point.
Try just converting the input data from json:
inputdata = sys.stdin.read()
jsonload = json.loads(inputdata)
However this will not work because you have not got valid JSON data in your snippet. It looks like serialized python code. You can check the input data using http://jsonlint.com
The use of u'trades' shows me that you have a unicode python string. The JSON equivalent would be "trades". To convert the python code you can eval it, but this is a dangerous operation if the data comes from an untrusted source.
I am using python v2.7.3 and am trying to get a conversion to work but am having some issues.
This is code that works the way I would like it to:
testString = "\x00\x13\xA2\x00\x40\xAA\x15\x47"
print 'Test String:',testString
This produces the following result
TestString: ¢#ªG
Now I load the same string as above along with some other data:
\x00\x13\xA2\x00\x40\xAA\x15\x47123456
into a SQLite3 database and then pull it from the database as such:
cur.execute('select datafield from databasetable')
rows = cur.fetchall()
if len(rows) == 0:
print 'Sorry Found Nothing!'
else:
print row[0][:32]
This however produces the following result:
\x00\x13\xA2\x00\x40\xAA\x15\x47
I can not figure out how to convert the database stored string to the bytes string, if that is what it is, as the first snippet of code does. I actually need it to load into a variable in that format so I can pass it to a function for further processing.
The following I have tried:
print "My Addy:",bytes(row[0][:32])
print '{0}'.format(row[0][:32])
...
They all produce the same results...
Please
First, Can anyone tell me what format the first results are in? I think its bytes format but am not sure.
Second, How can I convert the database stored text into
Any help and I would be eternally grateful.
Thanks in advance,
Ed
The problem is that you're not storing the value in the database properly. You want to store a sequence of bytes, but you're storing an escaped version of those bytes instead.
When entering string literals into a programming language, you can use escape codes in your source code to access non-printing characters. That's what you've done in your first example:
testString = "\x00\x13\xA2\x00\x40\xAA\x15\x47"
print 'Test String:',testString
But this is processing done by the Python interpreter as it's reading through your program and executing it.
Change the database column to a binary blob instead of a string, then go back to the code you're using to store the bytes in SQLite3, and have it store the actual bytes ('ABC', 3 bytes) instead of an escaped string ('\x41\x42\x43', 12 characters).
If you truly need to store the escaped string in SQLite3 and convert it at run-time, you might be able to use ast.literal_eval() to evaluate it as a string literal.