Python convert Hexadecimal Character to Respective Symbols? [duplicate] - python

This question already has answers here:
How do I url unencode in Python?
(3 answers)
Closed 5 years ago.
I'm trying to find a python package/sample code that can convert the following input "why+don%27t+you+want+to+talk+to+me" to "why+don't+you+want+to+talk+to+me".
Converting the Hex codes like %27 to ' respectively. I can hardcode the who hex character set and then swap them with their symbols. However, I want a simple and scalable solution.
Thanks for helping

You can use urllib's unquote function.
import urllib.parse
urllib.parse.unquote('why+don%27t+you+want+to+talk+to+me')

Related

String format: getting u'' inside the final string [duplicate]

This question already has answers here:
Removing u in list
(8 answers)
Closed 3 years ago.
I have a list of id's and I am trying the following below:
final = "ids: {}".format(tuple(id_list))
For some reason I am getting the following:
"ids: (u'213231231', u'weqewqqwe')
Could anyone help out on why the u is coming inside my final string. When I am trying the same in another environment, I get the output without the u''. Any specific reason for this?
Actually it is unicode strings in python
for literal value of string you can fist map with str
>>> final = "ids: {}".format(tuple(map(str, id_list)))
>>> final
"ids: ('213231231', 'weqewqqwe')

How to make query string readable in Python? [duplicate]

This question already has answers here:
Decode escaped characters in URL
(5 answers)
Closed 5 years ago.
How to make this string readable in Python 2.7?
%D0%9A%D0%BE%D0%BD%D1%86%D0%B5%D0%BF%D1%86%D0%B8%D1%8F_%D0%A4%D0%B5%D0%B4%D0%B5%D1%80%D0%B0%D0%BB%D1%8C%D0%BD%D0%BE%D0%B9_%D1%86%D0%B5%D0%BB%D0%B5%D0%B2%D0%BE%D0%B9_%D0%BF%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC%D0%BC%D1%8B_%D1%80%D0%B0%D0%B7%D0%B2%D0%B8%D1%82%D0%B8%D1%8F_%D0%BE%D0%B1%D1%80%D0%B0%D0%B7%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D1%8F_%D0%BD%D0%B0_2016-2020_%D0%B3%D0%B3
This string contains Cyrillic symbol and it's a part of a URL (a query string parameter).
use urllib.unquote from the standard library.
urllib.unquote(string)¶
Replace %xx escapes by their single-character equivalent.
Example: unquote('/%7Econnolly/') yields '/~connolly/'.

encoding string that has been decoded with %' to unicode [duplicate]

This question already has answers here:
Transform URL string into normal string in Python (%20 to space etc)
(3 answers)
Url decode UTF-8 in Python
(5 answers)
Decode escaped characters in URL
(5 answers)
Closed 5 years ago.
html POST method decoded my string like this:
Ostrołęka => Ostro%C5%82%C4%99ka
How do I encode it into readable form in Python?
Sorry for possible duplicate.
EDIT: Solution in 'possible duplicate' doesn't solve above problem
Python 2:
from urllib import unquote
x = unquote('Ostro%C5%82%C4%99ka')
Python 3:
from urllib.parse import unquote
x = unquote('Ostro%C5%82%C4%99ka')

Is there a way to convert unicode to the nearest ASCII equivalent? [duplicate]

This question already has answers here:
Convert a Unicode string to a string in Python (containing extra symbols)
(12 answers)
Closed 7 years ago.
I will give the example from Turkish, for example "şğüı" becomes "sgui"
I'm sure each language has it's own conversion methods, sometimes a character might be converted to multiple ASCII characters, like "alpha"/"phi" etc.
I'm wondering whether there is a library/method that achieves this conversion
What you are asking is called transliteration.
Try the Unidecode library.

Replacing HTML representation to ascii using Python [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Decode HTML entities in Python string?
I have parsed some HTML text. But some punctuations like apostrophe are replaced by ’. How to revert them back to `
P.S: I am using Python/Feedparser
Thanks
The PSF Wiki has some ways of doing it. Here is one way:
import htmllib
def unescape(s):
p = htmllib.HTMLParser(None)
p.save_bgn()
p.feed(s)
return p.save_end()
See http://wiki.python.org/moin/EscapingHtml
This helped me
import HTMLParser
hparser=HTMLParser.HTMLParser()
new_text=hparser.unescape(raw_text)

Categories

Resources