Decode %xx in Python 3.6 [duplicate] - python

This question already has answers here:
Transform URL string into normal string in Python (%20 to space etc)
(3 answers)
Closed 4 years ago.
I have
%64%6f%63%75%6d%65%6e%74%2e%77%72%69%74%65%28%27%3c%61%20%68%72%65%66%3d%22%6d%61%69%6c%74%6f%3a%62%65%6e%2e%61%6e%67%65%72%40%6b%6e%6f%62%62%65%2e%63%6f%6d%22%20%72%65%6c%3d%22%6e%6f%69%6e%64%65%78%2c%20%6e%6f%66%6f%6c%6c%6f%77%22%3e%62%65%6e%2e%61%6e%67%65%72%40%6b%6e%6f%62%62%65%2e%63%6f%6d%3c%2f%61%3e%27%29%3b
It's from a JavaScript tag that I scraped.
Unfortunately, none of the solutions in Javascript unescape() vs. Python urllib.unquote() seem to work in Python 3.

unquote() has been moved to the urllib.parse package in Python 3:
>>> from urllib.parse import unquote
>>> unquote('%64%6f%63%75%6d%65%6e%74%2e')
'document.'

Related

Python - How to convert HTML entity to UTF-8 [duplicate]

This question already has answers here:
Decode HTML entities in Python string?
(6 answers)
Closed 3 years ago.
I want to convert in Python 2.7 string like
"€", "ż"
and similar to UTF-8 string.
How to do it?
Python3
>>> import html
>>> html.unescape('©')
'©'
>>> html.unescape('€')
'€'
>>> html.unescape('ż')
'ż'
It's in html module in python.

Python convert Hexadecimal Character to Respective Symbols? [duplicate]

This question already has answers here:
How do I url unencode in Python?
(3 answers)
Closed 5 years ago.
I'm trying to find a python package/sample code that can convert the following input "why+don%27t+you+want+to+talk+to+me" to "why+don't+you+want+to+talk+to+me".
Converting the Hex codes like %27 to ' respectively. I can hardcode the who hex character set and then swap them with their symbols. However, I want a simple and scalable solution.
Thanks for helping
You can use urllib's unquote function.
import urllib.parse
urllib.parse.unquote('why+don%27t+you+want+to+talk+to+me')

encoding string that has been decoded with %' to unicode [duplicate]

This question already has answers here:
Transform URL string into normal string in Python (%20 to space etc)
(3 answers)
Url decode UTF-8 in Python
(5 answers)
Decode escaped characters in URL
(5 answers)
Closed 5 years ago.
html POST method decoded my string like this:
Ostrołęka => Ostro%C5%82%C4%99ka
How do I encode it into readable form in Python?
Sorry for possible duplicate.
EDIT: Solution in 'possible duplicate' doesn't solve above problem
Python 2:
from urllib import unquote
x = unquote('Ostro%C5%82%C4%99ka')
Python 3:
from urllib.parse import unquote
x = unquote('Ostro%C5%82%C4%99ka')

How to convert `%xx` code in URL back to the corresponding UTF-8 character [duplicate]

This question already has answers here:
How do I url unencode in Python?
(3 answers)
Closed 8 years ago.
For example, I want to convert
'http://en.wikipedia.org/wiki/Ana%C3%AFs_Croze'
to
u'http://en.wikipedia.org/wiki/Anaïs_Croze'
How to do this in Python?
>>> import urllib2
>>> print urllib2.unquote('http://en.wikipedia.org/wiki/Ana%C3%AFs_Croze')
http://en.wikipedia.org/wiki/Anaïs_Croze
>>>
The above code as a runnable 'bunk' http://codebunk.com/bunk#-Iy8_GcBQ02jlMauuYP4

Replacing HTML representation to ascii using Python [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Decode HTML entities in Python string?
I have parsed some HTML text. But some punctuations like apostrophe are replaced by ’. How to revert them back to `
P.S: I am using Python/Feedparser
Thanks
The PSF Wiki has some ways of doing it. Here is one way:
import htmllib
def unescape(s):
p = htmllib.HTMLParser(None)
p.save_bgn()
p.feed(s)
return p.save_end()
See http://wiki.python.org/moin/EscapingHtml
This helped me
import HTMLParser
hparser=HTMLParser.HTMLParser()
new_text=hparser.unescape(raw_text)

Categories

Resources