Parse Json without quotes in Python

Parse Json without quotes in Python - python

I am trying to parse JSON input as string in Python, not able to parse as list or dict since the JSON input is not in a proper format (Due to limitations in the middleware can't do much here.)
{
"Records": "{Output=[{_fields=[{Entity=ABC , No=12345, LineNo= 1, EffDate=20200630}, {Entity=ABC , No=567, LineNo= 1, EffDate=20200630}]}"
}
I tried json.loads and ast.literal (invalid syntax error).
How can I load this?

The sad answer is: the contents of your "Records" field are simply not JSON. No amount of ad-hoc patching (= to :, adding quotes) will change that. You have to find out the language/format specification for what the producing system emits and write/find a proper parser for that particular format.
As a clutch, and only in the case that the above example already captures all the variability you might see in production data, a much simpler approach based on regular expressions (see package re or edd's pragmatic answer) might be sufficient.

If the producer of the data is consistent, you can start with something like the following, that aims to bridge the JSON gap.
import re
import json
source = {
"Records": "{Output=[{_fields=[{Entity=ABC , No=12345, LineNo= 1, EffDate=20200630}, {Entity=ABC , No=567, LineNo= 1, EffDate=20200630}]}"
}
s = source["Records"]
# We'll start by removing any extraneous white spaces
s2 = re.sub('\s', '', s)
# Surrounding any word with "
s3 = re.sub('(\w+)', '"\g<1>"', s2)
# Replacing = with :
s4 = re.sub('=', ':', s3)
# Lastly, fixing missing closing ], }
## Note that }} is an escaped } for f-string.
s5 = f"{s4}]}}"
>>> json.loads(s5)
{'Output': [{'_fields': [{'Entity': 'ABC', 'No': '12345', 'LineNo': '1', 'EffDate': '20200630'}, {'Entity': 'ABC', 'No': '567', 'LineNo': '1', 'EffDate': '20200630'}]}]}
Follow up with some robust testing and have a nice polished ETL with your favorite tooling.

As i understand you are trying to parse the value of the Records item in the dictionary as JSON, unfortunately you cannot.
The string in that value is not JSON, and you must write a parser that will first parse the string into a JSON string according to the format that the string is written in by yourself. ( We don't know what "middleware" you are talking of unfortunately ).
tldr: Parse it into a JSON string, then parse the JSON into a python dictionary. Read this to find out more about JSON ( Javascript Object Notation ) rules.

There you go. This code will make it valid json:
notjson = """{
"Records": "{Output=[{_fields=[{Entity=ABC , No=12345, LineNo= 1, EffDate=20200630}, {Entity=ABC , No=567, LineNo= 1, EffDate=20200630}]}"
}"""
notjson = notjson.replace("=","':") #adds a singlequote and makes it more valid
notjson = notjson.replace("{","{'")
notjson = notjson.replace(", ",", '")
notjson = notjson.replace("}, '{","}, {")
json = "{" + notjson[2:]
print(json)
print(notjson)

Related

pyparsing syntax tree from named value list

I'd like to parse tag/value descriptions using the delimiters :, and •
E.g. the Input would be:
Name:Test•Title: Test•Keywords: A,B,C
the expected result should be the name value dict
{
"name": "Test",
"title": "Title",
"keywords: "A,B,C"
}
potentially already splitting the keywords in "A,B,C" to a list. (This is a minor detail since the python built in split method of string will happily do this).
Also applying a mapping
keys={
"Name": "name",
"Title": "title",
"Keywords": "keywords",
}
as a mapping between names and dict keys would be helpful but could be a separate step.
I tried the code below https://trinket.io/python3/8dbbc783c7
# pyparsing named values
# Wolfgang Fahl
# 2023-01-28 for Stackoverflow question
import pyparsing as pp
notes_text="Name:Test•Title: Test•Keywords: A,B,C"
keys={
"Name": "name",
"Titel": "title",
"Keywords": "keywords",
}
keywords=list(keys.keys())
runDelim="•"
name_values_grammar=pp.delimited_list(
pp.oneOf(keywords,as_keyword=True).setResultsName("key",list_all_matches=True)
+":"+pp.Suppress(pp.Optional(pp.White()))
+pp.delimited_list(
pp.OneOrMore(pp.Word(pp.printables+" ", exclude_chars=",:"))
,delim=",")("value")
,delim=runDelim).setResultsName("tag", list_all_matches=True)
results=name_values_grammar.parseString(notes_text)
print(results.dump())
and variations of it but i am not even close to the expected result. Currently the dump shows:
['Name', ':', 'Test']
- key: 'Name'
- tag: [['Name', ':', 'Test']]
[0]:
['Name', ':', 'Test']
- value: ['Test']
Seems i don't know how to define the grammar and work on the parseresult in a way to get the needed dict result.
The main questions for me are:
Should i use parse actions?
How is the naming of part results done?
How is the navigation of the resulting tree done?
How is it possible to get the list back from delimitedList?
What does list_all_matches=True achieve - it's behavior seems strange
I searched for answers on the above questions here on stackoverflow and i couldn't find a consistent picture of what to do.
Pyparsing delimited list only returns first element
Finding lists of elements within a string using Pyparsing
PyParsing seems to be a great tool but i find it very unintuitive. There are fortunately lots of answers here so i hope to learn how to get this example working
Trying myself i took a stepwise approach:
First i checked the delimitedList behavior see https://trinket.io/python3/25e60884eb
# Try out pyparsing delimitedList
# WF 2023-01-28
from pyparsing import printables, OneOrMore, Word, delimitedList
notes_text="A,B,C"
comma_separated_values=delimitedList(Word(printables+" ", exclude_chars=",:"),delim=",")("clist")
grammar = comma_separated_values
result=grammar.parseString(notes_text)
print(f"result:{result}")
print(f"dump:{result.dump()}")
print(f"asDict:{result.asDict()}")
print(f"asList:{result.asList()}")
which returns
result:['A', 'B', 'C']
dump:['A', 'B', 'C']
- clist: ['A', 'B', 'C']
asDict:{'clist': ['A', 'B', 'C']}
asList:['A', 'B', 'C']
which looks promising and the key success factor seems to be to name this list with "clist" and the default behavior looks fine.
https://trinket.io/python3/bc2517e25a
shows in more detail where the problem is.
# Try out pyparsing delimitedList
# see https://stackoverflow.com/q/75266188/1497139
# WF 2023-01-28
from pyparsing import printables, oneOf, OneOrMore,Optional, ParseResults, Suppress,White, Word, delimitedList
def show_result(title:str,result:ParseResults):
"""
show pyparsing result details
Args:
result(ParseResults)
"""
print(f"result for {title}:")
print(f" result:{result}")
print(f" dump:{result.dump()}")
print(f" asDict:{result.asDict()}")
print(f" asList:{result.asList()}")
# asXML is deprecated and doesn't work any more
# print(f"asXML:{result.asXML()}")
notes_text="Name:Test•Title: Test•Keywords: A,B,C"
comma_text="A,B,C"
keys={
"Name": "name",
"Titel": "title",
"Keywords": "keywords",
}
keywords=list(keys.keys())
runDelim="•"
comma_separated_values=delimitedList(Word(printables+" ", exclude_chars=",:"),delim=",")("clist")
cresult=comma_separated_values.parseString(comma_text)
show_result("comma separated values",cresult)
grammar=delimitedList(
oneOf(keywords,as_keyword=True)
+Suppress(":"+Optional(White()))
+comma_separated_values
,delim=runDelim
)("namevalues")
nresult=grammar.parseString(notes_text)
show_result("name value list",nresult)
#ogrammar=OneOrMore(
# oneOf(keywords,as_keyword=True)
# +Suppress(":"+Optional(White()))
# +comma_separated_values
#)
#oresult=grammar.parseString(notes_text)
#show_result("name value list with OneOf",nresult)
output:
result for comma separated values:
result:['A', 'B', 'C']
dump:['A', 'B', 'C']
- clist: ['A', 'B', 'C']
asDict:{'clist': ['A', 'B', 'C']}
asList:['A', 'B', 'C']
result for name value list:
result:['Name', 'Test']
dump:['Name', 'Test']
- clist: ['Test']
- namevalues: ['Name', 'Test']
asDict:{'clist': ['Test'], 'namevalues': ['Name', 'Test']}
asList:['Name', 'Test']
while the first result makes sense for me the second is unintuitive. I'd expected a nested result - a dict with a dict of list.
What causes this unintuitive behavior and how can it be mitigated?

Issues with the grammar being that: you are encapsulating OneOrMore in delimited_list and you only want the outer one, and you aren't telling the parser how your data needs to be structured to give the names meaning.
You also don't need the whitespace suppression as it is automatic.
Adding parse_all to the parse_string function will help to see where not everything is being consumed.
name_values_grammar = pp.delimited_list(
pp.Group(
pp.oneOf(keywords,as_keyword=True).setResultsName("key",list_all_matches=True)
+ pp.Suppress(pp.Literal(':'))
+ pp.delimited_list(
pp.Word(pp.printables, exclude_chars=':,').setResultsName('value', list_all_matches=True)
, delim=',')
)
, delim='•'
).setResultsName('tag', list_all_matches=True)
Should i use parse actions? As you can see, you don't technically need to, but you've ended up with a data structure that might be less efficient for what you want. If the grammar gets more complicated, I think using some parse actions would make sense. Take a look below for some examples to map the key names (only if they are found), and cleaning up list parsing for a more complicated grammar.
How is the naming of part results done? By default in a ParseResults object, the last part that is labelled with a name will be returned when you ask for that name. Asking for all matches to be returned using list_all_matches will only work usefully for some simple structures, but it does work. See below for examples.
How is the navigation of the resulting tree done? By default, everything gets flattened. You can use pyparsing.Group to tell the parser not to flatten its contents into the parent list (and therefore retain useful structure and part names).
How is it possible to get the list back from delimitedList? If you don't wrap the delimited_list result in another list then the flattening that is done will remove the structure. Parse actions or Group on the internal structure again to the rescue.
What does list_all_matches=True achieve - its behavior seems strange It is a function of the grammar structure that it seems strange. Consider the different outputs in:
import pyparsing as pp
print(
pp.delimited_list(
pp.Word(pp.printables, exclude_chars=',').setResultsName('word', list_all_matches=True)
).parse_string('x,y,z').dump()
)
print(
pp.delimited_list(
pp.Word(pp.printables, exclude_chars=':,').setResultsName('key', list_all_matches=True)
+ pp.Suppress(pp.Literal(':'))
+ pp.Word(pp.printables, exclude_chars=':,').setResultsName('value', list_all_matches=True)
)
.parse_string('x:a,y:b,z:c').dump()
)
print(
pp.delimited_list(
pp.Group(
pp.Word(pp.printables, exclude_chars=':,').setResultsName('key', list_all_matches=True)
+ pp.Suppress(pp.Literal(':'))
+ pp.Word(pp.printables, exclude_chars=':,').setResultsName('value', list_all_matches=True)
)
).setResultsName('tag', list_all_matches=True)
.parse_string('x:a,y:b,z:c').dump()
)
The first one makes sense, giving you a list of all the tokens you would expect. The third one also makes sense, since you have a structure you can walk. But the second one you end up with two lists that are not necessarily (in a more complicated grammar) going to be easy to match up.
Here's a different way of building the grammar so that it supports quoting strings with delimiters in them so they don't become lists, and keywords that aren't in your mapping. It's harder to do this without parse actions.
import pyparsing as pp
import json
test_string = "Name:Test•Title: Test•Extra: '1,2,3'•Keywords: A,B,C,'D,E',F"
keys={
"Name": "name",
"Title": "title",
"Keywords": "keywords",
}
g_key = pp.Word(pp.alphas)
g_item = pp.Word(pp.printables, excludeChars='•,\'') | pp.QuotedString(quote_char="'")
g_value = pp.delimited_list(g_item, delim=',')
l_key_value_sep = pp.Suppress(pp.Literal(':'))
g_key_value = g_key + l_key_value_sep + g_value
g_grammar = pp.delimited_list(g_key_value, delim='•')
g_key.add_parse_action(lambda x: keys[x[0]] if x[0] in keys else x)
g_value.add_parse_action(lambda x: [x] if len(x) > 1 else x)
g_key_value.add_parse_action(lambda x: (x[0], x[1].as_list()) if isinstance(x[1],pp.ParseResults) else (x[0], x[1]))
key_values = dict()
for k,v in g_grammar.parse_string(test_string, parse_all=True):
key_values[k] = v
print(json.dumps(key_values, indent=2))

Another approach using regular expressions would be:
def _extractByKeyword(keyword: str, string: str) -> typing.Union[str, None]:
"""
Extract the value for the given key from the given string.
designed for simple key value strings without further formatting
e.g.
Title: Hello World
Goal: extraction
For keyword="Goal" the string "extraction would be returned"
Args:
keyword: extract the value associated to this keyword
string: string to extract from
Returns:
str: value associated to given keyword
None: keyword not found in given string
"""
if string is None or keyword is None:
return None
# https://stackoverflow.com/a/2788151/1497139
# value is closure of not space not / colon
pattern = rf"({keyword}:(?P<value>[\s\w,_-]*))(\s+\w+:|\n|$)"
import re
match = re.search(pattern, string)
value = None
if match is not None:
value = match.group('value')
if isinstance(value, str):
value = value.strip()
return value
keys={
"Name": "name",
"Title": "title",
"Keywords": "keywords",
}
notes_text="Name:Test Title: Test Keywords: A,B,C"
lod = {v: _extractByKeyword(k, notes_text) for k,v in keys.items()}
The extraction function was tested with:
import typing
from dataclasses import dataclass
from unittest import TestCase
class TestExtraction(TestCase)
def test_extractByKeyword(self):
"""
tests the keyword extraction
"""
#dataclass
class TestParam:
expected: typing.Union[str, None]
keyword: typing.Union[str, None]
string: typing.Union[str, None]
testParams = [
TestParam("test", "Goal", "Title:Title\nGoal:test\nLabel:title"),
TestParam("test", "Goal", "Title:Title\nGoal:test Label:title"),
TestParam("test", "Goal", "Title:Title\nGoal:test"),
TestParam("test with spaces", "Goal", "Title:Title\nGoal:test with spaces\nLabel:title"),
TestParam("test with spaces", "Goal", "Title:Title\nGoal:test with spaces Label:title"),
TestParam("test with spaces", "Goal", "Title:Title\nGoal:test with spaces"),
TestParam("SQL-DML", "Goal", "Title:Title\nGoal:SQL-DML"),
TestParam("SQL_DML", "Goal", "Title:Title\nGoal:SQL_DML"),
TestParam(None, None, "Title:Title\nGoal:test"),
TestParam(None, "Label", None),
TestParam(None, None, None),
]
for testParam in testParams:
with self.subTest(testParam=testParam):
actual = _extractByKeyword(testParam.keyword, testParam.string)
self.assertEqual(testParam.expected, actual)

For the time being i am using a simple work-around see https://trinket.io/python3/7ccaa91f7e
# Try out parsing name value list
# WF 2023-01-28
import json
notes_text="Name:Test•Title: Test•Keywords: A,B,C"
keys={
"Name": "name",
"Title": "title",
"Keywords": "keywords",
}
result={}
key_values=notes_text.split("•")
for key_value in key_values:
key,value=key_value.split(":")
value=value.strip()
result[keys[key]]=value # could do another split here if need be
print(json.dumps(result,indent=2))
output:
{
"name": "Test",
"title": "Test",
"keywords": "A,B,C"
}

Python parse string environment variables

I would like to parse environment variables from a string.
For example:
envs = parse_env('name="John Doe" age=21 gender=male')
print(envs)
# outputs: {"name": "John Doe", "age": 21, "gender": "male"}
What is the best and most minimalistic way to achieve this?
Thank you.

If you can dictate that your values will never contain the special characters that you use in your input format (namely = and ), this is very easy to do with split:
>>> def parse_env(envs):
... pairs = [pair.split("=") for pair in envs.split(" ")]
... return {var: val for var, val in pairs}
...
>>> parse_env("name=John_Doe age=21 gender=male")
{'name': 'John_Doe', 'age': '21', 'gender': 'male'}
If your special characters mean different things in different contexts (e.g. a = can be the separator between a var and value OR it can be part of a value), the problem is harder; you'll need to use some kind of state machine (e.g. a regex) to break the string into tokens in a way that takes into account the different ways that a character might be used in the string.

You can use Python Dotevn to parse a .env file with your variables. Your file .env
should be something like:
NAME="John Doe"
AGE=21
GENDER="male"
Then you should be able to run:
from dotenv import load_dotenv
load_dotenv(dotenv_path=env_path)
Good luck!

Replace single quotes with double quotes but leave ones within double quotes untouched

The ultimate goal or the origin of the problem is to have a field compatible with in json_extract_path_text Redshift.
This is how it looks right now:
{'error': "Feed load failed: Parameter 'url' must be a string, not object", 'errorCode': 3, 'event_origin': 'app', 'screen_id': '6118964227874465', 'screen_class': 'Promotion'}
To extract field I need from the string in Redshift, I replaced single quotes with double quotes.
The particular record is giving error because inside value of error, there is a single quote there. With that, the string will be a invalid json if those get replaced as well.
So what I need is:
{"error": "Feed load failed: Parameter 'url' must be a string, not object", "errorCode": 3, "event_origin": "app", "screen_id": "6118964227874465", "screen_class": "Promotion"}

Several ways, one is to use the regex module with
"[^"]*"(*SKIP)(*FAIL)|'
See a demo on regex101.com.
In Python:
import regex as re
rx = re.compile(r'"[^"]*"(*SKIP)(*FAIL)|\'')
new_string = rx.sub('"', old_string)
With the original re module, you'd need to use a function and see if the group has been matched or not - (*SKIP)(*FAIL) lets you avoid exactly that.

I tried a regex approach but found it to complicated and slow. So i wrote a simple "bracket-parser" which keeps track of the current quotation mode. It can not do multiple nesting you'd need a stack for that. For my usecase converting str(dict) to proper JSON it works:
example input:
{'cities': [{'name': "Upper Hell's Gate"}, {'name': "N'zeto"}]}
example output:
{"cities": [{"name": "Upper Hell's Gate"}, {"name": "N'zeto"}]}'
python unit test
def testSingleToDoubleQuote(self):
jsonStr='''
{
"cities": [
{
"name": "Upper Hell's Gate"
},
{
"name": "N'zeto"
}
]
}
'''
listOfDicts=json.loads(jsonStr)
dictStr=str(listOfDicts)
if self.debug:
print(dictStr)
jsonStr2=JSONAble.singleQuoteToDoubleQuote(dictStr)
if self.debug:
print(jsonStr2)
self.assertEqual('''{"cities": [{"name": "Upper Hell's Gate"}, {"name": "N'zeto"}]}''',jsonStr2)
singleQuoteToDoubleQuote
def singleQuoteToDoubleQuote(singleQuoted):
'''
convert a single quoted string to a double quoted one
Args:
singleQuoted(string): a single quoted string e.g. {'cities': [{'name': "Upper Hell's Gate"}]}
Returns:
string: the double quoted version of the string e.g.
see
- https://stackoverflow.com/questions/55600788/python-replace-single-quotes-with-double-quotes-but-leave-ones-within-double-q
'''
cList=list(singleQuoted)
inDouble=False;
inSingle=False;
for i,c in enumerate(cList):
#print ("%d:%s %r %r" %(i,c,inSingle,inDouble))
if c=="'":
if not inDouble:
inSingle=not inSingle
cList[i]='"'
elif c=='"':
inDouble=not inDouble
doubleQuoted="".join(cList)
return doubleQuoted

Using Python Parse to get a string of numbers, letters, whitespace, and symbols

I am attempting to parse a log using the Parse library from Python. (https://pypi.python.org/pypi/parse) For my purposes I need to use the type specifiers in the format string, however, some of the data that I am parsing might be a combination of several of those types.
For example:
"4.56|test-1 Cool|dog"
I can parse the number of the front using the format specifier g (general number) and w (word) for "dog" at the end. However, the middle phrase "test-1 Cool" is a number, letters, whitespace, and a dash. Using any of the specifiers alone doesn't seem to work (have tried W,w,s, and S). I would like to extract that phrase as a string.
Without the problem phrase, I would just do this:
test = "|4.56|dog|"
result = parse('|{number:g}|{word:w}|', test)
EDIT: I have had some success using a custom type conversion shown below:
def SString(string):
return string
test = "|4.56|test-1 Cool|dog|"
result = parse('|{number:g}|{other:SString}|{word:w}|', test, dict(SString=SString))

You can do that with some code like this:
from parse import *
test = "4.56|test-1 Cool|dog"
result = parse('{number:g}|{other}|{word:w}', test)
print result
#<Result () {'other': 'test-1 Cool', 'word': 'dog', 'number': 4.56}>
Also, for type checking you can use re module (for example):
from parse import *
import re
def SString(string):
if re.match('\w+-\d+ \w+',string):
return string
else:
return None
test = "|4.56|test-1 Cool|dog|"
result = parse('|{number:g}|{other:SString}|{word:w}|', test, dict(SString=SString))
print(result)
#<Result () {'other': 'test-1 Cool', 'word': 'dog', 'number': 4.56}>
test = "|4.56|t3est Cool|dog|"
result = parse('|{number:g}|{other:SString}|{word:w}|', test, dict(SString=SString))
print(result)
#<Result () {'other': None, 'word': 'dog', 'number': 4.56}>

How about trying
test.split("|")

Regex to reformat improper JSON data

I have some data that are not properly saved in an old database. I am moving the system to a new database and reformatting the old data as well. The old data looks like this:
a:10:{
s:7:"step_no";s:1:"1";
s:9:"YOUR_NAME";s:14:"Firtname Lastname";
s:11:"CITIZENSHIP"; s:7:"Indian";
s:22:"PROPOSE_NAME_BUSINESS1"; s:12:"ABC Limited";
s:22:"PROPOSE_NAME_BUSINESS2"; s:15:"XYZ Investment";
s:22:"PROPOSE_NAME_BUSINESS3";s:0:"";
s:22:"PROPOSE_NAME_BUSINESS4";s:0:"";
s:23:"PURPOSE_NATURE_BUSINESS";s:15:"Some dummy content";
s:15:"CAPITAL_COMPANY";s:24:"20 Million Capital";
s:14:"ANOTHER_AMOUNT";s:0:"";
}
I want the new look to be in proper JSON format so I can read in python jut like this:
data = {
"step_no": "1",
"YOUR_NAME":"Firtname Lastname",
"CITIZENSHIP":"Indian",
"PROPOSE_NAME_BUSINESS1":"ABC Limited",
"PROPOSE_NAME_BUSINESS2":"XYZ Investment",
"PROPOSE_NAME_BUSINESS3":"",
"PROPOSE_NAME_BUSINESS4":"",
"PURPOSE_NATURE_BUSINESS":"Some dummy content",
"CAPITAL_COMPANY":"20 Million Capital",
"ANOTHER_AMOUNT":""
}
I am thinking using regex to strip out the unwanted parts and reformatting the content using the names in caps would work but I don't know how to go about this.

Regexes would be the wrong approach here. There is no need, and the format is a little more complex than you assume it is.
You have data in the PHP serialize format. You can trivially deserialise it in Python with the phpserialize library:
import phpserialize
import json
def fixup_php_arrays(o):
if isinstance(o, dict):
if isinstance(next(iter(o), None), int):
# PHP has no lists, only mappings; produce a list for
# a dictionary with integer keys to 'repair'
return [fixup_php_arrays(o[i]) for i in range(len(o))]
return {k: fixup_php_arrays(v) for k, v in o.items()}
return o
json.dumps(fixup_php(phpserialize.loads(yourdata, decode_strings=True)))
Note that PHP strings are byte strings, not Unicode text, so especially in Python 3 you'd have to decode your key-value pairs after the fact if you want to be able to re-encode to JSON. The decode_strings=True flag takes care of this for you. The default is UTF-8, pass in an encoding argument to pick a different codec.
PHP also uses arrays for sequences, so you may have to convert any decoded dict object with integer keys to a list first, which is what the fixup_php_arrays() function does.
Demo (with repaired data, many string lengths were off and whitespace was added):
>>> import phpserialize, json
>>> from pprint import pprint
>>> data = b'a:10:{s:7:"step_no";s:1:"1";s:9:"YOUR_NAME";s:18:"Firstname Lastname";s:11:"CITIZENSHIP";s:6:"Indian";s:22:"PROPOSE_NAME_BUSINESS1";s:11:"ABC Limited";s:22:"PROPOSE_NAME_BUSINESS2";s:14:"XYZ Investment";s:22:"PROPOSE_NAME_BUSINESS3";s:0:"";s:22:"PROPOSE_NAME_BUSINESS4";s:0:"";s:23:"PURPOSE_NATURE_BUSINESS";s:18:"Some dummy content";s:15:"CAPITAL_COMPANY";s:18:"20 Million Capital";s:14:"ANOTHER_AMOUNT";s:0:"";}'
>>> pprint(phpserialize.loads(data, decode_strings=True))
{'ANOTHER_AMOUNT': '',
'CAPITAL_COMPANY': '20 Million Capital',
'CITIZENSHIP': 'Indian',
'PROPOSE_NAME_BUSINESS1': 'ABC Limited',
'PROPOSE_NAME_BUSINESS2': 'XYZ Investment',
'PROPOSE_NAME_BUSINESS3': '',
'PROPOSE_NAME_BUSINESS4': '',
'PURPOSE_NATURE_BUSINESS': 'Some dummy content',
'YOUR_NAME': 'Firstname Lastname',
'step_no': '1'}
>>> print(json.dumps(phpserialize.loads(data, decode_strings=True), sort_keys=True, indent=4))
{
"ANOTHER_AMOUNT": "",
"CAPITAL_COMPANY": "20 Million Capital",
"CITIZENSHIP": "Indian",
"PROPOSE_NAME_BUSINESS1": "ABC Limited",
"PROPOSE_NAME_BUSINESS2": "XYZ Investment",
"PROPOSE_NAME_BUSINESS3": "",
"PROPOSE_NAME_BUSINESS4": "",
"PURPOSE_NATURE_BUSINESS": "Some dummy content",
"YOUR_NAME": "Firstname Lastname",
"step_no": "1"
}

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parse Json without quotes in Python - python

Related

pyparsing syntax tree from named value list

Python parse string environment variables

Replace single quotes with double quotes but leave ones within double quotes untouched

Using Python Parse to get a string of numbers, letters, whitespace, and symbols

Regex to reformat improper JSON data

Categories

Resources