I having some json format like
json= 5843080158430803{"name":"NAME", "age":"56",}
So, how i get {"name":"NAME", "age":"56",} Using regex/split (which one is bets method for it) in Python.
Thanks in Advance...
Split the first occurance of { into an array, and get the second element in the array.
We also have to add the { again because its removed by the split function
json = '5843080158430803{"name":"NAME", "age":"56",}'
json = '{' + json.split('{', 1)[1]
print(json)
Result: {"name":"NAME", "age":"56",}
perhaps you could split at at the first { and then replace the part prior to it.
I am assuming the json you have above is actually a string. Then you could do:
json_prefix = json.split("{")
json = json.replace(json_prefix, "")
Have a txt with contents :
{
hello : 1,two:three,four:five,six:seven,
}
how to remove the last , in the above string ?
while using them as dictionaries for further. it cant be parsed because of the last delimiter.
code :
import json
d2=json.load(open(test.txt))
i cant change the source code. coz i am extracting data from a json file(json.dump) and creating a new json. is there any way of doing that other than dump/changing the source code
This removes the last , in your string without the need of any further import's.
with open('test.txt', 'r') as f:
s = f.read()
s = s[::-1].replace(',', '', 1)[::-1]
the output of s is then:
{
hello : 1,two:three,four:five,six:seven
}
Simple replace should work fine.
broken_json = '''{
hello : 1,two:three,four:five,six:seven,
bye : 42,ick:poo,zoo:bar,}'''
j = broken_json.replace(',}', '}').replace(',\n}','\n}')
The result at this point is still not valid JSON, because the dictionary keys need to be quoted; but this is outside the scope of your question so I will not try to tackle that part.
string ='{hello : 1,two:three,four:five,six:seven,}'
length = len(string)
for i in range(length):
if(string[i] == ','):
string2 = string[0:i] + string[i + 1:length]
print (string2)
The output type would be a string. So, convert it to a dict later.
import re
string ='{hello : 1,two:three,four:five,six:seven,}'
match = re.search(r'(.*),[^,]*$', string)
print (match.group(1)+"}")
Try this #selcuk for an efficient one
I have some csv files that I need to convert to json. Some of the float values in the csv are numeric strings (to maintain trailing zeros). When converting to json, all keys and values are wrapped in double quotes. I need the numeric string float values to not have quotes, but maintain the trailing zeros.
Here is a sample of the input csv file:
ACCOUNTNAMEDENORM,DELINQUENCYSTATUS,RETIRED,INVOICEDAYOFWEEK,ID,BEANVERSION,ACCOUNTTYPE,ORGANIZATIONTYPEDENORM,HIDDENTACCOUNTCONTAINERID,NEWPOLICYPAYMENTDISTRIBUTABLE,ACCOUNTNUMBER,PAYMENTMETHOD,INVOICEDELIVERYTYPE,DISTRIBUTIONLIMITTYPE,CLOSEDATE,FIRSTTWICEPERMTHINVOICEDOM,HELDFORINVOICESENDING,FEINDENORM,COLLECTING,ACCOUNTNUMBERDENORM,CHARGEHELD,PUBLICID
John Smith,2.0000000000,0.0000000000,5.0000000000,1234567.0000000000,69.0000000000,1.0000000000,,4321987.0000000000,1,000-000-000-00,10012.0000000000,10002.0000000000,3.0000000000,,1.0000000000,0,,0,000-000-000-00,0,bc:1234346
The json output I am getting is:
{"ACCOUNTNAMEDENORM":"John Smith","DELINQUENCYSTATUS":"2.0000000000","RETIRED":"0.0000000000","INVOICEDAYOFWEEK":"5.0000000000","ID":"1234567.0000000000","BEANVERSION":"69.0000000000","ACCOUNTTYPE":"1.0000000000","ORGANIZATIONTYPEDENORM":null,"HIDDENTACCOUNTCONTAINERID":"4321987.0000000000","NEWPOLICYPAYMENTDISTRIBUTABLE":"1","ACCOUNTNUMBER":"000-000-000-00","PAYMENTMETHOD":"12345.0000000000","INVOICEDELIVERYTYPE":"98765.0000000000","DISTRIBUTIONLIMITTYPE":"3.0000000000","CLOSEDATE":null,"FIRSTTWICEPERMTHINVOICEDOM":"1.0000000000","HELDFORINVOICESENDING":"0","FEINDENORM":null,"COLLECTING":"0","ACCOUNTNUMBERDENORM":"000-000-000-00","CHARGEHELD":"0","PUBLICID":"xx:1234346"}
Here is the code I am using:
import csv
import json
csvfile = open('output2.csv', 'r')
jsonfile = open('output2.json', 'w')
readHeaders = csv.reader(csvfile)
fieldnames = next(readHeaders)
reader = csv.DictReader(csvfile, fieldnames)
for row in reader:
json.dump(row, jsonfile, separators=(',', ':'))
jsonfile.write('\n')
I would like the output to have no quotes around float values, similar to the following:
{"ACCOUNTNAMEDENORM":"John Smith","DELINQUENCYSTATUS":2.0000000000,"RETIRED":0.0000000000,"INVOICEDAYOFWEEK":5.0000000000,"ID":1234567.0000000000,"BEANVERSION":69.0000000000,"ACCOUNTTYPE":1.0000000000,"ORGANIZATIONTYPEDENORM":null,"HIDDENTACCOUNTCONTAINERID":4321987.0000000000,"NEWPOLICYPAYMENTDISTRIBUTABLE":"1","ACCOUNTNUMBER":"000-000-000-00","PAYMENTMETHOD":12345.0000000000,"INVOICEDELIVERYTYPE":98765.0000000000,"DISTRIBUTIONLIMITTYPE":3.0000000000,"CLOSEDATE":null,"FIRSTTWICEPERMTHINVOICEDOM":1.0000000000,"HELDFORINVOICESENDING":"0","FEINDENORM":null,"COLLECTING":"0","ACCOUNTNUMBERDENORM":"000-000-000-00","CHARGEHELD":"0","PUBLICID":"xx:1234346"}
Now, from your comments, that I understand your question better, here's a completely different answer. Note that it doesn't use the json module and just does the processing needed "manually". Although it probably could be done using the module, getting it to format the Python data types it recognizes by default differently can be fairly involved — I know from experience — as compared to the relatively simple logic used below anyway.
Anther note: Like your code, this converts each row of the csv file into a valid JSON object and writes each one to a file on a separate line. However the contents of the resulting file technically won't be valid JSON because all of these individual objects need to be be comma-separated and enclosed in [ ] brackets (i.e. thereby becoming a valid JSON "Array" Object).
import csv
with open('output2.csv', 'r', newline='') as csvfile, \
open('output2.json', 'w') as jsonfile:
for row in csv.DictReader(csvfile):
newfmt = []
for field, value in row.items():
field = '"{}"'.format(field)
try:
float(value)
except ValueError:
value = 'null' if value == '' else '"{}"'.format(value)
else:
# Avoid changing integer values to float.
try:
int(value)
except ValueError:
pass
else:
value = '"{}"'.format(value)
newfmt.append((field, value))
json_repr = '{' + ','.join(':'.join(pair) for pair in newfmt) + '}'
jsonfile.write(json_repr + '\n')
This is the JSON written to the file:
{"ACCOUNTNAMEDENORM":"John Smith","DELINQUENCYSTATUS":2.0000000000,"RETIRED":0.0000000000,"INVOICEDAYOFWEEK":5.0000000000,"ID":1234567.0000000000,"BEANVERSION":69.0000000000,"ACCOUNTTYPE":1.0000000000,"ORGANIZATIONTYPEDENORM":null,"HIDDENTACCOUNTCONTAINERID":4321987.0000000000,"NEWPOLICYPAYMENTDISTRIBUTABLE":"1","ACCOUNTNUMBER":"000-000-000-00","PAYMENTMETHOD":12345.0000000000,"INVOICEDELIVERYTYPE":98765.0000000000,"DISTRIBUTIONLIMITTYPE":3.0000000000,"CLOSEDATE":null,"FIRSTTWICEPERMTHINVOICEDOM":1.0000000000,"HELDFORINVOICESENDING":"0","FEINDENORM":null,"COLLECTING":"0","ACCOUNTNUMBERDENORM":"000-000-000-00","CHARGEHELD":"0","PUBLICID":"bc:1234346"}
Shown again below with added whitespace:
{"ACCOUNTNAMEDENORM": "John Smith",
"DELINQUENCYSTATUS": 2.0000000000,
"RETIRED": 0.0000000000,
"INVOICEDAYOFWEEK": 5.0000000000,
"ID": 1234567.0000000000,
"BEANVERSION": 69.0000000000,
"ACCOUNTTYPE": 1.0000000000,
"ORGANIZATIONTYPEDENORM": null,
"HIDDENTACCOUNTCONTAINERID": 4321987.0000000000,
"NEWPOLICYPAYMENTDISTRIBUTABLE": "1",
"ACCOUNTNUMBER": "000-000-000-00",
"PAYMENTMETHOD": 12345.0000000000,
"INVOICEDELIVERYTYPE": 98765.0000000000,
"DISTRIBUTIONLIMITTYPE": 3.0000000000,
"CLOSEDATE": null,
"FIRSTTWICEPERMTHINVOICEDOM": 1.0000000000,
"HELDFORINVOICESENDING": "0",
"FEINDENORM": null,
"COLLECTING": "0",
"ACCOUNTNUMBERDENORM": "000-000-000-00",
"CHARGEHELD": "0",
"PUBLICID": "bc:1234346"}
Might be a bit of overkill, but with pandas it would be pretty simple:
import pandas as pd
data = pd.read_csv('output2.csv')
data.to_json(''output2.json')
One solution is to use a regular expression to see if the string value looks like a float, and convert it to a float if it is.
import re
null = None
j = {"ACCOUNTNAMEDENORM":"John Smith","DELINQUENCYSTATUS":"2.0000000000",
"RETIRED":"0.0000000000","INVOICEDAYOFWEEK":"5.0000000000",
"ID":"1234567.0000000000","BEANVERSION":"69.0000000000",
"ACCOUNTTYPE":"1.0000000000","ORGANIZATIONTYPEDENORM":null,
"HIDDENTACCOUNTCONTAINERID":"4321987.0000000000",
"NEWPOLICYPAYMENTDISTRIBUTABLE":"1","ACCOUNTNUMBER":"000-000-000-00",
"PAYMENTMETHOD":"12345.0000000000","INVOICEDELIVERYTYPE":"98765.0000000000",
"DISTRIBUTIONLIMITTYPE":"3.0000000000","CLOSEDATE":null,
"FIRSTTWICEPERMTHINVOICEDOM":"1.0000000000","HELDFORINVOICESENDING":"0",
"FEINDENORM":null,"COLLECTING":"0","ACCOUNTNUMBERDENORM":"000-000-000-00",
"CHARGEHELD":"0","PUBLICID":"xx:1234346"}
for key in j:
if j[key] is not None:
if re.match("^\d+?\.\d+?$", j[key]):
j[key] = float(j[key])
I used null = None here to deal with the "null"s that show up in the JSON. But you can replace 'j' here with each CSV row you're reading, then use this to update the row before writing it back with the floats replacing the strings.
If you're OK with converting any numerical string into a float, then you can skip the regular expression (re.match() command) and replace it with j[key].isnumeric(), if it's available for your Python version.
EDIT: I don't think floats in Python handle the "precision" in a way you might think. It may look like 2.0000000000 is being "truncated" to 2.0, but I think this is more of a formatting and display issue, rather than losing information. Consider the following examples:
>>> float(2.0000000000)
2.0
>>> float(2.00000000001)
2.00000000001
>>> float(1.00) == float(1.000000000)
True
>>> float(3.141) == float(3.140999999)
False
>>> float(3.141) == float(3.1409999999999999)
True
>>> print('%.10f' % 3.14)
3.1400000000
It's possible though to get the JSON to have those zeroes, but in that case it comes down to treating the number as a string, namely a formatted one.
Hah, it's really interesting, I want to find the opposite answer with you that is the results are with quotes.
Actually it's very easy to remove it automatically, just remove the param "separators=(',', ':')".
For me, just adding this param is Okay.
I'm writing some code with Python and Vincent to display some map data.
The example from the docs looks like this:
import vincent
county_topo = r'us_counties.topo.json'
state_topo = r'us_states.topo.json'
geo_data = [{'name': 'counties',
'url': county_topo,
'feature': 'us_counties.geo'},
{'name': 'states',
'url': state_topo,
'feature': 'us_states.geo'}]
vis = vincent.Map(geo_data=geo_data, scale=3000, projection='albersUsa')
del vis.marks[1].properties.update
vis.marks[0].properties.update.fill.value = '#084081'
vis.marks[1].properties.enter.stroke.value = '#fff'
vis.marks[0].properties.enter.stroke.value = '#7bccc4'
vis.to_json('map.json', html_out=True, html_path='map_template.html')
Running this code outputs an html file, but it's formatted improperly. It's in some kind of python string representation, b'<html>....</html>'.
If I remove the quotes and the leading b, the html page works as expected when run through the built in python server.
What's wrong with my output statement?
From the Docs:
A prefix of 'b' or 'B' is ignored in Python 2; it indicates that the
literal should become a bytes literal in Python 3 (e.g. when code is
automatically converted with 2to3). A 'u' or 'b' prefix may be
followed by an 'r' prefix.
You can slice it using:
with open('map_template.html', 'w') a f:
html = f.read()[2:-1]
f.truncate()
f.write(html)
This will open your html file,
b'<html><head><title>MyFile</title></head></html>'
And remove the first 2 and last character, giving you:
<html><head><title>MyFile</title></head></html>
I have a database which returns the rows as lists in following format:
data = ['(1000,"test value",0,0.00,0,0)', '(1001,"Another test value",0,0.00,0,0)']
After that, I use json_str = json.dumps(data) to get a JSON string. After applying json.dumps(), I get the following output:
json_str = ["(1000,\"test value\",0,0.00,0,0)", "(1001,\"Another test value\",0,0.00,0,0)"]
However, I need the JSON string in the following format:
json_str = [(1000,\"test value\",0,0.00,0,0), (1001,\"Another test value\",0,0.00,0,0)]
So basically, I want to remove the surrounding double quotes. I tried to accomplish this with json_str = json_str.strip('"') but this doesn't work. Then, I tried json_str = json_str.replace('"', '') but this also removes the escaped quotes.
Does anybody know a way to accomplish this or is there a function in Python similiar to json.dumps() which produces the same result, but without the surrounding double quotes?
You are dumping list of strings so json.dumps does exactly what you are asking for. Rather ugly solution for your problem could be something like below.
def split_and_convert(s):
bits = s[1:-1].split(',')
return (
int(bits[0]), bits[1], float(bits[2]),
float(bits[3]), float(bits[4]), float(bits[5])
)
data_to_dump = [split_and_convert(s) for s in data]
json.dumps(data_to_dump)