How to correctly set repeated fields from json file

How to correctly set repeated fields from json file - python

I have such json file:
[{
"datafiles": ["data.data"]
}]
Description in .proto file:
message Dataset {
repeated string datafiles = 1;
}
When I create a Dataset (Dataset(datafiles=datafiles)) object datafiles sets up in strange manner:
datafiles: "d"\ndatafiles: "a"\ndatafiles: "t"\ndatafiles: "a"\ndatafiles: ."\ndatafiles: "d"\ndatafiles: "a"\ndatafiles: "t"\ndatafiles: "a"
How to set it in correct way:
datafiles: "data.data"

It looks like your string ("data.data") is being iterated and added one character at a time.
This suggests that you are probably passing in a string by itself:
"data.data"
when you should really be passing in an iterable containing strings:
[ "data.data" ]
Try printing the value of datafiles right before your call to create the Dataset:
print(repr(datafiles))
... whatever ... Dataset(datafiles=datafiles)

Related

How do I write the each word into one single cels?

I am trying to write data from json file to CSV file using python. My code is like this:
CSVFile1 = open('Group_A_participant_1_1.csv', 'a')
writeCSV1 = csv.writer(CSVFile1)
for file in data['annotations'][3]['instances']:
var = file['arguments'].get('argument1')
writeCSV1.writerow(var)
CSVFile.close()
My output is:
So my problem is that I can not see the whole word in one cell.
Thanks your helps inn advance!
I expect to get each word in one single cell.

Change
writeCSV1.writerow(var)
to
writeCSV1.writerow([var])
so you're writing an one-item list with your var instead of having the CSV module interpret var, a string, as separate characters.
For instance:
import csv
import sys
writeCSV1 = csv.writer(sys.stdout)
data = {
"annotations": [
{},
{},
{},
{
"instances": [
{"arguments": {"argument1": "foo"}},
{"arguments": {"argument1": "bar"}},
]
},
],
}
for file in data["annotations"][3]["instances"]:
var = file["arguments"].get("argument1")
writeCSV1.writerow([var])
prints out
foo
bar
whereas taking the brackets out from around [var] results in
f,o,o
b,a,r
as you described.

Click on the first cell of the column where you want the converted
names to appear (B2).
Type equal sign (=), followed by the text “Prof. “, followed by an
ampersand (&).
Select the cell containing the first name (A2).
Press the Return Key.
You will notice that the title “Prof.” is added before the first name in the list.

How to add string in specific positions in geojson object

i have the below posted geojson mentioned in geojson_1 section below. i want to add to it "geometry":{ and }, so that to appear as follows
{"geometry":{"type":"Polygon","coordinates":[[[1216374.67364018,6563498.44078949],[1216387.86261675,6563523.87797899],[1216397.66970116,6563548.2905649],[1216424.17569103,6563588.32082324],[1216458.19258303,6563622.16452455],[1216498.32084288,6563648.42909789],[1216542.90943577,6563666.03380959],[1216590.12376481,6563674.25425166],[1216638.02117068,6563672.7521636],[1216684.63088244,6563661.58935797],[1216728.03512655,6563641.225175],[1216752.29181681,6563626.67066235],[1216787.17700448,6563601.12371718],[1216816.83970763,6563569.63465531],[1216831.39332728,6563551.03748989],[1216838.2508451,6563541.8226918],[1216897.47283376,6563458.0765492],[1216918.74007329,6563421.44644481],[1216933.156564,6563381.60258193],[1216940.26085904,6563339.82061228],[1216939.82562707,6563297.43819918],[1216931.86491641,6563255.81218836],[1216907.60647856,6563170.91644364],[1216887.20280767,6563121.46139137],[1216856.24799209,6563077.86160203],[1216821.48046704,6563039.0529759],[1216799.23490474,6563017.28929875],[1216753.95673639,6562978.48086898],[1216737.29066155,6562965.4435638],[1216673.22488836,6562919.79826372],[1216644.73178636,6562899.22061724],[1216601.13622245,6562874.31962206],[1216562.32695185,6562857.3410734],[1216556.56069412,6562854.90900462],[1216549.97837146,6562852.23502385],[1216545.77480552,6562849.58841453],[1216504.75306873,6562829.03095075],[1216487.0229317,6562822.21187019],[1216482.65368148,6562820.3796627],[1216478.79578194,6562814.49158384],[1216462.95127963,6562793.04723497],[1216450.44559886,6562777.97698661],[1216448.65520854,6562774.19751598],[1216429.39331353,6562740.84427663],[1216404.99213055,6562711.06486155],[1216382.3528849,6562687.61801865],[1216357.97638417,6562665.64947943],[1216339.38004804,6562651.09634054],[1216299.24217837,6562625.743469],[1216254.86196793,6562608.94292463],[1216208.02902037,6562601.37212065],[1216160.63182011,6562603.33631786],[1216114.58159971,6562614.75632195],[1216071.73529105,6562635.17167463],[1216033.82066317,6562663.75921023],[1216002.36666225,6562699.36623124],[1215978.64176027,6562740.55696848],[1215963.60279766,6562785.67045603],[1215957.85638339,6562832.88749107],[1215961.63441126,6562880.30398198],[1215963.25125904,6562889.19806131],[1215964.03213898,6562893.28933659],[1215968.1319511,6562913.79137984],[1215972.03222389,6562937.19708669],[1215977.28745991,6563016.79645952],[1215971.45390521,6563048.0427099],[1215969.39172682,6563061.09023781],[1215963.73069651,6563104.75131293],[1215962.06592533,6563123.22251112],[1215960.00953034,6563163.6421848],[1215954.35640427,6563213.80082903],[1215954.63230125,6563269.34572034],[1215960.29082704,6563315.43307246],[1215970.70253119,6563361.57391952],[1215982.82982632,6563397.95907391],[1216001.20120538,6563439.38870108],[1216027.10825421,6563476.54992802],[1216059.60193364,6563508.08124916],[1216097.4918555,6563532.82739871],[1216139.38989254,6563549.8816932],[1216183.76104002,6563558.61926572],[1216228.97966418,6563558.71997125],[1216273.38907516,6563550.18012236],[1216287.1346647,6563546.13717455],[1216332.92682121,6563527.24586381],[1216373.78586258,6563499.1986745],[1216374.67364018,6563498.44078949]]]}}
to simpify it even more, i want to add "geometry":{ right after the the first curly bracket, and the } at the very end
i attmepted the following:
asString = asString[:2] + "geometry:" + asString[2:]
asString = asString[:len(asString)] + "}" + asString[len(asString):]
but i am not getting the expected results
geojson_1:
{"type":"Polygon","coordinates":[[[1216374.67364018,6563498.44078949],[1216387.86261675,6563523.87797899],[1216397.66970116,6563548.2905649],[1216424.17569103,6563588.32082324],[1216458.19258303,6563622.16452455],[1216498.32084288,6563648.42909789],[1216542.90943577,6563666.03380959],[1216590.12376481,6563674.25425166],[1216638.02117068,6563672.7521636],[1216684.63088244,6563661.58935797],[1216728.03512655,6563641.225175],[1216752.29181681,6563626.67066235],[1216787.17700448,6563601.12371718],[1216816.83970763,6563569.63465531],[1216831.39332728,6563551.03748989],[1216838.2508451,6563541.8226918],[1216897.47283376,6563458.0765492],[1216918.74007329,6563421.44644481],[1216933.156564,6563381.60258193],[1216940.26085904,6563339.82061228],[1216939.82562707,6563297.43819918],[1216931.86491641,6563255.81218836],[1216907.60647856,6563170.91644364],[1216887.20280767,6563121.46139137],[1216856.24799209,6563077.86160203],[1216821.48046704,6563039.0529759],[1216799.23490474,6563017.28929875],[1216753.95673639,6562978.48086898],[1216737.29066155,6562965.4435638],[1216673.22488836,6562919.79826372],[1216644.73178636,6562899.22061724],[1216601.13622245,6562874.31962206],[1216562.32695185,6562857.3410734],[1216556.56069412,6562854.90900462],[1216549.97837146,6562852.23502385],[1216545.77480552,6562849.58841453],[1216504.75306873,6562829.03095075],[1216487.0229317,6562822.21187019],[1216482.65368148,6562820.3796627],[1216478.79578194,6562814.49158384],[1216462.95127963,6562793.04723497],[1216450.44559886,6562777.97698661],[1216448.65520854,6562774.19751598],[1216429.39331353,6562740.84427663],[1216404.99213055,6562711.06486155],[1216382.3528849,6562687.61801865],[1216357.97638417,6562665.64947943],[1216339.38004804,6562651.09634054],[1216299.24217837,6562625.743469],[1216254.86196793,6562608.94292463],[1216208.02902037,6562601.37212065],[1216160.63182011,6562603.33631786],[1216114.58159971,6562614.75632195],[1216071.73529105,6562635.17167463],[1216033.82066317,6562663.75921023],[1216002.36666225,6562699.36623124],[1215978.64176027,6562740.55696848],[1215963.60279766,6562785.67045603],[1215957.85638339,6562832.88749107],[1215961.63441126,6562880.30398198],[1215963.25125904,6562889.19806131],[1215964.03213898,6562893.28933659],[1215968.1319511,6562913.79137984],[1215972.03222389,6562937.19708669],[1215977.28745991,6563016.79645952],[1215971.45390521,6563048.0427099],[1215969.39172682,6563061.09023781],[1215963.73069651,6563104.75131293],[1215962.06592533,6563123.22251112],[1215960.00953034,6563163.6421848],[1215954.35640427,6563213.80082903],[1215954.63230125,6563269.34572034],[1215960.29082704,6563315.43307246],[1215970.70253119,6563361.57391952],[1215982.82982632,6563397.95907391],[1216001.20120538,6563439.38870108],[1216027.10825421,6563476.54992802],[1216059.60193364,6563508.08124916],[1216097.4918555,6563532.82739871],[1216139.38989254,6563549.8816932],[1216183.76104002,6563558.61926572],[1216228.97966418,6563558.71997125],[1216273.38907516,6563550.18012236],[1216287.1346647,6563546.13717455],[1216332.92682121,6563527.24586381],[1216373.78586258,6563499.1986745],[1216374.67364018,6563498.44078949]]]}

I'm going to assume that geojson_1 is available as a string in which case:
import json
output = {'geometry': json.loads(geojson_1)}
...will give you a dictionary with the structure you need.

It looks like plain json data, or a string representation of a dict (they wouldn't be any different in this case), did you consider wrapping the returned data in a new dict rather than manipulating it as a string?
import json
# Assume this returns the geojson as text
geojson = json.loads(get_geojson())
geojson = {"geometry": geojson}
print(json.dumps(geojson))

I get the expected result using the following:
'{"geometry":' + d + "}"
It adds the string {"geometry": to the string d and at the end }.
The variable dis:
d = '{"type":"Polygon","co (rest of json) ,6563498.44078949]]]}'
Or you can use the json library for this:
import json
data = json.loads(d) # note d is the same string as above, this can also be from a file or read file using json.load(FILE)
# Create your new object:
result = {'geometry': data}
# print you new json:
print( json.dumps(result, indent=2))
edit:
'"geometry":{' + d + "}"
Note that you get a string starting with geometry and a { and directly another { from you input json. This is not a correct dictionary nor a proper json format.
Result:
'"geometry":{{"type":"Polygon", ... ,6563498.44078949]]]}}'
(the dots are just the rest of your original json.

How to remove numbers from json in python

I having some json format like
json= 5843080158430803{"name":"NAME", "age":"56",}
So, how i get {"name":"NAME", "age":"56",} Using regex/split (which one is bets method for it) in Python.
Thanks in Advance...

Split the first occurance of { into an array, and get the second element in the array.
We also have to add the { again because its removed by the split function
json = '5843080158430803{"name":"NAME", "age":"56",}'
json = '{' + json.split('{', 1)[1]
print(json)
Result: {"name":"NAME", "age":"56",}

perhaps you could split at at the first { and then replace the part prior to it.
I am assuming the json you have above is actually a string. Then you could do:
json_prefix = json.split("{")
json = json.replace(json_prefix, "")

Convert a string with the name of a variable to a dictionary

I have a string which is little complex in that, it has some objects embedded as values. I need to convert them to proper dict or json.
foo = '{"Source": "my source", "Input": {"User_id": 18, "some_object": this_is_a_variable_actually}}'
Notice that the last key some_object has a value which is neither a string nor an int. Hence when I do a json.loads or ast.literal_eval, I am getting malformed string errors, and so Converting a String to Dictionary doesn't work.
I have no control over the source of the string.
Is it possible to convert such strings to dictionary
The result I need is a dict like this
dict = {
"Source" : "Good",
"object1": variable1,
"object2": variable2
}
The thing here is I wouldn't know what is variable1 or 2. There is no pattern here.
One point I want to mention here is that, If I can make the variables as just plain strings, that is also fine
For example,
dict = {
"Source" : "Good",
"object1": "variable1",
"object2": "variable2"
}
This will be good for my purpose. Thanks for all the answers.

It's a bit of a kludge using the demjson module which allows you to parse most of a somewhat non-confirming JSON syntax string and lists the errors... You can then use that to replace the invalid tokens found and put quotes around it just so it parses correctly, eg:
import demjson
import re
foo = '{"Source": "my source", "Input": {"User_id": 18, "some_object": this_is_a_variable_actually}}'
def f(json_str):
res = demjson.decode(json_str, strict=False, return_errors=True)
if not res.errors:
return res
for err in res.errors:
var = err.args[1]
json_str = re.sub(r'\b{}\b'.format(var), '"{}"'.format(var), json_str)
return demjson.decode(json_str, strict=False)
res = f(foo)
Gives you:
{'Input': {'User_id': 18, 'some_object': 'this_is_a_variable_actually'}, 'Source': 'my source'}
Note that while this should work in the example data presented, your mileage may vary if there's other nuisances in your input that require further munging.

python: how do I parse a stream of json arrays with ijson library

The incoming data resembles the following:
[{
"foo": "bar"
}]
[{
"bar": "baz"
}]
[{
"baz": "foo"
}]
as you see, arrays of objects strung together. JSON-ish
ijson is able to handle the first array, and then I get:
ijson.common.JSONError: Additional data
when it hits the subsequent arrays. How do I get around this?

Here's a first cut at the problem that at least has a working regex substitution to turn a full string into valid json. It only works if you're ok with reading the full input stream before parsing as json.
import re
input = ''
for line in inputStream:
input = input + line
# input == '[{"foo": "bar"}][{"bar": "baz"}][{"baz": "foo"}]'
# wrap in [] and put commas between each ][
sanitizedInput = re.sub(r"\]\[", "],[", "[%s]" % input)
# sanitizedInput == '[[{"foo": "bar"}],[{"bar": "baz"}],[{"baz": "foo"}]]'
# then parse sanitizedInput
parsed = json.loads(sanitizedInput)
print parsed #=> [[{u'foo': u'bar'}], [{u'bar': u'baz'}], [{u'baz': u'foo'}]]
Note: since you're read the whole thing as a string, you can use json instead of ijson

You can use json.JSONDecoder.raw_decode to walk through the string. Its documentation indeed says:
This can be used to decode a JSON document from a string that may have extraneous data at the end.
The following code sample assumes all the JSON values are in one big string:
def json_elements(string):
while True:
try:
(element, position) = json.JSONDecoder.raw_decode(string)
yield element
string = string[position:]
except ValueError:
break
To avoid dealing with raw_decode yourself and to be able to parse a stream chunk by chunk, I would recommend a library I made for this exact purpose: streamcat.
def json_elements(stream)
decoder = json.JSONDecoder()
yield from streamcat.stream_to_iterator(stream, decoder)
This works for any concatenation of JSON values regardless of how many white-space characters are used within them or between them.
If you have control over how your input stream is encoded, you may want to consider using line-delimited JSON, which makes parsing easier.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to correctly set repeated fields from json file - python

Related

How do I write the each word into one single cels?

How to add string in specific positions in geojson object

How to remove numbers from json in python

Convert a string with the name of a variable to a dictionary

python: how do I parse a stream of json arrays with ijson library

Categories

Resources