How to correctly set repeated fields from json file - python
I have such json file:
[{
"datafiles": ["data.data"]
}]
Description in .proto file:
message Dataset {
repeated string datafiles = 1;
}
When I create a Dataset (Dataset(datafiles=datafiles)) object datafiles sets up in strange manner:
datafiles: "d"\ndatafiles: "a"\ndatafiles: "t"\ndatafiles: "a"\ndatafiles: ."\ndatafiles: "d"\ndatafiles: "a"\ndatafiles: "t"\ndatafiles: "a"
How to set it in correct way:
datafiles: "data.data"
It looks like your string ("data.data") is being iterated and added one character at a time.
This suggests that you are probably passing in a string by itself:
"data.data"
when you should really be passing in an iterable containing strings:
[ "data.data" ]
Try printing the value of datafiles right before your call to create the Dataset:
print(repr(datafiles))
... whatever ... Dataset(datafiles=datafiles)
Related
How do I write the each word into one single cels?
I am trying to write data from json file to CSV file using python. My code is like this: CSVFile1 = open('Group_A_participant_1_1.csv', 'a') writeCSV1 = csv.writer(CSVFile1) for file in data['annotations'][3]['instances']: var = file['arguments'].get('argument1') writeCSV1.writerow(var) CSVFile.close() My output is: So my problem is that I can not see the whole word in one cell. Thanks your helps inn advance! I expect to get each word in one single cell.
Change writeCSV1.writerow(var) to writeCSV1.writerow([var]) so you're writing an one-item list with your var instead of having the CSV module interpret var, a string, as separate characters. For instance: import csv import sys writeCSV1 = csv.writer(sys.stdout) data = { "annotations": [ {}, {}, {}, { "instances": [ {"arguments": {"argument1": "foo"}}, {"arguments": {"argument1": "bar"}}, ] }, ], } for file in data["annotations"][3]["instances"]: var = file["arguments"].get("argument1") writeCSV1.writerow([var]) prints out foo bar whereas taking the brackets out from around [var] results in f,o,o b,a,r as you described.
Click on the first cell of the column where you want the converted names to appear (B2). Type equal sign (=), followed by the text “Prof. “, followed by an ampersand (&). Select the cell containing the first name (A2). Press the Return Key. You will notice that the title “Prof.” is added before the first name in the list.
How to add string in specific positions in geojson object
i have the below posted geojson mentioned in geojson_1 section below. i want to add to it "geometry":{ and }, so that to appear as follows {"geometry":{"type":"Polygon","coordinates":[[[1216374.67364018,6563498.44078949],[1216387.86261675,6563523.87797899],[1216397.66970116,6563548.2905649],[1216424.17569103,6563588.32082324],[1216458.19258303,6563622.16452455],[1216498.32084288,6563648.42909789],[1216542.90943577,6563666.03380959],[1216590.12376481,6563674.25425166],[1216638.02117068,6563672.7521636],[1216684.63088244,6563661.58935797],[1216728.03512655,6563641.225175],[1216752.29181681,6563626.67066235],[1216787.17700448,6563601.12371718],[1216816.83970763,6563569.63465531],[1216831.39332728,6563551.03748989],[1216838.2508451,6563541.8226918],[1216897.47283376,6563458.0765492],[1216918.74007329,6563421.44644481],[1216933.156564,6563381.60258193],[1216940.26085904,6563339.82061228],[1216939.82562707,6563297.43819918],[1216931.86491641,6563255.81218836],[1216907.60647856,6563170.91644364],[1216887.20280767,6563121.46139137],[1216856.24799209,6563077.86160203],[1216821.48046704,6563039.0529759],[1216799.23490474,6563017.28929875],[1216753.95673639,6562978.48086898],[1216737.29066155,6562965.4435638],[1216673.22488836,6562919.79826372],[1216644.73178636,6562899.22061724],[1216601.13622245,6562874.31962206],[1216562.32695185,6562857.3410734],[1216556.56069412,6562854.90900462],[1216549.97837146,6562852.23502385],[1216545.77480552,6562849.58841453],[1216504.75306873,6562829.03095075],[1216487.0229317,6562822.21187019],[1216482.65368148,6562820.3796627],[1216478.79578194,6562814.49158384],[1216462.95127963,6562793.04723497],[1216450.44559886,6562777.97698661],[1216448.65520854,6562774.19751598],[1216429.39331353,6562740.84427663],[1216404.99213055,6562711.06486155],[1216382.3528849,6562687.61801865],[1216357.97638417,6562665.64947943],[1216339.38004804,6562651.09634054],[1216299.24217837,6562625.743469],[1216254.86196793,6562608.94292463],[1216208.02902037,6562601.37212065],[1216160.63182011,6562603.33631786],[1216114.58159971,6562614.75632195],[1216071.73529105,6562635.17167463],[1216033.82066317,6562663.75921023],[1216002.36666225,6562699.36623124],[1215978.64176027,6562740.55696848],[1215963.60279766,6562785.67045603],[1215957.85638339,6562832.88749107],[1215961.63441126,6562880.30398198],[1215963.25125904,6562889.19806131],[1215964.03213898,6562893.28933659],[1215968.1319511,6562913.79137984],[1215972.03222389,6562937.19708669],[1215977.28745991,6563016.79645952],[1215971.45390521,6563048.0427099],[1215969.39172682,6563061.09023781],[1215963.73069651,6563104.75131293],[1215962.06592533,6563123.22251112],[1215960.00953034,6563163.6421848],[1215954.35640427,6563213.80082903],[1215954.63230125,6563269.34572034],[1215960.29082704,6563315.43307246],[1215970.70253119,6563361.57391952],[1215982.82982632,6563397.95907391],[1216001.20120538,6563439.38870108],[1216027.10825421,6563476.54992802],[1216059.60193364,6563508.08124916],[1216097.4918555,6563532.82739871],[1216139.38989254,6563549.8816932],[1216183.76104002,6563558.61926572],[1216228.97966418,6563558.71997125],[1216273.38907516,6563550.18012236],[1216287.1346647,6563546.13717455],[1216332.92682121,6563527.24586381],[1216373.78586258,6563499.1986745],[1216374.67364018,6563498.44078949]]]}} to simpify it even more, i want to add "geometry":{ right after the the first curly bracket, and the } at the very end i attmepted the following: asString = asString[:2] + "geometry:" + asString[2:] asString = asString[:len(asString)] + "}" + asString[len(asString):] but i am not getting the expected results geojson_1: {"type":"Polygon","coordinates":[[[1216374.67364018,6563498.44078949],[1216387.86261675,6563523.87797899],[1216397.66970116,6563548.2905649],[1216424.17569103,6563588.32082324],[1216458.19258303,6563622.16452455],[1216498.32084288,6563648.42909789],[1216542.90943577,6563666.03380959],[1216590.12376481,6563674.25425166],[1216638.02117068,6563672.7521636],[1216684.63088244,6563661.58935797],[1216728.03512655,6563641.225175],[1216752.29181681,6563626.67066235],[1216787.17700448,6563601.12371718],[1216816.83970763,6563569.63465531],[1216831.39332728,6563551.03748989],[1216838.2508451,6563541.8226918],[1216897.47283376,6563458.0765492],[1216918.74007329,6563421.44644481],[1216933.156564,6563381.60258193],[1216940.26085904,6563339.82061228],[1216939.82562707,6563297.43819918],[1216931.86491641,6563255.81218836],[1216907.60647856,6563170.91644364],[1216887.20280767,6563121.46139137],[1216856.24799209,6563077.86160203],[1216821.48046704,6563039.0529759],[1216799.23490474,6563017.28929875],[1216753.95673639,6562978.48086898],[1216737.29066155,6562965.4435638],[1216673.22488836,6562919.79826372],[1216644.73178636,6562899.22061724],[1216601.13622245,6562874.31962206],[1216562.32695185,6562857.3410734],[1216556.56069412,6562854.90900462],[1216549.97837146,6562852.23502385],[1216545.77480552,6562849.58841453],[1216504.75306873,6562829.03095075],[1216487.0229317,6562822.21187019],[1216482.65368148,6562820.3796627],[1216478.79578194,6562814.49158384],[1216462.95127963,6562793.04723497],[1216450.44559886,6562777.97698661],[1216448.65520854,6562774.19751598],[1216429.39331353,6562740.84427663],[1216404.99213055,6562711.06486155],[1216382.3528849,6562687.61801865],[1216357.97638417,6562665.64947943],[1216339.38004804,6562651.09634054],[1216299.24217837,6562625.743469],[1216254.86196793,6562608.94292463],[1216208.02902037,6562601.37212065],[1216160.63182011,6562603.33631786],[1216114.58159971,6562614.75632195],[1216071.73529105,6562635.17167463],[1216033.82066317,6562663.75921023],[1216002.36666225,6562699.36623124],[1215978.64176027,6562740.55696848],[1215963.60279766,6562785.67045603],[1215957.85638339,6562832.88749107],[1215961.63441126,6562880.30398198],[1215963.25125904,6562889.19806131],[1215964.03213898,6562893.28933659],[1215968.1319511,6562913.79137984],[1215972.03222389,6562937.19708669],[1215977.28745991,6563016.79645952],[1215971.45390521,6563048.0427099],[1215969.39172682,6563061.09023781],[1215963.73069651,6563104.75131293],[1215962.06592533,6563123.22251112],[1215960.00953034,6563163.6421848],[1215954.35640427,6563213.80082903],[1215954.63230125,6563269.34572034],[1215960.29082704,6563315.43307246],[1215970.70253119,6563361.57391952],[1215982.82982632,6563397.95907391],[1216001.20120538,6563439.38870108],[1216027.10825421,6563476.54992802],[1216059.60193364,6563508.08124916],[1216097.4918555,6563532.82739871],[1216139.38989254,6563549.8816932],[1216183.76104002,6563558.61926572],[1216228.97966418,6563558.71997125],[1216273.38907516,6563550.18012236],[1216287.1346647,6563546.13717455],[1216332.92682121,6563527.24586381],[1216373.78586258,6563499.1986745],[1216374.67364018,6563498.44078949]]]}
I'm going to assume that geojson_1 is available as a string in which case: import json output = {'geometry': json.loads(geojson_1)} ...will give you a dictionary with the structure you need.
It looks like plain json data, or a string representation of a dict (they wouldn't be any different in this case), did you consider wrapping the returned data in a new dict rather than manipulating it as a string? import json # Assume this returns the geojson as text geojson = json.loads(get_geojson()) geojson = {"geometry": geojson} print(json.dumps(geojson))
I get the expected result using the following: '{"geometry":' + d + "}" It adds the string {"geometry": to the string d and at the end }. The variable dis: d = '{"type":"Polygon","co (rest of json) ,6563498.44078949]]]}' Or you can use the json library for this: import json data = json.loads(d) # note d is the same string as above, this can also be from a file or read file using json.load(FILE) # Create your new object: result = {'geometry': data} # print you new json: print( json.dumps(result, indent=2)) edit: '"geometry":{' + d + "}" Note that you get a string starting with geometry and a { and directly another { from you input json. This is not a correct dictionary nor a proper json format. Result: '"geometry":{{"type":"Polygon", ... ,6563498.44078949]]]}}' (the dots are just the rest of your original json.
How to remove numbers from json in python
I having some json format like json= 5843080158430803{"name":"NAME", "age":"56",} So, how i get {"name":"NAME", "age":"56",} Using regex/split (which one is bets method for it) in Python. Thanks in Advance...
Split the first occurance of { into an array, and get the second element in the array. We also have to add the { again because its removed by the split function json = '5843080158430803{"name":"NAME", "age":"56",}' json = '{' + json.split('{', 1)[1] print(json) Result: {"name":"NAME", "age":"56",}
perhaps you could split at at the first { and then replace the part prior to it. I am assuming the json you have above is actually a string. Then you could do: json_prefix = json.split("{") json = json.replace(json_prefix, "")
Convert a string with the name of a variable to a dictionary
I have a string which is little complex in that, it has some objects embedded as values. I need to convert them to proper dict or json. foo = '{"Source": "my source", "Input": {"User_id": 18, "some_object": this_is_a_variable_actually}}' Notice that the last key some_object has a value which is neither a string nor an int. Hence when I do a json.loads or ast.literal_eval, I am getting malformed string errors, and so Converting a String to Dictionary doesn't work. I have no control over the source of the string. Is it possible to convert such strings to dictionary The result I need is a dict like this dict = { "Source" : "Good", "object1": variable1, "object2": variable2 } The thing here is I wouldn't know what is variable1 or 2. There is no pattern here. One point I want to mention here is that, If I can make the variables as just plain strings, that is also fine For example, dict = { "Source" : "Good", "object1": "variable1", "object2": "variable2" } This will be good for my purpose. Thanks for all the answers.
It's a bit of a kludge using the demjson module which allows you to parse most of a somewhat non-confirming JSON syntax string and lists the errors... You can then use that to replace the invalid tokens found and put quotes around it just so it parses correctly, eg: import demjson import re foo = '{"Source": "my source", "Input": {"User_id": 18, "some_object": this_is_a_variable_actually}}' def f(json_str): res = demjson.decode(json_str, strict=False, return_errors=True) if not res.errors: return res for err in res.errors: var = err.args[1] json_str = re.sub(r'\b{}\b'.format(var), '"{}"'.format(var), json_str) return demjson.decode(json_str, strict=False) res = f(foo) Gives you: {'Input': {'User_id': 18, 'some_object': 'this_is_a_variable_actually'}, 'Source': 'my source'} Note that while this should work in the example data presented, your mileage may vary if there's other nuisances in your input that require further munging.
python: how do I parse a stream of json arrays with ijson library
The incoming data resembles the following: [{ "foo": "bar" }] [{ "bar": "baz" }] [{ "baz": "foo" }] as you see, arrays of objects strung together. JSON-ish ijson is able to handle the first array, and then I get: ijson.common.JSONError: Additional data when it hits the subsequent arrays. How do I get around this?
Here's a first cut at the problem that at least has a working regex substitution to turn a full string into valid json. It only works if you're ok with reading the full input stream before parsing as json. import re input = '' for line in inputStream: input = input + line # input == '[{"foo": "bar"}][{"bar": "baz"}][{"baz": "foo"}]' # wrap in [] and put commas between each ][ sanitizedInput = re.sub(r"\]\[", "],[", "[%s]" % input) # sanitizedInput == '[[{"foo": "bar"}],[{"bar": "baz"}],[{"baz": "foo"}]]' # then parse sanitizedInput parsed = json.loads(sanitizedInput) print parsed #=> [[{u'foo': u'bar'}], [{u'bar': u'baz'}], [{u'baz': u'foo'}]] Note: since you're read the whole thing as a string, you can use json instead of ijson
You can use json.JSONDecoder.raw_decode to walk through the string. Its documentation indeed says: This can be used to decode a JSON document from a string that may have extraneous data at the end. The following code sample assumes all the JSON values are in one big string: def json_elements(string): while True: try: (element, position) = json.JSONDecoder.raw_decode(string) yield element string = string[position:] except ValueError: break To avoid dealing with raw_decode yourself and to be able to parse a stream chunk by chunk, I would recommend a library I made for this exact purpose: streamcat. def json_elements(stream) decoder = json.JSONDecoder() yield from streamcat.stream_to_iterator(stream, decoder) This works for any concatenation of JSON values regardless of how many white-space characters are used within them or between them. If you have control over how your input stream is encoded, you may want to consider using line-delimited JSON, which makes parsing easier.