I'm trying to edit a meta file with new information generated with a python script and don't want to just append the information with a new JSON object, but rather update the read information.
As input I have something like this:
{
"foo1": [
{
"bar1": 0,
"bar2": 1337
},
...
}
So far my code reads the information and stores it in a dictionary. After that the information in this file is deleted and replaced with the updated dictionary. The Code is as shown below:
...
outputData = {"foo2": [{"bar3": True, "bar4": 123}]}
with open(metaFile, 'r+') as f:
metaData = json.load(f)
f.seek(0)
f.truncate()
metaData.update(outputData)
f.write(json.dumps(metaData, indent=2))
f.close()
...
As a result this comes out as expected:
{
"foo1": [
{
"bar1": 0,
"bar2": 1337
}
],
"foo2":[
{
"bar3": true,
"bar4": 123
}
]
}
Now to my exact question, is it possible to edit the file in such a way, that the content in the file doesn't get deleted at first and written again? Because if something happens with the metaData after the initialization, the information is just gone.
Changing the 'r+' argument to 'w+' (+ is optional) will create a new file instead reading from it first and the whole data is gone at this point. With 'a' the outputData cannot be updated and then added, because it would rewrite the already given information. Without updating the metaData it will just create a new object and that's not what I had in mind.
In your case, if you are sure the file size after your changes will always be equal or bigger in size then what's currently in the file, you can call f.write(data) directly.
This way, you don't have to truncate (and lose) the file before writing it.
Also, when you open a file using the with syntax, it will be automatically closed once the with block ends.
In the end you code would look something like this:
outputData = {"foo2": [{"bar3": True, "bar4": 123}]}
with open(metaFile, 'r+') as f:
metaData = json.load(f)
f.seek(0)
metaData.update(outputData)
f.write(json.dumps(metaData, indent=2))
# rest of your code with the normal identation level
You can code as follows, or use MongoDB alternatively.
outputData = {"foo2": [{"bar3": True, "bar4": 123}]}
with open(metaFile, 'r+') as fp:
origin = fp.read()
target = json.dumps(dict(json.loads(origin), **outputData), indent=2)
index = [i for i, (a, b) in enumerate(zip(origin, target)) if a != b][0]
fp.seek(index)
fp.truncate()
fp.write(target[index:])
fp.close()
Related
I need to modify a YAML file and add several fields.I am using the ruamel.yaml package.
First I load the YAML file:
data = yaml.load(file_name)
I can easily add new simple fields, like-
data['prop1'] = "value1"
The problem I face is that I need to add a nested dictionary incorporate with array:
prop2:
prop3:
- prop4:
prop5: "Some title"
prop6: "Some more data"
I tried to define-
record_to_add = dict(prop2 = dict(prop3 = ['prop4']))
This is working, but when I try to add beneath it prop5 it fails-
record_to_add = dict(prop2 = dict(prop3 = ['prop4'= dict(prop5 = "Value")]))
I get
SyntaxError: expression cannot contain assignment, perhaps you meant "=="?
What am I doing wrong?
The problem has little to do with ruamel.yaml. This:
['prop4'= dict(prop5 = "Value")]
is invalid Python as a list ([ ]) expects comma separated values. You would need to use something like:
record_to_add = dict(prop2 = dict(prop3 = dict(prop4= [dict(prop5 = "Some title"), dict(prop6='Some more data'),])))
As your program is incomplete I am not sure if you are using the old API or not. Make sure to use
import ruamel.yaml
yaml = ruamel.yaml.YAML()
and not
import ruamel.yaml as yaml
Its because of having ['prop4'= <> ].Instead record_to_add = dict(prop2 = dict(prop3 = [dict(prop4 = dict(prop5 = "Value"))])) should work.
Another alternate would be,
import yaml
data = {
"prop1": {
"prop3":
[{ "prop4":
{
"prop5": "some title",
"prop6": "some more data"
}
}]
}
}
with open(filename, 'w') as outfile:
yaml.dump(data, outfile, default_flow_style=False)
I am using python 3 to read this file and convert it to a dictionary.
I have this string from a file and I would like to know how could be possible to create a dictionary from it.
[User]
Date=10/26/2003
Time=09:01:01 AM
User=teodor
UserText=Max Cor
UserTextUnicode=392039n9dj90j32
[System]
Type=Absolute
Dnumber=QS236
Software=1.1.1.2
BuildNr=0923875
Source=LAM
Column=OWKD
[Build]
StageX=12345
Spotter=2
ApertureX=0.0098743
ApertureY=0.2431899
ShiftXYZ=-4.234809e-002
[Text]
Text=Here is the Text files
DataBaseNumber=The database number is 918723
..... (There are more than 1000 lines per file) ...
On the text I have "Name=Something" and then I would like to convert it as follows:
{'Date':'10/26/2003',
'Time':'09:01:01 AM'
'User':'teodor'
'UserText':'Max Cor'
'UserTextUnicode':'392039n9dj90j32'.......}
The word between [ ] can be removed, like [User], [System], [Build], [Text], etc...
In some fields there is only the first part of the string:
[Colors]
Red=
Blue=
Yellow=
DarkBlue=
What you have is an ordinary properties file. You can use this example to read the values into map:
try (InputStream input = new FileInputStream("your_file_path")) {
Properties prop = new Properties();
prop.load(input);
// prop.getProperty("User") == "teodor"
} catch (IOException ex) {
ex.printStackTrace();
}
EDIT:
For Python solution, refer to the answerred question.
You can use configparser to read .ini, or .properties files (format you have).
import configparser
config = configparser.ConfigParser()
config.read('your_file_path')
# config['User'] == {'Date': '10/26/2003', 'Time': '09:01:01 AM'...}
# config['User']['User'] == 'teodor'
# config['System'] == {'Type': 'Abosulte', ...}
Can easily be done in python. Assuming your file is named test.txt.
This will also work for lines with nothing after the = as well as lines with multiple =.
d = {}
with open('test.txt', 'r') as f:
for line in f:
line = line.strip() # Remove any space or newline characters
parts = line.split('=') # Split around the `=`
if len(parts) > 1:
d[parts[0]] = ''.join(parts[1:])
print(d)
Output:
{
"Date": "10/26/2003",
"Time": "09:01:01 AM",
"User": "teodor",
"UserText": "Max Cor",
"UserTextUnicode": "392039n9dj90j32",
"Type": "Absolute",
"Dnumber": "QS236",
"Software": "1.1.1.2",
"BuildNr": "0923875",
"Source": "LAM",
"Column": "OWKD",
"StageX": "12345",
"Spotter": "2",
"ApertureX": "0.0098743",
"ApertureY": "0.2431899",
"ShiftXYZ": "-4.234809e-002",
"Text": "Here is the Text files",
"DataBaseNumber": "The database number is 918723"
}
I would suggest to do some cleaning to get rid of the [] lines.
After that you can split those lines by the "=" separator and then convert it to a dictionary.
How can I extract all the names from big JSON file using Python3.
with open('out.json', 'r') as f:
data = f.read()
Here I'm opening JSON file after that I tried this
a = json.dumps(data)
b= json.loads(a)
print (b)
Here is my data from JSON file.
{"data": [
{"errorCode":"E0000011","errorSummary":"Invalid token provided","errorLink":"E0000011","errorId":"oaeZ3PywqdMRWSQuA9_KML-ow","errorCauses":[]},
{"errorCode":"E0000011","errorSummary":"Invalid token provided","errorLink":"E0000011","errorId":"oaet_rFPO5bSkuEGKNI9a5vgQ","errorCauses":[]},
{"errorCode":"E0000011","errorSummary":"Invalid token provided","errorLink":"E0000011","errorId":"oaejsPt3fprRCOiYx-p7mbu5g","errorCauses":[]}]}
I need output like this
{"oaeZ3PywqdMRWSQuA9_KML-ow","oaet_rFPO5bSkuEGKNI9a5vgQ","oaejsPt3fprRCOiYx-p7mbu5g"}
I want all errorId.
Try like this :
n = {b['name'] for b in data['movie']['people']['actors']}
If you want to get or process the JSON data, you have to load the JSON first.
Here the example of the code
from json import loads
with open('out.json', 'r') as f:
data = f.read()
load = loads(data)
names = [i['name'] for i in data['movie']['people']['actors']]
or you can change names = [i['name'] for i in data['movie']['people']['actors']] to Vikas P answers
Try using json module for the above.
import json
with open('path_to_file/data.json') as f:
data = json.load(f)
actor_names = { names['name'] for names in data['movie']['people']['actors'] }
I am doing a task in python (learning phase) wherein i have a text file with list of ip's eg:
10.8.9.0
10.7.8.7
10.4.5.6 and so on. Each on one line , one below another.
I have to read its contents and create its json as [{"ip":"10.8.9.0"},{"ip":"10.7.8.7"}..]
Code:
with open("filename.txt") as file:
content = [x.strip('\n') for x in file.readlines()]
print content
print "content",type(content)
content_json=json.dumps(content)
print content_json
print type(content_json)
The output of content is ['ip adrress1','ip address2'] which is a list.
When i dump the list in content_json the type shown is "Str" .
However i need it as json
My concern is - my further task is to validate ip and add a item in existing json stating {"status":"valid/invalid"}.
I dnt know how to do that as the type of my json is showing str.
Kindly let me knw how to proceed and add status for every ip in existing json.
Also i wish to know why is the type of the json i dumped my list with is being showed as str.
The desired output should be
[
{
"ip":"10.8.9.0",
"status":"valid"
},
{
"ip":"10.7.8.A",
"status":"invalid"
}, ..so on
]
First thing: The result is a list because you're building a list with
[x.strip('\n') for x in file.readlines()]. In case you're not sure that means: Take every line x in file, remove the \n character from it and then build a list of those results. You want something like [{"ip":x.strip('\n')} for x in file.readlines()].
Now, the function json.dumps takes a Python object and attempts to create a JSON representation of it. That representation is serialized as a string so if you ask for the type of content_json that's what you'll get.
You have to make the distinction between a python list/dictionary and a JSON string.
This
>>> with open('input.txt') as inp:
... result = [dict(ip=ip.strip()) for ip in inp]
...
>>> result
[{'ip': '10.8.9.0'}, {'ip': '10.7.8.7'}, {'ip': '10.4.5.6'}]
will give you a list of dictionaries that is easy to mutate. When you are done with it, you can dump it as a JSON string:
>>> result[1]['status'] = 'valid'
>>> result
[{'ip': '10.8.9.0'}, {'status': 'valid', 'ip': '10.7.8.7'}, {'ip': '10.4.5.6'}]
>>> json.dumps(result)
'[{"ip": "10.8.9.0"}, {"status": "valid", "ip": "10.7.8.7"}, {"ip": "10.4.5.6"}]'
You should supply key:value properly for the dump. Putting just the value alone would store it as String
Refer this :
https://docs.python.org/2/library/json.html
Maybe something like this?
import json
import socket
result = list()
with open("filename.txt") as file:
for line in file:
ip = line.strip()
try:
socket.inet_aton(ip)
result.append({"ip": line.strip(), "status": "valid"})
except socket.error:
result.append({"ip": line.strip(), "status": "invalid"})
print(json.dumps(result))
Finally, I got a fix:
import os
import sys
import json
from IPy import IP
filepath="E:/Work/"
filename="data.txt"
result = list()
with open(os.path.join(filepath+filename)) as file:
for line in file:
ip = line.strip()
if ip.startswith("0"):
result.append({"ip": line.strip(), "status": "invalid"})
else:
try:
ip_add=IP(ip)
result.append({"ip": line.strip(), "status": "Valid"})
except ValueError:
result.append({"ip": line.strip(), "status": "invalid"})
print(json.dumps(result))
N files of with dictionaries-of-lists, saved as a.json, b.json...
{
"ELEC.GEN.OOG-AK-99.A": [
["2013", null],
["2012", 2.65844],
["2011", 2.7383]
],
"ELEC.GEN.AOR-AK-99.A": [
["2015", 217.30239],
["2014", 214.46868],
["2013", 197.32097]
],
"ELEC.GEN.HYC-AK-99.A": [
["2015", 1542.29841],
["2014", 1538.738],
["2013", 1345.665]
]}
I am unclear how to save them all to one large dictionary/json file, like so:
{
"a":
{
"ELEC.GEN.OOG-AK-99.A": [
["2013", null],
["2012", 2.65844],
["2011", 2.7383]
],
"ELEC.GEN.AOR-AK-99.A": [
["2015", 217.30239],
["2014", 214.46868],
["2013", 197.32097]
],
"ELEC.GEN.HYC-AK-99.A": [
["2015", 1542.29841],
["2014", 1538.738],
["2001", 1345.665]
]},
"b": {...},
...
}
This is data I requested that will be used in a javascript graph, and it is theoretically possible to preprocess it even more when streaming the requested data from its source, as well as maybe possible to work around the fact there are so many data files I need to request to get my graph working, but both those options seem very difficult.
I don't understand the best way to parse json-that-is-meant-for-javascript in python.
====
I have tried:
from collections import defaultdict
# load into memory
data = defaultdict(dict)
filelist = ["a.json", "b.json", ...]
for fn in filelist:
with open(fn, 'rb') as f:
# this brings up TypeError
data[fn] = json.loads(f)
# write
out = "out.json"
with open(out, 'wb') as f:
json.dump(data, f)
===
For json.loads() I get TypeError: expected string or buffer. For json.load() it works!
Loading from string:
>>> with open("a.json", "r") as f:
... json.loads(f.read())
...
{u'Player2': 4, u'Player3': 10, u'Player1': 3}
>>>
Loading from file object:
>>> with open("a.json", "r") as f:
... json.load(f)
...
{u'Player2': 4, u'Player3': 10, u'Player1': 3}
>>>
you are using json.loads instead of json.load to load a file, you also need to open it for reading for string instead of bytes, so change this:
with open(fn, 'rb') as f:
data[fn] = json.loads(f)
to this:
with open(f, 'r') as f: #only r instead of rb
data[fn] = json.load(f) #load instead of loads
And again further down when writing open for w instead of wb