I am trying to import a file which was saved using json.dumps and contains tweet coordinates:
{
"type": "Point",
"coordinates": [
-4.62352292,
55.44787441
]
}
My code is:
>>> import json
>>> data = json.loads('/Users/JoshuaHawley/clean1.txt')
But each time I get the error:
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I want to end up extracting all the coordinates and saving them separately to a different file so they can then be mapped, but this seemingly simple problem is stopping me from doing so. I have looked at answers to similar errors but don't seem to be able to apply it to this. Any help would be appreciated as I am relatively new to python.
json.loads() takes a JSON encoded string, not a filename. You want to use json.load() (no s) instead and pass in an open file object:
with open('/Users/JoshuaHawley/clean1.txt') as jsonfile:
data = json.load(jsonfile)
The open() command produces a file object that json.load() can then read from, to produce the decoded Python object for you. The with statement ensures that the file is closed again when done.
The alternative is to read the data yourself and then pass it into json.loads().
It helped for me to add "myfile.seek(0)", move the pointer to the 0 character:
with open(storage_path, 'r') as myfile:
if len(myfile.readlines()) != 0:
myfile.seek(0)
Bank_0 = json.load(myfile)
I got same type of error after reading in a json file creating from python.
Same error occurred whether i read into a string and tried json.loads() or straight from file with json.load()
In my case, it turned out to be because I had written python booleans (False/True) straight out to the file.
Trying to read them back in again caused this error.
When i modified to valid json (true/false), json.load worked fine
Didnt see any SO questions with this as a possible cause for this error so adding here for reference.
You may use this function (it works with data):
def read_json_file(filename):
with open(filename, 'r') as f:
cache = f.read()
data = eval(cache)
return data
Or, you may put this in your code (it has the same effect):
def read_json_file(filename):
data = []
with open(filename, 'r') as f:
data = [json.loads(_.replace('}]}"},', '}]}"}')) for _ in f.readlines()]
return data
import json
file_path = "C:/Projects/Tryouts/books.json"
with open(file_path, 'r') as j:
contents = json.loads(j.read())
print(contents)
Related
I am trying to do something like this which uses reading , appending and writing at the same time.
with open("data.json", mode="a+") as file:
# 1.Reading old data
data = json.load(file)
# 2. Updating old data with new data
data.update(new_dict)
# 3.Writing into json file
json.dump(data,file,indent=4)
But it shows json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
First you need to open the file in mode="r+". Update the old data with new, then seek(0) to the beginning of the file, write your updated json data, and then truncate the rest:
with open("data.json", mode="r+") as file:
file.seek(0, 2)
if file.tell():
file.seek(0)
data = json.load(file)
data.update(new_dict)
else:
data = new_dict
file.seek(0)
json.dump(data, file, indent=4)
file.truncate()
Reason it doesn't work with a+ mode is that it will always write at the end of the file, irrespective of seek(0). So your updated json data just gets appended to the older one like a normal text data, but since its not a valid json syntax, it causes JSON Decode error.
Check here for more detailed info on how the different open modes work.
I have big size of json file to parse with python, however it's incomplete (i.e., missing parentheses in the end). The json file consist of one big json object which contains json objects inside. All json object in outer json object is complete, just finishing parenthese are missing.
for example, its structure is like this.
{bigger_json_head:value, another_key:[{small_complete_json1},{small_complete_json2}, ...,{small_complete_json_n},
So final "]}" are missing. however, each small json forms a single row so when I tried to print each line of the json file I have, I get each json object as a single string.
so I've tried to use:
with open("file.json","r",encoding="UTF-8") as f:
for line in f.readlines()
line_arr.append(line)
I expected to have a list with line of json object as its element
and then I tried below after the process:
for json_line in line_arr:
try:
json_str = json.loads(json_line)
print(json_str)
except json.decoder.JSONDecodeError:
continue
I expected from this code block, except first and last string, this code would print json string to console. However, it printed nothing and just got decode error.
Is there anyone who solved similar problem? please help. Thank you
If the faulty json file only miss the final "]}", then you can actually fix it before parse it.
Here is an example code to illustrate:
with open("file.json","r",encoding="UTF-8") as f:
faulty_json_str = f.read()
fixed_json_str = faulty_json_str + ']}'
json_obj = json.loads(fixed_json_str)
This question might have been asked many times but I am still unable to understand how to use json file. I use json.dump(data, filename). While dumping I get unnecessary {} at the end of the file. So json.load(data) gives me below error.
simplejson.scanner.JSONDecodeError: Extra data: line 1 column 1865 - line 1 column 1867 (char 1864 - 1866)
I read that there is no way to load a first or second dictionary. I have also read that there is a separater which can be used with json dump but I see no use here. Should I be using encoding, decoding here?
My json.dump file:
{
"deployCI2": ["094fd196-20f0-4e8d-b946-f74a56d2f319", "6a1ce382-98c6-4058-a929-95a7d2415fd0"],
"deployCI3": ["c8fff661-4482-4908-b722-4fac0227a8b0", "929cf1fa-3fa6-4f95-8464-d58e5490f4cf"],
"deployCI4": ["9f8ffa3c-460d-43a9-8113-58e891340e1b", "6e535e92-4da2-4228-a6ab-c8fc8d31adcd", "8e26a35e-7fb9-43b3-8026-d1283f7b678c", "f40e5c29-b4df-4cfb-9d7f-3bcc9c4dcf9f"],
"HeenaStackABC": [], "HeenaStackABC-DISK_VM1-mm55lkkvccej": ["cc2a89a2-3b27-4f88-af09-b3b0b1301056"]
}{}
Edited: I think the code is doing something here.
with open('stackList.json', 'a') as f:
for stack in stacks:
try:
hlist = hc.resources.list(stack_id=stack.id)
vlist = [o.physical_resource_id for o in hlist if o.resource_type =='OS::Cinder::Volume']
myDict[stack.stack_name] = vlist
except heatclient.exc.HTTPBadRequest as e:
pass
json.dump(myDict,f)
I edited the code like below. I hope this is valid. It removed the last braces
if len(myDict) != 0:
json.dump(myDict, f)
Your problem is here :
with open('stackList.json', 'a') as f:
You're opening the file in 'append' mode, so each time the code is executed it appends the dump to your file. The result you complain about comes from this and mydict being empty on the second run.
You either have to open the file in "w" ("write") mode which will overwrite the existing content (you can eventually create a new dump file for each call) or switch to the "jsonline" format (but your file will NOT be a valid json file anymore and any code reading it will have to know to parse it as jsonlines)
This is my first question here, I'm new to python and trying to figure some things out to set up an automatic 3D model processing chain that relies on data being stored in JSON files moving from one server to another.
The problem is that I need to store absolute paths to files that are being processed, but these absolute paths should be modified in the original JSON files upon the first time that they are processed.
Basically the JSON file comes in like this:
{
"normaldir": "D:\\Outgoing\\1621_1\\",
"projectdir": "D:\\Outgoing\\1622_2\\"
}
And I would like to rename the file paths to
{
"normaldir": "X:\\Incoming\\1621_1\\",
"projectdir": "X:\\Incoming\\1622_2\\",
}
What I've been trying to do is replace the first part of the path using this code, but it isn't working:
def processscan(scanfile):
configfile= MonitorDirectory + scanfile
with open(configfile, 'r+') as file:
content = file.read()
file.seek(0)
content.replace("D:\\Outgoing\\", "X:\\Incoming\\")
file.write(content)
However this was not working at all, so I tried interpreting the JSON file properly and replacing the key code from here:
def processscan(scanfile):
configfile= MonitorDirectory + scanfile
with open(configfile, 'r+') as settingsData:
settings = json.load(settingsData)
settings['normaldir'] = 'X:\\Incoming\\1621_1\\'
settings['projectdir'] = 'X:\\Incoming\\1622_2\\'
settingsData.seek(0) # rewind to beginning of file
settingsData.write(json.dumps(settings,indent=2,sort_keys=True)) #write the updated version
settingsData.truncate() #truncate the remainder of the data in the file
This works perfectly, however I'm replacing the whole path so it won't really work for every JSON file that I need to process. What I would really like to do is to take a JSON key corresponding to a file path, keep the last 8 characters and replace the rest of the patch with a new string, but I can't figure out how to do this using json in python, as far as I can tell I can't edit part of a key.
Does anyone have a workaround for this?
Thanks!
Your replace logic failed as you need to reassign content to the new string,str.replace is not an inplace operation, it creates a new string:
content = content.replace("D:\\Outgoing\\", "X:\\Incoming\\")
Using the json approach just do a replace too, using the current value:
settings['normaldir'] = settings['normaldir'].replace("D:\\Outgoing\\", "X:\\Incoming\\")
You also would want truncate() before you write or just reopen the file with w and dump/write the new value, if you really wanted to just keep the last 8 chars and prepend a string:
settings['normaldir'] = "X:\\Incoming\\" + settings['normaldir'][-8:]
Python come with a json library.
With this library, you can read and write JSON files (or JSON strings).
Parsed data is converted to Python objects and vice versa.
To use the json library, simply import it:
import json
Say your data is stored in input_data.json file.
input_data_path = "input_data.json"
You read the file like this:
import io
with io.open(input_data_path, mode="rb") as fd:
obj = json.load(fd)
or, alternatively:
with io.open(input_data_path, mode="rb") as fd:
content = fd.read()
obj = json.loads(content)
Your data is automatically converted into Python objects, here you get a dict:
print(repr(obj))
# {u'projectdir': u'D:\\Outgoing\\1622_2\\',
# u'normaldir': u'D:\\Outgoing\\1621_1\\'}
note: I'm using Python 2.7 so you get the unicode string prefixed by "u", like u'projectdir'.
It's now easy to change the values for normaldir and projectdir:
obj["normaldir"] = "X:\\Incoming\\1621_1\\"
obj["projectdir"] = "X:\\Incoming\\1622_2\\"
Since obj is a dict, you can also use the update method like this:
obj.update({'normaldir': "X:\\Incoming\\1621_1\\",
'projectdir': "X:\\Incoming\\1622_2\\"})
That way, you use a similar syntax like JSON.
Finally, you can write your Python object back to JSON file:
output_data_path = "output_data.json"
with io.open(output_data_path, mode="wb") as fd:
json.dump(obj, fd)
or, alternatively with indentation:
content = json.dumps(obj, indent=True)
with io.open(output_data_path, mode="wb") as fd:
fd.write(content)
Remarks: reading/writing JSON objects is faster with a buffer (the content variable).
.replace returns a new string, and don't change it. But you should not treat json-files as normal text files, so you can combine parsing json with replace:
def processscan(scanfile):
configfile= MonitorDirectory + scanfile
with open(configfile, 'rb') as settingsData:
settings = json.load(settingsData)
settings = {k: v.replace("D:\\Outgoing\\", "X:\\Incoming\\")
for k, v in settings.items()
}
with open(configfile, 'wb') as settingsData:
json.dump(settings, settingsData)
I am downloading Json files from an API, I use the following code to write the JSON. Each item the loop gives me a JSON file. I need to save it and extract entities from the appended JSON file using a loop.
for item in style_ls:
dat = get_json(api, item)
specs_dict[item] = dat
with open("specs_append.txt", "a") as myfile:
json.dump(dat, myfile)
myfile.close()
print item
with open ("specs_data.txt", "w") as my file:
json.dump(spec_dict, myfile)
myfile.close()
I know that I cannot get a valid JSON format from the specs_append.txt, but I can get one from the specs_data.txt. I am doing the first one just because my program needs atleast 3-4 days to complete and there are high chances that my system may shutdown. So is there anyway I can do this efficiently ?
If not is there anyway I can extract it from specs_append.txt <{JSON}{JSON}> format (which is not a valid JSON format)?
If not should I write specs_dict to a txt file every time in the loop, so that even if program gets terminated i can start if from that point in loop and still get a valid json format?
I suggest several possible solutions.
One solution is to write custom code to slurp in the input file. I would suggest putting a special line before each JSON object in the file, such as: ###
Then you could write code like this:
import json
def json_get_objects(f):
temp = ''
line = next(f) # pull first line
assert line == SPECIAL_LINE
for line in f:
if line != SPECIAL_LINE:
temp += line
else:
# found special marker, temp now contains a complete JSON object
j = json.loads(temp)
yield j
temp = ''
# after loop done, yield up last JSON object
if temp:
j = json.loads(temp)
yield j
with open("specs_data.txt", "r") as f:
for j in json_get_objects(f):
pass # do something with JSON object j
Two notes on this. First, I am simply appending to a string over and over; this used to be a very slow way to do this in Python, so if you are using a very old version of Python, don't do it this way unless your JSON objects are very small. Second, I wrote code to split the input and yield up JSON objects one at a time, but you could also use a guaranteed-unique string, slurp in all the data with a single call to f.read() and then split on your guaranteed-unique string using the str.split() method function.
Another solution would be to write the whole file as a valid JSON list of valid JSON objects. Write the file like this:
{"mylist":[
# first JSON object, followed by a comma
# second JSON object, followed by a comma
# third JSON object
]}
This would require your file appending code to open the file with writing permission, and seek to the last ] in the file before writing a comma plus newline, then the new JSON object on the end, and then finally writing ]} to close out the file. If you do it this way, you can use json.loads() to slurp the whole thing in and have a list of JSON objects.
Finally, I suggest that maybe you should just use a database. Use SQLite or something and just throw the JSON strings in to a table. If you choose this, I suggest using an ORM to make your life simple, rather than writing SQL commands by hand.
Personally, I favor the first suggestion: write in a special line like ###, then have custom code to split the input on those marks and then get the JSON objects.
EDIT: Okay, the first suggestion was sort of assuming that the JSON was formatted for human readability, with a bunch of short lines:
{
"foo": 0,
"bar": 1,
"baz": 2
}
But it's all run together as one big long line:
{"foo":0,"bar":1,"baz":2}
Here are three ways to fix this.
0) write a newline before the ### and after it, like so:
###
{"foo":0,"bar":1,"baz":2}
###
{"foo":0,"bar":1,"baz":2}
Then each input line will alternately be ### or a complete JSON object.
1) As long as SPECIAL_LINE is completely unique (never appears inside a string in the JSON) you can do this:
with open("specs_data.txt", "r") as f:
temp = f.read() # read entire file contents
lst = temp.split(SPECIAL_LINE)
json_objects = [json.loads(x) for x in lst]
for j in json_objects:
pass # do something with JSON object j
The .split() method function can split up the temp string into JSON objects for you.
2) If you are certain that each JSON object will never have a newline character inside it, you could simply write JSON objects to the file, one after another, putting a newline after each; then assume that each line is a JSON object:
import json
def json_get_objects(f):
for line in f:
if line.strip():
yield json.loads(line)
with open("specs_data.txt", "r") as f:
for j in json_get_objects(f):
pass # do something with JSON object j
I like the simplicity of option (2), but I like the reliability of option (0). If a newline ever got written in as part of a JSON object, option (0) would still work, but option (2) would error.
Again, you can also simply use an actual database (SQLite) with an ORM and let the database worry about the details.
Good luck.
Append json data to a dict on every loop.
In the end dump this dict as a json and write it to a file.
For getting you an idea for appending data to dict:
>>> d1 = {'suku':12}
>>> t1 = {'suku1':212}
>>> d1.update(t1)
>>> d1
{'suku1': 212, 'suku': 12}