I have a code that needs to read a JSON file with multiple lines, i.e:
{"c1-line1": "value", "c2-line1": "value"}
{"c1-line2": "value", "c2-line2": "value"}...
and, after change the keys values (already working), I need to write a new json file with these multiple lines, i.e:
{"newc1-line1": "value", "newc2-line1": "value"}
{"newc1-line2": "value", "newc2-line2": "value"}...
My problem is that my code are just writing the last value readed:
{"newc1-line2": "value", "newc2-line2": "value"}
My code:
def main():
... # changeKeyValueCode
writeFile(data)
def writeFile(data):
with open('new_file.json', 'w') as f:
json.dump(data, f)
I already tried with json.dumps and just f.write('') or f.write('\n')
I know that data in writeFile() is correctly with each line value.
How can I resolve this, please?
def main():
... # changeKeyValueCode
writeFile(data)
def writeFile(data):
with open('new_file.json', 'a') as f:
json.dump(data, f)
with open('new_file.json', 'a')
open file with (a), it will search the file if found append data to the end, else it will create empty file and then append data.
Related
I'm attempting to convert yelps data set that is in JSON to a csv format. The new csv file that is created is empty.
I've tried different ways to iterate through the JSON but they all give me a zero bytes file.
The json file looks like this:
{"business_id":"1SWheh84yJXfytovILXOAQ","name":"Arizona Biltmore Golf Club","address":"2818 E Camino Acequia Drive","city":"Phoenix","state":"AZ","postal_code":"85016","latitude":33.5221425,"longitude":-112.0184807,"stars":3.0,"review_count":5,"is_open":0,"attributes":{"GoodForKids":"False"},"categories":"Golf, Active Life","hours":null}
import json
import csv
infile = open("business.json","r")
outfile = open("business2.csv","w")
data = json.load(infile)
infile.close()
out = csv.writer(outfile)
out.writerow(data[0].keys())
for row in data:
out.writerow(row.values())
I get an "extra data" message when the code runs. The new business2 csv file is empty and the size is zero bytes.
if you JSON has only one row.. then try this
infile = open("business.json","r")
outfile = open("business2.csv","w")
data = json.load(infile)
infile.close()
out = csv.writer(outfile)
#print(data.keys())
out.writerow(data.keys())
out.writerow(data.values())
Hi Please try the below code, by using with command the file access will automatically get closed when the control moves out of scope of with
infile = open("business.json","r")
outfile = open("business2.csv","w")
data = json.load(infile)
infile.close()
headers = list(data.keys())
values = list(data.values())
with open("business2.csv","w") as outfile:
out = csv.writer(outfile)
out.writerow(headers)
out.writerow(values)
You need to use with to close file.
import json
import csv
infile = open("business.json","r")
data = json.load(infile)
infile.close()
with open("business2.csv","w") as outfile:
out = csv.writer(outfile)
out.writerow(list(data.keys()))
out.writerow(list(data.values()))
I have this csv file:
89,Network activity,ip-dst,80.179.42.44,,1,20160929
89,Payload delivery,md5,4ad2924ced722ab65ff978f83a40448e,,1,20160929
89,Network activity,domain,alkamaihd.net,,1,20160929
90,Payload delivery,md5,197c018922237828683783654d3c632a,,1,20160929
90,Network activity,domain,dnsrecordsolver.tk,,1,20160929
90,Network activity,ip-dst,178.33.94.47,,1,20160929
90,Payload delivery,filename,Airline.xls,,1,20160929
91,Payload delivery,md5,23a9bbf8d64ae893db17777bedccdc05,,1,20160929
91,Payload delivery,md5,07e47f06c5ed05a062e674f8d11b01d8,,1,20160929
91,Payload delivery,md5,bd75af219f417413a4e0fae8cd89febd,,1,20160929
91,Payload delivery,md5,9f4023f2aefc8c4c261bfdd4bd911952,,1,20160929
91,Network activity,domain,mailsinfo.net,,1,20160929
91,Payload delivery,md5,1e4653631feebf507faeb9406664792f,,1,20160929
92,Payload delivery,md5,6fa869f17b703a1282b8f386d0d87bd4,,1,20160929
92,Payload delivery,md5,24befa319fd96dea587f82eb945f5d2a,,1,20160929
I need to divide this csv file to 4 csv files where as the condition is the event number at the beginning of every row. so far I created a set that includes al the event numbers {89,90,91,92}, and I know that I need to make loop in a loop and copy each row to its dedicated csv file.
data = {
'89': [],
'90': [],
'91': [],
'92': []
}
with open('yourfile.csv') as infile:
for line in infile:
prefix = line[:2]
data[prefix].append(line)
for prefix in data.keys():
with open('csv' + prefix + '.csv', 'w') as csv:
csv.writelines(''.join(data[prefix]))
However if your are open to solutions other than python then this can be easily accomplished by running four commands
grep ^89 file.csv > 89.csv
grep ^90 file.csv > 90.csv
Similarly for other values.
It would be best to not hardcode the event numbers in your code so it's not dependent on the values of the data. I also prefer to use the csv module which has been optimized to read and write .csv files.
Here's a way to do that:
import csv
prefix = 'events' # of output csv file names
data = {}
with open('conditions.csv', 'rb') as conditions:
reader = csv.reader(conditions)
for row in reader:
data.setdefault(row[0], []).append(row)
for event in sorted(data):
csv_filename = '{}_{}.csv'.format(prefix, event)
print(csv_filename)
with open(csv_filename, 'wb') as csvfile:
writer = csv.writer(csvfile)
writer.writerows(data[event])
Update
The approach implemented above first reads the entire csv file into memory, and then writes the all the rows associated with each event value into a separate output file, one at a time.
A more memory-efficient approach would be to open multiple output files simultaneously and write each row immediately after it has been read out to the proper destination file. Doing this requires keeping track of what files are already open. Something else the file managing code needs to do is make sure all the files are closed when processing is complete.
In the code below all of this has been accomplished by defining and using a Python Context Manager type to centralize the handling of all the csv output files that might be generated depending on how many different event values there are in the input file.
import csv
import sys
PY3 = sys.version_info.major > 2
class MultiCSVOutputFileManager(object):
"""Context manager to open and close multiple csv files and csv writers.
"""
def __enter__(self):
self.files = {}
return self
def __exit__(self, exc_type, exc_value, traceback):
for file, csv_writer in self.files.values():
print('closing file: {}'.format(file.name))
file.close()
self.files.clear()
return None
def get_csv_writer(self, filename):
if filename not in self.files: # new file?
open_kwargs = dict(mode='w', newline='') if PY3 else dict(mode='wb')
print('opening file: {}'.format(filename))
file = open(filename, **open_kwargs)
self.files[filename] = file, csv.writer(file)
return self.files[filename][1] # return associated csv.writer object
And here's how to use it:
prefix = 'events' # to name of each csv output file
with open('conditions.csv', 'rb') as conditions:
reader = csv.reader(conditions)
with MultiCSVOutputFileManager() as file_manager:
for row in reader:
csv_filename = '{}_{}.csv'.format(prefix, row[0]) # row[0] is event
writer = file_manager.get_csv_writer(csv_filename)
writer.writerow(row)
You can even dynamically create the resulting files if the first field has not been encountered by keeping a mapping of that id and the associated file:
files = {}
with open('file.csv') as fd:
for line in fd:
if 0 == len(line.strip()): continue # skip empty lines
try:
id_field = line.split(',', 1)[0] # extract first field
if not id in files.keys(): # if not encountered open a new result file
files[id] = open(id + '.csv')
files[id].write(line) # write the line in proper file
except Exception as e:
print('ERR', line, e) # catchall in case of problems...
I am unable to write the result of the following code to a file
import boto3
ACCESS_KEY= "XXX"
SECRET_KEY= "XXX"
regions = ['us-east-1','us-west-1','us-west-2','eu-west-1','sa-east-1','ap-southeast-1','ap-southeast-2','ap-northeast-1']
for region in regions:
client = boto3.client('ec2',aws_access_key_id=ACCESS_KEY,aws_secret_access_key=SECRET_KEY,region_name=region,)
addresses_dict = client.describe_addresses()
#f = open('/root/temps','w')
for eip_dict in addresses_dict['Addresses']:
with open('/root/temps', 'w') as f:
if 'PrivateIpAddress' in eip_dict:
print eip_dict['PublicIp']
f.write(eip_dict['PublicIp'])
This results in printing the IP's but nothing gets written in file, the result of print is :
22.1.14.1
22.1.15.1
112.121.41.41
....
I just need to write the content in this format only
for eip_dict in addresses_dict['Addresses']:
with open('/root/temps', 'w') as f:
if 'PrivateIpAddress' in eip_dict:
print eip_dict['PublicIp']
f.write(eip_dict['PublicIp'])
You are re-opening the file for writing at each iteration of the loop. Perhaps the last iteration has no members with 'PrivateIpAddress' in its dict, so the file gets opened, truncated, and left empty. Write it this way instead:
with open('/root/temps', 'w') as f:
for eip_dict in addresses_dict['Addresses']:
if 'PrivateIpAddress' in eip_dict:
print eip_dict['PublicIp']
f.write(eip_dict['PublicIp'])
open file in append mode
with open('/root/temps', 'a') as f:
or
declare the file outside the loop
I am trying to append values to a json file. How can i append the data? I have been trying so many ways but none are working ?
Code:
def all(title,author,body,type):
title = "hello"
author = "njas"
body = "vgbhn"
data = {
"id" : id,
"author": author,
"body" : body,
"title" : title,
"type" : type
}
data_json = json.dumps(data)
#data = ast.literal_eval(data)
#print data_json
if(os.path.isfile("offline_post.json")):
with open('offline_post.json','a') as f:
new = json.loads(f)
new.update(a_dict)
json.dump(new,f)
else:
open('offline_post.json', 'a')
with open('offline_post.json','a') as f:
new = json.loads(f)
new.update(a_dict)
json.dump(new,f)
How can I append data to json file when this function is called?
I suspect you left out that you're getting a TypeError in the blocks where you're trying to write the file. Here's where you're trying to write:
with open('offline_post.json','a') as f:
new = json.loads(f)
new.update(a_dict)
json.dump(new,f)
There's a couple of problems here. First, you're passing a file object to the json.loads command, which expects a string. You probably meant to use json.load.
Second, you're opening the file in append mode, which places the pointer at the end of the file. When you run the json.load, you're not going to get anything because it's reading at the end of the file. You would need to seek to 0 before loading (edit: this would fail anyway, as append mode is not readable).
Third, when you json.dump the new data to the file, it's going to append it to the file in addition to the old data. From the structure, it appears you want to replace the contents of the file (as the new data contains the old data already).
You probably want to use r+ mode, seeking back to the start of the file between the read and write, and truncateing at the end just in case the size of the data structure ever shrinks.
with open('offline_post.json', 'r+') as f:
new = json.load(f)
new.update(a_dict)
f.seek(0)
json.dump(new, f)
f.truncate()
Alternatively, you can open the file twice:
with open('offline_post.json', 'r') as f:
new = json.load(f)
new.update(a_dict)
with open('offline_post.json', 'w') as f:
json.dump(new, f)
This is a different approach, I just wanted to append without reloading all the data. Running on a raspberry pi so want to look after memory. The test code -
import os
json_file_exists = 0
filename = "/home/pi/scratch_pad/test.json"
# remove the last run json data
try:
os.remove(filename)
except OSError:
pass
count = 0
boiler = 90
tower = 78
while count<10:
if json_file_exists==0:
# create the json file
with open(filename, mode = 'w') as fw:
json_string = "[\n\t{'boiler':"+str(boiler)+",'tower':"+str(tower)+"}\n]"
fw.write(json_string)
json_file_exists=1
else:
# append to the json file
char = ""
boiler = boiler + .01
tower = tower + .02
while(char<>"}"):
with open(filename, mode = 'rb+') as f:
f.seek(-1,2)
size=f.tell()
char = f.read()
if char == "}":
break
f.truncate(size-1)
with open(filename, mode = 'a') as fw:
json_string = "\n\t,{'boiler':"+str(boiler)+",'tower':"+str(tower)+"}\n]"
fw.seek(-1, os.SEEK_END)
fw.write(json_string)
count = count + 1
I have a file with the class and it's function
import csv
class file:
def __init__(self,rc):
self.rc = rc
def load(self):
with open('airports.csv', newline='', encoding='utf-8') as file:
for row in file: #csv.reader(file):
return(row)
which reads a csv file with many (40,000) lines and
with open('airports.csv', newline='', encoding='utf-8') as file:
for row in file: #csv.reader(file):
return(row)
this code a lone reads all the rows as intended, but when i use the first code above with this
from testing import file
file1 = file("row")
print (file1.load())
it only returns the first row of the csv file. why does this occur, and how can i fix it?
You return the first row, which ends the load() method and ignores the rest of the file. You should yield the row instead, making a generator method. You can then call this to print out all lines:
print('\n'.join(str(l) for l in file1.load()))