How to append new data to pickle file using python - python

I am extracting face embedding of an image and appending it in a existing pickle file. But looks like its not working as when I unpickle the file, it do not contains the new data added. Below is code:
file = client_dir + '\embeddings.pickle'
data = {"embeddings": known_embeddings, "names": known_names}
with open(file, 'ab+') as fp:
pickle.dump(data, fp)
fp.close()
log("[INFO] Data appended to embeddings.pickle ")
Current pickle file contains below data:
{'embeddings': [array([-0.03656099, 0.11354745, -0.00438912, 0.0367547 , 0.06391761,
0.18440282, 0.06150107, -0.17380905, 0.03094344, -0.00182147,
0.00969766, 0.06890091, 0.04974053, -0.0502388 , -0.03414046,
-0.13550822, -0.02251128, 0.14556041, -0.04045469, 0.06500552,
0.0726142 , -0.04139924, -0.04662199, 0.08869533, -0.00061307,
-0.11912274, 0.13141112, -0.00648551, 0.00296356, 0.03682912,
-0.15076959, 0.03989822, 0.02799555, 0.03429572, 0.09865954,
0.14113557, -0.08355764, 0.09193961, -0.00819231, -0.01184336,
-0.12519744, 0.00668721, 0.0816237 , 0.00464355, -0.00339399,
0.07501812, 0.11679655, -0.09211859, 0.06211261, -0.00543289,
0.10347278, 0.06651585, -0.01512023, 0.09477805, 0.09886038,
-0.03837246, 0.02265131, -0.14867221, 0.00781244, 0.04845129,
-0.0363168 , -0.00186919, -0.16163988, 0.09539618, 0.14983718,
0.09159472, -0.05315595, -0.05073383, 0.01501674, -0.03789762,
0.07116041, 0.07650694, -0.02975985], dtype=float32)], 'names': ['rock']}
New data which I am trying to append is below:
{'embeddings': [array([-0.03656099, 0.11354745, -0.00438912, 0.0367547 , 0.06391761,
0.18440282, 0.06150107, -0.17380905, 0.03094344, -0.00182147,
0.00969766, 0.06890091, 0.04974053, -0.0502388 , -0.03414046,
0.07501812, 0.11679655, -0.09211859, 0.06211261, -0.00543289,
-0.13550822, -0.02251128, 0.14556041, -0.04045469, 0.06500552,
0.0726142 , -0.04139924, -0.04662199, 0.08869533, -0.00061307,
-0.11912274, 0.13141112, -0.00648551, 0.00296356, 0.03682912,
-0.15076959, 0.03989822, 0.02799555, 0.03429572, 0.09865954,
0.14113557, -0.08355764, 0.09193961, -0.00819231, -0.01184336,
-0.12519744, 0.00668721, 0.0816237 , 0.00464355, -0.00339399,
0.10347278, 0.06651585, -0.01512023, 0.09477805, 0.09886038,
-0.03837246, 0.02265131, -0.14867221, 0.00781244, 0.04845129,
-0.0363168 , -0.00186919, -0.16163988, 0.09539618, 0.14983718,
0.09159472, -0.05315595, -0.05073383, 0.01501674, -0.03789762,
0.07116041, 0.07650694, -0.02975985], dtype=float32)], 'names': ['john']}
But when I unpickle the file it only has the the data for rock and not the john. Can anyone please help me what I am doing wrong. Below is the code I am using to unpickle and watch what data is added. May be the way I am unpickling the file is wrong, because when I am appending the data I can see the file size increasing.
import pickle
file = open('G:\\output\\embeddings.pickle', 'rb')
data = pickle.load(file)
file.close()
print(data)
Please help. Thanks
Updated code:
file_path = client_dir + '\embeddings.pickle'
file = open(file_path, 'rb')
old_data = pickle.load(file)
new_embeddings = old_data['embeddings']
new_names = old_data['names']
new_embeddings.append(known_embeddings[0])
new_names.append(known_names[0])
data1 = {"embeddings": new_embeddings, "names": new_names}
with open(file_path, 'ab+') as fp:
pickle.dump(data1, fp)
fp.close()
log.error("[INFO] Data appended to embeddings.pickle ")
In the above code, I am first loading the data from the pickle file into list and then appending the new data into the list and then adding all the data (old + new) into the pickle file. Can anyone please tell me if this is the correct way of doing it.
After this as well, when I unpickle the file, I am not getting all the data. Thanks

file_path = client_dir + '\embeddings.pickle'
file = open(file_path, 'rb')
old_data = pickle.load(file)
new_embeddings = old_data['embeddings']
new_names = old_data['names']
new_embeddings.append(known_embeddings[0])
new_names.append(known_names[0])
data1 = {"embeddings": new_embeddings, "names": new_names}
with open(file_path, 'ab+') as fp:
pickle.dump(data1, fp)
fp.close()
log.error("[INFO] Data appended to embeddings.pickle ")
This looks pretty close to being correct to me. You succesfully load the pickled data and add new elements to it. The problem appears to be the with open(file_path, 'ab+') as fp: call. If you open the file in "a" mode, then the pickle data you write will get added to the end, after the old pickle data. Then, on subsequent executions of your program, pickle.load will only load the old pickle data.
Try overwriting the old pickle data completely with your new pickle data. You can do this by opening in "w" mode instead.
with open(file_path, 'wb') as fp:
pickle.dump(data1, fp)
Incidentally, you don't need that fp.close() call. A with statement automatically closes the opened file at the end of the block.

It can be done without loading the data first to improve speed:
use mode='ab' to create a new file if file doesn't exist, or append data if file exists:
pickle.dump((data), open('data folder/' + filename2save + '.pkl', 'ab'))

Related

gzipped jsonlines file read and write in python

While this code reads and writes a jsonlines file. How to compress it? I tried directly using gzip.open but I am getting various errors.
import json
def dump_jsonl(data, output_path, append=False):
"""
Write list of objects to a JSON lines file.
"""
mode = 'a+' if append else 'w'
with open(output_path, mode, encoding='utf-8') as f:
for line in data:
json_record = json.dumps(line, ensure_ascii=False)
f.write(json_record + '\n')
print('Wrote {} records to {}'.format(len(data), output_path))
def load_jsonl(input_path) -> list:
"""
Read list of objects from a JSON lines file.
"""
data = []
with open(input_path, 'r', encoding='utf-8') as f:
for line in f:
data.append(json.loads(line.rstrip('\n|\r')))
print('Loaded {} records from {}'.format(len(data), input_path))
return data
This is what I am doing to compress but I am unable to read it.
def dump_jsonl(data, output_path, append=False):
with gzip.open(output_path, "a+") as f:
for line in data:
json_record = json.dumps(line, ensure_ascii = False)
encoded = json_record.encode("utf-8") + ("\n").encode("utf-8")
compressed = gzip.compress(encoded)
f.write(compressed)
Use the gzip module's compress function.
import gzip
with open('file.jsonl') as f_in:
with gzip.open('file.jsonl.gz', 'wb') as f_out:
f_out.writelines(f_in)
gzip.open() is for opening gzipped files, not jsonl.
Read:
gzip a file in Python
Python support for Gzip

Converting JSON to CSV, CSV is empty

I'm attempting to convert yelps data set that is in JSON to a csv format. The new csv file that is created is empty.
I've tried different ways to iterate through the JSON but they all give me a zero bytes file.
The json file looks like this:
{"business_id":"1SWheh84yJXfytovILXOAQ","name":"Arizona Biltmore Golf Club","address":"2818 E Camino Acequia Drive","city":"Phoenix","state":"AZ","postal_code":"85016","latitude":33.5221425,"longitude":-112.0184807,"stars":3.0,"review_count":5,"is_open":0,"attributes":{"GoodForKids":"False"},"categories":"Golf, Active Life","hours":null}
import json
import csv
infile = open("business.json","r")
outfile = open("business2.csv","w")
data = json.load(infile)
infile.close()
out = csv.writer(outfile)
out.writerow(data[0].keys())
for row in data:
out.writerow(row.values())
I get an "extra data" message when the code runs. The new business2 csv file is empty and the size is zero bytes.
if you JSON has only one row.. then try this
infile = open("business.json","r")
outfile = open("business2.csv","w")
data = json.load(infile)
infile.close()
out = csv.writer(outfile)
#print(data.keys())
out.writerow(data.keys())
out.writerow(data.values())
Hi Please try the below code, by using with command the file access will automatically get closed when the control moves out of scope of with
infile = open("business.json","r")
outfile = open("business2.csv","w")
data = json.load(infile)
infile.close()
headers = list(data.keys())
values = list(data.values())
with open("business2.csv","w") as outfile:
out = csv.writer(outfile)
out.writerow(headers)
out.writerow(values)
You need to use with to close file.
import json
import csv
infile = open("business.json","r")
data = json.load(infile)
infile.close()
with open("business2.csv","w") as outfile:
out = csv.writer(outfile)
out.writerow(list(data.keys()))
out.writerow(list(data.values()))

keep data in the file python

hi guys how are you I hope that well, I'm new using python and I'm doing a program but I dont know how to save the data permanently in a file. I only know how to create the file but i dont know how can i keep the data on the file eventhough the program be closed and when i open it back i be able to add more data and keep it in the file too.I have also tried several methods to upload the file on python but they didnt work for me. Can someone please help me?
This is my code:
file = open ('file.txt','w')
t = input ('name :')
p= input ('last name: ')
c = input ('nickname: ')
file.write('name :')
file.write(t)
file.write(' ')
file.write('last name: ')
file.write(p)
file.write('nickname: ')
file.write(c)
file.close()
with open('archivo.txt','w') as file:
data = load(file)
print(data)
Here is a demonstration of how file writing works, and the difference between w and a. The comments represent the text in the file that is written to the drive at each given point.
f1 = open('appending.txt', 'w')
f1.write('first string\n')
f1.close()
# first string
f2 = open('appending.txt', 'a')
f2.write('second string\n')
f2.close()
# first string
# second string
f3 = open('appending.txt', 'w')
f3.write('third string\n')
f3.close()
# third string
There are three type of File operation mode can happen on file like read, write and append.
Read Mode: In this you only able to read the file like
#content in file.txt "Hi I am Python Developer"
with open('file.txt', 'r') as f:
data = f.read()
print(data)
#output as : Hi I am Python Developer
Write Mode: In this you are able to write information into files, but it will always overwrite the content of the file like for example.
data = input('Enter string to insert into file:')
with open('file.txt', 'w') as f:
f.write(data)
with open('file.txt', 'r') as f:
data = f.read()
print('out_data:', data)
# Output : Enter string to insert into file: Hi, I am developer
# out_data: Hi, I am developer
When you open file for next time and do same write operation, it will overwrite whole information into file.
Append Mode: In this you will able to write into file but the contents is append into this files. Like for example:
data = input('Enter string to insert into file:')
with open('file.txt', 'a') as f:
f.write(data)
with open('file.txt', 'r') as f:
data = f.read()
print('out_data:', data)
# Output : Enter string to insert into file: Hi, I am developer
# out_data: Hi, I am developer
# Now perform same operation:
data = input('Enter string to insert into file:')
with open('file.txt', 'a') as f:
f.write(data)
with open('file.txt', 'r') as f:
data = f.read()
print('out_data:', data)
# Output : Enter string to insert into file: Hi, I am Python developer
# out_data: Hi, I am developer Hi, I am Python Developer

How to write the list to file?

I am unable to write the result of the following code to a file
import boto3
ACCESS_KEY= "XXX"
SECRET_KEY= "XXX"
regions = ['us-east-1','us-west-1','us-west-2','eu-west-1','sa-east-1','ap-southeast-1','ap-southeast-2','ap-northeast-1']
for region in regions:
client = boto3.client('ec2',aws_access_key_id=ACCESS_KEY,aws_secret_access_key=SECRET_KEY,region_name=region,)
addresses_dict = client.describe_addresses()
#f = open('/root/temps','w')
for eip_dict in addresses_dict['Addresses']:
with open('/root/temps', 'w') as f:
if 'PrivateIpAddress' in eip_dict:
print eip_dict['PublicIp']
f.write(eip_dict['PublicIp'])
This results in printing the IP's but nothing gets written in file, the result of print is :
22.1.14.1
22.1.15.1
112.121.41.41
....
I just need to write the content in this format only
for eip_dict in addresses_dict['Addresses']:
with open('/root/temps', 'w') as f:
if 'PrivateIpAddress' in eip_dict:
print eip_dict['PublicIp']
f.write(eip_dict['PublicIp'])
You are re-opening the file for writing at each iteration of the loop. Perhaps the last iteration has no members with 'PrivateIpAddress' in its dict, so the file gets opened, truncated, and left empty. Write it this way instead:
with open('/root/temps', 'w') as f:
for eip_dict in addresses_dict['Addresses']:
if 'PrivateIpAddress' in eip_dict:
print eip_dict['PublicIp']
f.write(eip_dict['PublicIp'])
open file in append mode
with open('/root/temps', 'a') as f:
or
declare the file outside the loop

Append JSON to file

I am trying to append values to a json file. How can i append the data? I have been trying so many ways but none are working ?
Code:
def all(title,author,body,type):
title = "hello"
author = "njas"
body = "vgbhn"
data = {
"id" : id,
"author": author,
"body" : body,
"title" : title,
"type" : type
}
data_json = json.dumps(data)
#data = ast.literal_eval(data)
#print data_json
if(os.path.isfile("offline_post.json")):
with open('offline_post.json','a') as f:
new = json.loads(f)
new.update(a_dict)
json.dump(new,f)
else:
open('offline_post.json', 'a')
with open('offline_post.json','a') as f:
new = json.loads(f)
new.update(a_dict)
json.dump(new,f)
How can I append data to json file when this function is called?
I suspect you left out that you're getting a TypeError in the blocks where you're trying to write the file. Here's where you're trying to write:
with open('offline_post.json','a') as f:
new = json.loads(f)
new.update(a_dict)
json.dump(new,f)
There's a couple of problems here. First, you're passing a file object to the json.loads command, which expects a string. You probably meant to use json.load.
Second, you're opening the file in append mode, which places the pointer at the end of the file. When you run the json.load, you're not going to get anything because it's reading at the end of the file. You would need to seek to 0 before loading (edit: this would fail anyway, as append mode is not readable).
Third, when you json.dump the new data to the file, it's going to append it to the file in addition to the old data. From the structure, it appears you want to replace the contents of the file (as the new data contains the old data already).
You probably want to use r+ mode, seeking back to the start of the file between the read and write, and truncateing at the end just in case the size of the data structure ever shrinks.
with open('offline_post.json', 'r+') as f:
new = json.load(f)
new.update(a_dict)
f.seek(0)
json.dump(new, f)
f.truncate()
Alternatively, you can open the file twice:
with open('offline_post.json', 'r') as f:
new = json.load(f)
new.update(a_dict)
with open('offline_post.json', 'w') as f:
json.dump(new, f)
This is a different approach, I just wanted to append without reloading all the data. Running on a raspberry pi so want to look after memory. The test code -
import os
json_file_exists = 0
filename = "/home/pi/scratch_pad/test.json"
# remove the last run json data
try:
os.remove(filename)
except OSError:
pass
count = 0
boiler = 90
tower = 78
while count<10:
if json_file_exists==0:
# create the json file
with open(filename, mode = 'w') as fw:
json_string = "[\n\t{'boiler':"+str(boiler)+",'tower':"+str(tower)+"}\n]"
fw.write(json_string)
json_file_exists=1
else:
# append to the json file
char = ""
boiler = boiler + .01
tower = tower + .02
while(char<>"}"):
with open(filename, mode = 'rb+') as f:
f.seek(-1,2)
size=f.tell()
char = f.read()
if char == "}":
break
f.truncate(size-1)
with open(filename, mode = 'a') as fw:
json_string = "\n\t,{'boiler':"+str(boiler)+",'tower':"+str(tower)+"}\n]"
fw.seek(-1, os.SEEK_END)
fw.write(json_string)
count = count + 1

Categories

Resources