Serializing a list of class instances in python - python

In python, I am trying to store a list to a file. I've tried pickle, json, etc, but none of them support classes being inside those lists. I can't sacrifice the lists or the classes, I must maintain both. How can I do it?
My current attempt:
try:
with open('file.json', 'r') as file:
allcards = json.load(file)
except:
allcards = []
def saveData(list):
with open('file.json', 'w') as file:
print(list)
json.dump(list, file, indent=2)
saveData is called elsewhere, and I've done all the testing I can and have determined the error comes from trying to save the list due to it's inclusion of classes. It throws me the error
Object of type Card is not JSON serializable
whenever I do the JSON method, and any other method doesn't even give errors but doesn't load the list when I reload the program.
Edit: As for the pickle method, here is what it looks like:
try:
with open('allcards.dat', 'rb') as file:
allcards = pickle.load(file)
print(allcards)
except:
allcards = []
class Card():
def __init__(self, owner, name, rarity, img, pack):
self.owner = str(owner)
self.name = str(name)
self.rarity = str(rarity)
self.img = img
self.pack = str(pack)
def saveData(list):
with open('allcards.dat', 'wb') as file:
pickle.dump(list, file)
When I do this, all that happens is the code runs as normal, but the list is not saved. And the print(allcards) does not trigger either which makes me believe it's somehow not detecting the file or causing some other error leading to it just going straight to the exception. Also, img is supposed to always a link, in case that changes anything.
I have no other way I believe I can help solve this issue, but I can post more code if need be.
Please help, and thanks in advance.

Python's built-in pickle module does not support serializing a python class, but there are libraries that extend the pickle module and provide this functionality. Drill and Cloudpickle both support serializing a python class and has the exact same interface as the pickle module.
Dill: https://github.com/uqfoundation/dill
Cloudpickle: https://github.com/cloudpipe/cloudpickle

//EDIT
The article linked below is good, but I've written a bad example.
This time I've created a new snippet from scratch -- sorry for making it earlier more complicated than it should.
import json
class Card(object):
#classmethod
def from_json(cls, data):
return cls(**data)
def __init__(self, figure, color):
self.figure = figure
self.color = color
def __repr__(self):
return f"<Card: [{self.figure} of {self.color}]>"
def save(cards):
with open('file.json', 'w') as f:
json.dump(cards, f, indent=4, default=lambda c: c.__dict__)
def load():
with open('file.json', 'r') as f:
obj_list = json.load(f)
return [Card.from_json(obj) for obj in obj_list]
cards = []
cards.append(Card("1", "clubs"))
cards.append(Card("K", "spades"))
save(cards)
cards_from_file = load()
print(cards_from_file)
Source

Related

How do I mock `csv.DictWriter.writeheader()`, which is called from within my custom function?

I have a custom function defined in
custom_file.py
import csv
def write_dict_to_csv(columns=None, file_name=None, data=None):
try:
with open(file_name, "w") as f:
writer = csv.DictWriter(f, fieldnames=columns)
writer.writeheader()
in test_file.py I want to return a fixed value when writer.writeheader() is called.
from custom_file import write_dict_to_csv
class TestMyFunctions(unittest.TestCase):
#patch('custom_file.csv.DictWriter.writeheader')
def test_write_dict_to_csv(self, mock_writeheader):
custom_file.write_dict_to_csv(file_name='fileName')
self.assertTrue(mock_writeheader.called)
But this returns TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
How do I mock csv.DictWriter.writeheader() when it's being imported from an external library into a custom_file, which I'm then testing from a separate test_file?
I figured this would be close to correct since we're meant to patch where a thing is looked up, not where it is defined.
The code you have provided does not run "as is" because of some missing imports, but after fixing the problems everything seems working (the test passes).
Here is the code I ran. I hope it helps.
custom_file.py
import csv
def write_dict_to_csv(columns=None, file_name=None, data=None):
with open(file_name, "w") as f:
writer = csv.DictWriter(f, fieldnames=columns)
writer.writeheader()
test_file.py
import unittest
from unittest.mock import patch
import custom_file
class TestMyFunctions(unittest.TestCase):
#patch('custom_file.csv.DictWriter.writeheader')
def test_write_dict_to_csv(self, mock_writeheader):
print("Hello")
custom_file.write_dict_to_csv(file_name='fileName')
self.assertTrue(mock_writeheader.called)

Loading multiple files with bobobo-etl

I'm new to bonobo-etl and I'm trying to write a job that loads multiple files at once but I can't get the CsvReader to work with the #use_context_processor annotation. A snippet of my code:
def input_file(self, context):
yield 'test1.csv'
yield 'test2.csv'
yield 'test3.csv'
#use_context_processor(input_file)
def extract(f):
return bonobo.CsvReader(path=f,delimiter='|')
def load(*args):
print(*args)
def get_graph(**options):
graph = bonobo.Graph()
graph.add_chain(extract,load)
return graph
When I run the job I get something like <bonobo.nodes.io.csv.CsvReader object at 0x7f849678dc88> rather than the lines of the CSV.
If I hardcode the reader like graph.add_chain(bonobo.CsvReader(path='test1.csv',delimiter='|'),load), it works.
Any help would be appreciated.
Thank you.
As bonobo.CsvReader does not support (yet) to read file names from the input stream, you need use a custom reader for that.
Here is a solution that works for me on a set of csvs I have:
import bonobo
import bonobo.config
import bonobo.util
import glob
import csv
#bonobo.config.use_context
def read_multi_csv(context, name):
with open(name) as f:
reader = csv.reader(f, delimiter=';')
headers = next(reader)
if not context.output_type:
context.set_output_fields(headers)
for row in reader:
yield tuple(row)
def get_graph(**options):
graph = bonobo.Graph()
graph.add_chain(
glob.glob('prenoms_*.csv'),
read_multi_csv,
bonobo.PrettyPrinter(),
)
return graph
if __name__ == '__main__':
with bonobo.parse_args() as options:
bonobo.run(get_graph(**options))
Few comments on this snippet, in reading order:
use_context decorator will inject the node execution context to the transformation call, allowing to use .set_output_fields(...) using the first csv headers.
Other csv headers are ignored, in my case they're all the same. You may need a slightly more complex logic for your own case.
Then, we just generate the filenames in a bonobo.Graph instance using glob.glob (in my case, the stream will contain: prenoms_2004.csv prenoms_2005.csv ... prenoms_2011.csv prenoms_2012.csv) and pass it to our custom reader, which will be called once for each file, open it, and yield its lines.
Hope that helps!

Load csv file in processing.py

I am trying to load a csv file in processing.py as a table. The Java environment allows me to use the loadTable() function, however, I'm unable to find an equivalent function in the python environment.
The missing functionality could be added as follows:
import csv
class Row(object):
def __init__(self, dict_row):
self.dict_row = dict_row
def getFloat(self, key):
return float(self.dict_row[key])
def getString(self, key):
return self.dict_row[key]
class loadTable(object):
def __init__(self, csv_filename, header):
with open(csv_filename, "rb") as f_input:
csv_input = csv.DictReader(f_input)
self.data = [Row(row) for row in csv_input]
def rows(self):
return self.data
This reads the csv file into memory using Python's csv.DictReader class. This treats each row in the csv file as a dictionary. For each row, it creates an instance of a Row class which then lets you retrieve entries in the format required. Currently I have just coded for getFloat() and getString() (which is the default format for all csv values).
You could create an empty Table object with this:
from processing.data import Table
t = Table()
And then populate it as discussed at https://discourse.processing.org/t/creating-an-empty-table-object-in-python-mode-and-some-other-hidden-data-classes/25121
But I think a Python Dict as proposed by #martin-evans would be nice. You load it like this:
import csv
from codecs import open # optional to have the 'enconding="utf-8"' in Python 2
with open("data/pokemon.csv", encoding="utf-8") as f:
data = list(csv.DictReader(f)) # a list of dicts, col-headers as keys

Wrapping Python's io.open Function

as you might know, file-streams created by Python's io.open function in non-binary mode (e.g. io.open("myFile.txt", "w", encoding = "utf-8") do not accept non-unicode strings. For a reason that is kind of hard to explain, I want to sort of monkey-patch the write-method of those file-streams in order to convert the string to be written into a unicode before calling the actual write-method. My code currently looks like this:
import io
import types
def write(self, text):
if not isinstance(text, unicode):
text = unicode(text)
return super(self.__class__, self).write(text)
def wopen(*args, **kwargs):
f = io.open(*args, **kwargs)
f.write = types.MethodType(write, f)
return f
However, it does not work:
>>> with wopen(p, "w", encoding = "utf-8") as f:
>>> f.write("test")
UnsupportedOperation: write
Since I can not really look into those built-in-methods, I have no idea how to analyse the problem.
Thanks in advance!

how to use json on mac os

i have encountered a very strange issue: i use json.dump to write a file and then use json.load to read the file.
The same code can run succeed on windows 7 but it can not do on mac os x 10.7
Below is the code:
class Result:
def __init__(self,name,result):
self.name = name
self.result = result
def __repr__(self):
return 'Result name : %s , result : %s' % (self.name,self.result)
class MyEncoder(json.JSONEncoder):
def default(self,obj):
#convert object to a dict
d = {'CaseResult':{}}
d['CaseResult'][obj.name] = obj.result
return d
def save(name,result):
filename = 'basic.json'
obj = Result(name,result)
obj_json = MyEncoder().encode(obj)
with open(filename, mode='ab+') as fp:
json.dump(obj_json,fp)
s=json.load(fp)
save('aaa','bbb')
in mac os it give an error "ValueError:NO JSON object could be decoded"
who can tell me why this happen and how can i resolve it
This problem is unrelated to being run on a Mac; this code should never work:
with open(filename, mode='ab+') as fp:
json.dump(obj_json,fp)
s=json.load(fp)
This is because after json.dump, your file pointer is at the end of the file. You must call fp.seek to reset it to the initial position, like this:
import os
with open(filename, mode='rb+') as fp:
fp.seek(0, os.SEEK_END)
pos = fp.tell()
json.dump(obj_json,fp)
fp.seek(pos)
s=json.load(fp)
I'm not sure how this actually works on Windows, but you're missing a seek back to the beginning of the file before you read the object back. change your save/load to
with open(filename, mode='ab+') as fp:
json.dump(obj_json,fp)
fp.seek(0)
s=json.load(fp)
and it runs just fine on MacOS too. Note that you're appending to the file, so only the first run succeeds in loading the object back, the next one will find extra data after the end of the object.

Categories

Resources