file IO with defaultdict

file IO with defaultdict - python

I'm attempting to:
load dictionary
update/change the dictionary
save
(repeat)
Problem: I want to work with just 1 dictionary (players_scores)
but the defaultdict expression creates a completely seperate dictionary.
How do I load, update, and save to one dictionary?
Code:
from collections import defaultdict#for manipulating dict
players_scores = defaultdict(dict)
import ast #module for removing string from dict once it's called back
a = {}
open_file = open("scores", "w")
open_file.write(str(a))
open_file.close()
open_file2 = open("scores")
open_file2.readlines()
open_file2.seek(0)
i = input("Enter new player's name: ").upper()
players_scores[i]['GOLF'] = 0
players_scores[i]['MON DEAL'] = 0
print()
scores_str = open_file2.read()
players_scores = ast.literal_eval(scores_str)
open_file2.close()
print(players_scores)

You are wiping your changes; instead of writing out your file, you read it anew and the result is used to replace your players_scores dictionary. Your defaultdict worked just fine before that, even if you can't really use defaultdict here (ast.literal_eval() does not support collections.defaultdict, only standard python literal dict notation).
You can simplify your code by using the json module here:
import json
try:
with open('scores', 'r') as f:
player_scores = json.load(f)
except IOError:
# no such file, create an empty dictionary
player_scores = {}
name = input("Enter new player's name: ").upper()
# create a complete, new dictionary
players_scores[name] = {'GOLF': 0, 'MON DEAL': 0}
with open('scores', 'w') as f:
json.dump(player_scores, f)
You don't need defaultdict here at all; you are only creating new dictionary for every player name anyway.

I think one problem is that to index the data structure the way you want, something like a defaultdict(defaultdict(dict)) is what's really needed — but which unfortunately it's impossible to specify one directly like that. However, to workaround that, all you need to do is define a simple intermediary factory function to pass to the upper-level defaultdict:
from collections import defaultdict
def defaultdict_factory(*args, **kwargs):
""" Create and return a defaultdict(dict). """
return defaultdict(dict, *args, **kwargs)
Then you can use players_scores = defaultdict(defaultdict_factory) to create one.
However ast.literal_eval() won't work with one that's been converted to string representation because it's not one of the simple literal data types the function supports. Instead I would suggest you consider using Python's venerable pickle module which can handle most of Python's built-in data types as well custom classes like I'm describing. Here's an example of applying it to your code (in conjunction with the code above):
import pickle
try:
with open('scores', 'rb') as input_file:
players_scores = pickle.load(input_file)
except FileNotFoundError:
print('new scores file will be created')
players_scores = defaultdict(defaultdict_factory)
player_name = input("Enter new player's name: ").upper()
players_scores[player_name]['GOLF'] = 0
players_scores[player_name]['MON DEAL'] = 0
# below is a shorter way to do the initialization for a new player
# players_scores[player_name] = defaultdict_factory({'GOLF': 0, 'MON DEAL': 0})
# write new/updated data structure (back) to disk
with open('scores', 'wb') as output_file:
pickle.dump(players_scores, output_file)
print(players_scores)

Related

Pulling Values from a JSON Dataset that Match a Keyword

Currently working on a school project based on a JSON dataset about a weapon list in a videogame. I've been trying to add functionality where the user can click a button that runs code filtering the dataset down to the forms containing the key-word. Below is the back end code for one of the functions, returning all forms where name = dagger
def wpn_dagger():
with open('DSweapons.json', encoding='utf-8') as outfile:
data = json.load(outfile)
wpn_list = []
for dict in data:
if dict in ['name'] == 'dagger':
wpn_list.append(data)
print(wpn_list)
return wpn_list
Whilst I do not get any errors when I run the code the only output to the terminal is an empty set of [] brackets. Any help on this issue would be much appreciated.

if dict in ['name'] == 'dagger' is wrong syntax for what you want
when written like that ['name'] is a list containing the string 'name', so dict in ['name'] is checking if dict is in that list (will be always false) and then we check if the result of that == 'dagger', i.e. the whole thing reads as if False == 'dagger'
Try this:
def wpn_dagger():
with open('DSweapons.json', encoding='utf-8') as outfile:
data = json.load(outfile)
wpn_list = []
for weapon in data:
if weapon['name'] == 'dagger':
wpn_list.append(weapon)
print(wpn_list)
return wpn_list

Indentation in python is very important. Your for loop needs to be indented so it operates inside the with context manager.
Secondly, dict is a keyword in python, so use something different if you need to name a variable.
Finally you can get an object out of your form using .get('name') on the dictionary, it's at least easier to read in my opinion.
In summary, something like this (not tested):
def wpn_dagger():
with open('DSweapons.json', encoding='utf-8') as outfile:
data = json.load(outfile)
wpn_list = []
for form in data:
if form.get('name') == 'dagger':
wpn_list.append(form)
print(wpn_list)
return wpn_list

How to copy a csv file into a dictionary?

I'm working on cs50's pset6, DNA, and I want to read a csv file that looks like this:
name,AGATC,AATG,TATC
Alice,2,8,3
Bob,4,1,5
Charlie,3,2,5
But the problem is that dictionaries only have a key, and a value, so I don't know how I could structure this. What I currently have is this piece of code:
import sys
with open(argv[1]) as data_file:
data_reader = csv.DictReader(data_file)
And also, my csv file has multiple columns and rows, with a header and the first column indicating the name of the person. I don't know how to do this, and I will later need to access the individual amount of say, Alice's value of AATG.
Also, I'm using the module sys, to import DictReader and also reader

You can always try to create the function on your own.
You can use my code here:
def csv_to_dict(csv_file):
key_list = [key for key in csv_file[:csv_file.index('\n')].split(',')] # save the keys
data = {} # every dictionary
info = [] # list of dicitionaries
# for each line
for line in csv_file[csv_file.index('\n') + 1:].split('\n'):
count = 0 # this variable saves the key index in my key_list.
# for each string before comma
for value in line.split(','):
data[key_list[count]] = value # for each key in key_list (which I've created before), I put the value. This is the way to set a dictionary values.
count += 1
info.append(data) # after updating my data (dictionary), I append it to my list.
data = {} # I set the data dictionary to empty dictionary.
print(info) # I print it.
### Be aware that this function prints a list of dictionaries.

How to store and load a Python dictionary with HDF5

I'm having issues loading (I think storing is working – a file is being created and contains data) a dictionary (string key and array/list value) from a HDF5 file. I'm receiving the following error:
ValueError: malformed node or string: < HDF5 dataset "dataset_1": shape (), type "|O" >
My code is:
import h5py
def store_table(self, filename):
table = dict()
table['test'] = list(np.zeros(7,dtype=int))
with h5py.File(filename, "w") as file:
file.create_dataset('dataset_1', data=str(table))
file.close()
def load_table(self, filename):
file = h5py.File(filename, "r")
data = file.get('dataset_1')
print(ast.literal_eval(data))
I've read online using the ast method literal_eval should work but it doesn't appear to help... How do I 'unpack' the HDF5 so it's a dictionary again?
Any ideas would be appreciated.

It's not clear to me what you really want to accomplish. (I suspect your dictionaries have more than seven zeros. Otherwise, HDF5 is overkill to store your data.) If you have a lot of very large dictionaries, it would be better to covert the data to a NumPy array then either 1) create and load the dataset with data= or 2) create the dataset with an appropriate dtype then populate. You can create datasets with mixed datatypes, which is not addressed in the previous solution. If those situations don't apply, you might want to save the dictionary as attributes. Attributes can be associated to a group, a dataset, or the file object itself. Which is best depends on your requirements.
I wrote a short example to show how to load dictionary key/value pairs as attribute names/value pairs tagged to a group. For this example, I assumed the dictionary has a name key with the group name for association. The process is almost identical for a dataset or file object (just change the object reference).
import h5py
def load_dict_to_attr(h5f, thisdict) :
if 'name' not in thisdict:
print('Dictionary missing name key. Skipping function.')
return
dname = thisdict.get('name')
if dname in h5f:
print('Group:' + dname + ' exists. Skipping function.')
return
else:
grp = h5f.create_group(dname)
for key, val in thisdict.items():
grp.attrs[key] = val
###########################################
def get_grp_attrs(name, node) :
grp_dict = {}
for k in node.attrs.keys():
grp_dict[k]= node.attrs[k]
print (grp_dict)
###########################################
car1 = dict( name='my_car', brand='Ford', model='Mustang', year=1964,
engine='V6', disp=260, units='cu.in' )
car2 = dict( name='your_car', brand='Chevy', model='Camaro', year=1969,
engine='I6', disp=250, units='cu.in' )
car3 = dict( name='dads_car', brand='Mercedes', model='350SL', year=1972,
engine='V8', disp=4520, units='cc' )
car4 = dict( name='moms_car', brand='Plymouth', model='Voyager', year=1989,
engine='V6', disp=289, units='cu.in' )
a_truck = dict( brand='Dodge', model='RAM', year=1984,
engine='V8', disp=359, units='cu.in' )
garage = dict(my_car=car1,
your_car=car2,
dads_car=car3,
moms_car=car4,
a_truck=a_truck )
with h5py.File('SO_61226773.h5','w') as h5w:
for car in garage:
print ('\nLoading dictionary:', car)
load_dict_to_attr(h5w, garage.get(car))
with h5py.File('SO_61226773.h5','r') as h5r:
print ('\nReading dictionaries from Group attributes:')
h5r.visititems (get_grp_attrs)

If I understand what you are trying to do, this should work:
import numpy as np
import ast
import h5py
def store_table(filename):
table = dict()
table['test'] = list(np.zeros(7,dtype=int))
with h5py.File(filename, "w") as file:
file.create_dataset('dataset_1', data=str(table))
def load_table(filename):
file = h5py.File(filename, "r")
data = file.get('dataset_1')[...].tolist()
file.close();
return ast.literal_eval(data)
filename = "file.h5"
store_table(filename)
data = load_table(filename)
print(data)

My preferred solution is just to convert them to ascii and then store this binary data.
import h5py
import json
import itertools
#generate a test dictionary
testDict={
"one":1,
"two":2,
"three":3,
"otherStuff":[{"A":"A"}]
}
testFile=h5py.File("test.h5","w")
#create a test data set containing the binary representation of my dictionary data
testFile.create_dataset(name="dictionary",shape=(len([i.encode("ascii","ignore") for i in json.dumps(testDict)]),1),dtype="S10",data=[i.encode("ascii","ignore") for i in json.dumps(testDict)])
testFile.close()
testFile=h5py.File("test.h5","r")
#load the test data back
dictionary=testFile["dictionary"][:].tolist()
dictionary=list(itertools.chain(*dictionary))
dictionary=json.loads(b''.join(dictionary))
The two key parts are:
testFile.create_dataset(name="dictionary",shape=(len([i.encode("ascii","ignore") for i in json.dumps(testDict)]),1),dtype="S10",data=[i.encode("ascii","ignore") for i in json.dumps(testDict)])
Where
data=[i.encode("ascii","ignore") for i in json.dumps(testDict)])
Converts the dictionary to a list of ascii charecters (The string shape may also be calculated from this)
Decoding back from the hdf5 container is a little simpler:
dictionary=testFile["dictionary"][:].tolist()
dictionary=list(itertools.chain(*dictionary))
dictionary=json.loads(b''.join(dictionary))
All that this is doing is loading the string from the hdf5 container and converting it to a list of bytes. Then I coerce this into a bytes object which I can convert back to a dictionary with json.loads
If you are ok with the extra library usage (json, ittertools) I think this offers a somewhat more pythonic solution (which in my case wasnt a problem since I was using them anyway).

Exporting a List is producing an empty CSV file

I have a two lists that are zipped together, i am able to print the list out to view but i when i try to export the list into a csv file, the csv file is created but its empty. not sure why as im using the same method to save the two lists separately and it works.
import csv
import random
import datetime
import calendar
with open("Duty Name List.csv") as CsvNameList:
NameList = CsvNameList.read().split("\n")
date = datetime.datetime.now()
MaxNumofDays = calendar.monthrange(date.year,date.month)
print (NameList)
print(date.year)
print(date.month)
print(MaxNumofDays[1])
x = MaxNumofDays[1] + 1
daylist = list(range(1,x))
print(daylist)
ShuffledList = random.sample(NameList,len(daylist))
print(ShuffledList)
RemainderList = set(NameList) - set(ShuffledList)
print(RemainderList)
with open("remainder.csv","w") as f:
wr = csv.writer(f,delimiter="\n")
wr.writerow(RemainderList)
AssignedDutyList = zip(daylist,ShuffledList)
print(list(AssignedDutyList))
with open("AssignedDutyList.csv","w") as g:
wr = csv.writer(g)
wr.writerow(list(AssignedDutyList))
no error messages are produced.

In Python 3, This line
AssignedDutyList = zip(daylist,ShuffledList)
creates an iterator named AssignedDutyList.
This line
print(list(AssignedDutyList))
exhausts the iterator. When this line is executed
wr.writerow(list(AssignedDutyList))
the iterator has no further output, so nothing is written to the file.
The solution is to store the result of calling list on the iterator in a variable rather than the iterator itself, in cases where the content of an iterator must be reused.
AssignedDutyList = list(zip(daylist,ShuffledList))
print(AssignedDutyList)
with open("AssignedDutyList.csv","w") as g:
wr = csv.writer(g)
wr.writerow(AssignedDutyList)
As a bonus, the name AssignedDutyList now refers to an actual list, and so is less confusing for future readers of the code.

Objects/classes/lists Python

I am confused about classes in python. I don't want anyone to write down raw code but suggest methods of doing it. Right now I have the following code...
def main():
lst = []
filename = 'yob' + input('Enter year: ') + '.txt'
for line in open(filename):
line = line.strip()
lst.append(line.split(',')
What this code does is have a input for a file based on a year. The program is placed in a folder with a bunch of text files that have different years to them. Then, I made a class...
class Names():
__slots__ = ('Name', 'Gender', 'Occurences')
This class just defines what objects I should make. The goal of the project is to build objects and create lists based off these objects. My main function returns a list containing several elements that look like the following:
[[jon, M, 190203], ...]
These elements have a name in lst[0], a gender M or F in [1] and a occurence in [3]. I'm trying to find the top 20 Male and Female candidates and print them out.
Goal-
There should be a function which creates a name entry, i.e. mkEntry. It should be
passed the appropriate information, build a new object, populate the ﬁelds, and return
it.

If all you want is a handy container class to hold your data in, I suggest using the namedtuple type factory from the collections module, which is designed for exactly this. You should probably also use the csv module to handle reading your file. Python comes with "batteries included", so learn to use the standard library!
from collections import namedtuple
import csv
Person = namedtuple('Person', ('name', 'gender', 'occurences')) # create our type
def main():
filename = 'yob' + input('Enter year: ') + '.txt'
with open(filename, newlines="") as f: # parameters differ a bit in Python 2
reader = csv.reader(f) # the reader handles splitting the lines for you
lst = [Person(*row) for row in reader]
Note: If you're using Python 2, the csv module needs you to open the file in binary mode (with a second argument of 'rb') rather than using the newlines parameter.
If your file had just the single person you used in your example output, you' get a list with one Person object:
>>> print(lst)
[Person(name='jon', gender='M', occurences=190203)]
You can access the various values either by index (like a list or tuple) or by attribute name (like a custom object):
>>> jon = lst[0]
>>> print(jon[0])
jon
>>> print(jon.gender)
M

In your class, add an __init__ method, like this:
def __init__(self, name, gender, occurrences):
self.Name = name
# etc.
Now you don't need a separate "make" method; just call the class itself as a constructor:
myname = Names(lst[0], etc.)
And that's all there is to it.
If you really want an mkEntry function anyway, it'll just be a one-liner: return Names(etc.)

I know you said not to write out the code but it's just easier to explain it this way. You don't need to use slots - they're for a specialised optimisation purpose (and if you don't know what it is, you don't need it).
class Person(object):
def __init__(self, name, gender, occurrences):
self.name = name
self.gender = gender
self.occurrences = occurrences
def main():
# read in the csv to create a list of Person objects
people = []
filename = 'yob' + input('Enter year: ') + '.txt'
for line in open(filename):
line = line.strip()
fields = line.split(',')
p = Person(fields[0], fields[1], int(fields[2]))
people.append(p)
# split into genders
p_m = [p for p in people if p.gender == 'M']
p_f = [p for p in people if p.gender == 'F']
# sort each by occurrences descending
p_m = sorted(p_m, key=lambda x: -x.occurrences)
p_f = sorted(p_f, key=lambda x: -x.occurrences)
# print out the first 20 of each
for p in p_m[:20]:
print p.name, p.gender, p.occurrences
for p in p_f[:20]:
print p.name, p.gender, p.occurrences
if __name__ == '__main__':
main()
I've used a couple of features here that might look a little scary, but they're easy enough once you get used to them (and you'll see them all over python code). List comprehensions give us an easy way of filtering our list of people into genders. lambda gives you an anonymous function. The [:20] syntax says, give me the first 20 elements of this list - refer to list slicing.
Your case is quite simple and you probably don't even really need the class / objects but it should give you an idea of how you use them. There's also a csv reading library in python that will help you out if the csvs are more complex (quoted fields etc).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

file IO with defaultdict - python

Related

Pulling Values from a JSON Dataset that Match a Keyword

How to copy a csv file into a dictionary?

How to store and load a Python dictionary with HDF5

Exporting a List is producing an empty CSV file

Objects/classes/lists Python

Categories

Resources