I have a big list that I pulled in from a .csv:
CSV_PATH = 'myfile.csv'
CSV_OBJ = csv.DictReader(open(CSV_PATH, 'r'))
CSV_LIST = list(CSV_OBJ)
And I only want to keep some of the columns in it:
KEEP_COLS = ['Name', 'Year', 'Total Allocations', 'Enrollment']'
It seems from Removing multiple keys from a dictionary safely like this ought to work:
BETTER = {k: v for k, v in CSV_LIST if k not in KEEP_COLS}
But I get an error: ValueError: too many values to unpack What am I missing here? I could write a loop that runs through CSV_LIST and produces BETTER by keeping only what I want, but I suspect that using comprehension is more pythonic.
As requested, a chunk of CSV_LIST
{'EIN': '77-0000091',
'FR': '28.4',
'Name': 'Org A',
'Enrollment': '506',
'Total Allocations': '$34214',
'geo_latitude': '37.9381775755',
'geo_longitude': '-122.3146910612',
'Year': '2009'},
{'EIN': '77-0000091',
'FR': '28.4',
'Name': 'Org A',
'Enrollment': '506',
'Total Allocations': '$34214',
'geo_latitude': '37.9381775755',
'geo_longitude': '-122.3146910612',
'Year': '2010'}
At the commandline I can do csvcut -c 'Name','Year','Total Allocations','Enrollment' myfile.csv > better_myfile.csv but that's definitely not pythonic.
Your dictionary comprehension is fine, but since you have a list of dictionaries, you have to create a list comprehension using that dictionary comprehension for the individual list items. Also, since you want to keep those columns, I guess you should drop that not. Try this:
[{k: v for k, v in d.items() if k in KEEP_COLS} for d in CSV_LIST]
An alternative is to use
CSV_LIST = map(operator.itemgetter(*KEEP_LIST), CSV_OBJ)
This will create a list of tuples with the desired columns.
The issue is that CSV_LIST is a list, not a single dict. #tobias explained how to unpack it correctly.
However, if you're worried about being Pythonic, why are you processing a DictReader into a list of dictionaries and then filtering out all but a few keys? Without knowing your use case I can't be sure, but it's likely that it would be cleaner and simpler to just use the DictReader row-by-row the way it was intended to be used:
with open(CSV_PATH, 'r') as f:
for row in csv.DictReader(f):
process(row['Name'],row['Year'],row['Total Allocations'],row['Enrollment'])
Related
For example, for the txt file of
Math, Calculus, 5
Math, Vector, 3
Language, English, 4
Language, Spanish, 4
into the dictionary of:
data={'Math':{'name':[Calculus, Vector], 'score':[5,3]}, 'Language':{'name':[English, Spanish], 'score':[4,4]}}
I am having trouble with appending value to create list inside the smaller dict. I'm very new to this and I would not understand importing command. Thank you so much for all your help!
For each line, find the 3 values, then add them to a dict structure
from pathlib import Path
result = {}
for row in Path("test.txt").read_text().splitlines():
subject_type, subject, score = row.split(", ")
if subject_type not in result:
result[subject_type] = {'name': [], 'score': []}
result[subject_type]['name'].append(subject)
result[subject_type]['score'].append(int(score))
You can simplify it with the use of a defaultdict that creates the mapping if the key isn't already present
result = defaultdict(lambda: {'name': [], 'score': []}) # from collections import defaultdict
for row in Path("test.txt").read_text().splitlines():
subject_type, subject, score = row.split(", ")
result[subject_type]['name'].append(subject)
result[subject_type]['score'].append(int(score))
With pandas.DataFrame you can directly the formatted data and output the format you want
import pandas as pd
df = pd.read_csv("test.txt", sep=", ", engine="python", names=['key', 'name', 'score'])
df = df.groupby('key').agg(list)
result = df.to_dict(orient='index')
From your data:
data={'Math':{'name':['Calculus', 'Vector'], 'score':[5,3]},
'Language':{'name':['English', 'Spanish'], 'score':[4,4]}}
If you want to append to the list inside your dictionary, you can do:
data['Math']['name'].append('Algebra')
data['Math']['score'].append(4)
If you want to add a new dictionary, you can do:
data['Science'] = {'name':['Chemisty', 'Biology'], 'score':[2,3]}
I am not sure if that is what you wanted but I hope it helps!
import csv
def partytoyear():
party_in_power = {}
with open("presidents.txt") as f:
reader = csv.reader(f)
for row in reader:
party = row[1]
for year in row[2:]:
party_in_power[year] = party
print(party_in_power)
return party_in_power
partytoyear()
def statistics():
with open("BLS_private.csv") as f:
statistics = {}
reader = csv.DictReader(f)
for row in reader:
statistics = row
print(statistics)
return statistics
statistics()
These two functions return two dictionaries.
Here is a sample of the first dictionary:
'Democrat', '1981': 'Republican', '1982': 'Republican', '1983'
Sample of the second dictionary:
'2012', '110470', '110724', '110871', '110956', '111072', '111135', '111298', '111432', '111560', '111744'
The first dictionary associates a year and the political party. The next dictionary associates the year with job statistics.
I need to combine these two dictionaries, so I can have the party inside the dictionary with the job statistics.
I would like the dictioary to look like this:
'Democrat, '2012','110470', '110724', '110871', '110956', '111072', '111135', '111298', '111432', '111560', '111744'
How would I go about doing this? I've looked at the syntax for update() but that didn't work for my program
You can’t have a dictionary in that manor in python it’s syntactically wrong but you can have each value be a collection such as a list. Here’s a comprehension that does just that using dict lookups:
first_dict = {'Democrat': '1981': 'Republican': '1982': 'Republican': '1983', ...}
second_dict = {'2012': ['110470', '110724', '110871', '110956', '111072', '111135', '111298', '111432', '111560', '111744'], ...}
result = {party: [year, *second_dict[year] for party, year in first_dict.items()}
Pseudo result dict structure:
{'Party Name': [year, stats, ...], ...}
I'm trying to create a dictionary of dictionaries like this:
food = {"Broccoli": {"Taste": "Bad", "Smell": "Bad"},
"Strawberry": {"Taste": "Good", "Smell": "Good"}}
But I am populating it from an SQL table. So I've pulled the SQL table into an SQL object called "result". And then I got the column names like this:
nutCol = [i[0] for i in result.description]
The table has about 40 characteristics, so it is quite long.
I can do this...
foodList = {}
for id, food in enumerate(result):
addMe = {str(food[1]): {nutCol[id + 2]: food[2], nulCol[idx + 3]:
food[3] ...}}
foodList.update(addMe)
But this of course would look horrible and take a while to write. And I'm still working out how I want to build this whole thing so it's possible I'll need to change it a few times...which could get extremely tedious.
Is there a DRY way of doing this?
In order to make solution position independent you can make use of dict1.update(dict2). This simply merges dict2 with dict1.
In our case since we have dict of dict, we can use dict['key'] as dict1 and simply add any additional key,value pair as dict2.
Here is an example.
food = {"Broccoli": {"Taste": "Bad", "Smell": "Bad"},
"Strawberry": {"Taste": "Good", "Smell": "Good"}}
addthis = {'foo':'bar'}
Suppose you want to add addthis dict to food['strawberry'] , we can simply use,
food["Strawberry"].update(addthis)
Getting result:
>>> food
{'Strawberry': {'Taste': 'Good', 'foo': 'bar', 'Smell': 'Good'},'Broccoli': {'Taste': 'Bad', 'Smell': 'Bad'}}
>>>
Assuming that column 0 is what you wish to use as your key, and you do wish to build a dictionary of dictionaries, then its:
detail_names = [col[0] for col in result.description[1:]]
foodList = {row[0]: dict(zip(detail_names, row[1:]))
for row in result}
Generalising, if column k is your identity then its:
foodList = {row[k]: {col[0]: row[i]
for i, col in enumerate(result.description) if i != k}
for row in result}
(Here each sub dictionary is all columns other than column k)
addMe = {str(food[1]):dict(zip(nutCol[2:],food[2:]))}
zip will take two (or more) lists of items and pair the elements, then you can pass the result to dict to turn the pairs into a dictionary.
I'm trying to perform operations on a nested dictionary (data retrieved from a yaml file):
data = {'services': {'web': {'name': 'x'}}, 'networks': {'prod': 'value'}}
I'm trying to modify the above using the inputs like:
{'services.web.name': 'new'}
I converted the above to a list of indices ['services', 'web', 'name']. But I'm not able to/not sure how to perform the below operation in a loop:
data['services']['web']['name'] = new
That way I can modify dict the data. There are other values I plan to change in the above dictionary (it is extensive one) so I need a solution that works in cases where I have to change, EG:
data['services2']['web2']['networks']['local'].
Is there a easy way to do this? Any help is appreciated.
You may iterate over the keys while moving a reference:
data = {'networks': {'prod': 'value'}, 'services': {'web': {'name': 'x'}}}
modification = {'services.web.name': 'new'}
for key, value in modification.items():
keyparts = key.split('.')
to_modify = data
for keypart in keyparts[:-1]:
to_modify = to_modify[keypart]
to_modify[keyparts[-1]] = value
print(data)
Giving:
{'networks': {'prod': 'value'}, 'services': {'web': {'name': 'new'}}}
I have a CSV with two columns, column one is the team dedicated to a particular building in our project.
The second column is the actual building number.
What I am looking for is a dictionary with the first column as the key and the buildings that belong to that team in the list.
I have tried various forms of csv.reader and csv.DictReader along with different for loops to rewrite the data to another dictionary, but I cannot get the structure I want.
CSV:
team,bldg,
3,204,
3,250,
3,1437,
2,1440,
1,1450,
The structure of the dictionary would be as follows:
dict["1"] = ["1450"]
dict["2"] = ["1440"]
dict["3"] = ["204", "250", "1437"]
This works:
import csv
result={}
with open('/tmp/test.csv','r') as f:
red=csv.DictReader(f)
for d in red:
result.setdefault(d['team'],[]).append(d['bldg'])
#results={'1': ['1450'], '3': ['204', '250', '1437'], '2': ['1440']}
The useful collections.defaultdict in the standard library makes short work of this task:
import csv
import collections as co
dd = co.defaultdict(list)
with open('/path/to/your.csv'),'rb') as fin:
dr = csv.DictReader(fin)
for line in dr:
dd[line['team']].append(line['bldg'])
# defaultdict(<type 'list'>, {'1': ['1450'], '3': ['204', '250', '1437'], '2': ['1440']})
http://docs.python.org/2/library/collections.html#collections.defaultdict
The first argument provides the initial value for the default_factory
attribute; it defaults to None.