Create a dictionary from csv file?

Create a dictionary from csv file? - python

I want to do the below in python.The csv file is:
item1,item2,item2,item3
item2,item3,item4,item1
i want to make a dictionary with unique keys item1, item2, item3 and item4.
dictionary = {item1: value1, item2: value2....}. Value is how many times the key appears in csv file.How can I do this?

Obtain a list of all items from your cvs:
with open('your.csv') as csv:
content = csv.readlines()
items = ','.join(content).split(',')
Then start the mapping
mapping = {}
for item in items:
mapping[item] = (mapping.get(item) or 0) + 1
and your will get the following:
>>> mapping
{'item2': 3, 'item3': 2, 'item1': 2, 'item4': 1}

import csv
from collections import Counter
# define a generator, that will yield you field after field
# ignoring newlines:
def iter_fields(filename):
with open(filename, 'rb') as f:
reader = csv.reader(f)
for row in reader:
for field in row:
yield field
# now use collections.Counter to count your values:
counts = Counter(iter_fields('stackoverflow.csv'))
print counts
# output:
# Counter({'item3': 2, 'item2': 2, 'item1': 1,
# ' item1': 1, ' item2': 1, 'item4': 1})
see https://docs.python.org/2/library/collections.html#collections.Counter

import csv
temp = dict()
with open('stackoverflow.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
for x in row:
if x in temp.keys():
temp[x] = int(temp[x]) + 1
else:
temp[x] = 1
print temp
The output is like:-
{'item2': 3, 'item3': 2, 'item1': 2, 'item4': 1}

Related

write dictionary of lists to a tab delimited file in python, with dictionary key values as columns without Pandas

the dictionary I am using is:
dict={'item': [1,2,3], 'id':['a','b','c'], 'car':['sedan','truck','moped'], 'color': ['r','b','g'], 'speed': [2,4,10]}
I am trying to produce a tab delimited out put as such:
item id
1 a
2 b
3 c
The code I have written:
with open('file.txt', 'w') as tab_file:
dict_writer = DictWriter(tab_file, dict.keys(), delimiter = '\t')
dict_writer.writeheader()
dict_writer.writerows(dict)
specifically, I am struggling with writing to the file in a column based manner. Meaning, that the dictionary keys populate as the header, and the dictionary values populate vertically underneath the associated header. Also, I do NOT have the luxury of using Pandas

This solution will work for an ambiguous number of items and subitems in the dict:
d = {'item': [1, 2, 3], 'id': [4, 5, 6]}
for i in d:
print(i + "\t", end="")
numSubItems = len(d[i])
print()
for level in range(numSubItems):
for i in d:
print(str(d[i][level]) + "\t", end="")
print()
EDIT:
To implement this with writing to a text file:
d = {'item': [1, 2, 3], 'id': [4, 5, 6], 'test': [6, 7, 8]}
with open('file.txt', 'w') as f:
for i in d:
f.write(i + "\t")
numSubItems = len(d[i])
f.write("\n")
for level in range(numSubItems):
for i in d:
f.write(str(d[i][level]) + "\t")
f.write("\n")

Here's a way to do this using a one-off function and zip:
d = {
'item': [1, 2, 3],
'id': ['a', 'b', 'c'],
'car': ['sedan', 'truck', 'moped'],
'color': ['r', 'b', 'g'],
'speed': [2, 4, 10],
}
def row_printer(row):
print(*row, sep='\t')
row_printer(d.keys()) # Print header
for t in zip(*d.values()): # Print rows
row_printer(t)
To print to a file: print(..., file='file.txt')

You can use a simple loop with a zip:
d={'item': [1,2,3], 'id':["a","b","c"]}
print('item\tid')
for num, letter in zip(d['item'], d['id']):
print('\t'.join(str(num) + letter))
item id
1 a
2 b
3 c
EDIT:
If you don't want to hard code column names you can use this:
d={'item': [1,2,3], 'id':["a","b","c"]}
print('\t'.join(d.keys()))
for num, letter in zip(*d.values()):
print('\t'.join(str(num) + letter))
However the order of the columns is only guaranteed in python3.7+ if you use a dictionary. If you have a lower python version use an orderedDict instead, like this:
from collections import OrderedDict
d=OrderedDict({'item': [1,2,3], 'id':["a","b","c"]})
print('\t'.join(d.keys()))
for num, letter in zip(*d.values()):
print('\t'.join(str(num) + letter))

Instead of using csv.DictWriter you can also use a module like pandas for this:
import pandas as pd
df = pd.DataFrame.from_dict(d)
df.to_csv(“test.csv”, sep=“\t”, index=False)
Probably, you have to install it first by using
pip3 install pandas
See here for an example.

Replace values in Python dict

I have 2 files, The first only has 2 columns
A 2
B 5
C 6
And the second has the letters as a first column.
A cat
B dog
C house
I want to replace the letters in the second file with the numbers that correspond to them in the first file so I would get.
2 cat
5 dog
6 house
I created a dict from the first and read the second. I tried a few things but none worked. I can't seem to replace the values.
import csv
with open('filea.txt','rU') as f:
reader = csv.reader(f, delimiter="\t")
for i in reader:
print i[0] #reads only first column
a_data = (i[0])
dictList = []
with open('file2.txt', 'r') as d:
for line in d:
elements = line.rstrip().split("\t")[0:]
dictList.append(dict(zip(elements[::1], elements[0::1])))
for key, value in dictList.items():
if value == "A":
dictList[key] = "cat"

The issue appears to be on your last lines:
for key, value in dictList.items():
if value == "A":
dictList[key] = "cat"
This should be:
for key, value in dictList.items():
if key in a_data:
dictList[a_data[key]] = dictList[key]
del dictList[key]
d1 = {'A': 2, 'B': 5, 'C': 6}
d2 = {'A': 'cat', 'B': 'dog', 'C': 'house', 'D': 'car'}
for key, value in d2.items():
if key in d1:
d2[d1[key]] = d2[key]
del d2[key]
>>> d2
{2: 'cat', 5: 'dog', 6: 'house', 'D': 'car'}
Notice that this method allows for items in the second dictionary which don't have a key from the first dictionary.
Wrapped up in a conditional dictionary comprehension format:
>>> {d1[k] if k in d1 else k: d2[k] for k in d2}
{2: 'cat', 5: 'dog', 6: 'house', 'D': 'car'}
I believe this code will get you your desired result:
with open('filea.txt', 'rU') as f:
reader = csv.reader(f, delimiter="\t")
d1 = {}
for line in reader:
if line[1] != "":
d1[line[0]] = int(line[1])
with open('fileb.txt', 'rU') as f:
reader = csv.reader(f, delimiter="\t")
reader.next() # Skip header row.
d2 = {}
for line in reader:
d2[line[0]] = [float(i) for i in line[1:]]
d3 = {d1[k] if k in d1 else k: d2[k] for k in d2}

You could use dictionary comprehension:
d1 = {'A':2,'B':5,'C':6}
d2 = {'A':'cat','B':'dog','C':'house'}
In [23]: {d1[k]:d2[k] for k in d1.keys()}
Out[23]: {2: 'cat', 5: 'dog', 6: 'house'}

If the two dictionaries are called a and b, you can construct a new dictionary this way:
composed_dict = {a[k]:b[k] for k in a}
This will take all the keys in a, and read the corresponding values from a and b to construct a new dictionary.
Regarding your code:
The variable a_data has no purpose. You read the first file, pront the first column, and do nothing else with the data in it
zip(elements[::1], elements[0::1]) will just construct pairs like [1,2,3] -> [(1,1),(2,2),(3,3)], I think that's not what you want
After all you have a list of dictionaries, and at the last line you just put strings in that list. I think that is not intentional.

import re
d1 = dict()
with open('filea.txt', 'r') as fl:
for f in fl:
key, val = re.findall('\w+', f)
d1[key] = val
d2 = dict()
with open('file2.txt', 'r') as fl:
for f in fl:
key, val = re.findall('\w+', f)
d2[key] = val
with open('file3.txt', 'wb') as f:
for k, v in d1.items():
f.write("{a}\t{b}\n".format(a=v, b=d2[k]))

convert csv file to list of dictionaries

I have a csv file
col1, col2, col3
1, 2, 3
4, 5, 6
I want to create a list of dictionary from this csv.
output as :
a= [{'col1':1, 'col2':2, 'col3':3}, {'col1':4, 'col2':5, 'col3':6}]
How can I do this?

Use csv.DictReader:
import csv
with open('test.csv') as f:
a = [{k: int(v) for k, v in row.items()}
for row in csv.DictReader(f, skipinitialspace=True)]
Will result in :
[{'col2': 2, 'col3': 3, 'col1': 1}, {'col2': 5, 'col3': 6, 'col1': 4}]

Another simpler answer:
import csv
with open("configure_column_mapping_logic.csv", "r") as f:
reader = csv.DictReader(f)
a = list(reader)
print a

Using the csv module and a list comprehension:
import csv
with open('foo.csv') as f:
reader = csv.reader(f, skipinitialspace=True)
header = next(reader)
a = [dict(zip(header, map(int, row))) for row in reader]
print a
Output:
[{'col3': 3, 'col2': 2, 'col1': 1}, {'col3': 6, 'col2': 5, 'col1': 4}]

Answering here after long time as I don't see any updated/relevant answers.
df = pd.read_csv('Your csv file path')
data = df.to_dict('records')
print( data )

# similar solution via namedtuple:
import csv
from collections import namedtuple
with open('foo.csv') as f:
fh = csv.reader(open(f, "rU"), delimiter=',', dialect=csv.excel_tab)
headers = fh.next()
Row = namedtuple('Row', headers)
list_of_dicts = [Row._make(i)._asdict() for i in fh]

Well, while other people were out doing it the smart way, I implemented it naively. I suppose my approach has the benefit of not needing any external modules, although it will probably fail with weird configurations of values. Here it is just for reference:
a = []
with open("csv.txt") as myfile:
firstline = True
for line in myfile:
if firstline:
mykeys = "".join(line.split()).split(',')
firstline = False
else:
values = "".join(line.split()).split(',')
a.append({mykeys[n]:values[n] for n in range(0,len(mykeys))})

Simple method to parse CSV into list of dictionaries
with open('/home/mitul/Desktop/OPENEBS/test.csv', 'rb') as infile:
header = infile.readline().split(",")
for line in infile:
fields = line.split(",")
entry = {}
for i,value in enumerate(fields):
entry[header[i].strip()] = value.strip()
data.append(entry)

Problems with Python Dictionarys and nested Lists

I am trying to create a dictionary that has a nested list inside of it.
The goal would be to have it be:
key : [x,y,z]
I am pulling the information from a csv file and counting the number of times a certain key shows up in each column. However I am getting the below error
> d[key][i] = 1
KeyError: 'owner'
Where owner is the title of my column.
if __name__ == '__main__':
d = {}
with open ('sample.csv','r') as f:
reader = csv.reader(f)
for i in range(0,3):
for row in reader:
key = row[0]
if key in d:
d[key][i] +=1
else:
d[key][i] = 1
for key,value in d.iteritems():
print key,value
What do I tweak in this loop to have it create a key if it doesn't exist and then add to it if it does?

The problem is, that you try to use a list ([i]) where no list is.
So you have to replace
d[key][i] = 1
with
d[key] = [0,0,0]
d[key][i] = 1
This would first create the list with three entries (so you can use [0], [1] and [2] afterward without error) and then assigns one to the correct entry in the list.

You can use defaultdict:
from collections import defaultdict
ncols = 3
d = defaultdict(lambda: [0 for i in range(ncols)])

Use a try, catch block to append a list to the new key, then increment as needed
if __name__ == '__main__':
d = {}
with open ('sample.csv','r') as f:
reader = csv.reader(f)
for i in xrange(0,3):
for row in reader:
key = row[i]
try: d[key][i] += 1
except KeyError:
d[key] = [0, 0, 0]
d[key][i] = 1
for key,value in d.iteritems():
print key,value

Using defaultdict and Counter you can come up with a dict that allows you to easily measure how many times a key appeared in a position (in this case 1st, 2nd or 3rd, by the slice)
csv = [
['a','b','c','d'],
['e','f','g', 4 ],
['a','b','c','d']
]
from collections import Counter, defaultdict
d = defaultdict(Counter)
for row in csv:
for idx, value in enumerate(row[0:3]):
d[value][idx] += 1
example usage:
print d
print d['a'][0] #number of times 'a' has been found in the 1st position
print d['b'][2] #number of times 'b' found in the 3rd position
print d['f'][1] #number of times 'f' found in 2nd position
print [d['a'][n] for n in xrange(3)] # to match the format requested in your post
defaultdict(<class 'collections.Counter'>, {'a': Counter({0: 2}), 'c': Counter({2: 2}), 'b': Counter({1: 2}), 'e': Counter({0: 1}), 'g': Counter({2: 1}), 'f': Counter({1: 1})})
2
0
1
[2, 0, 0]
Or put into a function:
def occurrences(key):
return [d[key][n] for n in xrange(3)]
print occurrences('a') # [2, 0, 0]

Update dictionary while parsing CSV file

I have csv file like this:
item,#RGB
item1,#ffcc00
item1,#ffcc00
item1,#ff00cc
item2,#00ffcc
item2,#ffcc00
item2,#ffcc00
item2,#ffcc00
....
and I want to make dictionary d, with item name as key and RGB value and count as tuples in list as dictionary value, like:
d[item] = [ (#RGB, count) ]
so for "item1" as in example, I would like to get:
d['item1'] = [ ('#ffcc00', 2), ('#ff00cc', 1) ]
I imagine some Pythonic iterator can do this in one line, but I can't understand how at this moment. So far I've made this:
d={}
with open('data.csv', 'rb') as f:
reader = csv.reader(f)
try:
for row in reader:
try:
if d[(row[0], row[1])]:
i +=1
except KeyError:
i = 1
d[(row[0], row[1])] = i
except csv.Error, e:
sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))
which gives me:
d[(item, #RGB)] = count
Any better way? Or am I doing this wrongly from start?

how about:
a = {}
for row in reader:
a.setdefault(row[0], {}).setdefault(row[1], 0)
a[row[0]][row[1]] += 1
This creates a dictionary like
{'item2': {'#00ffcc': 1, '#ffcc00': 3},
'item1': {'#ffcc00': 2, '#ff00cc': 1}}
I find it more convenient than your structure, but you can convert it to tuples if needed:
b = dict((k, v.items()) for k, v in a.items())

import csv
from collections import defaultdict, Counter
from itertools import islice
with open('infile.txt') as f:
d=defaultdict(Counter)
for k,v in islice(csv.reader(f),1,None):
d[k].update((v,))
print d
prints
defaultdict(<class 'collections.Counter'>, {'item2': Counter({'#ffcc00': 3, '#00ffcc': 1}), 'item1': Counter({'#ffcc00': 2, '#ff00cc': 1})})

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Create a dictionary from csv file? - python

I want to do the below in python.The csv file is: item1,item2,item2,item3 item2,item3,item4,item1 i want to make a dictionary with unique keys item1, item2, item3 and item4. dictionary = {item1: value1, item2: value2....}. Value is how many times the key appears in csv file.How can I do this?

import csv temp = dict() with open('stackoverflow.csv', 'rb') as f: reader = csv.reader(f) for row in reader: for x in row: if x in temp.keys(): temp[x] = int(temp[x]) + 1 else: temp[x] = 1 print temp The output is like:- {'item2': 3, 'item3': 2, 'item1': 2, 'item4': 1}

Related

write dictionary of lists to a tab delimited file in python, with dictionary key values as columns without Pandas

Replace values in Python dict

convert csv file to list of dictionaries

Problems with Python Dictionarys and nested Lists

Update dictionary while parsing CSV file

Categories

Resources