create dictionary from text file with headers - python

I have a text file test.txt
col1|col2|col3
1|a|123
2|b|456
I want to create a python dictionary with column as keys and respective values as values in python.

d = {}
with open("test.txt") as f:
for line in f:
(key, val) = line.split()
d[int(key)] = val
Alternatively, if you are running python 2.7, you can use Dict Comprehensions

dict = {}
read = open("test.txt")
for l in read:
l = l.split("|")
dict.update({l[0]:l[1:]})
print(dict)
it works

Below code worked for me:
read = open("test.txt")
a=[]
for z in read:
a.append( z.split("|"))
dict = {}
for i in range(len(a)):
if i == 0:
for j in a[i]:
dict[j] = []
else :
for j in range(len(a[i])):
dict[a[0][j]].append( a[i][j])
print(dict)
output: {'col1': ['1', '2'], 'col2': ['a', 'b'], 'col3\n': ['123\n', '456']}

Related

How to make a dictionary from a txt file?

Assuming a following text file (dict.txt) has
1 2 3
aaa bbb ccc
the dictionary should be {1: aaa, 2: bbb, 3: ccc} like this
I did:
d = {}
with open("dict.txt") as f:
for line in f:
(key, val) = line.split()
d[int(key)] = val
print (d)
but it didn't work. I think it is because of the structure of txt file.
The data which you want to be keys are in first line, and all the data which you want to be as values are in second line.
So, do something like this:
with open(r"dict.txt") as f: data = f.readlines() # Read 'list' of all lines
keys = list(map(int, data[0].split())) # Data from first line
values = data[1].split() # Data from second line
d = dict(zip(keys, values)) # Zip them and make dictionary
print(d) # {1: 'aaa', 2: 'bbb', 3: 'ccc'}
Updated answer based on OP edit:
#Initialize dict
d = {}
#Read in file by newline splits & ignore blank lines
fobj = open("dict.txt","r")
lines = fobj.read().split("\n")
lines = [l for l in line if not l.strip() == ""]
fobj.close()
#Get first line (keys)
key_list = lines[0].split()
#Convert keys to integers
key_list = list(map(int,key_list))
#Get second line (values)
val_list = lines[1].split()
#Store in dict going through zipped lists
for k,v in zip(key_list,val_list):
d[k] = v
First create separate list for keys and values, with condition
like :
if (idx % 2) == 0:
keys = line.split()
values = lines[idx + 1].split()
then combine both the lists
d = {}
# Get all lines in list
with open("dict.txt") as f:
lines = f.readlines()
for idx, line in enumerate(lines):
if (idx % 2) == 0:
# Get the key list
keys = line.split()
# Get the value list
values = lines[idx + 1].split()
# Combine both the lists in dictionary
d.update({ keys[i] : values[i] for i in range(len(keys))})
print (d)

How to create a list of dictionaries from a csv file without list comprehension

The output must be like this:
[{'id': '1', 'first_name': 'Heidie','gender': 'Female'}, {'id': '2', 'first_name': 'Adaline', 'gender': 'Female'}, {...}
There is a code snippet that works, running this requirement.
with open('./test.csv', 'r') as file_read:
reader = csv.DictReader(file_read, skipinitialspace=True)
listDict = [{k: v for k, v in row.items()} for row in reader]
print(listDict)
However, i can't understand some points about this code above:
List comprehension: listDict = [{k: v for k, v in row.items()} for row in reader]
How the python interpret this?
How does the compiler assemble a list always with the header (id,first_name, gender) and their values?
How would be the implementation of this code with nested for
I read theese answers, but i still do not understand:
python list comprehension double for
convert csv file to list of dictionaries
My csv file:
id,first_name,last_name,email,gender
1,Heidie,Philimore,hphilimore0#msu.edu,Female
2,Adaline,Wapplington,awapplington1#icq.com,Female
3,Erin,Copland,ecopland2#google.co.uk,Female
4,Way,Buckthought,wbuckthought3#usa.gov,Male
5,Adan,McComiskey,amccomiskey4#theatlantic.com,Male
6,Kilian,Creane,kcreane5#hud.gov,Male
7,Mandy,McManamon,mmcmanamon6#omniture.com,Female
8,Cherish,Futcher,cfutcher7#accuweather.com,Female
9,Dave,Tosney,dtosney8#businesswire.com,Male
10,Torr,Kiebes,tkiebes9#dyndns.org,Male
your list comprehension :
listDict = [{k: v for k, v in row.items()} for row in reader]
equals:
item_list = []
#go through every row
for row in reader:
item_dict = {}
#in every row go through each item
for k,v in row.items():
#add each items k,v to dict.
item_dict[k] = v
#append every item_dict to item_list
item_list.append(item_dict)
print(item_list)
EDIT (some more explanation):
#lets create a list
list_ = [x ** 2 for x in range(0,10)]
print(list_)
this returns:
[0,1,4,9,16,25,36,49,64,81]
You can write this as:
list_ = []
for x in range(0,10):
list_.append(x ** 2)
So in that example yes you read it 'backwards'
Now assume the next:
#lets create a list
list_ = [x ** 2 for x in range(0,10) if x % 2 == 0]
print(list_)
this returns:
[0,4,16,36,64]
You can write this as:
list_ = []
for x in range(0,10):
if x % 2 == 0:
list_.append(x ** 2)
So thats not 100% backwards, but it should be logical whats happening. Hope this helps you!

Replace values in Python dict

I have 2 files, The first only has 2 columns
A 2
B 5
C 6
And the second has the letters as a first column.
A cat
B dog
C house
I want to replace the letters in the second file with the numbers that correspond to them in the first file so I would get.
2 cat
5 dog
6 house
I created a dict from the first and read the second. I tried a few things but none worked. I can't seem to replace the values.
import csv
with open('filea.txt','rU') as f:
reader = csv.reader(f, delimiter="\t")
for i in reader:
print i[0] #reads only first column
a_data = (i[0])
dictList = []
with open('file2.txt', 'r') as d:
for line in d:
elements = line.rstrip().split("\t")[0:]
dictList.append(dict(zip(elements[::1], elements[0::1])))
for key, value in dictList.items():
if value == "A":
dictList[key] = "cat"
The issue appears to be on your last lines:
for key, value in dictList.items():
if value == "A":
dictList[key] = "cat"
This should be:
for key, value in dictList.items():
if key in a_data:
dictList[a_data[key]] = dictList[key]
del dictList[key]
d1 = {'A': 2, 'B': 5, 'C': 6}
d2 = {'A': 'cat', 'B': 'dog', 'C': 'house', 'D': 'car'}
for key, value in d2.items():
if key in d1:
d2[d1[key]] = d2[key]
del d2[key]
>>> d2
{2: 'cat', 5: 'dog', 6: 'house', 'D': 'car'}
Notice that this method allows for items in the second dictionary which don't have a key from the first dictionary.
Wrapped up in a conditional dictionary comprehension format:
>>> {d1[k] if k in d1 else k: d2[k] for k in d2}
{2: 'cat', 5: 'dog', 6: 'house', 'D': 'car'}
I believe this code will get you your desired result:
with open('filea.txt', 'rU') as f:
reader = csv.reader(f, delimiter="\t")
d1 = {}
for line in reader:
if line[1] != "":
d1[line[0]] = int(line[1])
with open('fileb.txt', 'rU') as f:
reader = csv.reader(f, delimiter="\t")
reader.next() # Skip header row.
d2 = {}
for line in reader:
d2[line[0]] = [float(i) for i in line[1:]]
d3 = {d1[k] if k in d1 else k: d2[k] for k in d2}
You could use dictionary comprehension:
d1 = {'A':2,'B':5,'C':6}
d2 = {'A':'cat','B':'dog','C':'house'}
In [23]: {d1[k]:d2[k] for k in d1.keys()}
Out[23]: {2: 'cat', 5: 'dog', 6: 'house'}
If the two dictionaries are called a and b, you can construct a new dictionary this way:
composed_dict = {a[k]:b[k] for k in a}
This will take all the keys in a, and read the corresponding values from a and b to construct a new dictionary.
Regarding your code:
The variable a_data has no purpose. You read the first file, pront the first column, and do nothing else with the data in it
zip(elements[::1], elements[0::1]) will just construct pairs like [1,2,3] -> [(1,1),(2,2),(3,3)], I think that's not what you want
After all you have a list of dictionaries, and at the last line you just put strings in that list. I think that is not intentional.
import re
d1 = dict()
with open('filea.txt', 'r') as fl:
for f in fl:
key, val = re.findall('\w+', f)
d1[key] = val
d2 = dict()
with open('file2.txt', 'r') as fl:
for f in fl:
key, val = re.findall('\w+', f)
d2[key] = val
with open('file3.txt', 'wb') as f:
for k, v in d1.items():
f.write("{a}\t{b}\n".format(a=v, b=d2[k]))

Python - make a dictionary from a csv file with multiple categories

I am trying to make a dictionary from a csv file in python, but I have multiple categories. I want the keys to be the ID numbers, and the values to be the name of the items. Here is the text file:
"ID#","name","quantity","price"
"1","hello kitty","4","9999"
"2","rilakkuma","3","999"
"3","keroppi","5","1000"
"4","korilakkuma","6","699"
and this is what I have so far:
txt = open("hk.txt","rU")
file_data = txt.read()
lst = [] #first make a list, and then convert it into a dictionary.
for key in file_data:
k = key.split(",")
lst.append((k[0],k[1]))
dic = dict(lst)
print(dic)
This just prints an empty list though. I want the keys to be the ID#, and then the values will be the names of the products. I will make another dictionary with the names as the keys and the ID#'s as the values, but I think it will be the same thing but the other way around.
Use the csv module to handle your data; it'll remove the quoting and handle the splitting:
results = {}
with open('hk.txt', 'r', newline='') as txt:
reader = csv.reader(txt)
next(reader, None) # skip the header line
for row in reader:
results[row[0]] = row[1]
For your sample input, this produces:
{'4': 'korilakkuma', '1': 'hello kitty', '3': 'keroppi', '2': 'rilakkuma'}
You can use csv DictReader:
import csv
result={}
with open('/tmp/test.csv', 'r', newline='') as f:
for d in csv.DictReader(f):
result[d['ID#']]=d['name']
print(result)
# {'1': 'hello kitty', '3': 'keroppi', '2': 'rilakkuma', '4': 'korilakkuma'}
You can use a dictionary directly:
dictionary = {}
file_data.readline() # skip the first line
for key in file_data:
key = key.replace('"', '').strip()
k = key.split(",")
dictionary[k[0]] = k[1]
try this or use any library to read the file.
txt = open("hk.txt","rU")
file_data = txt.read()
file_lines = file_data.split("\n")
lst = [] #first make a list, and then convert it into a dictionary.
for linenumber in range(1,len(file_lines)):
k = file_lines[linenumber].split(",")
lst.append((k[0][1:len(k[0])-1],k[1][1:len(k[1])-1]))
dic = dict(lst)
print(dic)
but you can use the dict directly as well.

make dictionary from csv file columns

i am new to the concept of dictionaries in python.
I have a csv file with multiple columns and i want to create a dictionary such that keys are taken from 1st column and values from the second and a key:value pair is made for all rows of those two columns.
The code is as follows:
if __name__=="__main__":
reader = csv.reader(open("file.csv", "rb"))
for rows in reader:
k = rows[0]
v = rows[1]
mydict = {k:v}
print (mydict)
problem: The output returned is only for "last" or "bottom most" row of the first two columns i.e. {'12654':'18790'}. i want the dictionary to contain all 100 rows of the first two columns in this format. How to do that? can i run some loop on the row numbers for the first two columns to do that...i dont know how.
if __name__=="__main__":
mydict = {}
reader = csv.reader(open("file.csv", "rb"))
for rows in reader:
k = rows[0]
v = rows[1]
mydict[k] = v
print mydict
Here:
mydict = {k:v}
You were making new dictionary in every iteration, and the previous data has been lost.
Update:
You can make something like this:
mydict = {}
L = [(1, 2), (2, 4), (1, 3), (3, 2), (3, 4)]
for el in L:
k, v = el
if not k in mydict:
mydict[k] = [v]
else:
mydict[k].append(v)
print mydict
>>>
{1: [2, 3], 2: [4], 3: [2, 4]}
This way, each value of the same key will be stored
Your code will be:
if __name__=="__main__":
mydict = {}
reader = csv.reader(open("file.csv", "rb"))
for i, rows in enumerate(reader):
if i == 0: continue
k = rows[0]
v = rows[1]
if not k in mydict:
mydict[k] = [v]
else:
mydict[k].append(v)
print mydict
Update2: You mean?
for k, v in mydict.items():
print "%s: %s" % (k, v)
>>>
1: [2, 3]
2: [4]
3: [2, 4]
Update3:
This should work:
if __name__=="__main__":
mydict = {}
reader = csv.reader(open("file.csv", "rb"))
for i, rows in enumerate(reader):
if i == 0: continue
k = rows[0]
v = rows[1]
if not k in mydict:
mydict[k] = [v]
else:
mydict[k].append(v)
print mydict
You are creating a new dict and overwriting the old one each iteration. #develerx's answer fixes this problem. I just wanted to point an easier way, using dict comprehensions:
Assuming the csv file contains two columns.
if __name__=="__main__":
reader = csv.reader(open("file.csv", "rb"))
my_dict = {k: v for k, v in reader}
print mydict
If you are using older version(older than 2.7 I think), you can't use dict comprehensions, just use the dict function then:
my_dict = dict((k, v) for k, v in reader)
Edit: And I just thought that; my_dict = dict(reader) could also work.

Categories

Resources