Creating a dictionary of lists from a file - python

I have a list in the following format in a txt file :
Shoes, Nike, Addias, Puma,...other brand names
Pants, Dockers, Levis,...other brand names
Watches, Timex, Tiesto,...other brand names
how to put these into dictionary like this format:
dictionary={Shoes: [Nike, Addias, Puma,.....]
Pants: [Dockers, Levis.....]
Watches:[Timex, Tiesto,.....]
}
How to do this in a for loop rather than manual input.
i have tried
clothes=open('clothes.txt').readlines()
clothing=[]
stuff=[]
for line in clothes:
items=line.replace("\n","").split(',')
clothing.append(items[0])
stuff.append(items[1:])
Clothing:{}
for d in clothing:
Clothing[d]= [f for f in stuff]

Here's a more concise way to do things, though you'll probably want to split it up a bit for readability
wordlines = [line.split(', ') for line in open('clothes.txt').read().split('\n')]
d = {w[0]:w[1:] for w in wordlines}

How about:
file = open('clothes.txt')
clothing = {}
for line in file:
items = [item.strip() for item in line.split(",")]
clothing[items[0]] = items[1:]

Try this, it will remove the need for replacing line breaks and is quite simple, but effective:
clothes = {}
with open('clothes.txt', 'r', newline = '/r/n') as clothesfile:
for line in clothesfile:
key = line.split(',')[0]
value = line.split(',')[1:]
clothes[key] = value
The 'with' statement will make sure the file reader is closed after your code to implement the dictionary is executed. From there you can use the dictionary to your heart's content!

Using list comprehension you could do:
clothes=[line.strip() for line in open('clothes.txt').readlines()]
clothingDict = {}
for line in clothes:
arr = line.split(",")
clothingDict[arr[0]] = [arr[i] for i in range(1,len(arr))]

Related

Adding words between lines to an array

This is the content of my file:
david C001 C002 C004 C005 C006 C007
* C008 C009 C010 C011 C016 C017 C018
* C019 C020 C021 C022 C023 C024 C025
anna C500 C521 C523 C547 C555 C556
* C557 C559 C562 C563 C566 C567 C568
* C569 C571 C572 C573 C574 C575 C576
* C578
charlie C701 C702 C704 C706 C707 C708
* C709 C712 C715 C716 C717 C718
I want my output to be:
david=[C001,C002,C004,C005,C006,C007,C008,C009,C010,C011,C016,C017,C018,C019,C020,C021,C022,C023,C024,C025]
anna=[C500,C521,C523,C547,C555,C556,C557,C559,C562,C563,C566,C567,C568,C569,C571,C572,C573,C574,C575,C576,C578]
charlie=[C701,C702,C704,C706,C707,C708,C709,C712,C715,C716,C717,C718]
I am able to create:
david=[C001,C002,C004,C005,C006,C007]
anna=[C500,C521,C523,C547,C555,C556]
charlie=[C701,C702,C704,C706,C707,C708]
counting the number of words in a line and using line[0] as the array name and adding the remaining words to the array.
However, I don't know how to take the continuation of words in the next lines starting with "*" to the array.
Can anyone help?
NOTE: This solution relies on defaultdict being ordered, which is something that was introduced on Python 3.6
Somewhat naive approach:
from collections import defaultdict
# Create a dictionary of people
people = defaultdict(list)
# Open up your file in read-only mode
with open('your_file.txt', 'r') as f:
# Iterate over all lines, stripping them and splitting them into words
for line in filter(bool, map(str.split, map(str.strip, f))):
# Retrieve the name of the person
# either from the current line or use the name of the last person processed
name, words = list(people)[-1] if line[0] == '*' else line[0], line[1:]
# Add all remaining words to that person's record
people[name].extend(words)
print(people['anna'])
# ['C500', 'C521', 'C523', 'C547', 'C555', 'C556', 'C557', 'C559', 'C562', 'C563', 'C566', 'C567', 'C568', 'C569', 'C571', 'C572', 'C573', 'C574', 'C575', 'C576', 'C578']
It also has the additional benefit of returning an empty list for unknown names:
print(people['matt'])
# []
You could read the lists into a dictionary using regular expressions:
import re
with open('file_name') as file:
contents = file.read()
res_list = re.findall(r"[a-z]+\s+[^a-z]+",contents)
res_dict = {}
for p in res_list:
elt = p.split()
res_dict[elt[0]] = [e for e in elt[1:] if e != '*']
print(res_dict)
I figured out a way myself. Thanks to the ones who gave their own solution. It gave me new perspective.
Below is my code:
persons_library={}
persons=['david','anna','charlie']
for i,person in enumerate(persons,start=0):
persons_library[person]=[]
with open('data.txt','r') as f:
for line in f:
line=line.replace('*',"")
line=line.split()
for i,val in enumerate(line,start=0):
if val in persons_library:
key=val
else:
persons_library[key].append(val)
print(persons_library)

Convert a text file into a dictionary

I have a text file in this format:
key:object,
key2:object2,
key3:object3
How can I convert this into a dictionary in Python for the following process?
Open it
Check if string s = any key in the dictionary
If it is, then string s = the object linked to the aforementioned key.
If not, nothing happens
File closes.
I've tried the following code for dividing them with commas, but the output was incorrect. It made the combination of key and object in the text file into a single key and single object, effectively duplicating it:
Code:
file = open("foo.txt","r")
dict = {}
for line in file:
x = line.split(",")
a = x[0]
b = x[0]
dict[a] = b
Incorrect output:
key:object, key:object
key2:object2, key2:object2
key3:object3, key3:object3
Thank you
m={}
for line in file:
x = line.replace(",","") # remove comma if present
y=x.split(':') #split key and value
m[y[0]] = y[1]
# -*- coding:utf-8 -*-
key_dict={"key":'',"key5":'',"key10":''}
File=open('/home/wangxinshuo/KeyAndObject','r')
List=File.readlines()
File.close()
key=[]
for i in range(0,len(List)):
for j in range(0,len(List[i])):
if(List[i][j]==':'):
if(List[i][0:j] in key_dict):
for final_num,final_result in enumerate(List[i][j:].split(',')):
if(final_result!='\n'):
key_dict["%s"%List[i][0:j]]=final_result
print(key_dict)
I am using your file in "/home/wangxinshuo/KeyAndObject"
You can convert the content of your file to a dictionary with some oneliner similar to the below one:
result = {k:v for k,v in [line.strip().replace(",","").split(":") for line in f if line.strip()]}
In case you want the dictionary values to be stripped, just add v.strip()

Python: dictionary to collection

I have a file with 2 columns:
Anzegem Anzegem
Gijzelbrechtegem Anzegem
Ingooigem Anzegem
Aalst Sint-Truiden
Aalter Aalter
The first column is a town and the second column is the district of that town.
I made a dictionary of that file like this:
def readTowns(text):
input = open(text, 'r')
file = input.readlines()
dict = {}
verzameling = set()
for line in file:
tmp = line.split()
dict[tmp[0]] = tmp[1]
return dict
If I set a variable 'writeTowns' equal to readTowns(text) and do writeTown['Anzegem'], I want to get a collection of {'Anzegem', 'Gijzelbrechtegem', 'Ingooigem'}.
Does anybody know how to do this?
I think you can just create another function that can create appropriate data structure for what you need. Because, at the end you will end up writing code which basically manipulates the dictionary returned by readTowns to generate data as per your requirement. Why not keep the code clean and create another function for that. You Just create a name to list dictionary and you are all set.
def writeTowns(text):
input = open(text, 'r')
file = input.readlines()
dict = {}
for line in file:
tmp = line.split()
dict[tmp[1]] = dict.get(tmp[1]) or []
dict.get(tmp[1]).append(tmp[0])
return dict
writeTown = writeTowns('file.txt')
print writeTown['Anzegem']
And if you are concerned about reading the same file twice, you can do something like this as well,
def readTowns(text):
input = open(text, 'r')
file = input.readlines()
dict2town = {}
town2dict = {}
for line in file:
tmp = line.split()
dict2town[tmp[0]] = tmp[1]
town2dict[tmp[1]] = town2dict.get(tmp[1]) or []
town2dict.get(tmp[1]).append(tmp[0])
return dict2town, town2dict
dict2town, town2dict = readTowns('file.txt')
print town2dict['Anzegem']
You could do something like this, although, please have a look at #ubadub's answer, there are better ways to organise your data.
[town for town, region in dic.items() if region == 'Anzegem']
It sounds like you want to make a dictionary where the keys are the districts and the values are a list of towns.
A basic way to do this is:
def readTowns(text):
with open(text, 'r') as f:
file = input.readlines()
my_dict = {}
for line in file:
tmp = line.split()
if tmp[1] in dict:
my_dict[tmp[1]].append(tmp[0])
else:
my_dict[tmp[1]] = [tmp[0]]
return dict
The if/else blocks can also be achieved using python's defaultdict subclass (docs here) but I've used the if/else statements here for readability.
Also some other points: the variables dict and file are python types so it is bad practice to overwrite these with your own local variable (notice I've changed dict to my_dict in the code above.
If you build your dictionary as {town: district}, so the town is the key and the district is the value, you can't do this easily*, because a dictionary is not meant to be used in that way. Dictionaries allow you to easily find the values associated with a given key. So if you want to find all the towns in a district, you are better of building your dictionary as:
{district: [list_of_towns]}
So for example the district Anzegem would appear as {'Anzegem': ['Anzegem', 'Gijzelbrechtegem', 'Ingooigem']}
And of course the value is your collection.
*you could probably do it by iterating through the entire dict and checking where your matches occur, but this isn't very efficient.

Python Replacing Words from Definitions in Text File

I've got an old informix database that was written for cobol. All the fields are in code so my SQL queries look like.
SELECT uu00012 FROM uu0001;
This is pretty hard to read.
I have a text file with the field definitions like
uu00012 client
uu00013 date
uu00014 f_name
uu00015 l_name
I would like to swap out the code for the more english name. Run a python script on it maybe and have a file with the english names saved.
What's the best way to do this?
If each piece is definitely a separate word, re.sub is definitely the way to go here:
#create a mapping of old vars to new vars.
with open('definitions') as f:
d = dict( [x.split() for x in f] )
def my_replace(match):
#if the match is in the dictionary, replace it, otherwise, return the match unchanged.
return d.get( match.group(), match.group() )
with open('inquiry') as f:
for line in f:
print re.sub( r'\w+', my_replace, line )
Conceptually,
I would probably first build a mapping of codings -> english (in memory or o.
Then, for each coding in your map, scan your file and replace with the codes mapped english equivalent.
infile = open('filename.txt','r')
namelist = []
for each in infile.readlines():
namelist.append((each.split(' ')[0],each.split(' ')[1]))
this will give you a list of key,value pairs
i dont know what you want to do with the results from there though, you need to be more explicit
dictionary = '''uu00012 client
uu00013 date
uu00014 f_name
uu00015 l_name'''
dictionary = dict(map(lambda x: (x[1], x[0]), [x.split() for x in dictionary.split('\n')]))
def process_sql(sql, d):
for k, v in d.items():
sql = sql.replace(k, v)
return sql
sql = process_sql('SELECT f_name FROM client;', dictionary)
build dictionary:
{'date': 'uu00013', 'l_name': 'uu00015', 'f_name': 'uu00014', 'client': 'uu00012'}
then run thru your SQL and replace human readable values with coded stuff. The result is:
SELECT uu00014 FROM uu00012;
import re
f = open("dictfile.txt")
d = {}
for mapping in f.readlines():
l, r = mapping.split(" ")
d[re.compile(l)] = r.strip("\n")
sql = open("orig.sql")
out = file("translated.sql", "w")
for line in sql.readlines():
for r in d.keys():
line = r.sub(d[r], line)
out.write(line)

how to use string as list's indices in Python

for line in f.readlines():
(addr, vlanid, videoid, reqs, area) = line.split()
if vlanid not in dict:
dict[vlanid] = []
video_dict = dict[vlanid]
if videoid not in video_dict:
video_dict[videoid] = []
video_dict[videoid].append((addr, vlanid, videoid, reqs, area))
Here is my code, I want to use videoid as indices to creat a list. the real data of videoid are different strings like this : FYFSYJDHSJ
I got this error message:
video_dict[videoid] = []
TypeError: list indices must be integers, not str
But now how to add identifier like 1,2,3,4 for different strings in this case?
Use a dictionary instead of a list:
if vlanid not in dict:
dict[vlanid] = {}
P.S. I recommend that you call dict something else so that it doesn't shadow the built-in dict.
Don't use dict as a variable name. Try this (d instead of dict):
d = {}
for line in f.readlines():
(addr, vlanid, videoid, reqs, area) = line.split()
video_dict = d.setdefault(vlanid, {})
video_dict.setdefault(videoid, []).append((addr, vlanid, videoid, reqs, area))
As suggested above, creating dictionaries would be the most ideal code to implement. (Although you should avoid calling them dict, as that means something important to Python.
Your code may look something like what #aix had already posted above:
for line in f.readlines():
d = dict(zip(("addr", "vlanid", "videoid", "reqs", "area"), tuple(line.split())))
You would be able to do something with the dictionary d later in your code. Just remember - iterating through this dictionary will mean that, if you don't use d until after the loop is complete, you'll only get the last values from the file.

Categories

Resources