I have a text file that looks like this:
A (four spaces) 16
B (four spaces) 25
etc.
I need to make a dictionary that looks like this: dic = { A:16, B:25, etc.}
I have already tried "split" (which turned out doesn't work with spaces).
Here is my code so far:
lines = []
with open("ABC.txt","r") as filein:
for line in filein:
lines.append(line.strip('\n'))
print lines
which for the moment gives me a list of strings : "A(four spaces)16", "B(fourspaces)25") etc. I just need to separate the key from value using the four spaces as a cutoff.
Any advice?
The constructor dict() takes an iterable of key-value pairs. So the dictionary can be built directly like so:
with open("ABC.txt","r") as filein:
mydict = dict(line.split() for line in filein)
I don't see why split shouldn't work... Try the following.
dic = {}
with open("ABC.txt",'r') as f:
for l in f:
key,value = l.split() #Set two temporary variables for easy access to data
dic[key] = int(value) #Set the appropriate key of the list to its respective value
print dic
lines is not a dictionary it is a list to make it a dictionary you should initilize:
lines = {}
You could also do:
for line in filein:
lines[line[0]] = line[-2:]
assuming the length of your keys and values is always the same and also assuming one line is somethin like
A 25
Related
So I have this file that contains 2 words each line. It looks like this.
[/lang:F </lang:foreign>
[lipmack] [lipsmack]
[Fang:foreign] <lang:foreign>
the first word is incorrectly formatted and the second one is correctly formatted. I am trying to put them in a dictionary. Below is my code.
textFile = open("hinditxt1.txt", "r+")
textFile = textFile.readlines()
flist = []
for word in textFile:
flist.append(word.split())
fdict = dict()
for num in range(len(flist)):
fdict[flist[num][0]] = flist[num][1]
First I split it then I try to put them in a dictionary. But for some reason I get "IndexError: list index out of range" when trying to put them in a dictionary. What can i do to fix it? Thanks!
It is better in python to iterate over the items of a list rather than a new range of indicies. My guess is that the IndexError is coming from a line in the input file that is blank or does not contain any spaces.
with open("input.txt", 'r') as f:
flist = [line.split() for line in f]
fdict = {}
for k, v in flist:
fdict[k] = v
print(fdict)
The code above avoids needing to access elements of the list using an index by simply iterating over the items of the list itself. We can further simplify this by using a dict comprehension:
with open("input.txt", 'r') as f:
flist = [line.split() for line in f]
fdict = {k: v for k, v in flist}
print(fdict)
With dictionaries it is typical to use the .update() method to add new key-value pairs. It would look more like:
for num in range(len(flist)):
fdict.update({flist[num][0] : flist[num][1]})
A full example without file reading would look like:
in_words = ["[/lang:F </lang:foreign>",
"[lipmack] [lipsmack]",
"[Fang:foreign] <lang:foreign>"]
flist = []
for word in in_words:
flist.append(word.split())
fdict = dict()
for num in range(len(flist)):
fdict.update({flist[num][0]: flist[num][1]})
print(fdict)
Yielding:
{'[lipmack]': '[lipsmack]', '[Fang:foreign]': '<lang:foreign>', '[/lang:F': '</lang:foreign>'}
Although your output may vary, since dictionaries do not maintain order.
As #Alex points out, the IndexError is likely from your data having improperly formatted data (i.e. a line with only 1 or 0 items on it). I suspect the most likely cause of this would be a \n at the end of your file that is causing the last line(s) to be blank.
I'm learning how to code and I've run into a problem I don't have an answer to. I have a text file from which I have to make three dictionaries:
Georgie Porgie
87%
$$$
Canadian, Pub Food
Queen St. Cafe
82%
$
Malaysian, Thai
For the purpose of this thread I just want to ask how to extract the first line of each text block and store it as a key and the second line of each block as a value? I am supposed to write a code using nothing more but the very basic functions and loops.
Here is my code(once the file is opened):
d = {}
a = 0
for i in file:
d[i] = i + 1
a = i + 5
return(d)
Thank you.
First you have to read the file:
with open("data.txt") as file:
lines = file.readlines()
The with clause ensures it is closed after it is read. Next, according to your description, a line contains a key if the index % 5 is 0. Then, the next line contains the value. With only "basic" elements of the language, you could construct your dictionary like this:
dic = {lines[idx].strip(): lines[idx + 1].strip()
for idx in range(0, len(lines), 5)}
This is a dictionary comprehension, which can also be written unfolded.
Now you can also zip the keys and values first, so you can iterate them quite easily. This makes the dictionary comprehension more readable. The strip method is necessary though, since we want to get rid of the line breaks.
entries = zip(lines[::5], lines[1::5])
dic = {key.strip(): value.strip() for key, value in entries}
I have the following text file in the same folder as my Python Code.
78459581
Black Ballpoint Pen
12345670
Football
49585922
Perfume
83799715
Shampoo
I have written this Python code.
file = open("ProductDatabaseEdit.txt", "r")
d = {}
for line in file:
x = line.split("\n")
a=x[0]
b=x[1]
d[a]=b
print(d)
This is the result I receive.
b=x[1] # IndexError: list index out of range
My dictionary should appear as follows:
{"78459581" : "Black Ballpoint Pen"
"12345670" : "Football"
"49585922" : "Perfume"
"83799715" : "Shampoo"}
What am I doing wrong?
A line is terminated by a linebreak, thus line.split("\n") will never give you more than one line.
You could cheat and do:
for first_line in file:
second_line = next(file)
You can simplify your solution by using a dictionary generator, this is probably the most pythonic solution I can think of:
>>> with open("in.txt") as f:
... my_dict = dict((line.strip(), next(f).strip()) for line in f)
...
>>> my_dict
{'12345670': 'Football', '49585922': 'Perfume', '78459581': 'Black Ballpoint Pen', '83799715': 'Shampoo'}
Where in.txt contains the data as described in the problem. It is necessary to strip() each line otherwise you would be left with a trailing \n character for your keys and values.
You need to strip the \n, not split
file = open("products.txt", "r")
d = {}
for line in file:
a = line.strip()
b = file.next().strip()
# next(file).strip() # if using python 3.x
d[a]=b
print(d)
{'12345670': 'Football', '49585922': 'Perfume', '78459581': 'Black Ballpoint Pen', '83799715': 'Shampoo'}
What's going on
When you open a file you get an iterator, which will give you one line at a time when you use it in a for loop.
Your code is iterating over the file, splitting every line in a list with \n as the delimiter, but that gives you a list with only one item: the same line you already had. Then you try to access the second item in the list, which doesn't exist. That's why you get the IndexError: list index out of range.
How to fix it
What you need is this:
file = open('products.txt','r')
d = {}
for line in file:
d[line.strip()] = next(file).strip()
In every loop you add a new key to the dictionary (by assigning a value to a key that didn't exist yet) and assign the next line as the value. The next() function is just telling to the file iterator "please move on to the next line". So, to drive the point home: in the first loop you set first line as a key and assign the second line as the value; in the second loop iteration, you set the third line as a key and assign the fourth line as the value; and so on.
The reason you need to use the .strip() method every time, is because your example file had a space at the end of every line, so that method will remove it.
Or...
You can also get the same result using a dictionary comprehension:
file = open('products.txt','r')
d = {line.strip():next(file).strip() for line in file}
Basically, is a shorter version of the same code above. It's shorter, but less readable: not necessarily something you want (a matter of taste).
In my solution i tried to not use any loops. Therefore, I first load the txt data with pandas:
import pandas as pd
file = pd.read_csv("test.txt", header = None)
Then I seperate keys and values for the dict such as:
keys, values = file[0::2].values, file[1::2].values
Then, we can directly zip these two as lists and create a dict:
result = dict(zip(list(keys.flatten()), list(values.flatten())))
To create this solution I used the information as provided in [question]: How to remove every other element of an array in python? (The inverse of np.repeat()?) and in [question]: Map two lists into a dictionary in Python
You can loop over a list two items at a time:
file = open("ProductDatabaseEdit.txt", "r")
data = file.readlines()
d = {}
for line in range(0,len(data),2):
d[data[i]] = data[i+1]
Try this code (where the data is in /tmp/tmp5.txt):
#!/usr/bin/env python3
d = dict()
iskey = True
with open("/tmp/tmp5.txt") as infile:
for line in infile:
if iskey:
_key = line.strip()
else:
_value = line.strip()
d[_key] = _value
iskey = not iskey
print(d)
Which gives you:
{'12345670': 'Football', '49585922': 'Perfume', '78459581': 'Black Ballpoint Pen', '83799715': 'Shampoo'}
I'm trying to figure out how to open up a file and then store it's contents into a dictionary using the Part no. as the key and the other information as the value. So I want it to look something like this:
{Part no.: "Description,Price", 453: "Sperving_Bearing,9900", 1342: "Panametric_Fan,23400",9480: "Converter_Exchange,93859"}
I was able to store the text from the file into a list, but I'm not sure how to assign more than one value to a key. I'm trying to do this without importing any modules. I've been using the basic str methods, list methods and dict methods.
For a txt file like so
453 Sperving_Bearing 9900
1342 Panametric_Fan 23400
9480 Converter_Exchange 93859
You can just do
>>> newDict = {}
>>> with open('testFile.txt', 'r') as f:
for line in f:
splitLine = line.split()
newDict[int(splitLine[0])] = ",".join(splitLine[1:])
>>> newDict
{9480: 'Converter_Exchange,93859', 453: 'Sperving_Bearing,9900', 1342: 'Panametric_Fan,23400'}
You can get rid of the ----... line by just checking if line.startswith('-----').
EDIT - If you are sure that the first two lines contain the same stuff, then you can just do
>>> testDict = {"Part no.": "Description,Price"}
>>> with open('testFile.txt', 'r') as f:
_ = next(f)
_ = next(f)
for line in f:
splitLine = line.split()
testDict[int(splitLine[0])] = ",".join(splitLine[1:])
>>> testDict
{9480: 'Converter_Exchange,93859', 'Part no.': 'Description,Price', 453: 'Sperving_Bearing,9900', 1342: 'Panametric_Fan,23400'}
This adds the first line to the testDict in the code and skips the first two lines and then continues on as normal.
You can read a file into a list of lines like this:
lines = thetextfile.readlines()
You can split a single line by spaces using:
items = somestring.split()
Here's a principial example how to store a list into a dictionary:
>>>mylist = [1, 2, 3]
>>>mydict = {}
>>>mydict['hello'] = mylist
>>>mydict['world'] = [4,5,6]
>>>print(mydict)
Containers like a tuple, list and dictionary can be nested into each other as their items.
To itereate a list you have to use a for statement like this:
for item in somelist:
# do something with the item like printing it
print item
Here's my stab at it, tested on Python 2.x/3.x:
import re
def str2dict(filename="temp.txt"):
results = {}
with open(filename, "r") as cache:
# read file into a list of lines
lines = cache.readlines()
# loop through lines
for line in lines:
# skip lines starting with "--".
if not line.startswith("--"):
# replace random amount of spaces (\s) with tab (\t),
# strip the trailing return (\n), split into list using
# "\t" as the split pattern
line = re.sub("\s\s+", "\t", line).strip().split("\t")
# use first item in list for the key, join remaining list items
# with ", " for the value.
results[line[0]] = ", ".join(line[1:])
return results
print (str2dict("temp.txt"))
You should store the values as a list or a tuple. Something like this:
textname = input("ENter a file")
thetextfile = open(textname,'r')
print("The file has been successfully opened!")
thetextfile = thetextfile.read()
file_s = thetextfile.split()
holder = []
wordlist = {}
for c in file_s:
wordlist[c.split()[0]] = c.split()[1:]
Your file should look like this:
Part no.;Description,Price
453;Sperving_Bearin,9900
1342;Panametric_Fan,23400
9480;Converter_Exchange,93859
Than you just need to add a bit of code:
d = collections.OrderedDict()
reader = csv.reader(open('your_file.txt','r'),delimiter=';')
d = {row[0]:row[1].strip() for row in reader}
for x,y in d.items():
print x
print y
Trying to analyze a 2 column (color number_of_occurances) .tsv file that has a heading line with a dictionary. Trying to skip the heading line in the most generic way possible (assume this to be by requiring the 2nd column to be of int type). The following is the best I've come up with, but seems like there has to be better:
filelist = []
color_dict = {}
with open('file1.tsv') as F:
filelist = [line.strip('\n').split('\t') for line in F]
for item in filelist:
try: #attempt to add values to existing dictionary entry
x = color_dict[item[0]]
x += int(item[1])
color_dict[item[0]] = x
except: #if color has not been observed yet (KeyError), or if non-convertable string(ValueError) create new entry
try:
color_dict[item[0]] = int(item[1])
except(ValueError): #if item[1] can't convert to int
pass
Seems like there should be a better way to handle the trys and exceptions.
File excerpt by request:
color Observed
green 15
gold 20
green 35
Can't you just skip the first element in the list by slicing your list as [1:] like this:
filelist = [line.strip('\n').split('\t') for line in F][1:]
Now, fileList won't at all contain the element for first line, i.e., the heading line.
Or, as pointed in comment by #StevenRumbalski, you can simply do next(F, None) before your list comprehension to avoid making a copy of your list, after first element like this:
with open('file1.tsv') as F:
next(F, None)
filelist = [line.strip('\n').split('\t') for line in F]
Also, it would be better if you use a defaultdict here.
Use it like this:
from collections import defaultdict
color_dict = defaultdict(int)
And this way, you won't have to check for existence of key, before operating on it. So, you can simply do:
color_dict[item[0]] += int(item[1])
I would use defaultdict in this case. Because, when each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created.
from collections import defaultdict
color_dict = defaultdict(int)
for item in filelist:
color_dict[item[0]] += int(item[1])