Creating a dictionary in Python + working with that dictionary

Creating a dictionary in Python + working with that dictionary - python

I am quite new to Python and am just trying to get my head around some basics.
I was wondering if anyone could show me how to perform the following tasks. I have a text file with multiple lines, those lines are as follows:
name1, variable, variable, variable
name2, variable, variable, variable
name3, variable, variable, variable
I want to store these items in a dictionary so they can be easily called. I want the name to be the key. I then want to be able to call the variables like this: key[0] or key1
The code I have at the moment does not do this:
d = {}
with open("servers.txt") as f:
for line in f:
(key, val) = line.split()
d[int(key)] = val
Once this is done, I would like to be able to take an input from a user and then check the array to see if this item is present in the array. I have found a few threads on Stackoverflow however none seem to do what I require.
There is a Similar Question asked here.
Any assistance you can provide would be amazing. I am new to this but I hope to learn fast & start contributing to threads myself in the near future :)
Cheers!

You're nearly there. Assuming that .split() actually splits the lines correctly (which it wouldn't do if there are actual commas between the values), you just need an additional unpacking operator (*):
d = {}
with open("servers.txt") as f:
for line in f:
key, *val = line.split() # first element -> key, rest -> val[0], val[1] etc.
d[int(key)] = val
If you want to check if a user-entered key exists, you can do something like
ukey = int(input("Enter key number: "))
values = d.get(ukey)
if d is not None:
# do something
else:
print("That key doesn't exist.")

Suppose that your file my_file.csv looks like:
name1, variable, variable, variable
name2, variable, variable, variable
name3, variable, variable, variable
Use pandas to do the work:
import pandas as pd
result = pd.read_csv('my_file.csv', index_col=0, header=None)
print(result)
print(result.loc['name1'])
Notice that pandas is a 3rd party library, and you need to install it using pip or easy_install tools.

Related

Dynamically building a dictionary based on variables

I am trying to build a dictionary based on a larger input of text. From this input, I will create nested dictionaries which will need to be updated as the program runs. The structure ideally looks like this:
nodes = {}
node_name: {
inc_name: inc_capacity,
inc_name: inc_capacity,
inc_name: inc_capacity,
}
Because of the nature of this input, I would like to use variables to dynamically create dictionary keys (or access them if they already exist). But I get KeyError if the key doesn't already exist. I assume I could do a try/except, but was wondering if there was a 'cleaner' way to do this in python. The next best solution I found is illustrated below:
test_dict = {}
inc_color = 'light blue'
inc_cap = 2
test_dict[f'{inc_color}'] = inc_cap
# test_dict returns >>> {'light blue': 2}

Try this code, for Large Scale input. For example file input
Lemme give you an example for what I am aiming for, and I think, this what you want.
File.txt
Person1: 115.5
Person2: 128.87
Person3: 827.43
Person4:'18.9
Numerical Validation Function
def is_number(a):
try:
float (a)
except ValueError:
return False
else:
return True
Code for dictionary File.txt
adict = {}
with open("File.txt") as data:
adict = {line[:line.index(':')]: line[line.index(':')+1: ].strip(' \n') for line in data.readlines() if is_number(line[line.index(':')+1: ].strip('\n')) == True}
print(adict)
Output
{'Person1': '115.5', 'Person2': '128.87', 'Person3': '827.43'}
For more explanation, please follow this issue solution How to fix the errors in my code for making a dictionary from a file

As already mentioned in the comments sections, you can use setdefault.
Here's how I will implement it.
Assume I want to add values to dict : node_name and I have the keys and values in two lists. Keys are in inc_names and values are in inc_ccity. Then I will use the below code to load them. Note that inc_name2 key exists twice in the key list. So the second occurrence of it will be ignored from entry into the dictionary.
node_name = {}
inc_names = ['inc_name1','inc_name2','inc_name3','inc_name2']
inc_ccity = ['inc_capacity1','inc_capacity2','inc_capacity3','inc_capacity4']
for i,names in enumerate(inc_names):
node = node_name.setdefault(names, inc_ccity[i])
if node != inc_ccity[i]:
print ('Key=',names,'already exists with value',node, '. New value=', inc_ccity[i], 'skipped')
print ('\nThe final list of values in the dict node_name are :')
print (node_name)
The output of this will be:
Key= inc_name2 already exists with value inc_capacity2 . New value= inc_capacity4 skipped
The final list of values in the dict node_name are :
{'inc_name1': 'inc_capacity1', 'inc_name2': 'inc_capacity2', 'inc_name3': 'inc_capacity3'}
This way you can add values into a dictionary using variables.

My CSV files are not being assigned to the correct Key in a dictionary

def read_prices(tikrList):
#read each file and get the price list dictionary
def getPriceDict():
priceDict = {}
TLL = len(tikrList)
for x in range(0,TLL):
with open(tikrList[x] + '.csv','r') as csvFile:
csvReader = csv.reader(csvFile)
for column in csvReader:
priceDict[column[0]] = float(column[1])
return priceDict
#populate the final dictionary with the price dictionary from the previous function
def popDict():
combDict = {}
TLL = len(tikrList)
for x in range(0,TLL):
for y in tikrList:
combDict[y] = getPriceDict()
return combDict
return(popDict())
print(read_prices(['GOOG','XOM','FB']))
What is wrong with the code is that when I return the final dictionary the key for GOOG,XOM,FB is represnting the values for the FB dictionary only.
As you can see with this output:
{'GOOG': {'2015-12-31': 104.660004, '2015-12-30': 106.220001},
'XOM': {'2015-12-31': 104.660004, '2015-12-30': 106.220001},
'FB': {'2015-12-31': 104.660004, '2015-12-30': 106.220001}
I have 3 different CSV files but all of them are just reading the CSV file for FB.
I want to apologize ahead of time if my code is not easy to read or doesn't make sense. I think there is an issue with storing the values and returning the priceDict in the getPriceDict function but I cant seem to figure it out.
Any help is appreciated, thank you!

Since this is classwork I won't provide a solution but I'll point a few things out.
You have defined three functions - two are defined inside the third. While structuring functions like that can make sense for some problems/solutions I don't see any benefit in your solution. It seems to make it more complicated.
The two inner functions don't have any parameters, you might want to refactor them so that when they are called you pass them the information they need. One advantage of a function is to encapsulate an idea/process into a self-contained code block that doesn't rely on resources external to itself. This makes it easy to test so you know that the function works and you can concentrate on other parts of the code.
This piece of your code doesn't make much sense - it never uses x from the outer loop:
...
for x in range(0,TLL):
for y in tikrList:
combDict[y] = getPriceDict()
When you iterate over a list the iteration will stop after the last item and it will iterate over the items themselves - no need to iterate over numbers to access the items: don't do for i in range(thelist): print(thelist[i])
>>> tikrList = ['GOOG','XOM','FB']
>>> for name in tikrList:
... print(name)
GOOG
XOM
FB
>>>
When you read through a tutorial or the documentation, don't just look at the examples - read and understand the text .

Python3 dictionary values being overwritten

I’m having a problem with a dictionary. I"m using Python3. I’m sure there’s something easy that I’m just not seeing.
I’m reading lines from a file to create a dictionary. The first 3 characters of each line are used as keys (they are unique). From there, I create a list from the information in the rest of the line. Each 4 characters make up a member of the list. Once I’ve created the list, I write to the directory with the list being the value and the first three characters of the line being the key.
The problem is, each time I add a new key:value pair to the dictionary, it seems to overlay (or update) the values in the previously written dictionary entries. The keys are fine, just the values are changed. So, in the end, all of the keys have a value equivalent to the list made from the last line in the file.
I hope this is clear. Any thoughts would be greatly appreciated.
A snippet of the code is below
formatDict = dict()
sectionList = list()
for usableLine in formatFileHandle:
lineLen = len(usableLine)
section = usableLine[:3]
x = 3
sectionList.clear()
while x < lineLen:
sectionList.append(usableLine[x:x+4])
x += 4
formatDict[section] = sectionList
for k, v in formatDict.items():
print ("for key= ", k, "value =", v)
formatFileHandle.close()

You always clear, then append and then insert the same sectionList, that's why it always overwrites the entries - because you told the program it should.
Always remember: In Python assignment never makes a copy!
Simple fix
Just insert a copy:
formatDict[section] = sectionList.copy() # changed here
Instead of inserting a reference:
formatDict[section] = sectionList
Complicated fix
There are lots of things going on and you could make it "better" by using functions for subtasks like the grouping, also files should be opened with with so that the file is closed automatically even if an exception occurs and while loops where the end is known should be avoided.
Personally I would use code like this:
def groups(seq, width):
"""Group a sequence (seq) into width-sized blocks. The last block may be shorter."""
length = len(seq)
for i in range(0, length, width): # range supports a step argument!
yield seq[i:i+width]
# Printing the dictionary could be useful in other places as well -> so
# I also created a function for this.
def print_dict_line_by_line(dct):
"""Print dictionary where each key-value pair is on one line."""
for key, value in dct.items():
print("for key =", key, "value =", value)
def mytask(filename):
formatDict = {}
with open(filename) as formatFileHandle:
# I don't "strip" each line (remove leading and trailing whitespaces/newlines)
# but if you need that you could also use:
# for usableLine in (line.strip() for line in formatFileHandle):
# instead.
for usableLine in formatFileHandle:
section = usableLine[:3]
sectionList = list(groups(usableLine[3:]))
formatDict[section] = sectionList
# upon exiting the "with" scope the file is closed automatically!
print_dict_line_by_line(formatDict)
if __name__ == '__main__':
mytask('insert your filename here')

You could simplify your code here by using a with statement to auto close the file and chunk the remainder of the line into groups of four, avoiding the re-use of a single list.
from itertools import islice
with open('somefile') as fin:
stripped = (line.strip() for line in fin)
format_dict = {
line[:3]: list(iter(lambda it=iter(line[3:]): ''.join(islice(it, 4)), ''))
for line in stripped
}
for key, value in format_dict.items():
print('key=', key, 'value=', value)

Append to python dictionary

I have a csv file that has numbers integers and floats,
"5",7.30124705657363,2,12,7.45176205440562
"18",6.83169608190656,5,11,7.18118108407457
"20",6.40446470770985,4,10,6.70549470337383
"3",5.37498781178147,17,9,5.9902122724706
"10",5.12954203598201,8,8,5.58108702947798
"9",3.93496153596789,7,7,4.35751055597501
I am doing some arithmetic and then I am trying to add them into a dictionary but I am getting key error. Here is the code that I have,
global oldPriceCompRankDict
oldPriceCompRankDict = {}
def increaseQuantityByOne(self, fileLocation):
rows = csv.reader(open(fileLocation))
rows.next()
print "PricePercentage\t" + "OldQuantity\t" + "newQuantity\t" + "oldCompScore\t" + "newCompScore"
for row in rows:
newQuantity = float(row[2]) + 1.0
newCompetitiveScore = float(row[1]) + float(math.log(float(newQuantity), 100))
print row[1] + "\t", str(row[2])+"\t", str(newQuantity) + "\t", str(row[4]) + "\t", newCompetitiveScore
oldPriceCompRankDict[row[3]].append(row[4])
I have un-ordered key, and I didn't think the key has to be in an ordered format. I thought anything could be key.

No need to put in the global keyword, it's a no-op.
Use a defaultdict instead:
from collections import defaultdict
oldPriceCompRankDict = defaultdict(list)
What is happening is that you never define any keys for oldPriceCompRankDict, you just expect them to be lists by default. The defaultdict type gives you a dict that does just that; when a key is not yet found in oldPriceCompRankDict a new list() instance will be used as the starting value instead of raising a KeyError.

A Python dictionary type does not have an append() method. What you are doing is basically trying to call an append() method of the dictionary element accessible by key row[3]. You get a KeyError because you have nothing under key row[3].
You should substitute your code
oldPriceCompRankDict[row[3]].append(row[4])
for this:
oldPriceCompRankDict[row[3]] = row[4]
In addition, the global keyword is used inside functions to indicate that variable is a global one, you can read about it here: Using global variables in a function other than the one that created them
so the right way to declare a global dictionary would be just oldPriceCompRankDict = {}
Your function will start adding to the dictionary from the second row because you call rows.next() if it is a desirable behavior then it's OK, otherwise you don't need to call that method.
Hope this was helpful, happy coding!

Python Config Parser (Duplicate Key Support)

So I recently started writing a config parser for a Python project I'm working on. I initially avoided configparser and configobj, because I wanted to support a config file like so:
key=value
key2=anothervalue
food=burger
food=hotdog
food=cake icecream
In short, this config file is going to be edited via the command line over SSH often. So I don't want to tab or finicky about spacing (like YAML), but I also want avoid keys with multiple values (easily 10 or more) being line wrapped in vi. This is why I would like to support duplicate keys.
An my ideal world, when I ask the Python config object for food, it would give me a list back with ['burger', 'hotdog', 'cake', 'icecream']. If there wasn't a food value defined, it would look in a defaults config file and give me that/those values.
I have already implemented the above
However, my troubles started when I realized I wanted to support preserving inline comments and such. The way I handle reading and writing to the config files, is decoding the file into a dict in memory, read the values from the dict, or write values to the dict, and then dump that dict back out into a file. This isn't really nice for preserving line order and commenting and such and it's bugging the crap out of me.
A) ConfigObj looks like it has everything I need except support duplicate keys. Instead it wants me to make a list is going to be a pain to edit manually in vi over ssh due to line wrapping. Can I make configobj more ssh/vi friendly?
B) Is my homebrew solution wrong? Is there a better way of reading/writing/storing my config values? Is there any easy way to handle changing a key value in a config file by just modifying that line and rewriting the entire config file from memory?

Well I would certainly try to leverage what is in the standard library if I could.
The signature for the config parser classes look like this:
class ConfigParser.SafeConfigParser([defaults[, dict_type[, allow_no_value]]])
Notice the dict_type argument. When provided, this will be used to construct the dictionary objects for the list of sections, for the options within a section, and for the default values. It defaults to collections.OrderedDict. Perhaps you could pass something in there to get your desired multiple-key behavior, and then reap all the advantages of ConfigParser. You might have to write your own class to do this, or you could possibly find one written for you on PyPi or in the ActiveState recipes. Try looking for a bag or multiset class.
I'd either go that route or just suck it up and make a list:
foo = value1, value2, value3

Crazy idea: make your dictionary values as a list of 3-tuples with line number, col number and value itself and add special key for comment.
CommentSymbol = ';'
def readConfig(filename):
f = open(filename, 'r')
if not f:
return
def addValue(dict, key, lineIdx, colIdx, value):
if key in dict:
dict[key].append((lineIdx, colIdx, value))
else:
dict[key] = [(lineIdx, colIdx, value)]
res = {}
i = 0
for line in f.readlines():
idx = line.find(CommentSymbol)
if idx != -1:
comment = line[idx + 1:]
addValue(res, CommentSymbol, i, idx, comment)
line = line[:idx]
pair = [x.strip() for x in line.split('=')][:2]
if len(pair) == 2:
addValue(res, pair[0], i, 0, pair[1])
i += 1
return res
def writeConfig(dict, filename):
f = open(filename, 'w')
if not f:
return
index = sorted(dict.iteritems(), cmp = lambda x, y: cmp(x[1][:2], y[1][:2]))
i = 0
for k, V in index:
for v in V:
if v[0] > i:
f.write('\n' * (v[0] - i - 1))
if k == CommentSymbol:
f.write('{0}{1}'.format(CommentSymbol, str(v[2])))
else:
f.write('{0} = {1}'.format(str(k), str(v[2])))
i = v[0]
f.close()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.