So I've created a dictionary off the txt file "PM.txt" where the key is the player and the value is their penalty minutes. I know dictionary keys have to be unique so how would I update the value of the key adding it to the current value, as the key exists more than once in the txt file.
"PM.txt"
Neil,2
Paul,5
Neil,10
Santos,2
Neil,2
Santos,10
Paul,2
Alex,2
So far I have this which returns:
{'Alex': 2, 'Santos': 10, 'Paul': 2, 'Neil': 2}
def pm_dict(filename):
f = open(filename, 'r')
dict = {}
for line in f:
x = line.split(",")
player = x[0]
minutes = x[1]
c = len(minutes)-2
minutes = minutes[0:c]
dict[player] = minutes
return dict
But how would I create a function or a helper for it to return:
{'Alex': 2, 'Santos': 12, 'Paul': 7, 'Neil': 14}
A couple of standard libraries make this problem straightforward. A defaultdict creates a key if it doesn't already exist of its default type, so you can use D[key] += value even when the key doesn't exist yet. The csv module automatically parses .csv files. the default separator is comma. Also make sure not to use dict as a variable name. It overwrites the dict type.
from collections import defaultdict
import csv
def pm_dict(filename):
D = defaultdict(int)
with open(filename, 'r', newline='') as f:
r = csv.reader(f)
for key,value in r:
D[key] += int(value)
return dict(D) # converts back to a standard dict, but not required.
print(pm_dict('PM.txt'))
Output:
{'Neil': 14, 'Paul': 7, 'Alex': 2, 'Santos': 12}
The values make more sense as numbers, but if you want value strings as in your example the last line of the function can be the following to convert values back to strings. This is a dictionary comprehension.
return {k:str(v) for k,v in D.items()}
First convert your minute to int:
minutes = int(x[1])
Then you add it to your dictionary:
if player in dict:
dict[player] += minutes
else:
dict[player] = minutes
in your for loop, instead of dict[player] = minutes:
if player not in dict:
dict[player] = 0
dict[player] += minutes
Also, instead of:
minutes = x[1]
c = len(minutes)-2
minutes = minutes[0:c]
You can do:
minutes = x[1].strip()
Just use defaultdict:
def pm_dict(filename):
f = open(filename, 'r')
dict = defaultdict(int)
for line in f:
x = line.split(",")
player = x[0]
minutes = x[1]
c = len(minutes)-2
minutes = minutes[0:c]
dict[player] += int(minutes)
return dict
After refactoring:
import csv
from collections import defaultdict
...
def get_penalty_minutes_dict(filename):
result_dict = defaultdict(int)
with open(filename, 'r') as f:
for player, minutes in csv.reader(f):
result_dict[player] += int(minutes)
return result_dict
Related
File contains student ID and ID of the solved problem.
Example:
1,2
1,4
1,3
2,1
2,2
2,3
2,4
The task is to write a function which will take a filename as an argument and return a dictionary with a student ID and amount of solved tasks.
Example output:
{1:3, 2:4}
My code which doesn't support the correct output. Please, help me find a mistake and a solution.
import collections
def solved_tasks(filename):
with open(filename) as f:
for line in f.readlines():
key,value = line.strip().split(',')
dictionary = {key: collections.Counter(str(value))}
return dictionary
Since you only care about the sum, not the individual exercises, you can use a Counter on the first column:
def solved_tasks(filename):
with open(filename) as in_stream:
counts = collections.Counter(
line.partition(',')[0] # first column ...
for line in in_stream if line # ... of every non-empty row
)
return {int(key): value for key, value in counts.items()}
Assuming that you want to save the repeated instances of student id, you can use a defaultdict and save the problems solved by each student as a list in your dictionary:
import collections
dictionary = collections.defaultdict(list)
def solved_tasks(filename):
with open(filename) as f:
for line in f.readlines():
key,value = line.strip().split(',')
dictionary[key].append(value)
return dictionary
Output:
defaultdict(<type 'list'>, {'1': ['2', '4', '3'], '2': ['1', '2', '3', '4']})
If you want the sum:
def solved_tasks(filename):
with open(filename) as f:
for line in f.readlines():
key,value = line.strip().split(',')
dictionary[key] += 1
return dictionary
Output:
defaultdict(<type 'int'>, {'1': 3, '2': 4})
you can count how often a key appears
marks = """1,2
"1,4
"1,3
"2,1
"2,2
"2,3
"2,4
"2,4"""
dict = {}
for line in marks.split("\n"):
key,value = line.strip().split(",")
dict[key] = dict.get(key,[]) + [value]
for key in dict:
dict[key] = len(set(dict[key])) # eliminate duplicates
the dict.get(key,[]) method returns an empty list if the key doesn't exist in the dict as a default parameter.
#Edit: You said there may contain duplicates. This method would eliminate all duplicates.
#Edit: Added multilines with """
def solved_tasks(filename):
res = {}
values=""
with open(filename, "r") as f:
for line in f.readlines():
values += line.strip()[0] #take only the first value and concatinate with the values string
value = values[0] #take the first value
res[int(value)] = values.count(value) #put it in the dict
for i in values: #loop the values
if i != value: # if the value is not the first value, then the value is the new found value
value = i
res[int(value)] = values.count(value) #add the new value to the dict
return res
I have two main dictionaries:
dict_main1 = {}
dict_main2 = {}
And then I open many dictionaries (below only 6 of 26 I have) which store the values of the main dictionaries depending on one particular string:
string1 = {}
string2 = {}
string3 = {}
string4 = {}
string5 = {}
string6 = {}
for key, value in dict_main1.items():
if 'string1' in key:
string1[key] = dict_main1[key]
elif 'string2' in key:
string2[key] = dict_main1[key]
elif 'string3' in key:
string3[key] = dict_main1[key]
......
for key, value in dict_main2.items():
if 'string4' in key:
string4[key] = dict_main2[key]
elif 'string5' in key:
string5[key] = dict_main2[key]
elif 'string6' in key:
string6[key] = dict_main2[key]
......
How can I open a file for each strin#={} in a pythonic way?. I would like to avoid doing it one by one (as in the example below):
FI = open ('string1', w)
for key, value in string1.items():
OUT = key + '\n' + value + '\n'
FI.write(OUT)
First of all you don't need 99999 dicts, just use one with dicts inside it.
for example:
from collections import collections.defaultdict
my_keys = ['str1', 'str2', ....]
container_dict = defaultdict(dict)
for key, value in dict_main.items():
for k in my_keys:
if k in key:
container_dict[k][key] = value
now for the files, just use for:
for string, strings_dict in container_dict:
with open(string, "wb") as f:
# format string dict... and save it
i didn't run this code, so maybe there are some bugs, but i suppose it ok
It might be useful to use a single dictionary data structure rather than maintaining 26 different dictionaries.
def split_dict(d, corpus):
dicts = {{} for w in corpus}
for k, v in d.items():
word = next(filter(lambda w: w in k, corpus))
dicts[word][k] = v
return dicts
dict_main = {...}
corpus = [f'string{i}' for i in range(1, 4)]
dict_split = split_dict(dict_main, corpus)
Now, just loop through dict_split:
for word, d in dict_split:
with open(word, 'w') as f:
for key, value in d.items():
f.write(f'{key}\n{value}\n')
For some reason my code refuses to convert to uppercase and I cant figure out why. Im trying to then write the dictionary to a file with the uppercase dictionary values being inputted into a sort of template file.
#!/usr/bin/env python3
import fileinput
from collections import Counter
#take every word from a file and put into dictionary
newDict = {}
dict2 = {}
with open('words.txt', 'r') as f:
for line in f:
k,v = line.strip().split(' ')
newDict[k.strip()] = v.strip()
print(newDict)
choice = input('Enter 1 for all uppercase keys or 2 for all lowercase, 3 for capitalized case or 0 for unchanged \n')
print("Your choice was " + choice)
if choice == 1:
for k,v in newDict.items():
newDict.update({k.upper(): v.upper()})
if choice == 2:
for k,v in newDict.items():
dict2.update({k.lower(): v})
#find keys and replace with word
print(newDict)
with open("tester.txt", "rt") as fin:
with open("outwords.txt", "wt") as fout:
for line in fin:
fout.write(line.replace('{PETNAME}', str(newDict['PETNAME:'])))
fout.write(line.replace('{ACTIVITY}', str(newDict['ACTIVITY:'])))
myfile = open("outwords.txt")
txt = myfile.read()
print(txt)
myfile.close()
In python 3 you cannot do that:
for k,v in newDict.items():
newDict.update({k.upper(): v.upper()})
because it changes the dictionary while iterating over it and python doesn't allow that (It doesn't happen with python 2 because items() used to return a copy of the elements as a list). Besides, even if it worked, it would keep the old keys (also: it's very slow to create a dictionary at each iteration...)
Instead, rebuild your dict in a dict comprehension:
newDict = {k.upper():v.upper() for k,v in newDict.items()}
You should not change dictionary items as you iterate over them. The docs state:
Iterating views while adding or deleting entries in the dictionary may
raise a RuntimeError or fail to iterate over all entries.
One way to update your dictionary as required is to pop values and reassign in a for loop. For example:
d = {'abc': 'xyz', 'def': 'uvw', 'ghi': 'rst'}
for k, v in d.items():
d[k.upper()] = d.pop(k).upper()
print(d)
{'ABC': 'XYZ', 'DEF': 'UVW', 'GHI': 'RST'}
An alternative is a dictionary comprehension, as shown by #Jean-FrançoisFabre.
I have a dictionary with many, many key/value pairs.
The keys are dates and the values are worldwide top-level domains.
I want to output the dictionary to a text file so that it counts and alpha sorts similar values but only within the same key
for example:
*key: value1:count value2:count*
date1: au:4 be:12 com:44
date2: az:4 com:14 net:5
Code:
with open('access_logshort.txt','rU') as f:
for line in f:
list1 = re.search(r'(?P<Date>[0-9]{2}/[a-zA-Z]{3}/[0-9]{4})(.+)(GET|POST)\s(http://|https://)([a-zA-Z.]+)(\.)(?P<tld>[a-zA-Z]+)(/).+?"\s200',line)
if list1 != None:
print list1.groupdict()
one_tuple = list1.group(1,7)
my_dict[one_tuple[0]]=one_tuple[1]
output:
print my_dict
{'09/Mar/2004': 'hu'}
{'09/Mar/2004': 'hu'}
{'09/Mar/2004': 'com'}
{'09/Mar/2004': 'ru'}
{'09/Mar/2004': 'ru'}
{'09/Mar/2004': 'com'}
T
This should suit your case.
from collections import defaultdict
from dateutil.parser import parse
import csv
import re
data = defaultdict(lambda: defaultdict(int))
with open('access_logshort.txt','rU') as f:
for line in f:
list1 = re.search(r'(?P<Date>[0-9]{2}/[a-zA-Z]{3}/[0-9]{4})(.+)(GET|POST)\s(http://|https://)([a-zA-Z.]+)(\.)(?P<tld>[a-zA-Z]+)(/).+?"\s200',line)
if list1 is not None:
date, domain = list1.group(1,7)
data[date.lower()][domain.lower()] += 1
with open('my_data.csv', 'wb') as ofile:
# add delimiter='\t' to the argument list of csv.writer if you want
# tsv rather than csv
writer = csv.writer(ofile)
for key, value in sorted(data.iteritems(), key=lambda x: parse(x[0])):
domains = sorted(value.iteritems())
writer.writerow([key] + ['{}:{}'.format(*d) for d in domains])
Output:
10/Mar/2004,com:2,hu:2,ru:2
09/Mar/2004,com:2,hu:2,ru:2
I am trying to make a dictionary from a csv file in python, but I have multiple categories. I want the keys to be the ID numbers, and the values to be the name of the items. Here is the text file:
"ID#","name","quantity","price"
"1","hello kitty","4","9999"
"2","rilakkuma","3","999"
"3","keroppi","5","1000"
"4","korilakkuma","6","699"
and this is what I have so far:
txt = open("hk.txt","rU")
file_data = txt.read()
lst = [] #first make a list, and then convert it into a dictionary.
for key in file_data:
k = key.split(",")
lst.append((k[0],k[1]))
dic = dict(lst)
print(dic)
This just prints an empty list though. I want the keys to be the ID#, and then the values will be the names of the products. I will make another dictionary with the names as the keys and the ID#'s as the values, but I think it will be the same thing but the other way around.
Use the csv module to handle your data; it'll remove the quoting and handle the splitting:
results = {}
with open('hk.txt', 'r', newline='') as txt:
reader = csv.reader(txt)
next(reader, None) # skip the header line
for row in reader:
results[row[0]] = row[1]
For your sample input, this produces:
{'4': 'korilakkuma', '1': 'hello kitty', '3': 'keroppi', '2': 'rilakkuma'}
You can use csv DictReader:
import csv
result={}
with open('/tmp/test.csv', 'r', newline='') as f:
for d in csv.DictReader(f):
result[d['ID#']]=d['name']
print(result)
# {'1': 'hello kitty', '3': 'keroppi', '2': 'rilakkuma', '4': 'korilakkuma'}
You can use a dictionary directly:
dictionary = {}
file_data.readline() # skip the first line
for key in file_data:
key = key.replace('"', '').strip()
k = key.split(",")
dictionary[k[0]] = k[1]
try this or use any library to read the file.
txt = open("hk.txt","rU")
file_data = txt.read()
file_lines = file_data.split("\n")
lst = [] #first make a list, and then convert it into a dictionary.
for linenumber in range(1,len(file_lines)):
k = file_lines[linenumber].split(",")
lst.append((k[0][1:len(k[0])-1],k[1][1:len(k[1])-1]))
dic = dict(lst)
print(dic)
but you can use the dict directly as well.