Trying to create variables from elements in a list - python

this is my second time trying to ask this question, hoping it comes across more concise this time round.
I have a list of batsmen for a cricket score calculator I am trying to get up and running.
eg.
batsmen = ['S Ganguly', 'M Brown', 'R Uthappa', 'A Majumdar', 'S Smith', 'A Mathews', 'M Manhas', 'W Parnell', 'B Kumar', 'M Kartik', 'A Nehra']
With this list I have a for loop that I run through currently without the list, it just works to calculate 2 teams and finds a winner.
for overs in range(overlimit):
for balls in range(6):
balls_team = balls_team+ 1
runtotal_team = input('Enter runs: ')
I am trying to utilise the list as a means of keeping score as well as holding values for each of the batsmen.
I'm hoping one of you guys can help. I'm assuming a while loop can be used but I am unable to figure out how to implement the list..

Dictionary?
batsmenDict = {
'A Majumdar': 0,
'A Mathews': 0,
'A Nehra': 0,
'B Kumar': 0,
'M Brown': 0,
'M Kartik': 0,
'M Manhas': 0,
'R Uthappa': 0,
'S Ganguly': 0,
'S Smith': 0,
'W Parnell': 0}
batsmenDict['M Manhas'] += 1
There is even a special collection type called a defaultdict that would let you make the default value 0 for each player:
from collections import defaultdict
batsmenDict = defaultdict(int)
print batsmenDict['R Uthappa']
# 0
batsmenDict['R Uthappa'] +=1
print batsmenDict['R Uthappa']
# 1

You want to use a dict, then the names of the batsmen can become the keys for the dict, and their runs/scores the value.

Use a dictionary
>>> batsmen = [ ... ]
>>> d = dict.fromkeys(batsmen)
>>> d['A Mathews'] = 14
>>> d['A Mathews']
14

Related

How to count word count in Data Frame using word list?

I have a question about word count using python.
Data Frame have three columns.(id, text, word)
First, This is example table.
[Data Frame]
df = pd.DataFrame({
"id":[
"100",
"200",
"300"
],
"text":[
"The best part of Zillow is you can search/view thousands of home within a click of a button without even stepping out of your door.At the comfort of your home you can get all the details such as the floor plan, tax history, neighborhood, mortgage calculator, school ratings etc. and also getting in touch with the contact realtor is just a click away and you are scheduled for the home tour!As a first time home buyer, this website greatly helped me to study the market before making the right choice.",
"I love all of the features of the Zillow app, especially the filtering options and the feature that allows you to save customized searches.",
"Data is not updated spontaneously. Listings are still shown as active while the Mls shows pending or closed."
],
"word":[
"[best, word, door, subway, rain]",
"[item, best, school, store, hospital]",
"[gym, mall, pool, playground]",
]
})
I already split text to make dictionary.
So, I want to each line word list checked to text.
This is result what I want.
| id | word dict |
| -- | ----------------------------------------------- |
| 100| {best: 1, word: 0, door: 1, subway: 0 , rain: 0} |
| 200| {item: 0, best: 0, school: 0, store: 0, hospital: 0} |
| 300| {gym: 0, mall: 0, pool: 0, playground: 0} |
Please, check this issue.
We can use re to extract all of the words in our list. Noting, this will only match words in your list, not numbers.
Then apply a function that returns a dict with the count of each word in the list. We can then apply this function to a new column in the df.
import re
def count_words(row):
words = re.findall(r'(\w+)', row['word'])
return {word: row['text'].count(word) for word in words}
df['word_counts'] = df.apply(lambda x: count_words(x), axis=1)
Outputs
id ... word_counts
0 100 ... {'best': 1, 'word': 0, 'door': 1, 'subway': 0,...
1 200 ... {'item': 0, 'best': 0, 'school': 0, 'store': 0...
2 300 ... {'gym': 0, 'mall': 0, 'pool': 0, 'playground': 0}
[3 rows x 4 columns]
Since your word column is of type string, convert it to a list first:
df['word'] = df['word'].str[1:-1].str.split(',')
Now you can use apply for axis=1 with the logic to count each word:
df[['text', 'word']].apply(lambda row: {item:row['text'].count(item) for item in row['word']}, axis=1)
OUTPUT:
Out[32]:
0 {'best': 1, ' word': 0, ' door': 1, ' subway':...
1 {'item': 0, ' best': 0, ' school': 0, ' store'...
2 {'gym': 0, ' mall': 0, ' pool': 0, ' playgroun...
dtype: object

How to replace string with dictionary in Python? [duplicate]

This question already has answers here:
How can I use a dictionary to do multiple search-and-replace operations? [duplicate]
(13 answers)
Closed 3 years ago.
How to change the string to result?
string = 'John and Mary are good friends'
d = {'John': 'Sam', 'Mary': 'Ann', 'are': 'are not'}
result = 'Sam and Ann are not good friends'
Thank you.
If keys in dictionary have only one word is possible split, map by get and join back:
a = ' '.join(d.get(x, x) for x in string.split())
print (a)
Sam and Ann are not good friends
If possible multiple words and also is necessary use words boundaries for avoid replace substrings:
import re
string = 'John and Mary are good friends'
d = {'John and': 'Sam with', 'Mary': 'Ann', 'are good': 'are not'}
pat = '|'.join(r"\b{}\b".format(x) for x in d.keys())
a = re.sub(pat, lambda x: d.get(x.group(0)), string)
print (a)
Sam with Ann are not friends
You can do it like this:
string = 'John and Mary are good friends'
d = {'John': 'Sam', 'Mary': 'Ann', 'are': 'are not'}
result = string
for key, value in d.items():
result = result.replace(key, value)
print(result)
output:
Sam and Ann are not good friends
1 - Iterate over each word of string.
2 - check if word exists in dictionary keys.
3 - If it does exist append value of that word to result. if it does not, append word to result.
Basic approach:
split the long string into a list of words
iterate through this list of words; if any word exists as a key in the given dictionary, replace that word with the corresponding value from the dictionary
join the list of words together using space
string = 'John and Mary are good friends'
d = {'John': 'Sam', 'Mary': 'Ann', 'are': 'are not'}
s = string.split()
for i, el in enumerate(s):
if el in d:
s[i] = d[el]
print(' '.join(s))

Accessing values in a sub-dictionary within a dictionary

Hello I have a dictionary that looks like this:
dictionary = {'John': {'car':12, 'house':10, 'boat':3},
'Mike': {'car':5, 'house':4, 'boat':6}}
I want to gain access and extract the keys within the sub-dictionary and assign them to variables like this:
cars_total = dictionary['car']
house_total = dictionary['house']
boat_total = dictionary['boat']
Now, when I run the variables above I get a 'Key Error'. It is understandable because I need to first access the outer dictionary. I would appreciate if someone gave a helping hand on how to access keys and the values within the sub-dictionary as those are the values I want to use.
Also i would like to do create a new key, this may not be right but something on these lines:
car = dictionary['car']
house = dictionary['house']
boat = dictionary['boat']
dictionary['total_assets'] = car + house + boat
I want to be able to access all those keys in the dictionary and create a new key. The outer keys such as "John' and 'Mike' should both contain the newly made key at the end.
I know this throws an error but it will give you an idea on what I want to achieve. Thanks for the help
I would just use a Counter object to get the totals:
>>> from collections import Counter
>>> totals = Counter()
>>> for v in dictionary.values():
... totals.update(v)
...
>>> totals
Counter({'car': 17, 'house': 14, 'boat': 9})
>>> totals['car']
17
>>> totals['house']
14
>>>
This has the added benefit of working nicely even if the keys aren't always present.
If you want the total assets, you can then simply sum the values:
>>> totals['total_assets'] = sum(totals.values())
>>> totals
Counter({'total_assets': 40, 'car': 17, 'house': 14, 'boat': 9})
>>>
To sum the total assets for each person and add it as a new key:
for person in dictionary:
dictionary[person]['total_assets'] = sum(dictionary[person].values())
which will result in:
dictionary = {'John': {'car':12, 'house':10, 'boat':3, 'total_assets':25},
'Mike': {'car':5, 'house':4, 'boat':6, 'total_assets':15}}
dictionary doens't have a key car, as you've seen. But dictionary['John'] does.
$ >>> dictionary['John']
{'car': 12, 'boat': 3, 'house': 10}
>>> dictionary['John']['car']
12
>>>
The value associated with each key in dictionary is, itself, another dictionary, which you index separately. There is no single object that contains, e.g., the car value for each subdictionary; you have to iterate
over each value.
# Iterate over the dictionary once per aggregate
cars_total = sum(d['car'] for d in dictionary.values())
house_total = sum(d['house'] for d in dictionary.values())
boat_total = sum(d['boat'] for d in dictionary.values())
or
# Iterate over the dictionary once total
cars_total = 0
house_total = 0
boat_total = 0
for d in dictionary.values():
cars_total += d['car']
house_total += d['house']
boat_total += d['boat']
dictionary = {'John': {'car':12, 'house':10, 'boat':3},'Mike': {'car':5, 'house':4, 'boat':6}}
total_cars=sum([dictionary[x]['car'] for x in dictionary ])
total_house=sum([dictionary[x]['house'] for x in dictionary ])
total_boats=sum([dictionary[x]['boat'] for x in dictionary ])
print(total_cars)
print(total_house)
print(total_boats)
Sample iteration method:
from collections import defaultdict
totals = defaultdict(int)
for person in dictionary:
for element in dictionary[person]:
totals[element] += dictionary[person][element]
print(totals)
Output:
defaultdict(<type 'int'>, {'car': 17, 'boat': 9, 'house': 14})

I need a way to map multiple keys to the same value in a dictionary

Admittedly, this question seems like it might be a popular one, but I couldn't really find it (perhaps I wasn't using the right search terms). Anyway, I need something of this sort:
tel = {}
tel['1 12'] = 1729
tel['9 10'] = 1729
tel['1 2'] = 9
tel['2 1'] = 9
tel['1 1'] = 2
print(tel)
{['1 1'] : 2, ['1 2', '2 1'] : 9, ['1 12', '9 10'] : 1729}
So whenever a key's value is already in the dict, append the key to the list of keys mapping to that value; else, add the key value pair to the dict.
EDIT
I'm sorry if I confused the lot of you, and I'm REALLY sorry if the following confuses you even more :)
This is the original problem I wanted to solve: Given the equation a^3 + b^3, produce a dictionary mapping all positive integer pair values for a, b less than 1000 to the value of the equation when evaluated. When two pairs evaluate to the same value, I want the two pairs to share the same value in the dictionary and be grouped together somehow. (I'm already aware that I can map different keys to the same value in a dict, but I need this grouping).
So a sample of my pseudocode would be given by:
for a in range(1, 1000):
for b in range(1, 1000):
map pair (a, b) to a^3 + b^3
For some integer pairs (a, b) and (p, q) where a != p, and b != q, a^3 + b^3 == p^3 + q^3. I want these pairs to be grouped together in some way. So for example, [(1, 12), (9, 10)] maps to 1729. I hope this makes it more clear what I want.
EDIT2
As many of you have pointed out, I shall switch the key value pairs if it means a faster lookup time. That would mean though that the values in the key:value pair need to be tuples.
As many of the comments have already pointed out, you seem to have your key/value structure inverted. I would recommend factoring out your int values as keys instead. This way you achieve efficient dictionary look ups using the int value as a key, and implement more elegant simple design in your data - using a dictionary as intended.
Ex: {9: ('1 2', '2 1'), 2: ('1 1',), 1729: ('9 10', '1 12')}
That being said the snippet below will do what you require. It first maps the data as shown above, then inverts the key/values essentially.
tel = {}
tel['1 12'] = 1729
tel['9 10'] = 1729
tel['1 2'] = 9
tel['2 1'] = 9
tel['1 1'] = 2
#-----------------------------------------------------------
from collections import defaultdict
new_tel = defaultdict(list)
for key, value in tel.items():
new_tel[value].append(key)
# new_tel -> {9: ['1 2', '2 1'], 2: ['1 1'], 1729: ['9 10', '1 12']}
print {tuple(key):value for value, key in new_tel.items()}
>>> {('1 2', '2 1'): 9, ('1 1',): 2, ('9 10', '1 12'): 1729}
I know this thread is old, but I just came across this problem and here's how I solved it:
class PolyMap:
def __init__(self):
self._map = {}
def __setitem__(self, key, value):
self._map[key] = value
def __getitem__(self, item):
for keys in self._map:
if item == keys:
return self._map[item]
if item in keys:
return self._map[keys]
return None
pm = PolyMap()
pm[("x", "y")] = "z"
print(pm["x"])
# >> z
print(pm["y"])
# >> z
print(pm[("x", "y")])
# >> z
print(pm["z"])
# >> None
This doesn't keep track of key components you've used, so if you use the same key component more than once, and expect consistent functionality of getitem, you'll need to pass it the tuple you created it with.
The good news is, if you know that the components of your key-tuples are all unique, then you can use this without fuss.
For those coming here via search:
First, invert the mapping
>>> tmp = {1729: ['1 12','9 10'], 9: ['1 2','2 1'], 2: ['1 1']}
Then create the inverse mapping to get what you want
>>> tel = {v: k for k,vs in tmp.items() for v in vs}
>>> print(tel)
... {'1 12': 1729, '9 10': 1729, '1 2': 9, '2 1': 9, '1 1': 2}

Efficient way of frequency counting of continuous words?

I have a string like this:
inputString = "this is the first sentence in this book the first sentence is really the most interesting the first sentence is always first"
and a dictionary like this:
{
'always first': 0,
'book the': 0,
'first': 0,
'first sentence': 0,
'in this': 0,
'interesting the': 0,
'is always': 0,
'is really': 0,
'is the': 0,
'most interesting': 0,
'really the': 0,
'sentence in': 0,
'sentence is': 0,
'the first': 0,
'the first sentence': 0,
'the first sentence is': 0,
'the most': 0,
'this': 0,
'this book': 0,
'this is': 0
}
What is the most efficient way of updating the frequency counts of this dictionary in one pass of the input string (if it is possible)? I get a feeling that there must be a parser technique to do this but am not an expert in this area so am stuck. Any suggestions?
Check out the Aho-Corasick algorithm.
The Aho–Corasick seems definitely the way to go, but if I needed a simple Python implementation, I'd write:
import collections
def consecutive_groups(seq, n):
return (seq[i:i+n] for i in range(len(seq)-n))
def get_snippet_ocurrences(snippets):
split_snippets = [s.split() for s in snippets]
max_snippet_length = max(len(sp) for sp in split_snippets)
for group in consecutive_groups(inputString.split(), max_snippet_length):
for lst in split_snippets:
if group[:len(lst)] == lst:
yield " ".join(lst)
print collections.Counter(get_snippet_ocurrences(snippets))
# Counter({'the first sentence': 3, 'first sentence': 3, 'the first': 3, 'first': 3, 'the first sentence is': 2, 'this': 2, 'this book': 1, 'in this': 1, 'book the': 1, 'most interesting': 1, 'really the': 1, 'sentence in': 1, 'is really': 1, 'sentence is': 1, 'is the': 1, 'interesting the': 1, 'this is': 1, 'the most': 1})
When confronted with this problem, I think, "I know, I'll use regular expressions".
Start off by making a list of all the patterns, sorted by decreasing length:
patterns = sorted(counts.keys(), key=len, reverse=True)
Now make that into a single massive regular expression which is an alternation between each of the patterns:
allPatterns = re.compile("|".join(patterns))
Now run that pattern over the input string, and count up the number of hits on each pattern as you go:
pos = 0
while (True):
match = allPatterns.search(inputString, pos)
if (match is None): break
pos = match.start() + 1
counts[match.group()] = counts[match.group()] + 1
You will end up with the counts of each of the strings.
(An aside: i believe most good regular expression libraries will compile a large alternation over fixed strings like this using the Aho-Corasick algorithm that e.dan mentioned. Using a regular expression library is probably the easiest way of applying this algorithm.)
With one problem: where a pattern is a prefix of another pattern (eg 'first' and 'first sentence'), only the longer pattern will have got a count against it. This is by design: that's what the sort by length at the start was for.
We can deal with this as a postprocessing step; go through the counts, and whenever one pattern is a prefix of another, add the longer pattern's counts to the shorter pattern's. Be careful not to double-add. That's simply done as a nested loop:
correctedCounts = {}
for donor in counts:
for recipient in counts:
if (donor.startswith(recipient)):
correctedCounts[recipient] = correctedCounts.get(recipient, 0) + counts[donor]
That dictionary now contains the actual counts.
Try with Suffix tree or Trie to store words instead of characters.
Just go through the string and use the dictionary as you would normally to increment any occurance. This is O(n), since dictionary lookup is often O(1). I do this regularly, even for large word collections.

Categories

Resources