Filtering sequences of events in django

Filtering sequences of events in django - python

My django app stores the actions that players do during a game. One of the models is called Event, and it contains a list of all actions by players. It has the following 4 columns: game_id, player_name, action, turn. Turn is the number of the turn in which the action takes place.
Now I want to count how often players behave in certain patterns. For example, I want to know how often a player takes decision A in turn 2 and that same player takes decision B in turn 3. I'm breaking my head over how I can do this. Can someone help?
Note: Ideally the queries should be efficient, bearing in mind that some related queries are following. For example, after the above query I would want to know how often a player takes decision C in turn 4, after doing A and B in turns 2 and 3. (The goal is to predict the likelihood of each action, given the actions in the past.)

Related

Add Solver Constraint For Non-Mutually Exclusive Ranges

I use or-tools to optimize my fantasy baseball team. My setup very much resembles the program described here. The only difference in my particular case is that players can actually be eligible for a number of different positions. So, I end up with 1 player in a list for a specific position type, and the same player in another list for another position type. I am trying to avoid having the solver select the same player for multiple positions (which wouldn't be realistic).
Is there any way to modify the aforementioned program to constrain the use of a player to a single position even while they are technically eligible for many? Please let me know if I can clarify any further & thanks for your input.

Which datatype should I choose for a units selection in an RTS game?

What is a good data type for a unit collection in an rts?
Im contributing to an api that lets you write bots for the strategy game Starcraft2 in Python.
Right now there is a class units that inherits from list. Every frame, a new units object gets created and then selections of these units are made, creating new units objects, for example with a filter for all enemy units or all flying units.
We use these selections to find the closest enemy to control our units, select our units that can attack right now or need a different order and so on.
But this also means we do a lot of filtering by attributes of each unit in every frame which takes a lot of time. The time for inititalizing one units object alone is 2e-5 to 5e-5 sec and we do it millions of times per game which can slow down the bot and tests a lot, in addition to the filtering process with loops over each unit in the units object.
Is there a better datataype for this?
Maybe something that does not need to be recreated every time for each selection in one frame, but just starts with the initial list of all units we get from the protocol buffer and then the selections and filters can be applied without recreating the object? What would be a good way to implement this so that filtering multiple times per frame is not that slow and/or complicated?

This doesn't sound like a ADT problem at all. This sounds like inefficient programming. It is impossible for us to tell you the correct message to construct to achieve what you're going for.
What you should probably be investigating is how to construct a UnitView if you don't actually need to modify the units data. Consider something similar to how dictionaries return views in Python 3. See here for more details.

How to keep track of players' rankings?

I have a Player class with a score attribute:
class Player(game_engine.Player):
def __init__(self, id):
super().__init__(id)
self.score = 0
This score increases/decreases as the player succeeds/fails to do objectives. Now I need to tell the player his rank out of the total amount of players with something like
print('Your rank is {0} out of {1}')
First I thought of having a list of all the players, and whenever anything happens to a player:
I check if his score increased or decreased
find him in the list
move him until his score is in the correct place
But this would be extremely slow. There can be hundreds of thousands of players, and a player can reset his own score to 0 which would mean that I'd have to move everyone after him in the stack. Even finding the player would be O(n).
What I'm looking for is a high performance solution. RAM usage isn't quite as important, although common sense should be used. How could I improve the system to be a lot faster?
Updated info: I'm storing a player's data into a MySQL database with SQLAlchemy everytime he leaves the gameserver, and I load it everytime he joins the server. These are handled through 'player_join' and 'player_leave' events:
#Event('player_join')
def load_player(id):
"""Load player into the global players dict."""
session = Session()
query = session.query(Player).filter_by(id=id)
players[id] = query.one_or_none() or Player(id=id)
#Event('player_leave')
def save_player(id):
"""Save player into the database."""
session = Session()
session.add(players[id])
session.commit()
Also, the player's score is updated upon 'player_kill' event:
#Event('player_kill')
def update_score(id, target_id):
"""Update players' scores upon a kill."""
players[id].score += 2
players[target_id].score -= 2

Redis sorted sets help with this exact situation (the documentation uses leader boards as the example usage) http://redis.io/topics/data-types-intro#redis-sorted-sets
The key commands you care about are ZADD (update player rank) and ZRANK (get rank for specific player). Both operations are O(log(N)) complexity.
Redis can be used as a cache of player ranking. When your application starts, populate redis from the SQL data. When updating player scores in mysql also update redis.
If you have multiple server processes/threads and they could trigger player score updates concurrently then you should also account for the mysql/redis update race condition, eg:
only update redis from a DB trigger; or
serialise player score updates; or
let data get temporarily out of sync and do another cache update after a delay; or
let data get temporarily out of sync and do a full cache rebuild at fixed intervals

The problem you have is that you want real-time updates against a database, which requires a db query each time. If you instead maintain a list of scores in memory, and update it at a more reasonable frequency (say once an hour, or even once a minute, if your players are really concerned with their rank), then the players will still experience real-time progress vs a score rank, and they can't really tell if there is a short lag in the updates.
With a sorted list of scores in memory, you can instantly get the player's rank (where by instantly, I mean O(lg n) lookup in memory) at the cost of the memory to cache, and of course the time to update the cache when you want to. Compared to a db query of 100k records every time someone wants to glance at their rank, this is a much better option.
Elaborating on the sorted list, you must query the db to get it, but you can keep using it for a while. Maybe you store the last_update, and re-query the db only if this list is "too old". So you update quickly by not trying to update all the time, but rather just enough to feel like real-time.
In order to find someone's rank nearly instantaneously, you use the bisect module, which supports binary search in a sorted list. The scores are sorted when you get them.
from bisect import bisect_left
# suppose scores are 1 through 10
scores = range(1, 11)
# get the insertion index for score 7
# subtract it from len(scores) because bisect expects ascending sort
# but you want a descending rank
print len(scores) - bisect_left(scores, 7)
This says that a 7 score is rank 4, which is correct.

That kind of information can be pulled using SQLAlchemy's sort_by function. If you perform a Query like:
leaderboard = session.query(Player).order_by(Player.score).all()
You will have the list of Players sorted by their score. Keep in mind that every time you do this you do an I/O with the database which can be rather slow instead of saving the data python variables.

Create balanced tennis rounds

We are a group of 20 people and we like to go play 2 vs 2 tennis matches. Each of us plays one match each round and we do 5 rounds in total, so everyone plays 5 matches. Matches have two restrictions:
Everyone has a different level (from 1 to 5), so the matches must be balanced: two players with levels 5 and 5 shoulnd't be matched with two levels 1. So between the two teams, the difference in level must be lower or equal to 1.5.
Ej.: level 1.5 and level 2 vs level 2 and level 2.5. The difference in level between teams is 1 so the match is accepted.
If two players play together in one match, they must not play toghether again in the following rounds.
I managed to create a python script that does the specified above, but it takes about 20 minutes to finish depending on the level of the people :/. What I do is shuffle the list with every one in it, break it into 5 lists of 4 people, check if conditions are satisfied and repeat for every round.
I tried modeling the problem to solve it with linear programming (LP) but I don't know which is my function to optimize to begin with... Any ideas on how to do this with or without LP?
Thanks in advance!

You could use a dummy objective or even try to minimize the max of the difference in levels.
My MIP model is not completely trivial, but it solves quite fast (about a second or so using a commercial solver).
The results look ok at first sight:
I assumed two players cannot be in the same team more than once. I.e. not just in the same game. That is in my case you can play against another player more than once.
A more complex example can be found here.

Python: How To Construct A Class With Many Parameters

I am in a bit of a jam in deciding how to structure my class. What I have is a baseball player class and necessary attributes of:
Player ID (a key from a DB)
Last name
First name
Team
Position
Opponent
about 10 or 11 stats (historical)
about 10 or 11 stats (projected)
pitcher matchup
weather
... and a few more
Some things to break this down a little:
1) put stats in dictionaries
2) make a team class that can hold general info common for all players on the team like weather and pitcher match up.
But, I still have 10 attributes after this.
Seeing this (Class with too many parameters: better design strategy?) has given me a couple ideas but don't know if they're ideal.
1) Use a dictionary - But then wouldn't be able to use methods to calculate stats (or would have to use separate functions)
2) Use args/kwargs - But from what I can gather, those seem to be for variable amounts of parameters, and all of my parameters will be required.
3) Breaking up into smaller classes - I have broken it up a bit already, but don't know if I can any further.
Is there a better way to build this rather than having a class with a bunch of parameters listed out?

If you think about this from the perspective of database design, it would probably be odd to have a BaseballPlayer object that has the following parameters:
Team
Position
Opponent
about 10 or 11 stats (historical)
about 10 or 11 stats (projected)
pitcher matchup
weather
Because there are certain things associated with a particular BaseballPlayer which remain relatively fixed, such as name, etc., but these other things are fluid and transitory.
If you were designing this as an application with various database tables, then, it's possible that each of the things listed here would represent a separate table, and the BaseballPlayer's relationship with these other tables amount to current and former Team, etc.
Thus, I would probably break up the problem into more classes, including a StatsClass and a Team class (which is probably what an Opponent really is...).
But it all depends what you would like to do. Usually when you are bending over backwards to cram data into a structure or doing the same to get it back out, the design could be reworked to make your job easier.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.