How to model a social news feed on Google App Engine - python

We want to implement a "News feed" where a user can see messages
broadcasted by her friends, sorted with newest message first. But the
feed should reflect changes in her friends list. (If she adds new
friends, messages from those should be included in the feed, and if
she removes friends their messages should not be included.) If we use
the pubsub-test example and attach a recipient list to each message
this means a lot of manipulation of the message recipients lists when users
connect and disconnect friends.
We first modeled publish-subscribe "fan out" using conventional RDBMS
thinking. It seemed to work at first, but then, since the IN operator
works the way it does, we quickly realized we couldn't continue on
that path. We found Brett Slatkin's presentation from last years
Google I/O and we have now watched it a few times but it isn't clear to
us how to do it with "dynamic" recipient lists.
What we need are some hints on how to "think" when modeling this.

Pasting the answer I got for this question in the Google Group for Google App Engine http://groups.google.com/group/google-appengine/browse_thread/thread/09a05c5f41163b4d# By Ikai L (Google)
A couple of thoughts here:
is removing of friends a common event? similarly, is adding of
friends a common event? (All relative,
relative to "reads" of the news feed)
From what I remember, the only way to make heavy reads scale is to write
the data multiple times in peoples'
streams. Twitter does this, from what
I remember, using a "eventually
consistent" model. This is why your
feed will not update for several
minutes when they are under heavy
load. The general consensus, though,
is that a relational, normalized
model simply will not work.
the Jaiku engine is open source for your study:
http://code.google.com/p/jaikuengine.
This runs on App Engine Hope these
help when you're considering a design.

Related

How to generate mission test scenarios

I'm working on a software that deals with drones.
My team introduced a server to allow command and control activities with multiple drones.
Now, I'd like to test its API and create a python module for automated testing.
The API includes actions like add marker, delete marker and so on and so forth that you can do in the app.
I've been researching if there might be a tool to allow me to randomize these actions automatically to create scenarios that imitate user actions.
For example:
check the license, add mission, add a marker, fly to position and delete Marker.
Each of those actions is a request sent to the server within the app, but I've already recreated those activities as functions in python. The server actions have also been written in Python(server is tornado). Now I just need to find a way to randomize their activation(the data they send to the server is generated randomly and legally as well, and that's not a problem).
So before wasting a lot of my time creating these scenarios by hand, I'm sure someone already faced this kind of problem. I couldn't find it here though. Searched for hours but there are so many questions I might have missed something related to my issue.
I can build such a tool myself and even share a git to it here if it comes to that. Then it will be helpful to anyone encountering this question.
I thought it would be worth asking anyway.
Let me know if there are any other details you need to know to answer this question.
Thanks!

How to build a real time recommendation engine with good performance?

I am a data analyst and just assigned to build a real time recommendation engine for our website.
I need to analyse the visitor behavior and do real time analysis for those input. Thus I have three questions about this project.
1) Users are not forced to sign-up. Is there any methodology to capture user behavior such as search and visit history?
2) The recommendation models can be pre-trained but the prediction process takes time. How can we improve the performance?
3) I only know how to write Python Scripts. How can I implement the recommendation engine with my python scripts?
Thanks.
===============
However, 90% of our customers purchase the products during their first visit and will not come back shortly.
We cannot make a ready model for new visitors.
And they prefer to use itemCF for the recommendation engine.
It sounds like a mission impossible now...
This is quite a broad question however I will do my best to answer:
Visit history can be tracked by enabling some form of analytics tracking on your domain. This can either be a pre-built solution that you implement and will provide a detailed overview of all visitors to your domain, usually with some form of dashboard. Most pre-built solutions provide a way to obtain the analytics that have been collected.
Another way would be to use browser cookies to store information pertaining to each visit to your domain and their search/page history. This information will be available to the website whenever the user visits it within the same browser. When the user visits your website, you could send the information to a server/rest endpoint which could analyse information (I.P/geolocation/number of visits/search/page history) and make recommendation based on that. Another common method is to track past purchases ect.
To improve performance one solution would be to always have the prediction model for a particular user ready for when they next visit the site. That way, there is no delay. However, the first time a user visits you likely wont have enough information to make detailed predictions so you will have to resort to providing options based on geolocation (which shouldn't take to long and wont impact performance)
There is another approach that can be taken and above mainly talked about making predictions based on a users behavior browsing the website. Content-based filtering is another approach which will recommend things that are similar to a item that the user is currently viewing. This approach is generally easier, as it just requires that you query a database for items that are similar in category, purpose/use ect.
There is no getting around using javascript for the client side stuff, however your recommendation engine can be built in Python (it could be a simple REST api endpoint with access to the items database). Most people use flask,django or eve to implement REST API's in Python.

How to implement a Nadex autotrading bot?

I have been searching for a way to autotrade on Nadex
https://www.nadex.com
and came across this script https://github.com/FreeTheQuarks/NadexBot
It is an old script and I am not that experienced in Python.
Q1: Is this a good way to go about, thus since it is not a official API and is probably scraping data from the site, which would mean slower requests and trade execution?
There is also an unofficial API client https://github.com/knoguchi/nadex/tree/master/nadex
But again, not sure if good for live trading.
Q2: Is there better ways to go about this and if so where should I start?
I'm the author of the Nadex unofficial API Python client.
I still maintain it. Recently the streamer support was added.
However, I suggest that you use JavaScript from nadex.com. It's always up-to-date and works just like the official web site, obviously.
The JS code is professionally written. Very readable. There are 100 JavaScript files but essential ones for the API access are only handful.
Nadex is a part of IG Group. Hence the JS has lots of IG namespace. IG offers API and documents for developers. Nadex message format is a little different from IG's but the design is same. Once you learn the document, all the JavaScript code is really easy to understand.
A1: Measure Twice Rather Before One Cut$
Simply put, it is your money you trade with, so warnings like this one ( from FreeTheQuarks )
(cit.:)This was the first non-trivial program I wrote.It hasn't received any significant updates in years.I've only made minor readability updates after first putting this on git.
should give one a sufficient sign to re-think the risks, before one puts a first dollar on table.
This is a for-profit game, isn't it?
A2: Yes, there is a better way.
All quantitatively supported trading strategies need stable and consistent care - i.e. one needs to have
rock-solid historical data
stable API to work with ( Trade Execution & Management, Market Events )
reliable & performant testbed for validating one's own quantitative model of the trading strategy
Having shared this piece of some 100+ man*years experience, one may decide on one's own whether to rely on or rather forget to start any reasonable work with just some reverse-engineering efforts on an "un-official API client".
In case the profit generation capability of some trading strategy supports the case, one may safely order an outsourced technical implementation and technical integration efforts in a turn-key manner.
Epilogue:If there are quantitatively supported reasons to implement a trading strategy,the profit envelope thereof sets the ultimate economically viable modelfor having that strategy automated and operated in-vivo.Failure to decide in this order of precedenceresults in nothing but wasted time & money.Your money.
I was looking into the Nadex platform recently too. I wrote a small wrapper over the oanda foreign exchange broker api v1 in python (now they have v2.0) so I have some experience.
To implement an autotrading bot is a big question, but to try and answer: you may either use a pre-existing wrapper for the Nadex API (it looks like either Python or Javascript are your choices), or write one yourself, in a language of your preference.
If you want to start from scratch, I believe Nadex offers a RESTful service, which basically means you can make GET, POST, DELETE, and other types of requests via a specific URL (most of the time there is a 'base' URL from where other endpoints spawn). I would first go about trying to find the endpoints to the Nadex servers - Kenjis unofficial API should point in the right direction there, since he is using URL strings and has a class for making different requests. I was unsuccessful trying to find any documentation for Nadex API myself, but Kenji's wrapper or the Javascript API both look promising. Depending on the depth of the market and number of requests, I think you are correct saying that you wouldn't want a web scraper for something like this. It would be very slow (and probably wasteful of time) compared to using an existing wrapper. I would start writing classes and/or functions that make simple requests to the Nadex RESTFUL endpoints, for example a function that logs in the accesses account data. Next step would be to retrieve market data and eventually stream the market data into a trade logic algorithm that makes decisions for you.
If you want to build a trading bot easily and with most of the work cut out for you, I would recommend one of the other answers here. That way, you can use their predefined classes/functions and have the "boring" API access code written for you, ready to use.
Hope that helps or leads you in the right direction!

Big mysql query versus an http post connection in terms of long term speed

right now I think i'm stuck between two main choices for grabbing a user's friends list.
The first is a direct connection with facebook, and the pulling the friends list out and creating a list of friend models with the json. (Takes quite a while whenever I try it out, like 2 seconds?)
The other is whenever a user logs in, the program will store his or her entire friends list inside a big friends model (note that even if two people have the same exact friends, two sets will still be stored, all friend models will have an FK back to the person who has these friends on their list).
Whenever a user needs his or her friends list, I just use django's filter to grab them.
Right now this is pretty fast but that's because it hasn't been tested with many people yet.
Based off of your guys experience, which of these two decisions would make the most sense long term?
Thank you
It depends a lot on what you plan on doing with the data. However, thinking long term you're going to have much more flexibility with breaking out the friends into distinct units than just storing them all together.
If the friend creation process is taking too long, you should consider off-loading it to a separate process that can finish it in the background, using something like Celery.

Psych Experiment in Python (w/Django) - how to port to interactive web app?

I'm writing a psychology experiment in Python, and I need to make it available as a web app. I've already got the Python basically working as a command-line program. On the recommendation of a CS buddy I'm using Django with a sqlite db. This is also working, my development server is up and the database tables are ready and waiting.
What I don't understand is how to glue these two pieces together. The Django tutorials I've found are all about building things like blogs, messaging systems or polls; systems based on sending form data. I can't do that, because I'm timing responses to presented stimuli in milliseconds - I need to build an interactive app that doesn't rely (during the exercise) on form POST data or URL changes.
In short: I have no idea how to go from my simple command line program to a "real time" interactive web application.
Maximum kudos for links to relevant tutorials! I will also really appreciate a high-level explanation of the concept I'm missing here.
(FYI, I asked a previous question (choice of database) about this project here)
You are going to need to use HTML/Javascript, and then you can collect and send the results to the server. The results can get gamed though, as the code for the exercise is going to be client side.
Edit: I recommend a Javascript library, jQuery: http://docs.jquery.com/Tutorials
Edit 2:
I'll be a bit more specific, you need at least two models in Django, Exercise, and ExecutedExercise. Exercise will have fields with its name, number, etc., generic data for each exercise. ExecutedExercise will have two fields, a foreign key to Exercise, and a field to store how long it took to finish.
Now in Javascript, you're going to time the exercises, and then post them to a Django view that will handle the data storage. How to post them? You could use http://api.jquery.com/jQuery.post/ Create the data string, data = { e1: timingE1, e2: timingE2 } and post it to the view. You can handle the POST parameters in that view, create a ExecutedExercise object (you'll have the time it took for each exercise) and save them.

Categories

Resources