how to count number of links shared by a facebook page?

how to count number of links shared by a facebook page? - python

I am working on a website for which it would be useful to know the number of links shared by a particular facebook page (e.g., http://www.facebook.com/cocacola) so that the user can know whether they are 'liking' a firehose of information or a dribble of goodness. What is the best way to get the number of links/status updates that are shared by a particular page?
+1 for implementations that use python (this is a django website) but any solutions are welcome! I tried using fbconsole to accomplish this but I have come up a little short.
For what it is worth, this unanswered question seems relevant. As does the fact that, as of 2012.04.18, you can export your data to csv from the insights management page on the facebook site. The information is in there I just don't know how to get it out...
Thanks for your help!

In the event that anyone else finds this useful, I thought I'd post my gist example here. fbconsole makes it fairly simple to extract data through the Facebook Graph API.
The caveat is that it was not terribly easy to programmatically extract data through fbconsole so I wrote the fbconsole.automatically_authenticate to make it much easier to access this information in a systematic way. This addition has not yet been incorporated into the master branch of fbconsole (it was just posted this morning), but it is available here in the meantime for those that are interested.

Related

Is there any way to know the frequency at which ads are uploaded to a webpage with python selenium?

I am doing some statistic with the rate at which people use to upload ads to a selling-website, questions to a forum, videos to YouTube, etc. (in particular I am mostly interested about the data related with ads). The thing is that I want to measure this rate myself (without any data base reported by someone else).
I have a script (in python also using selenium) that performs the basic things to open url's and navigate.
So, is there any way to know the time instance at which a given advertisement, for example, has been uploaded or how many of them in a certain time window? Is it registered in the webpage some information related with?
In the answer is "Yes", how can I access to that data from selenium?
I know it is kind of a general question since I am not being specific about a webpage in particular, so I apologize :)
I'd be glad if someone could help with that, thanks in advance. Any answer or comment would be appreciated.

Python -- Create a Webform

I'm working on a project to try and create a more streamlined process to enter data into our database.
Currently, we're just using raw_input("Question: "), but this seems very outdated and is prone to mistakes. Given that there are over 100 questions we need to answer, this can take quite some time, loading only one question at a time.
I was interested in creating a web-based form to do this instead, as individuals wouldn't need to install python and various libraries on their own computers, but connect to an IP address on our network instead.
However, they're going to be using this to input medical data into our database, so I need some sort of secure-login feature, and only after being "validated", is a technician redirected to our medical form. I saw that using FlaskWTF might be able to solve this issue, but their documentation is a little confusing to me.
I'm wondering exactly how to do this. I've been looking at FlaskWTF (recommended on another post), but I don't need the ability to upload documents, only get data from a text box, or to see if someone has selected multiple boxes (i.e., if the person indicates they have both cancer AND diabetes).
Likewise, I'm wondering if I can create a google form, download it, and host it on an internal server. However, I'm curious on how I can retrieve the data with Python. I saw this post, but its only for a one question form, and I have at least 100 questions on this form.
If you think creating a webform is too difficult for someone who has no major python web experience, nor experience with HTML/PHP (I've mostly worked with databases using elasticsearch and some, albeit very basic, python-based AI/Chatterbots), would you recommend creating a form using TKinker instead? If so, how to I save form inputs as variables, and make it look a little prettier?
My apologies that I don't have code here, but rather a series of questions. As I continue to work on this project, I will probably post snippits of code to this site.
Best!

how do I track how many users visit my website

I just deployed my first ever web app and I am curious if there is an easy way to track every time someone visits my website, well I am sure there is but how?

Easy as pie, use Google Analytics, you just have to include a tiny script in your app's pages
http://www.google.com/analytics/

PythonAnywhere Dev here. You also have your access log. You can click through this from your web app tab. It shows you the raw data about your visitors. I would personally also use something like Google Analytics. However you don't need to do anything to be able to just see your raw visitor data. It's already there.

know from myself people are obsessed with traffic, statistics, looking at other sites – tracking their stats and so on. And if there is enough demand, of course there are sites to satisfy You. I wanted to put those sites and tools in one list together, because at least for me this field was really unclear – I didn’t know what means Google Pagerank, Alexa, Compete, Technorati rankings and I could continue so on. I must say not always these stats are precise, but however they give at least overview, how popular the certain page is, how many visitors that sites gets – and if You compare those stats with Your site statistics, You can get pretty precise results then.
http://www.stuffedweb.com/3-tools-to-track-your-website-visitors/
http://www.1stwebdesigner.com/design/10-ways-how-to-track-site-traffic-popularity-statistics/

I am a huge fan of Cloudflare's analytics. It is super easy to setup, and you don't have to worry about adding a javascript blurb to each page. Cloudflare is also able to track all of the things that visit your page without loading the javascript.
http://www.cloudflare.com

Code for web crawling with Python 2.7.3 in mac terminal?

I am a social scientist and a complete newbie/noob when it comes to coding. I have searched through the other questions/tutorials but am unable to get the gist of how to crawl a news website targeting the comments section specifically. Ideally, I'd like to tell python to crawl a number of pages and return all the comments as a .txt file. I've tried
from bs4 import BeautifulSoup
import urllib2
url="http://www.xxxxxx.com"
and that's as far as I can go before I get an error message saying bs4 is not a module. I'd appreciate any kind of help on this, and please, if you decide to respond, DUMB IT DOWN for me!
I can run wget on terminal and get all kinds of text from websites which is awesome IF I could actually figure out how to save the individual output html files into one big .txt file. I will take a response to either question.

Try Scrapy. It is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

You will most likely encounter this as you go, but in some cases, if the site is employing 3rd party services for comments, like Disqus, you will find that you will not be able to pull the comments down in this manner. Just a heads up.
I've gone down this route before and have had to tailor the script to a particular site's layout/design/etc.
I've found libcurl to be extremely handy, if you don't mind doing the post-processing using Python's string handler functions.
If you don't need to implement it purely in Python, you can make use of wget's recursive mirroring option to handle the content pull, then write your python code to parse the downloaded files.

I'll add my two cents here as well.
The first things to check are that you installed beautiful soup, and that it lives somewhere that it can be found. There's all kinds of things that can go wrong here.
My experience is similar to yours: I work at a web startup, and we have a bunch of users who register, but give us no information about their job (which is actually important for us). So my idea was to scrape the homepage and the "About us" page from the domain in their email address, and try to put a learning algorithm around the data that I captured to predict their job. The results for each domain are stored as a text file.
Unfortunately (for you...sorry), the code I ended up with was a bit complicated. The problem is that you'll end up getting a lot of garbage when you do the scraping, and you'll have to filter it out. You'll also end up with encoding issues, and (assuming you want to do some learning here) you'll have to get rid of low-value words. The total code is about 1000 lines, and I'll post some important pieces that may help you out here, if you're interested.

How to get number of visitors of a page on GAE?

I need to get the number of unique visitors(say, for the last 5 minutes) that are currently looking at an article, so I can display that number, and sort the articles by most popular.
ex. Similar to how most forums display 'There are n people viewing this thread'
How can I achieve this on Google App Engine? I am using Python 2.7.
Please try to explain in a simple way because I recently started learning programming and I am working on my first project. I don't have lots of experience. Thank you!

Create a counter (property within entity) and increase it transactionally for every page view. If you have more then a few pageviews a second you need to look into sharded counters.

There is no way to tell when someone stops viewing a page unless you use Javascript to inform the server when that happens. Forums etc typically assume that someone has stopped viewing a page after n minutes of inactivity, and base their figures on that.
For minimal resource use, I would suggest using memcache exclusively here. If the value gets evicted, the count will be incorrect, but the consequences of that are minimal, and other solutions will use a lot more resources.

Did you consider Google Analytics service for getting statistics? Read this article about real-time monitoring using this service. Please note: a special script must be embedded on every page you want to monitor.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.