Firstly, apologies for the very basic question. I have looked into other answers but they haven't quite answered what I'm after. I'm confident designing a site in HTML/CSS and have very very basic knowledge of Python.
I want to run a very basic Python script on my website. It analyses tweets about a specific topic, and then posts a sentiment analysis score. I want it to run this sentiment analysis every hour and cache the score.
I have a working Python script which does this in Jupyter Notebook. Could you give me an overview of how I would make this script function online and cache the results? I've read into using Python web frameworks, but from my limited understanding, they seem like overkill?
Thank you for your help!
Could you give me an overview of how I would make this script function online
The key thing would be to uncouple the two parts of your system:
Producing the data
Showing it in a website.
So the first thing to do is have your sentiment-analysis script push its value to a database. The database could be something as simple as a csv file, or it could be a key/value store, or something like MySQL or CouchDB (or hundreds of other choices).
Over on the website you have to make a decision between:
Server-side
Client-side
If the former, you could program in Python if that is what you are most familiar with. Whatever language/framework combination you go for, there will an example tutorial of how to read a value from a database and display it: it is just about the most fundamental thing.
If client-side you will usually be programming in JavaScript. Again you need to choose a framework, but again you should easily be able to find a tutorial to follow.
(Unless you have a good reason to prefer server-side, such as familiarity with an existing framework, or security issues with accessing your database, I'd go with a client-side approach.)
I've read into using Python web frameworks... overkill?
Yes and no. You are going to need some kind of database, and some kind of framework. It would be good to understand the basics of web security, too. If the sentiment analysis is your major goal, all that is going to be a distraction, and it might be better to find a friend who already knows web programming to work with. Or just find a tutorial that is very close to what you want to do, and adapt that.
(P.S. I was going to flag your question as "too broad", but you did ask for an overview, so I hope this helps.)
I am doing some statistic with the rate at which people use to upload ads to a selling-website, questions to a forum, videos to YouTube, etc. (in particular I am mostly interested about the data related with ads). The thing is that I want to measure this rate myself (without any data base reported by someone else).
I have a script (in python also using selenium) that performs the basic things to open url's and navigate.
So, is there any way to know the time instance at which a given advertisement, for example, has been uploaded or how many of them in a certain time window? Is it registered in the webpage some information related with?
In the answer is "Yes", how can I access to that data from selenium?
I know it is kind of a general question since I am not being specific about a webpage in particular, so I apologize :)
I'd be glad if someone could help with that, thanks in advance. Any answer or comment would be appreciated.
I'm not a programmer (I'm starting to learn some coding, but I'm an absolute beginner).
Thing is that my work has lots of things that could be automated with scripting. I have to search lots of companies in Linkedin Sales Navigator which leads me to absolute boredom and alienation. I have to grab a spreadsheet, copy and paste every day, and I feel kind of stupid doing something that could be easily automated.
I have thought of automating this stuff, but I don't know where to start. A friend of mine told me that Stack Overflow could help me to know where I can start. I did my research but didn't found anything profitable, maybe this is due to my lack of knowledge.
Summarizing: I need the guides to create a Script that copy a spreadsheet field to a browser field and extract the results from it and paste it out to the adjacent fields of the same spreadsheet, with the info extracted from LinkedIn. Where I can start? Thanks for your comprehension.
Python is perfect the task the described above, I would recommend looking at following:
http://learnpythonthehardway.org/
Automate the boring stuff with python (book)
These should get you started.
Here a description of a tool that automate the extraction of leads and accounts:
https://medium.com/p/exporting-leads-informations-from-linkedin-sales-navigator-d38d3602e5b3
I am working on a website for which it would be useful to know the number of links shared by a particular facebook page (e.g., http://www.facebook.com/cocacola) so that the user can know whether they are 'liking' a firehose of information or a dribble of goodness. What is the best way to get the number of links/status updates that are shared by a particular page?
+1 for implementations that use python (this is a django website) but any solutions are welcome! I tried using fbconsole to accomplish this but I have come up a little short.
For what it is worth, this unanswered question seems relevant. As does the fact that, as of 2012.04.18, you can export your data to csv from the insights management page on the facebook site. The information is in there I just don't know how to get it out...
Thanks for your help!
In the event that anyone else finds this useful, I thought I'd post my gist example here. fbconsole makes it fairly simple to extract data through the Facebook Graph API.
The caveat is that it was not terribly easy to programmatically extract data through fbconsole so I wrote the fbconsole.automatically_authenticate to make it much easier to access this information in a systematic way. This addition has not yet been incorporated into the master branch of fbconsole (it was just posted this morning), but it is available here in the meantime for those that are interested.
I've been meaning to learn another language than java. So I started to poke around with python. I've gone over 'dive into python' so I have a decent knowledge about python now.
where do you suggest I go from here? I dont want to go through another advanced book again and would like to use the python knowledge towards building 'something'.
I've heard that python is good for web crawling, however, I did not see that in dive into python. Can the community suggest how to use my pythong knowledge towards web crawlers or spiders?
That really kind of depends on what you enjoy, or would like to build. Since you haven't said, I'll recommend something I enjoyed instead. Programming Collective Intelligence by Toby Segaran is a fun book, and the examples are all in Python. It might be more interesting to you -- if nothing else, it would give your web crawler something to do with the pages it gathers.
Edit: Fusspawn's suggestion of PyGame is very good, if don't want any more books and just want to "dive in" to something.
You can try my Building Skills in OO Design.
http://homepage.mac.com/s_lott/books/oodesign.html
If you like math try learning Python by solving Project Euler problems using python. Each problem is not too much code and it helped me increase my python skills.
I always find making a small game is a nice way to learn a language
PyGame makes it simple and could help learn more about python. I suggest giving it ago if your that way inclined.
To get started with web crawling, consider the Scrapy framework.
http://scrapy.org/
"Scrapy is a high level scraping and web crawling framework for writing spiders to crawl and parse web pages for all kinds of purposes, from information retrieval to monitoring or testing web sites."
It's still edging towards a first release, but is usable and has decent documentation.
For very basic web scraping, check out Mechanize (for basic web "browsing") and BeautifulSoup (for parsing "html soup"):
http://wwwsearch.sourceforge.net/mechanize/
http://www.crummy.com/software/BeautifulSoup/
One fun thing to do would be to combine these interests with some natural language processing projects. The NLTK book recently published by O'Reilly is available online as well:
http://www.nltk.org/book
Lots of fun to be had combining these interests. :-)
If you want to expand beyond web crawling and don't want to start a your own project (or don't know what to do), check out The Python Challenge. It's a game where you have to solve puzzles with a bit of python code. I really enjoyed it.
Is web crawling something you want to do or just something you think you can accomplish? Python is a good tool for web crawling(see here and here), but if you really just want ANY project to work on to get more familiar to the language/APIs I'd suggest you pick a project that you have a general interest in regardless. That way it'll be easier to stick with to fruition as you already have an interest in the project in addition to an interest in the language.
Find an interesting open source project to participate in. You could start looking on pythonsource or sourceforge.
The Tools/webchecker/ directory, which should be in your Python distribution (otherwise you can get it via the link I gave), is a start -- with lots of limitations (no threading except in wsgui.py, no async operation, ...), but removing some of them would be a great learning experience!
A vastly superior spidering system could be built on top of Twisted, e.g. starting with the snippet at the bottom of this mail (which only gets one page, but in the proper asynchronous way!) and adding the other functionality you see exemplified in webchecker (parse and respect robots.txt, get links from pages, etc, etc).
If you wanna "advanced book", I recommend Alex's Python in a Nutshell, Second Edition, learn quite a lot from the book, and Tarek's Expert Python Programming,we all know it's a advanced book for it's title:) .
For read some open source project, recommend SQLAlchemy and Django.
Maybe try to start you own project is the best way.
Others have said it but I'll repeat: work on something you are interested in or it won't be fun.
If you do decide that a crawler would be fun, take a look at google-kongulo, web spider plugin for Google desktop search. The code is quite short and well-written, so this might make a good base for when you decide what you want to crawl.
If you're specifically interested in crawling the Web, check out the three-part talk called "Scrape the Web" given at PyCon 2009. It's part of this RSS feed.
Read Dive Into Python again, it discusses HTML processing and HTTP web services in chapters 8 and 11.