I'm not a programmer (I'm starting to learn some coding, but I'm an absolute beginner).
Thing is that my work has lots of things that could be automated with scripting. I have to search lots of companies in Linkedin Sales Navigator which leads me to absolute boredom and alienation. I have to grab a spreadsheet, copy and paste every day, and I feel kind of stupid doing something that could be easily automated.
I have thought of automating this stuff, but I don't know where to start. A friend of mine told me that Stack Overflow could help me to know where I can start. I did my research but didn't found anything profitable, maybe this is due to my lack of knowledge.
Summarizing: I need the guides to create a Script that copy a spreadsheet field to a browser field and extract the results from it and paste it out to the adjacent fields of the same spreadsheet, with the info extracted from LinkedIn. Where I can start? Thanks for your comprehension.
Python is perfect the task the described above, I would recommend looking at following:
http://learnpythonthehardway.org/
Automate the boring stuff with python (book)
These should get you started.
Here a description of a tool that automate the extraction of leads and accounts:
https://medium.com/p/exporting-leads-informations-from-linkedin-sales-navigator-d38d3602e5b3
Related
I apologise in advance for the vagueness of this question - I am quite new to all this.
I am making a website that allows you to donate to various charities through the website (i.e. rather than redirecting you to that charities website, it allows you to donate straight from my website).
I am currently focusing on getting this working for the international rescue committee. Their donate page is here : https://help.rescue-uk.org/
I struggling to figure out what they are actually using to process payments. They have MaxMind but this seems to be an antifraud measure.
If anyone could give me some advice for where to start in terms of automating this process, I would be really grateful!
Again apologies, I am a newb
I realise I'll probably need to use selenium and headless browsing which I've done a bit of before. Any other help with getting started still very much appreciated (or if you don't think this is right)
Alternatively, is there an API that does this? Was looking at JustGiving but seemed to want to redirect
Firstly, apologies for the very basic question. I have looked into other answers but they haven't quite answered what I'm after. I'm confident designing a site in HTML/CSS and have very very basic knowledge of Python.
I want to run a very basic Python script on my website. It analyses tweets about a specific topic, and then posts a sentiment analysis score. I want it to run this sentiment analysis every hour and cache the score.
I have a working Python script which does this in Jupyter Notebook. Could you give me an overview of how I would make this script function online and cache the results? I've read into using Python web frameworks, but from my limited understanding, they seem like overkill?
Thank you for your help!
Could you give me an overview of how I would make this script function online
The key thing would be to uncouple the two parts of your system:
Producing the data
Showing it in a website.
So the first thing to do is have your sentiment-analysis script push its value to a database. The database could be something as simple as a csv file, or it could be a key/value store, or something like MySQL or CouchDB (or hundreds of other choices).
Over on the website you have to make a decision between:
Server-side
Client-side
If the former, you could program in Python if that is what you are most familiar with. Whatever language/framework combination you go for, there will an example tutorial of how to read a value from a database and display it: it is just about the most fundamental thing.
If client-side you will usually be programming in JavaScript. Again you need to choose a framework, but again you should easily be able to find a tutorial to follow.
(Unless you have a good reason to prefer server-side, such as familiarity with an existing framework, or security issues with accessing your database, I'd go with a client-side approach.)
I've read into using Python web frameworks... overkill?
Yes and no. You are going to need some kind of database, and some kind of framework. It would be good to understand the basics of web security, too. If the sentiment analysis is your major goal, all that is going to be a distraction, and it might be better to find a friend who already knows web programming to work with. Or just find a tutorial that is very close to what you want to do, and adapt that.
(P.S. I was going to flag your question as "too broad", but you did ask for an overview, so I hope this helps.)
I am a social scientist and a complete newbie/noob when it comes to coding. I have searched through the other questions/tutorials but am unable to get the gist of how to crawl a news website targeting the comments section specifically. Ideally, I'd like to tell python to crawl a number of pages and return all the comments as a .txt file. I've tried
from bs4 import BeautifulSoup
import urllib2
url="http://www.xxxxxx.com"
and that's as far as I can go before I get an error message saying bs4 is not a module. I'd appreciate any kind of help on this, and please, if you decide to respond, DUMB IT DOWN for me!
I can run wget on terminal and get all kinds of text from websites which is awesome IF I could actually figure out how to save the individual output html files into one big .txt file. I will take a response to either question.
Try Scrapy. It is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
You will most likely encounter this as you go, but in some cases, if the site is employing 3rd party services for comments, like Disqus, you will find that you will not be able to pull the comments down in this manner. Just a heads up.
I've gone down this route before and have had to tailor the script to a particular site's layout/design/etc.
I've found libcurl to be extremely handy, if you don't mind doing the post-processing using Python's string handler functions.
If you don't need to implement it purely in Python, you can make use of wget's recursive mirroring option to handle the content pull, then write your python code to parse the downloaded files.
I'll add my two cents here as well.
The first things to check are that you installed beautiful soup, and that it lives somewhere that it can be found. There's all kinds of things that can go wrong here.
My experience is similar to yours: I work at a web startup, and we have a bunch of users who register, but give us no information about their job (which is actually important for us). So my idea was to scrape the homepage and the "About us" page from the domain in their email address, and try to put a learning algorithm around the data that I captured to predict their job. The results for each domain are stored as a text file.
Unfortunately (for you...sorry), the code I ended up with was a bit complicated. The problem is that you'll end up getting a lot of garbage when you do the scraping, and you'll have to filter it out. You'll also end up with encoding issues, and (assuming you want to do some learning here) you'll have to get rid of low-value words. The total code is about 1000 lines, and I'll post some important pieces that may help you out here, if you're interested.
I am working on a website for which it would be useful to know the number of links shared by a particular facebook page (e.g., http://www.facebook.com/cocacola) so that the user can know whether they are 'liking' a firehose of information or a dribble of goodness. What is the best way to get the number of links/status updates that are shared by a particular page?
+1 for implementations that use python (this is a django website) but any solutions are welcome! I tried using fbconsole to accomplish this but I have come up a little short.
For what it is worth, this unanswered question seems relevant. As does the fact that, as of 2012.04.18, you can export your data to csv from the insights management page on the facebook site. The information is in there I just don't know how to get it out...
Thanks for your help!
In the event that anyone else finds this useful, I thought I'd post my gist example here. fbconsole makes it fairly simple to extract data through the Facebook Graph API.
The caveat is that it was not terribly easy to programmatically extract data through fbconsole so I wrote the fbconsole.automatically_authenticate to make it much easier to access this information in a systematic way. This addition has not yet been incorporated into the master branch of fbconsole (it was just posted this morning), but it is available here in the meantime for those that are interested.
I am new to programming and to Python itself. I have no programming experience. I have managed to read up on Python and done some fairly basic Python tutorial, now I am ready for my first project in Python.
I am basing my project around XBMC, I want to develop some addons for this awesome media center.
I have a few websites that I want to scrape and display in XBMC. One is a music website and one is a payed TV website which is only available to people with accounts with them. I have managed to scrape a website with feedparse but I have no idea how to output these titles and links to play in XBMC.
My question here is: where do I start, how do I construct the script for these websites, what tools/libraries/modules do I need. And what do I need to do to include it into XBMC.
On the general topic that has been asked a ton of times regarding webpage scraping, the common answer is always Mechanize/Beautiful Soup for python. That would allow you to actually get your data.
Once you have your data, its then just a matter of formatting it the way you want, for your xbmc app: http://wiki.xbmc.org/index.php?title=HOW-TO:Write_Python_Scripts_for_XBMC
Its a two step process.
Get your data from a source and format it into some common structure
Use the common structure to populate your elements in the xbmc script
What you actually want to do with your script will determine how you would use your data. If its just simply providing information, then that link above would pretty much explain it.