Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Could you recommend me some ways to scrape data from a web page?
I have been trying to use Python but I am stuck with my code. I was thinking about using Octoparse. This is the webpage (http://www.mlsa.am/?page_id=368), it is a drop-down list where the selection of a previous case allows you to choose other options in the other cases.
You could use scrapy framework specially built for scraping purpose only.
As an starter you can start from official documentation & you will find everything you need from it.
https://docs.scrapy.org/en/latest/intro/tutorial.html
except scrapy you can use beautifulsoup also.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I need to build a scraper that fetches contact info from different websites with different structures using python. I have tried doing it but since websites have different structures code doesn't for all.
Is this doable or should I need to write code for each website idiviadually?
There is no direct way of getting information from different websites, due to different architectures on how the websites are built. It is very rare that different websites have the same class values and IDs used.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
essentially i have a python script that takes an input from the user, and scrapes the relevant data and outputs it. I thought it'd be neat to turn this into a web app, where should i start?
You need to use some frameworks such as Flask or Django.
I wrote a small program which automates the process for simple scripts.
https://github.com/bkb3/py2webapp
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I am trying to scrape this website with scrapy and I have had to search for each link extracting the information from each one, I would like to know if there is an API of the site that I can use (I don't know how to find it).
I would also like to know how I can obtain the latitude and longitude? Currently the map is shown but I do not know how to obtain the numbers
I appreciate any suggestions
The website may be loading the data dynamically using Javascript. Use your browser dev tools and look at the networking tab, look for any XHR calls which may be accessing an API. Then you can scrape from that directly.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I'm trying to capture a test case result where table content Search/filter output need to cross check each time when the test run. I have attached a table grid that I need to use to search/filter. I'm using python script for the automation.
Any suggestion?
You can use selenium to test. The table's inner HTML can be accessed using
table_content = element.get_attribute('innerHTML').
you can parse that HTML to cross check your results.
Have a look at this question for reference.
Get HTML Source of WebElement in Selenium WebDriver using Python
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
what is the difference between spider and crawler?
and which one should I use? ( i mean which one provide more ffunctions )
You extend the scrapy.contrib.spiders.crawl.CrawlerSpider class when you want to create a spider that uses rules and link extractors to specify how the crawling process will work -- namely, how it should follow links to other pages.
You extend the class scrapy.Spider when you don't need crawling or if you just want to handle it yourself.