How to scrape without web browser driver? - python

I made a web scrape program using selenium.
This program is access target URL and download a file.
After updating Chrome, program does not work because chromedriver is old version.
How to do web scraping and file download not use chromedriver?
Thank all for reading.

I think it would be easier if you also updated ChromeDriver, that way your program would work again. Or you could install the previous Chrome version again.
But if you don't want that, you can use GeckoDriver with Firefox.

You can use a headless scrapper like Pupeppeteer. You also can update your Chrome driver to be compatible with your browser version that is the most recomended.

Related

How to connect selenium webdriver to already existing instance of firefox browser

In my code i am initially starting firefox browser via command prompt.
I want to connect the firefox webdriver to this already open instance of firefox browser and not open new firefox browser.
How do i implement this in selenium python.
Note:I am able to implement this for other browsers like chrome,edge but i dont find any way of doing it for firefox.
I also refered https://github.com/mozilla/geckodriver/issues/1669 ,but over here i am not sure how this --connect-existing works, it didnt work out for me.
Thanks in advance :)
Repeating the answers in github: there is a flag
--connect-existing
to connect the geckodriver with existing firefox instance.
https://helpmanual.io/help/geckodriver/
This here is a nice tutorial:
https://tarunlalwani.com/post/reusing-existing-browser-session-selenium/

What modules are best to imitate selenium style of being able to login into accounts and place orders? Without opening web browser [duplicate]

I have been trying to do web automation using Selenium. Is there any way to use a browser like Chrome or Firefox without actually installing them, like using some alternate options, or having portable versions of them. If I can use portable versions how do I tell Selenium to use it?
To use the browsers like google-chrome and firefox you have to install the full-blown browser.
You can find a detailed discussion in Is Chrome installation needed or only chromedriver when using Selenium?
As an alternative you can use the headless phantomjs browser as follows:
Code Block:
from selenium import webdriver
driver = webdriver.PhantomJS(executable_path=r'C:\WebDrivers\phantomjs.exe', service_args=['--ignore-ssl-errors=true', '--ssl-protocol=tslv1.0'])
driver.set_window_size(1920, 1080)
driver.get("https://account.booking.com/register?op_token=EgVvYXV0aCJ7ChQ2Wjcyb0hPZDM2Tm43emszcGlyaBIJYXV0aG9yaXplGhpodHRwczovL2FkbWluLmJvb2tpbmcuY29tLyo2eyJwYWdlIjoiL3JlZGlyZWN0LXRvLWpvaW5hcHAtbHA_bGFuZz1pdCZhaWQ9MTE4NzM2MCJ9QgRjb2RlKg4QAToAQgBY5dGK8gVgAQ")
print(driver.page_source)
driver.quit()
You can find a detailed discussion in PhantomJS can't load correctly web page
References
A couple of relevent discussions:
Do headless web browser need selenium WebDriver?
Difference of Headless browsers for automation
Install Selenium typing pip install selenium.
It comes with a portable version of Chrome browser, no need to manually install any browser for this.
Chrome will show this message to indicate that it is being 'remote controlled:
"Chrome is controlled by automated test software"

Is there a way to get python scripts working on ubuntu server?

my problem is, that I wrote some python scripts, which are working fine. Now I have to get them to work on an ubuntu server. The problem is, that I need to use the chromedriver (selenium) and ofc there cant be an open browser at the server. So is there a way to use selenium with a server?
What you need is called 'Headless' editions of a browser.
These headless browsers don't open up as a browser but run in the background for you to perform scripts on.
Try searching for Headless + 'The browser driver you use'
Here is a quick tutorial to get you started: https://medium.com/#pyzzled/running-headless-chrome-with-selenium-in-python-3f42d1f5ff1d

Why local chrome-urls like: chrome://downloads or chrome://apps doesn't work in headless mode?

I am trying to visit chrome local urls. But it's not working. Does headless chrome support local urls?
I was looking for exactly this just today.
Found this:
Most chrome internal pages are not implemented in headless mode. This is a limitation of headless Chrome itself, and is not related to ChromeDriver. If you need a particular internal page available in headless Chrome, please file a feature request at https://crbug.com/.
:(
source

Chrome crashes when opened with selenium webdriver

When I am launching Chrome browser from python shell using Selenium webdriver, it works well and good. But, when I launch the browser using the same code from inside a Python script, it crashes. How can I solve it?
Okay got it.
The reason was that I was not providing a url after opening the broswer.
We need to provide the url after opening the browser so that it can go to a specific page and it works fine, else it will crash.

Categories

Resources