Load local chrome user profile to Heroku to use it with selenium - python

I want to build a simple web scraper using python selenium and deploy it on Heroku. I've already done the deployment process with chromedriver and chrome buildpacks and everything is working fine. But I still need to implement one thing. I want to use my local chrome profile so that I don't have to sign up into Google. This is working fine locally by using
options = webdriver.ChromeOptions()
options.add_argument(r"user-data-dir=C:\\Users\\user\\AppData\\Local\\Google\\Chrome\\User Data")
To access my chrome profile on Heroku I just uploaded it the whole folder in the same directory as the code. After the deployment I can access the folder under /app/User Data and can see all files. However if I pass
options.add_argument(r"user-data-dir=/app/User Data")
to the Driver, it doesn't load the profile and the login process fails. I tried to get more information by printing the source code of "chrome://version", but that's just an empty page.
Do you have any suggestion what I can try instead to get it working? Thank you!

If you just need to login you can use https://pypi.org/project/selenium-stealth/ to login. It is selenium but it doesn't get detected when signing in.

Related

Keep changes made to firefox extensions through Selenium

My goal is to use Selenium to configure an extension in Firefox programmatically.
What I did is the following:
Load my default Firefox profile, so that my Selenium session has access to my extensions.
Use Selenium to navigate to the extension's configuration page (moz-extensions://*********************/options/)
Use selenium to automate the configuration process and save.
My only issue is that my changes are lost. That is, upon starting Firefox again (like a "normal" user), none of the changes I made are kept.
Is there a way to ensure that Selenium saves the changes to my profile?

Problem with deploying an automation script on heroku

I have an automation script that opens google browser and does some stuff. I also visit some site for which I need to be logged it. On manual machine, I logged in those site once and that cache get stored in browser profile. So, I just add that profile data in webdriver so that I don't have to log in everytime. Now I'm thinking of deploying it to heroku. Everything else if fine, heroku also have support for google chrome and chrome drivers but how do I import my own profile into that chrome. Thats what I'm stuck on. Any help regarding this would be very helpful. Thank you!
Reference to another old un answered thread on this forum. How do I access my google user profile on Heroku using Ruby?

How to use headless chrome in python on vercel?

You may know that heroku will stop their free dyno, free postgres etc from November. So I was finding some alternative to run my python web apps. I have almost 10 regular web apps which I visit regularly, like: url shortener, keyword research, google drive direct link generator site and many more. All of these are hosted on heroku. But I'm moving to vercel now. I setup all projects on vercel but the last one is complicated. My last project is python selenium bot. This one is my keyword research web app. I used some buildpack eg: Headless Chrome (https://github.com/heroku/heroku-buildpack-google-chrome) and Chromedriver (https://github.com/heroku/heroku-buildpack-chromedriver) to make this project run properly. But the problem is I could not find anything like buildpack in vercel to add Chrome and Chromedriver.
Anyone know about this?
Edit:
That was a kind of story and many people didn’t understand what I was asking.
So, My project is about selenium (python). Selenium needs google chrome browser installed and a chromedriver to run itself. There is another option without installing chrome is to set chrome binary location in webdriver.ChromeOptions(). I want to host this selenium project on vercel.com which is linux based.
So my question is how can I install Chrome Browser and ChromeDriver in vercel?

Deploying web scraping website with selenium chrome

I have created a website that scrapes multiple hockey websites for game scores. It runs perfectly on my local server and I am now in the process of trying to deploy. I have tried using pythonanywhere.com but selenium does not seem to be working on it. For any one who has deployed a website that uses selenium/webdriver, what is the easiest/best platform to deploy a website like this (it does not have to be free like pythonanywhere, as long as it is not too expensive, lol!). Thanks in advance
Selenium does work on PythonAnywhere. If you use a free account, you'd have restricted internet access though. Also it's recommended to scrape outside of the web app, since it would slow the views down -- you should rather use a Schedule/Always-on task for that instead. You can also refer to those PythonAnywhere help pages:
Using Selenium
Async work in web apps
You can use the AWS, GCP, or Digitalocean Linux servers. In this case, you first have to install chrome in Linux and then put the relevant version of the chrome driver in your project directory. Make sure to check the chrome version first and then put the relevant Chrome driver on your machine.

How to use Selenium with the Wappalyzer browser plugin

I want to make a python CLI for Wappalyzer (https://www.wappalyzer.com), but it is a browser plugin. The plugin identifies programs/frameworks running on a webpage, and I want to be able to use/get that information from a python script. While they do have a paid API, I was wondering if it is possible to use Selenium and the ChromeDriver to visit the page with a chrome extension, and then retrieve the data generated by Wappalyzer.

Categories

Resources