In my previous question which was answered greatly, I was told how to interact with a website with selenium but unfortunately interacting with the flash game itself is now the problem. I have tried researching how to do it in Selenium but it can't be done. I've seen FlashSelenium, Sikuli, and AutoIT.
I can only find documentation for FlashSelenium in Java, It's easier for me to use AutoIT rather than Sikuli as I'd have to learn to use Jpython to create the kind of script I want to, which I am not straying away from learning just trying to finish this as fast as possible. As for AutoIT, the only problem with it is that I don't undertsand how to use it with selenium
from selenium import webdriver
import autoit
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
driver.get("http://na58.evony.com/s.html?loginid=747970653D74727947616D65&adv=index")
driver.maximize_window()
assert "Evony - Free forever " in driver.title
So far I have this and It's doing what is suppose to do which is create a new account using that "driver.get" but when I reach to the page, it is all flash and I can not interact with anything in the webpage so I have to use AutoIT but I don't know how to get it to "pick-up" from where selenium left off. I want it to interact with a button on the webpage and from viewing a previous post on stackoverflow I can use a (x,y) to specify the location but unfortunately that post didn't explain beyond that. Any and all information would be great, thanks.
Related
How can I bypass the Google CAPTCHA using Selenium and Python?
When I try to scrape something, Google give me a CAPTCHA. Can I bypass the Google CAPTCHA with Selenium Python?
As an example, it's Google reCAPTCHA. You can see this CAPTCHA via this link: https://www.google.com/recaptcha/api2/demo
To start with using Selenium's Python clients, you should avoid solving/bypass Google CAPTCHA.
Selenium
Selenium automates browsers. Now, what you want to achieve with that power is entirely up to individuals, but primarily it is for automating web applications through browser clients for testing purposes and of coarse it is certainly not limited to that.
CAPTCHA
On the other hand, CAPTCHA (the acronym being ...Completely Automated Public Turing test to tell Computers and Humans Apart...) is a type of challenge–response test used in computing to determine if the user is human.
So, Selenium and CAPTCHA serves two completely different purposes and ideally shouldn't be used to achieve any interrelated tasks.
Having said that, reCAPTCHA can easily detect the network traffic and identify your program as a Selenium driven bot.
Generic Solution
However, there are some generic approaches to avoid getting detected while web scraping:
The first and foremost attribute a website can determine your script/program by is through your monitor size. So it is recommended not to use the conventional Viewport.
If you need to send multiple requests to a website, keep on changing the User Agent on each request. Here you can find a detailed discussion on Way to change Google Chrome user agent in Selenium?
To simulate humanlike behavior, you may require to slow down the script execution even beyond WebDriverWait and expected_conditions inducing time.sleep(secs). Here you can find a detailed discussion on How to sleep Selenium WebDriver in Python for milliseconds
This use case
However, in a couple of use cases we were able to interact with the reCAPTCHA using Selenium and you can find more details in the following discussions:
How to click on the reCAPTCHA using Selenium and Java
CSS selector for reCAPTCHA checkbok using Selenium and VBA Excel
Find the reCAPTCHA element and click on it — Python + Selenium
References
You can find a couple of related discussion in:
How can I make a Selenium script undetectable using GeckoDriver and Firefox through Python?
Is there a version of Selenium WebDriver that is not detectable?
tl; dr
How does reCAPTCHA 3 know I'm using Selenium/chromedriver?
In order to bypass the CAPTCHA when scraping Google, you have to manually solve a CAPTCHA and export the cookies Google gives you. Now, every time you open a Selenium WebDriver, make sure you add the cookies you exported. The GOOGLE_ABUSE_EXEMPTION cookie is the one you're looking for, but I would save all cookies just to be on the safe side.
If you want an additional layer of stability in your scrapes, you should export several cookies and have your script randomly select one of them each time you ping Google.
These cookies have a long expiration date so you wouldn't need to get new cookies every day.
For help on saving and loading cookies in Python and Selenium, you should check out this answer: How to save and load cookies using Python + Selenium WebDriver
Clear Browsing History, cached data, cookies and other site data
First Create an Google Account while you are in browser window opened by selenium.
Sign in to your account
wd.get("https://accounts.google.com/signin/v2/identifier?hl=en&passive=true&continue=https%3A%2F%2Fwww.google.com%2F%3Fgws_rd%3Dssl&ec=GAZAmgQ&flowName=GlifWebSignIn&flowEntry=ServiceLogin");
Thread.sleep(2000);
wd.findElement(By.name("identifier")).sendKeys("Email"+Keys.ENTER);
Thread.sleep(3000);
wd.findElement(By.name("password")).sendKeys("Password"+Keys.ENTER);
Thread.sleep(5000);
Then Open any website that uses recaptcha tick on checkmark using this code
String framename=wd.findElement(By.tagName("iframe")).getAttribute("name");
wd.switchTo().frame(framename);
wd.findElement(By.xpath("//span[#id='recaptcha-anchor']")).click();
You won't find any Puzzles or anything.
Bypass as in solve it or bypass as in never get it at all?
To solve it:
sign up with 2captcha, capmonster cloud, deathbycaptcha, etc. and follow their instructions. They will give you a token that you pass with the form.
To never get it at all:
Make sure you have good IP reputation (most important for Cloudflare).
Make sure you have a good browser fingerprint (most important for Distil) - I recommend puppeteer + the stealth plugin.
Ok, so there is a simple python script to solve captcha for you.
It basically read the audio and then use google assistant to convert it into text and paste it.
It is only workable in audio captchas which is given the most case with imahe captcha V2
https://github.com/ohyicong/recaptcha_v2_solver
Disclaimer!
I do not write the script, i just get an idea of doing this but got this brother project so, thought to help others through this.
The simple solution is suspend the program for 10 seconds or more and then when the automated browser opens solve the reCAPTCHA on your own and then the program starts after 10 seconds and execute rest of the program like clicking submit button or other things
How can I bypass the Google CAPTCHA using Selenium and Python?
When I try to scrape something, Google give me a CAPTCHA. Can I bypass the Google CAPTCHA with Selenium Python?
As an example, it's Google reCAPTCHA. You can see this CAPTCHA via this link: https://www.google.com/recaptcha/api2/demo
To start with using Selenium's Python clients, you should avoid solving/bypass Google CAPTCHA.
Selenium
Selenium automates browsers. Now, what you want to achieve with that power is entirely up to individuals, but primarily it is for automating web applications through browser clients for testing purposes and of coarse it is certainly not limited to that.
CAPTCHA
On the other hand, CAPTCHA (the acronym being ...Completely Automated Public Turing test to tell Computers and Humans Apart...) is a type of challenge–response test used in computing to determine if the user is human.
So, Selenium and CAPTCHA serves two completely different purposes and ideally shouldn't be used to achieve any interrelated tasks.
Having said that, reCAPTCHA can easily detect the network traffic and identify your program as a Selenium driven bot.
Generic Solution
However, there are some generic approaches to avoid getting detected while web scraping:
The first and foremost attribute a website can determine your script/program by is through your monitor size. So it is recommended not to use the conventional Viewport.
If you need to send multiple requests to a website, keep on changing the User Agent on each request. Here you can find a detailed discussion on Way to change Google Chrome user agent in Selenium?
To simulate humanlike behavior, you may require to slow down the script execution even beyond WebDriverWait and expected_conditions inducing time.sleep(secs). Here you can find a detailed discussion on How to sleep Selenium WebDriver in Python for milliseconds
This use case
However, in a couple of use cases we were able to interact with the reCAPTCHA using Selenium and you can find more details in the following discussions:
How to click on the reCAPTCHA using Selenium and Java
CSS selector for reCAPTCHA checkbok using Selenium and VBA Excel
Find the reCAPTCHA element and click on it — Python + Selenium
References
You can find a couple of related discussion in:
How can I make a Selenium script undetectable using GeckoDriver and Firefox through Python?
Is there a version of Selenium WebDriver that is not detectable?
tl; dr
How does reCAPTCHA 3 know I'm using Selenium/chromedriver?
In order to bypass the CAPTCHA when scraping Google, you have to manually solve a CAPTCHA and export the cookies Google gives you. Now, every time you open a Selenium WebDriver, make sure you add the cookies you exported. The GOOGLE_ABUSE_EXEMPTION cookie is the one you're looking for, but I would save all cookies just to be on the safe side.
If you want an additional layer of stability in your scrapes, you should export several cookies and have your script randomly select one of them each time you ping Google.
These cookies have a long expiration date so you wouldn't need to get new cookies every day.
For help on saving and loading cookies in Python and Selenium, you should check out this answer: How to save and load cookies using Python + Selenium WebDriver
Clear Browsing History, cached data, cookies and other site data
First Create an Google Account while you are in browser window opened by selenium.
Sign in to your account
wd.get("https://accounts.google.com/signin/v2/identifier?hl=en&passive=true&continue=https%3A%2F%2Fwww.google.com%2F%3Fgws_rd%3Dssl&ec=GAZAmgQ&flowName=GlifWebSignIn&flowEntry=ServiceLogin");
Thread.sleep(2000);
wd.findElement(By.name("identifier")).sendKeys("Email"+Keys.ENTER);
Thread.sleep(3000);
wd.findElement(By.name("password")).sendKeys("Password"+Keys.ENTER);
Thread.sleep(5000);
Then Open any website that uses recaptcha tick on checkmark using this code
String framename=wd.findElement(By.tagName("iframe")).getAttribute("name");
wd.switchTo().frame(framename);
wd.findElement(By.xpath("//span[#id='recaptcha-anchor']")).click();
You won't find any Puzzles or anything.
Bypass as in solve it or bypass as in never get it at all?
To solve it:
sign up with 2captcha, capmonster cloud, deathbycaptcha, etc. and follow their instructions. They will give you a token that you pass with the form.
To never get it at all:
Make sure you have good IP reputation (most important for Cloudflare).
Make sure you have a good browser fingerprint (most important for Distil) - I recommend puppeteer + the stealth plugin.
Ok, so there is a simple python script to solve captcha for you.
It basically read the audio and then use google assistant to convert it into text and paste it.
It is only workable in audio captchas which is given the most case with imahe captcha V2
https://github.com/ohyicong/recaptcha_v2_solver
Disclaimer!
I do not write the script, i just get an idea of doing this but got this brother project so, thought to help others through this.
The simple solution is suspend the program for 10 seconds or more and then when the automated browser opens solve the reCAPTCHA on your own and then the program starts after 10 seconds and execute rest of the program like clicking submit button or other things
I know the webscraping and I have taken the data from different website and I am using python language and selenium webdriver chrome. But I call a website it is open front page and then I click or go any other page then website restrict me and website know that I am using automated chrome.
This may be because the website uses reCAPTCHA v3, which "allows you to verify if an interaction is legitimate without any user interaction". This means that they can identify if you are not a human without asking you to check the famous "I'm not a robot" box. That box is used in the former version of reCAPTCHA, v2.
Read more about reCAPTCHA here: https://developers.google.com/recaptcha/docs/versions
I don't think it's possible to work around this with Selenium. And, as was already mentioned, web scraping is often illegal.
These days, websites can detect your program as a BOT pretty easily. Currently Google have 4(four) reCAPTCHA to choose and implement from when creating a new site.
reCAPTCHA v3
reCAPTCHA v2 ("I'm not a robot" Checkbox)
reCAPTCHA v2 (Invisible reCAPTCHA badge)
reCAPTCHA v2 (Android)
Solution
However there are some generic approaches to avoid getting detected while web-scraping:
The first and foremost attribute a website can determine your script/program is through your monitor size. So it is recommended not to use the conventional Viewport.
If you need to send multiple requests to a website keep on changing the User Agent on each request. Here you can find a detailed discussion on Way to change Google Chrome user agent in Selenium?
To simulate human like behavior you may require to slow down the script execution even beyond WebDriverWait and expected_conditions inducing time.sleep(secs). Here you can find a detailed discussion on How to sleep webdriver in python for milliseconds
Outro
See:
Unable to use Selenium to automate Chase site login
Selenium webdriver: Modifying navigator.webdriver flag to prevent selenium detection
I have this python code, which accesses a website using the module webbrowser:
import webbrowser
webbrowser.open('kahoot.it')
How could I input information into a text box on this website?
I suggest you use Selenium for that matter.
Here is an example code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
# driver = webdriver.Firefox() # Use this if you prefer Firefox.
driver = webdriver.Chrome()
driver.get('http://www.google.com/')
search_input = driver.find_elements_by_css_selector('input.gLFyf.gsfi')[0]
search_input.send_keys('some search string' + Keys.RETURN)
You can use Selenium better if you know HTML and CSS well. Knowing Javascript/JQuery may help too.
You need the specific webdriver to run it properly:
GeckoDriver (Firefox)
Chrome
There are other webdrivers available, but one of the previous should be enough for you.
On Windows, you should have the executable on the same folder as your code. On Ubuntu, you should copy the webdriver file to /usr/local/bin/
You can use Selenium not only to input information, but also to a lot of other utilities.
I don't think that's doable with the webbrowser module, I suggest you take a look at Selenium
How to use Selenium with Python?
Depending on how complex (interactive, reliant on scripts, ...) your activity is, you can use requests or, as others have suggested, selenium.
Requests allows you to send and get basic data from websites, you would probably use this when automatically submitting an order form, querying an API, checking if a page has ben updated, ...
Selenium gives you programmatic control of a "normal" browser, this seems better for you specific use-case.
The webbrowser module is actually only (more or less) able to open a browser. You can use this if you want to open a link from inside your application.
Like for example first open Safari
os.system("open /Applications/Safari.app")
Then go to a user inputs something else and it will be inputed into the search bar and searched.
Also for this example
os.system("open /Applications/Messages.app")
Then send a message to someone.
And this example if possible:
os.system("open /Applications/Dictionary.app")
And search for a word?
Thank you so much :D
That's not how it works. In your example, you're just using python to execute a shell command. Once the program opens, it's not going to just give you information about what is happening unless it is specifically designed to do so or supports some sort of api.
If you want to automate web task - look at selenium
Selenium Python bindings provides a simple API to write functional/acceptance tests using Selenium WebDriver. Through Selenium Python API you can access all functionalities of Selenium WebDriver in an intuitive way.
Selenium Python bindings provide a convenient API to access Selenium WebDrivers like Firefox, Ie, Chrome, Remote etc.
Here is a link to what is selenium and webdrivers
see you can do this:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
driver.get("http://www.python.org")
assert "Python" in driver.title
elem = driver.find_element_by_name("q")
elem.clear()
elem.send_keys("pycon")
elem.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source
driver.close()
see here for a breakdown of the code
For your message example you can use Skype4Py API
Those are all OS X apps, so I'm going to take a leap and assume you're using a Mac. The Automator app might be more appropriate for this task as it basically helps you easily write AppleScript applications.
For Web Automation:
Selenium Python BindingsI have a detailed example here along with some helpful links.
Splinter (simpler than selenium)
In order to do that you need to use the application on the macbook called automator. This way, you will have permission to access application's source code and edit it, essentially.