Catch response on send_keys - python

I am using Selenium and unittest to write automated testing for web app
I have a text field that works as a 'search engine'. API returns response in json format on the entry of each character in the text field.
For example I get element of search and enter “Arrays” in the same:
def test_search(self):
driver = self.driver
driver.get(URL)
# find text field
element = driver.find_element_by_id("gsc-i-id2")
# enter some text into a text field
element.send_keys("Arrays")
# --> api returns response in json format
# --> catch response
Is it posible to get result list? Idea is to get JSON from response, is it possible?

Related

How do I read the text within in a <pre> in python?

I'm trying to make a script that detects whether or not an Instagram username is taken. I found that using the url
https://www.instagram.com/{username}/?__a=1 will fill with info about the account if the name exists, but if the name doesn't exist, the page will just have {} inside of a pre and nothing else.
I'm using Requests and BeautifulSoup to scrape the page. Here is a script I wrote to test this out:
import requests
from bs4 import BeautifulSoup
username = input("Enter the username you would like to check:")
account_url=('https://www.instagram.com/' + username + '/?__a=1')
r = requests.get(account_url)
print(r.text)
The displaying the text works, but even when I put a username that doesn't exist or a random jumble of letters, it always returns a bunch of html that I don't see in inspect element on the actual url. How do I make it just returns the text inside of the pre? I just want to detect if the site shows nothing so I can determine whether or not it's a taken username.
Also, when you load the instagram ?__a=1 url with a non-existing username, inspect element will say there was an error 404, but testing the status of the requests variable in python always comes back with 200, which is success. I'm pretty inexperienced with python because I haven't used it in a very long time so some help would be greatly appreciated.
If you want a list of accounts which are not taken you could use this
import requests
not_taken = []
user_names = ["randomuser1", "randomuser2", "randomuser3", "etc..."]
for name in user_names:
response = requests.get(f"https://www.instagram.com/{name}/?__a=1")
if response.status_code == 404:
not_taken.append(name)
Now you can use not_taken as you want , for example :
print(not_taken)

Read page source before POST

I want to know if there is a way to POST parameters after reading the page source. Ex: read captcha before posting ID#
My current code:
import requests
id_number = "1"
url = "http://www.submitmyforum.com/page.php"
data = dict(id = id_number, name = 'Alex')
post = requests.post(url, data=data)
There is a captcha that is changeable after every request to http://submitforum.com/page.php (obv not a real site) I would like to read that parameter and submit it to the "data" variable.
As discussed in OP comments, selenium can be used, methods without browser emulation may also exists !
Using Selenium (http://selenium-python.readthedocs.io/) instead of requests module method:
import re
import selenium
from selenium import webdriver
regexCaptcha = "k=.*&co="
url = "http://submitforum.com/page.php"
# Get to the URL
browser = webdriver.Chrome()
browser.get(url)
# Example for getting page elements (using css seletors)
# In this example, I'm getting the google recaptcha ID if present on the current page
try:
element = browser.find_element_by_css_selector('iframe[src*="https://www.google.com/recaptcha/api2/anchor?k"]')
captchaID = re.findall(regexCaptcha, element.get_attribute("src"))[0].replace("k=", "").replace("&co=", "")
captchaFound = True
print "Captcha found !", captchaID
except Exception, ex:
print "No captcha found !"
captchaFound = False
# Treat captcha
# --> Your treatment code
# Enter Captcha Response on page
captchResponse = browser.find_element_by_id('captcha-response')
captchResponse.send_keys(captcha_answer)
# Validate the form
validateButton = browser.find_element_by_id('submitButton')
validateButton.click()
# --> Analysis of returned page if needed

Python no json object could be decoded

I'm trying to create a simple login into a "Kahoot!" quiz .
First thing i'm trying to do is load from "https://kahoot.it/#/" JSON objects so i could fill the form in it (i tried to fill the form using 'mechenize' but its seems to support only html forms).
when im running the next script im getting exception that json could not be decoded:
import urllib, json
url = "https://kahoot.it/#/"
response = urllib.urlopen(url)
data = json.loads(response.read())
print data
output:
ValueError: No JSON object could be decoded
Any ideas? ,
Thanks.
type(response.read()) is str, representing the HTML of the page. Obviously it's not a valid JSON therefore you are getting that error.
EDIT If you are trying to login to that page, it is possible with selenium:
from selenium import webdriver
url = "https://kahoot.it/#/"
driver = webdriver.Chrome() # or webdriver.Firefox()
driver.get(url)
# finding the text field and 'typing' the game pin
driver.find_element_by_xpath('//*[#id="inputSession"]').send_keys('your_game_pin')
# finding and clicking the sumbit button
driver.find_element_by_xpath('/html/body/div[3]/div/div/div/form/button').click()

Input html form data from python script

I am working on a project and I need to validate a piece of data using a third party site. I wrote a python script using the lxml package that successfully checks if a specific piece of data is valid.
Unfortunately, the site does not have a convenient url scheme for their data and therefor I can not predict the specific url that will contain the data for each unique request. Instead the third party site has a query page with a standard html text input that redirects to the proper url.
My question is this: is there a way to input a value into the html input and submit it all from my python script?
Yes there is.
Mechanize
Forms
List the forms
import mechanize
br = mechanize.Browser()
br.open(url)
for form in br.forms():
print "Form name:", form.name
print form
select form
br.select_form("form1")
br.form = list(br.forms())[0]
login form example
br.select_form("login")
br['login:loginUsernameField'] = user
br['login:password'] = password
br.method = "POST"
response = br.submit()
Selenium
Sending input
Given an element defined as:
<input type="text" name="passwd" id="passwd-id" />
you could find it using any of:
element = driver.find_element_by_id("passwd-id")
element = driver.find_element_by_name("passwd")
element = driver.find_element_by_xpath("//input[#id='passwd-id']")
You may want to enter some text into a text field:
element.send_keys("some text")
You can simulate pressing the arrow keys by using the “Keys” class:
element.send_keys("and some", Keys.ARROW_DOWN)
These are the two packages I'm aware of that can do what you've asked.

request.get doesn't work in python scraper

Hi I am trying to make this basic scraper work, where it should go to a website fill "City" and "area" ,search for restaurants and return the html page.
This is the code i'm using
payload = OrderedDict([('cityId','NewYork'),('area','Centralpark')])
req = requests.get("http://www.somewebsite.com",params=payload)
f = req.content
soup = BeautifulSoup((f))
And Here is how the Source HTML looks like
When I'm checking the resulting soup variable it doesn't have the search results , instead it contains the data from the first page only,which has the form for entering city and area value (i.e. www.somewebsite.com, what i want is results of www.somewebsite.com?cityId=NewYork&area=centralPark).So Is there anything that i have to pass with that params to explicitly press the search button or is there any other way to make it work.
You need first check whether you can visit the URL by web browser and get the correct result.

Categories

Resources