I will need to test few things from my linux server. I wonder if it is possible to manipulate some actions on web without access to browser. I have only access to linux server via command line. I know only selenium webdriver to do that actions but but for that I need browser.
What I want to do:
1) Input text to textbox on webpage using python script which is placed on linux server
2) Click button on webpage
Generally it is possible to manipulate actions on webpage from linux using python scripts?
If you input text to fom on webpage and submit the form, the browser will send POST or GET request to the server with contained information. The server then proceses (for example saves ) the information. You dont need a browser to send http request, you can send them directly from python.
AN example can be found here: How to simulate HTTP post request using Python Requests module?
If requests module isn't enough already, try using selenium with PhantomJS. PhantomJS is a headless WebKit scriptable with a JavaScript API.
PhantomJS : http://phantomjs.org
A great tutorial : https://realpython.com/headless-selenium-testing-with-python-and-phantomjs/
Related
Are there any alternatives to Selenium that don't require a web driver or browser to operate? I recently moved my code over to a Google Cloud VM instance, and when I run it there are multiple errors. I've been trying to get it to work for hours but just can't (no luck with PhantomJS, Chrome and GeckoDriver - tried re-downloading browsers, editing the sources.list file e.c.t.).
The page I'm web scraping uses JavaScript to load in numbers, which I was I initially chose Selenium. Everything else works perfectly though!
You could simply use the request library.
https://requests.readthedocs.io/en/master/
https://anaconda.org/anaconda/requests
You would then need to send a GET or POST request to the server.
If you do not know how to generate a proper POST request, simply try to "record" it.
If you have chrome, got to the page you want to navigate, press F12, navigate to the "Network" section and write method:POST into the filter.
Further info here:
https://stackoverflow.com/a/39661536/11971785
At first it is a bit more confusing than selenium, but once you understand it its waaaay better in my opinion.
Also the Java values shown on the page can usually be simply read out of the java code which is returned by your request.
No web driver or anything required and a lot more stable and customizable.
I am using Python with Selenium WebDriver to browse through website.
Now I have the problem that I want to monitor the XHR AJAX calls, thrown on the current site.
For example: I am on a specific page and the python selenium script clicks on a button on this site. This website button calls an AJAX request within this site.
Is it possible to monitor this XHR AJAX request and get it in my python script to handle the called AJAX URL?
Thanks!
UPDATE: I exactly want to do sth like this (but in python obviously)
https://dmitripavlutin.com/catch-the-xmlhttp-request-in-plain-javascript/
You can use browser.execute_script to catch the calls as explained in the link that you mentioned. In addition, start a fake website using Django on a separate thread. And in the JavaScript handler (sendReplacement), replace the url with the one of your django server.
In that server you will receive the AJAX call and be able to examine it.
You may be able to implement a simpler solution without the django server by simply monitor the calls directly from the JavaScript snippet and make it return the value you want directly. But if you need to monitor many calls and perform more complex examination is the requests, the former solution is more powerful.
Is where any way to get network data from browser's debug console or firebug to python script automatically, simultaneously? Webbrowser opens some websites and I see only simple, short URL. Instead of URL with params etc. All this data I may found in debug console manually, but how can I reach it using python and selenium?
I am novice in python (c++ developer), I am trying to do some hands-on on web scraping on windows IE.
The problem which I am facing is, when I open a URL using "requests" library the server sends me a login page always. I figured out the problem. Its actually doing it because it presumes you are coming through IE tries to execute on function which uses some information from the SSO ( single signup object ) which is there executing on the background in Windows on the first login to the web server ( consider this as some weird setup.)
On observing this I changed my strategy & started using webbrowser lib.
Now, when I try to do a webbrowser.open("url"), the browser is opening the page properly which is great thing!!!
But, my problems now are :
1) I do not want that the browser page opened should be visible to the user ( some way that the browser is opened in background ). I tried to used this :
ie = webbrowser.BackgroundBrowser(webbrowser.iexplore)
ie.Visible = 0
ie.open('url')
but no success.
It opens the page which is visible to the user.
2) [This is main activity] I want to scrape the page which is opened in the web browser's IE page opened above. how to do?
I tried to dig into this link but did not find any APIs for getting the data.
Kindly help.
PS : I tried to use beautiful soup for scraping on some other web pages using requests. It was successful & I go the data I wanted. But not in this case.
The webbrowser module doesn't allow to do that. The get function you mentioned is to retrieve registered web browsers not to scrap a HTTP GET request.
I don't know what is triggering the behavior you described with IE, have you tried to change your User-Agent with IE ones? You can check this post for more details: Sending "User-agent" using Requests library in Python
I am currently trying to write a small bot for a banking site that doesn't supply an API. Nevertheless, the security of the login page seems a little more ingenious than what I'd have expected, since even though I don't see any significant difference between Chrome and Python, it doesn't let requests made by Python through (I accounted for things such as headers and cookies)
I've been wondering if there is a tool to record requests in FireFox/Chrome/Any browser and replicate them in Python (or any other language)? Think selenium, but without the overhead of selenium :p
You can use Selenium web drivers to actually use browsers to make the requests for you.
In such cases, I usually checkout the request made by Chrome from my dev tools "Network" tab. Then I right click on the request and copy the request as cURL to run it on command line to see if it works perfectly. If it does, then I can be certain it can be achieved using Python's requests package.
Look into Phantomjs or casperjs. That is a complete browser that can be programmed using JavaScript