Retrieving all cookies from IE using python - python

I've been writing automated tests with Selenium Webdriver 2.45 in python. To get through some of the things I need to test I must retrieve the various JSESSION cookies that are generate from the site. When I use webdrivers get_cookies() function with Firefox or Chrome all of the needed cookies return to me. When I do the same thing with IE11 I do not see the cookies that I need. Anyone know how I can retrieve session cookies from IE?

What you describe sounds like an issue I ran into a few months ago. My tests ran fine with Chrome and Firefox but not in IE, and the problem was cookies. Upon investigation what I found is that my web site had set its session cookies to be HTTP-only. When a cookie has this flag turned on, the browser will send the cookie over the HTTP(S) protocol and allow it to be set by the server in responses but it will make the cookie inaccessible to JavaScript. (Which is consistent with your comment that you cannot see the cookies you want in document.cookie.) It so happens that when you use Selenium with Chrome or Firefox, Selenium is able to ignore this flag and obtain the cookies from the browser anyway. However, it cannot do the same with IE.
I worked around this issue by turning off the HTTP-only flag when running my site in testing mode. I use Django for my server so I had to create a special test_settings.py file with SESSION_COOKIE_HTTPONLY = False in it.

There is an open issue with IE and Safari. Those driver will not return correct cookies information. At least not the domain. See this

Related

Get cookies from website using python (Safari)

I try to get cookies from a website. Reading a bit on this topic on other StackOverflow posts I came up with this code because other code pieces did not work either.
import requests
s = requests.Session()
print(s.get("https://instagram.com").cookies.get_dict())
Unfortunately, it returns an empty dictionary.
I already tried it with browser_cookie3 but it either did not work or it is not supporting Safari.
Am I missing something important?
You are getting the cookies from your requests.Session(). The request will not execute any javascript code so there are no cookies to read. That is why you are getting a empty dictionary.
If the cookies were set on server side itself you would be able to read them.
Between Browser Cookies 3 Currently supports Chrome, Firefox, Opera, Edge, and Chromium.

Selenium, PhantomJS & Puppeteer: Your browser does not support iframe

Long story short all I am trying to do is scrape the contents of a certain page. Unfortunately, the specific info I need on that page is within an iFrame and I have tried several headless browser options, all yielding the same response which is the HTML displaying:
<iframe>Your browser does not support iframe</iframe>
In Python I have tried both Selenium (even tried the --web-security=no & --disable-web-security flags) & PhantomJS (so I know it's not JavaScript related), and in NodeJS I've tried Puppeteer, all of which aren't working...
Is there anything else out there I can try that may work?
Also, no, a direct GET request is useless because the page detects it's not a real user and loads nothing entirely regardless of user-agent etc etc so I really need a browser solution that can preferably be headless

Selenium Alternatives for Google Cloud VM instance?

Are there any alternatives to Selenium that don't require a web driver or browser to operate? I recently moved my code over to a Google Cloud VM instance, and when I run it there are multiple errors. I've been trying to get it to work for hours but just can't (no luck with PhantomJS, Chrome and GeckoDriver - tried re-downloading browsers, editing the sources.list file e.c.t.).
The page I'm web scraping uses JavaScript to load in numbers, which I was I initially chose Selenium. Everything else works perfectly though!
You could simply use the request library.
https://requests.readthedocs.io/en/master/
https://anaconda.org/anaconda/requests
You would then need to send a GET or POST request to the server.
If you do not know how to generate a proper POST request, simply try to "record" it.
If you have chrome, got to the page you want to navigate, press F12, navigate to the "Network" section and write method:POST into the filter.
Further info here:
https://stackoverflow.com/a/39661536/11971785
At first it is a bit more confusing than selenium, but once you understand it its waaaay better in my opinion.
Also the Java values shown on the page can usually be simply read out of the java code which is returned by your request.
No web driver or anything required and a lot more stable and customizable.

How does Chrome Driver work, but Firefox, PhantomJS and HTMLUnit not?

I am trying to scrape a dynamic content (javascript) page with Python + Selenium + BS4 and the page blocks my requests at random (the soft might be: F5 AMS).
I managed to bypass this thing by changing the user-agent for each of the browsers I have specified. The thing is, only the Chrome driver can pass over the rejection. Same code, adjusted for PhantomJS or Firefox drivers is blocked constantly, like I am not even changing the user agent.
I must say that I am also multithreading, that meaning, starting 4 browsers at the same time.
Why does this happen? What does Chrome Webdriver have to offer that can pass over the firewall and the rest don't?
I really need to get the results because I want to change to Firefox, therefore, I want to make Firefox pass just as Chrome.
Two words: Browser Fingerprinting. It's a huge topic in it's own right and as Tarun mentioned would take a decent amount of research to nail this issue on its head. But possible I believe.

Easily replicate browser requests with python?

I am currently trying to write a small bot for a banking site that doesn't supply an API. Nevertheless, the security of the login page seems a little more ingenious than what I'd have expected, since even though I don't see any significant difference between Chrome and Python, it doesn't let requests made by Python through (I accounted for things such as headers and cookies)
I've been wondering if there is a tool to record requests in FireFox/Chrome/Any browser and replicate them in Python (or any other language)? Think selenium, but without the overhead of selenium :p
You can use Selenium web drivers to actually use browsers to make the requests for you.
In such cases, I usually checkout the request made by Chrome from my dev tools "Network" tab. Then I right click on the request and copy the request as cURL to run it on command line to see if it works perfectly. If it does, then I can be certain it can be achieved using Python's requests package.
Look into Phantomjs or casperjs. That is a complete browser that can be programmed using JavaScript

Categories

Resources