Request processing time in python - python

I'm trying to test a web application using selenium python. I've wrote a script to mimic a user. It logs in to the server, generates some reports and so on. It is working fine.
Now, I need to see how much time the server is taking to process a specific request.
Is there a way to find that from the same python code?
Any alternate method is acceptable.
Note:
The server is in the same LAN
Also I don't have privileges to do anything at the server side. So anything I can do is from outside the server.
Any sort of help is appreciable. Thank you

Have you considered the w3c HTTP access log field, "time-taken." This will report on every single request the time in milliseconds maximally. On some platforms the precision reported is more granular. In order for a web server, an application server with an HTTP access layer, an enterprise services bus with an HTTP access layer (for SOAP and REST calls) to be fully w3c standards compliant this log value must be available for inclusion in the HTTP access logs.
You will see every single granular request and the time required for processing from first byte of receipt at the server to the last byte sent minus the final TCP ACK at the end.

Related

I want to know about server communication principles

Looking at the process of communicating between the server and the client with "fiddler," it was seen that dozens of times of communication with the server was made with a single click in Chrome.
In some cases, the response by the first request may be included in the second request, and I wonder how the information is extracted and included in the next request.

How to handle a heavy request on a server?

I don't know if this is the right place to ask but, i am desperate for an answer.
The problem in hand here is not the number of requests but, the amount of time one single request will take. For each request, the server has to query about 12 different sources for data and it can take upto 6 hours for server to get the data (let's leave request timeout from this because, this is not the server directly communicating with the client. This server is fetching messages from kafka and then starts getting the data from the sources). I am supposed to come up with a scalable solution. Can anyone help me with this?
The problem don't end here:
Once the server gets the data, he has to push to kafka for further computation using spark. Streaming api will be used in this part.
I am open to any web framework or any scaling solution in python.

Web History/Connection Logger

I'm thinking of writing an application that when running keeps track of all the websites I visit and connections I make.
Basically like a browser history, but I want to do it in a way that utilizes network concepts.
I only have a rudimentary understanding of Http, but would I be able to listen in on Http get requests from the browser and automatically pull information whenever a request is made? If anyone can give me a suggestion or outline of how this can be done, so I can research on implementing it, it would be very helpful! I'm thinking of implementing it in python, and my operating system is Ubuntu
Thank you very much.
You could do that by implementing a proxy.
In your case, basically, an agent that sits between your browser and internet. The proxy receive the request from the client, then, send it to the remote server, the remote server may reply to you and you'll have to send the server response back to the client.
To retrieve the informations you are wanting, reading the Http rfc will be helpful.

Python App Engine's urllib2: works locally but not when deployed to GAE

I have an App that worked well on both GAE and test server till a few days ago. It connects to a remote site, logs in and browse pages and input information automatically. The remote site is using dynamic URLs to follow the session, each page gives the link for next call.
The program is very basic : urllib2.urlopen then regexp to extract the next url key then new call to urllib2.urlopen and so on.
Now my app works still perfectly on test server but fails when deployed on GAE : I have a series of calls to urllib2.open and most of the time, the remote site says it has lost the session already on the second call but 1/10th I could go to the third call and once GAE has gone successfully to the fourth call.
This seems to point out that it is not a security issue with the remote site (which has not changed) nor a question of redirect and cookies I have read in other posts.
Users reported to me that it worked well till the 14th of Sept 13 and the failure was reported to me on the 20th. Was there a change in the handling of URLfetch in GAE recently?
I've just spent 2 days on the problem with no tangible clue.
It may be a question of IP address? The remote server could control the session with the IP adress and the dynamicURL together and I can imagine that GAE does not garantee that in a same call to GAE, all calls to URLlib are handled by the same machine? This could explain why sometimes it works for two or three successive URLs. I do not know enough GAE internal mechanism to confirm.
Thank you in advance for your ideas.
We make no guarantees that urlfetch calls will all go out on the same IP address.

Google App Engine URL Fetch Doesn't Work on Production

I am using google app engine's urlfetch feature to remotely log into another web service. Everything works fine on development, but when I move to production the login procedure fails. Do you have any suggestions on how to debug production URL fetch?
I am using cookies and other headers in my URL fetch (I manually set up the cookies within the header). One of the cookies is a session cookie.
There is no error or exception. On production, posting a login to the URL command returns the session cookies but when you request a page using the session cookies, they are ignored and you are prompted for login information again. On development once you get the session cookies you can access the internal pages just fine. I thought the problem was related to saving the cookies, but they look correct as the requests are nearly identical.
This is how I call it:
fetchresp = urlfetch.fetch(url=req.get_full_url(),
payload=req.get_data(),
method=method,
headers=all_headers,
allow_truncated=False,
follow_redirects=False,
deadline=10
)
Here are some guesses as to the problem:
The distributed nature of google's url fetch implementation is messing things up.
On production, headers are sent in a different order than in development, perhaps confusing the server.
Some of google's servers are blacklisted by the target server.
Here are some hypothesis that I've ruled out:
Google caching is too aggressive. But I still get the problem after turning off cache by using the header Cache-Control: no-store.
Google's urlfetch is too fast for the target server. But I still get the problem after inserting delays between calls.
Google appends some data to the User-Agent header. But I have added that header to development and I don't get the problem.
What other differences are there between the production URL fetch and the development URL fetch? Do you have any ideas for debugging this?
UPDATE 2
(First update was incorporated above)
I don't know if it was something I did (maybe adding delays or disabling caches mentioned above) but now the production environment works about 50% of the time. This definitely looks like a race condition. Unfortunately, I have no idea if the problem is in my code, google's code, or the target server's code.
As others have mentioned, the key differences between dev and prod are the originating IP, and how some of the request headers are handled. See here for a list of restricted headers. I don't know if this is documented, but in prod, your app ID is appended to the end of your user agent. I had an issue once where requests in prod only were getting detected as a search engine spider because my app ID contained the string "bot".
You mentioned that you're setting up cookies manually, including the session cookie. Does this mean that you established a session in Dev, and then you're trying to re-use it in prod? Is it possible that the remote server is logging the source IP that establishes a session, and requiring that subsequent requests come from the same IP?
You said that it doesn't work, but you don't get an exception. What exactly does this mean? You get an HTTP 200 and an empty response body? Another HTTP status? Your best bet may be to contact the owners of the remote service and see if they can tell you more specifically what was wrong with your request. Anything else is just speculation.
Check your server's logs to see if GAE is chopping any headers off. I've noticed that GAE (thought I think I've seen it on the dev server) will chop off headers it doesn't like.
Depending on the web service you're calling, it might also be less ok with GAE calling it than your local machine.
I ran across this issue while making a webapp with an analogous issue- when looking at urlfetch's documentation, it turns out that the maximum timeout for a fetch call is 60 seconds, but it defaults to 5 seconds.
5 seconds on my local machine was long enough to request URLs on my local machine, but on GAE it was only consistently completing its task in 5 seconds about 20% of the time.
I included the parameter deadline=60 and it has been working fine since.
Hope this helps others!

Categories

Resources