Cloud Run: how do I set check_same_thread=False? - python

My project which runs in Cloud Run of Google Cloud Platform (GCP) has generated errors: SQLite objects created in a thread can only be used in that same thread. The object was created in thread id 68387105408768 and this is thread id 68386614675200. for hours before it went back to normal by itself.
Our code is written in Python with flask & no SQLite is involved. Saw suggestions to set check_same_thread to False. May I know where I can set this in Cloud Run or GCP? Thanks.

That setting has nothing to do with your runtime environment, but is set during the connection initialization with sqlite (https://docs.python.org/3/library/sqlite3.html#module-functions), so if you claim that you aren't creating an sqlite connection that won't help you much.
That being said, I find it hard to believe that you are getting that error without using sqlite. More likely is that you are using sqlite via some dependency.
Since sqlite3 is part of the standard library of python it might however not be trivial to figure out which dependency uses it.

Related

Dataflow jobs are failed due to Beam-nuggets referencing to sqlalchemy

We have created an ETL in GCP which reads data from MySQL and migrates it to BigQuery. To read data from MySQL, we use beam-nuggets library. This library is passed as an extra package ('--extra_package=beam-nuggets-0.17.1.tar.gz') to the dataflow job. Cloud functions were used to create the dataflow job. The code was working fine and the Dataflow job got created and the data migration was successful.
After the latest version of sqlalchemy – 1.4 got released, we were unable to deploy the cloud function. The cloud function deployment failed with the exception as mentioned below.
To fix this issue, we tried to give the previous version of sqlalchemy – 1.3.23 in the requirements.txt file of cloud functions. This resolved the issue and the cloud functions got deployed successfully. But when we triggered the dataflow job from cloud functions, we got the same error as mentioned above.
This issue is caused because beam-nuggets library is internally referencing sqlalchemy during the run time and the job fails with the same error. Is it possible to manually enforce beam-nuggets to pick a specific version of sqlalchemy??
Try passing a specific version of sqlalchemy via the extra_package flag as well.

pgAdmin does not open in AWS EC2

I am trying to deploy Django based website on AWS EC2. I have successfully created an instance and install my python libraries there. I am using Postgres here. For that, I have installed Postgres along with pgAdmin, but for some reason, it does not open. It just displayed that it's starting up the server but it does not open at all.
I am new to it so I do not know much about it. Can someone please help or guide me why it does not open up?
You will need to check the logs from
C:\Users\Administrator\AppData\Local with name - 'pgadmin4.startup'
A lot of the time removing the instance and recreating usually works but without seeing the logs it's hard to tell what the issue might be. Could also be worth making the instance a bit beefier as pgadmin does use a good amount of CPU and memory.

Suppressing multi-threading in used libraries?

EDIT:
I ended up using a workaround to get the behaviour I wanted.
Disabling threading in the SSHTunnel as suggested in the accepted answer helped me pin down the problem.
I have a Python project that does a few things, mostly ETL.
It works fine when I run it locally, works fine when I stuff it into a docker container and run that locally, but deadlocks 80% in when I run that docker container in the cloud.
When I manually kill the process I get the error linked below, suggesting it is a threading issue. I'm not explicitly using threading anywhere in my code (and am no expert on the subject) and assume it's one of the libraries I'm using employing threading internally.
The idea I had to resolve this problem is to somehow suppress all threading that is happening in the function calls of the libraries I use.
Is there a catch-all way to do that in Python?
Steps of the program include moving PostGresQL data into Google BigQuery, then fetching data from BigQuery (including the new data), creating an Excel report out of that data and emailing it out.
Pandas' data frames are used for the internal representation and easy upload to GBQ using the to_gbq method.
sqlalchemy and sshtunnel are used to extract data from the Postgresql database.
Openpyxl is used for the Excel editing.
The whole thing takes less than a minute to run locally (either in- or outside of a docker container) and manually calling each of the steps separately on the server also works fine.
(The referenced cloud deployment is on a Google Cloud VM instance)
I can't think of any way to globally disable threading; at least not without breaking every piece of code that would use it.
Judging by the traceback, I assume you are using SSHTunnelForwarder from the sshtunnel package. This class takes a boolean argument threaded with True as a default value.
Instantiating SSHTunnelForwarder with threaded=False will disable the use of the _ThreadingForwardServer in favor of the _ForwardServer. This forward server is not using the socketserver.ThreadingMixIn, which is where your block seems to be surfacing. So, that should fix your problem.
However, I'd be curious to know why your project blocks in the cloud context. Judging by the output in your screenshot, the whole thing seems to be almost complete and just hangs when shutting down the tunnel forwarder. The maintainers of the sshtunnel package surely made the use of threading a default for a reason. I'd want to stick to that default if in any way possible, but that's just me :)

Slow page loading on apache when using Flask

The Issue
I am using my laptop with Apache to act as a server for a local project involving tensorflow and python which uses an API written in Flask to service GET and POST requests coming from an app and maybe another user on the local network.The problem is that the initial page keeps loading when I specifically import tensorflow or the object detection folder within the research folder in the tensorflow github folder, and it never seems to finish doing so, effectively getting it stuck. I suspect the issue has to do with the packages being large in size, but I didn't have any issue with that when running the application on the development server provided with Flask.
Are there any pointers that I should look for when trying to solve this issue? I checked the memory usage, and it doesn't seem to be rising substantially, as well as the CPU usage.
Debugging process
I am able to print basic hello world to the root page quite quickly, but I isolated the issue to the point when the importing takes place where it gets stuck.
The only thing I can think of is to limit the number of threads that are launched, but when I limited the number of threads per child to 5 and number of connections to 5 in the httpd-mpm.conf file, it didn't help.
The error/access logs don't provide much insight to the matter.
A few notes:
Thus far, I used Flask's development server with multi-threading enabled to serve those requests, but I found it to be prone to crashing after 5 minutes of continuous run, so I am now trying to use Apache using the wsgi interface in order to use Python scripts.
I should also note that I am not servicing html files, just basic GET and POST requests. I am just viewing them using the browser.
If it helps, I also don't use virtual environments.
I am using Windows 10, Apache 2.4 and mod_wsgi 4.5.24
The tensorflow module being a C extension module, may not be implemented so it works properly in Python sub interpreters. To combat this, force your application to run in the main Python interpreter context. Details in:
http://modwsgi.readthedocs.io/en/develop/user-guides/application-issues.html#python-simplified-gil-state-api

using mysql instead of sqlite3

im working on python application that requiring database connections..I had developed my application with sqlite3 but it start showing the error(the database is locked).. so I decided to use MySQL database instead.. and it is pretty good with no error..
the only one problem is that I need to ask every user using my application to install MySQL server on his pc (appserv for example) ..
so can I make mysql to be like sqlite3 apart of python lib. so I can produce a python script can be converted into exe file by the tool pyInstaller.exe and no need to install mysql server by users???
update:
after reviewing the code I found opened connection not closed correctly and work fine with sqllite3 ..thank you every body
It depends (more "depends" in the answer).
If you need to share the data between the users of your application - you need a mysql database server somewhere setup, your application would need to have an access to it. And, the performance can really depend on the network - depends on how heavily would the application use the database. The application itself would only need to know how to "speak" with the database server - python mysql driver, like MySQLdb or pymysql.
If you don't need to share the data between users - then sqlite may be an option. Or may be not - depends on what do you want to store there, what for and what do you need to do with the data.
So, more questions than answers, probably it was more suitable for a comment. At least, think about what I've said.
Also see:
https://stackoverflow.com/questions/1009438/which-database-should-i-use-for-my-desktop-application
Python Desktop Application Database
Python Framework for Desktop Database Application
Hope that helps.
If your application is a stand-alone system such that each user maintains their own private database then you have no alternative to install MySQL on each system that is running the application. You cannot bundle MySQL into your application such that it does not require a separate installation.
There is an embedded version of MySQL that you can build into your application (thanks, Carsten, in the comments, for pointing this out). More information is here: http://mysql-python.blogspot.com/. It may take some effort to get this working (on Windows you apparently need to build it from source code) and will take some more work to get it packaged up when you generate your executable, but this might be a MySQL solution for you.
I've just finished updating a web application using SQLite which had begun reporting Database is locked errors as the usage scaled up. By rewriting the database code with care I was able to produce a system that can handle moderate to heavy usage (in the context of a 15 person company) reliably still using SQLite -- you have to be careful to keep your connections around for the minimum time necessary and always call .close() on them. If your application is really single-user you should have no problem supporting it using SQLite -- and that's doubly true if it's single-threaded.

Categories

Resources