How to catch connection issues to minIO server? - python

I'm trying to catch authentication errors on a Python client for minio (minio package):
from minio import Minio
from minio.error import MinioError, ResponseError
## Data Lake (Minio)
try :
minioClient = Minio(endpoint= config['databases']['datalake']['hostip']+":"+config['databases']['datalake']['port'],
access_key= config['databases']['datalake']['accesskey'],
secret_key= config['databases']['datalake']['secretkey'],
secure= False)
app.logger.info("MinIO Server Connected!")
except MinioError as e:
app.logger.info("Could not connect to MinIO Server")
I can't seem to be able to catch an authentication error when using fake (wrong) creds. It's always a pass... Any ideas on how to catch these sort of issues?

The Minio() only creates an object, but does not connect to a server. Therefore, the object creation works with fake credentials or fake urls and param also, as this object is not used to connect somewhere for now. Your exception handling only tries to catch raised errors that occur from simple python object creation.
To check the connectivity, I try to connect to a non existing bucket. If I get an error message, everything is fine, if there is a timeout, you can catch it and log it. (You could also try to connect to a existing bucket, but this increases complexity in checking whether there is an error in creating/reaching this bucket or the storage)
#create object
client = Minio(
host,
access_key=user,
secret_key=pass,
more_access_data=...
)
# reach storage
try:
if not client.bucket_exists("nonexistingbucket"):
logging.debug("Object storage connected")
except:
# raise error if storage not reachable
logging.critical("Object storage not reachable")

As said above:
To check the connectivity, I try to connect to a non existing bucket.
I don't think that this is intuitive, why don't you use list_buckets() instead, for instance:
from urllib3.exceptions import MaxRetryError
self.config = {
"endpoint": "localhost:9000",
"access_key": "minioadmin",
"secret_key": "minioadmin",
"secure": False,
"http_client": urllib3.PoolManager(
num_pools=10,
)
}
try:
self.client = Minio(**self.config)
self.client.list_buckets()
except MaxRetryError:
logging.critical("Object storage not reachable")
It's important to inform that if Minio isn't alive, depending on the context of your application, the startup time will take a little longer than usual.

Related

Client is not allowed to use this service

When i try to take some data from API SOAP, i receive an error:
Client [my ip] is not allowed to use this service
I find some solutions on internet, but its like "set your IP to: allow, and it work"
I don't know how to do it, but that's not my point.
Even if i set my IP to allow it, at the end i want to make a stand-alone app. And then, what about that error? Is there a universal solution?
Full (small) code:
import zeep
from zeep import xsd
client = zeep.Client(wsdl='https://api.mediaexpert.pl/?q=ws/wsdl/products', service_name='TergProductsDataService')
try:
print(client.service.prodsCMS2USPGet(prodID=xsd.SkipValue, prodIndex=xsd.SkipValue, prodEAN='4210201242284',
insiderFlag=xsd.SkipValue, http200=False))
except Exception as e:
print(e)
API ENG doc:
https://api.mediaexpert.pl/?q=ws/doc
I use: Win10 or Ubutnu, Pyton3, zeep, PyCharm.

Python: Force minio throw error on connection error while (f)putting

I want the minio-client to throw an error if a connection error occurs while putting data to the minio-server.
It correctly returns an error when setting up a connection and the server is down. But if I disconnect the network while uploading (putting). The python process will stay open 'forever'.
What I want is that it returns an error after for example a timeout of 60 seconds.
I just can't find anything like it in any documentation.
My (working) code simplified:
from minio import Minio
client = None
bucket = 'testbucket'
def connect():
print(f"Creating MinioClient")
global client
client = Minio(
'10.0.0.1:9000',
access_key='myaccesskey',
secret_key='mysecretkey',
secure=False
)
def createBucket():
global client
global bucket
print(f"Making bucket '{bucket}' if not exists")
found = client.bucket_exists(bucket)
if not found:
client.make_bucket(bucket)
else:
print(f"Bucket '{bucket}' already exists")
def putFile():
""" Upload data """
print("Uploading the file with fput")
global client
global bucket
objectname = 'ReallyBigFile'
filepath = '/home/jake/4GBfile.bin'
try:
result = client.fput_object(
bucket,
objectname,
filepath,
)
except IOError as err:
print(err)
print("Created {0} object; etag: {1}, version-id: {2}".format(result.object_name, result.etag, result.version_id))
connect()
createBucket()
putFile()
Anybody an idea how I can force minio to exit fputting on a connection-error?
Already tried setting up the connection like this: No success:
client = Minio(
'10.0.0.1:9000',
access_key='myaccesskey',
secret_key='mysecretkey',
secure=False,
http_client=urllib3.poolmanager.PoolManager(
timeout=1,
retries=urllib3.Retry(
total=3,
backoff_factor=0.2,
status_forcelist=[500, 502, 503, 504]
)
)
Some research showed that Minio in fact is capable of recognizing a connection loss while uploading. It turns out it really has to do with the python client or my implementation of it.
Test with the command line tool mc:
The first session is a stable connection and the second is disconnected during upload.

Does psycopg2.connect inherit the proxy set in this context manager?

I have a Django app below that uses a proxy to connect to an external Postgres database. I had to replace another package with psycopg2 and it works fine locally, but doesn't work when I move onto our production server which is a Heroku app using QuotaguardStatic for proxy purposes. I'm not sure what's wrong here
For some reason, the psycopg2.connect part returns an error with a different IP address. Is it not inheriting the proxy set in the context manager? What would be
from apps.proxy.socks import Socks5Proxy
import requests
PROXY_URL = os.environ['QUOTAGUARDSTATIC_URL']
with Socks5Proxy(url=PROXY_URL) as p:
public_ip = requests.get("http://wtfismyip.com/text").text
print(public_ip) # prints the expected IP address
print('end')
try:
connection = psycopg2.connect(user=EXTERNAL_DB_USERNAME,
password=EXTERNAL_DB_PASSWORD,
host=EXTERNAL_DB_HOSTNAME,
port=EXTERNAL_DB_PORT,
database=EXTERNAL_DB_DATABASE,
cursor_factory=RealDictCursor # To access query results like a dictionary
) # , ssl_context=True
except psycopg2.DatabaseError as e:
logger.error('Unable to connect to Illuminate database')
raise e
Error is:
psycopg2.OperationalError: FATAL: no pg_hba.conf entry for host "12.345.678.910", user "username", database "databasename", SSL on
Basically, the IP address 12.345.678.910 does not match what was printed at the beginning of the context manager where the proxy is set. Do I need to set a proxy another method so that the psycopg2 connection uses it?

How to fix problem "Unable to complete the operation against any hosts" in Cassandra?

I have a pretty simple AWS Lambda function in which I connect to an Amazon Keyspaces for Cassandra database. This code in Python works, but from time to time I get the error. How do I fix this strange behavior? I have an assumption that you need to make additional settings when initializing the cluster. For example, set_max_connections_per_host. I would appreciate any help.
ERROR:
('Unable to complete the operation against any hosts', {<Host: X.XXX.XX.XXX:XXXX eu-central-1>: ConnectionShutdown('Connection to X.XXX.XX.XXX:XXXX was closed')})
lambda_function.py:
import sessions
cassandra_db_session = None
cassandra_db_username = 'your-username'
cassandra_db_password = 'your-password'
cassandra_db_endpoints = ['your-endpoint']
cassandra_db_port = 9142
def lambda_handler(event, context):
global cassandra_db_session
if not cassandra_db_session:
cassandra_db_session = sessions.create_cassandra_session(
cassandra_db_username,
cassandra_db_password,
cassandra_db_endpoints,
cassandra_db_port
)
result = cassandra_db_session.execute('select * from "your-keyspace"."your-table";')
return 'ok'
sessions.py:
from ssl import SSLContext
from ssl import CERT_REQUIRED
from ssl import PROTOCOL_TLSv1_2
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra.policies import DCAwareRoundRobinPolicy
def create_cassandra_session(db_username, db_password, db_endpoints, db_port):
ssl_context = SSLContext(PROTOCOL_TLSv1_2)
ssl_context.load_verify_locations('your-path/AmazonRootCA1.pem')
ssl_context.verify_mode = CERT_REQUIRED
auth_provider = PlainTextAuthProvider(username=db_username, password=db_password)
cluster = Cluster(
db_endpoints,
ssl_context=ssl_context,
auth_provider=auth_provider,
port=db_port,
load_balancing_policy=DCAwareRoundRobinPolicy(local_dc='eu-central-1'),
protocol_version=4,
connect_timeout=60
)
session = cluster.connect()
return session
There isn't much point setting the max connections on the client side since AWS Lambdas are effectively "dead" between runs. For the same reason, the recommendation is to disable driver heartbeats (with idle_heartbeat_interval = 0) since there is no activity that occurs until the next time the function is called.
This doesn't necessarily cause the issue you are seeing but there's a good chance the connection is being reused by the driver after it has been closed server-side.
With the lack of public documentation on the inner-workings of AWS Keyspaces, it's difficult to know what is happening on the cluster. I've always suspected that AWS Keyspaces has a CQL-like API engine in front of a Dynamo DB so there are quirks like what you're seeing that are hard to track down since it requires knowledge only available internally at AWS.
FWIW the DataStax drivers aren't tested against AWS Keyspaces.
This is the biggest issue which I see:
result = cassandra_db_session.execute('select * from "your-keyspace"."your-table";')
The code looks fine, but I don't see a WHERE clause. So if there's a lot of data, a single node (chosen as a coordinator) will have to build the result set while pulling data from all other nodes. As this results in (un)predictibly bad performance, that could explain why it works sometimes, but not others.
Pro-tip: All queries in Cassandra should have a WHERE clause.

SqlAlchemy - Fails to establish remote connection, connects to (nonexistant) local server instead

I'm having a big issue come up with Python (3.4) SqlAlchemy as of late, with a project that very much depends on this.
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2002, "Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)")
Which is preceded with the following error:
File "/home/dev/venv/project/lib/python3.4/site-packages/sqlalchemy/util/queue.py", line 145, in get
raise Empty
sqlalchemy.util.queue.Empty
I have literally no idea why this issue is happening, and it's becoming quite the burden. If I connect with a SQLLite database, it works.
Here's the base string I'm using for connection:
mysql:///username:password#redacted:3306/inventory
Credentials removed.
The database configuration is loaded from a YAML file on disk, and is verified to be as expected.
All unit tests surrounding the connection have passed, though I only test with SQLLite.
Here is the code that's used to connect & create the database and its elements:
self.database_config = {
'SQLALCHEMY_DATABASE_URI': self.config['databases']['source']['url'],
# 'SQLALCHEMY_BINDS': {
# 'discard': self.config['databases']['discarded-items']['url']
# }
}
try:
db.init(config=self.database_config, Model=Model)
# todo check if db exists
logger.debug("Database Initialized!")
db.create_all()
logger.debug("Database Created")
return True
except SQLAlchemyError as ex:
logger.error(ex)
logger.info("Application config: {0}\nConfig Location: {1}\nDatabase Config: {2}".format(self.config,
self.config_location,
self.database_config))
raise ex
Please, if I'm able to get any help that would be great! It's been quite the headache and would really like to resolve this!

Categories

Resources