Pymongo OperationFailure: Index with name: timestamp_1 already exists with different options - python

I came across this problem when I tried some scraping code. I defined a class MongoCache to cache the html pages:
class MongoCache:
def __init__(self, client=None, expires=timedelta(days=30)):
self.client = MongoClient('localhost', 27017) if client is None else client
self.db = self.client.cache
self.db.webpage.create_index('timestamp1', expireAfterSeconds=expires.total_seconds())
when I build the object:
cache = MongoCache()
the failure information came out.
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "F:\pythoncode\webscraping\mongo_cache.py", line 20, in __init__
File "D:\python27\lib\site-packages\pymongo\collection.py", line 1958, in create_index
self.__create_index(keys, kwargs, session, **cmd_options)
File "D:\python27\lib\site-packages\pymongo\collection.py", line 1860, in __create_index
session=session)
File "D:\python27\lib\site-packages\pymongo\collection.py", line 244, in _command
retryable_write=retryable_write)
File "D:\python27\lib\site-packages\pymongo\pool.py", line 579, in command
unacknowledged=unacknowledged)
File "D:\python27\lib\site-packages\pymongo\network.py", line 150, in command
parse_write_concern_error=parse_write_concern_error)
File "D:\python27\lib\site-packages\pymongo\helpers.py", line 155, in _check_command_response
raise OperationFailure(msg % errmsg, code, response)
OperationFailure: Index with name: timestamp_1 already exists with different options
I tried some solutions from stackoverflow, but those are not for pymongo, and I cannot even use the method drop_index().
I used win10, python2.7 on pycharm, and the MongoDB server version is 4.0.3.
I have spent two days to figure out the problm, and gave up..

Now, I tried the question again, and found that the problem may be in timestamp used for the index.
I define an object with no input parameter, everything's ok.
cache = MongoCache()
but, using the timestamp, it comes again:
cache = MongoCache(expires=timedelta())
The function for saving value for url is:
def __setitem__(self, url, result):
record = {
'result': Binary(zlib.compress(pickle.dumps(result))),
'timestamp': datetime.utcnow()}
self.db.webpage.update({'_id': url}, {'$set': record}, upsert=True)

Related

Querying on mysql docker container via python, throwing timeout error after few hours

Inserting via debezium connector to mysql database brought up via docker container.
Trying to query and it is working fine until some number of hours. But, after that, same query is throwing below exception.
export JAVA_HOME=/tmp/tests/artifacts/java-17/jdk-17; export PATH=$PATH:/tmp/tests/artifacts/java-17/jdk-17/bin; docker exec -i mysql_be1e6a mysql --user=demo --password=demo -D demo -e "select count(k) from test_cdc_f0bf84 where uuid = 'd1e5cd6d-8f7a-457c-b2ea-880c2be52f69'"
2023-01-02 16:27:43,812:ERROR: failed to execute query MySQL rows count by uuid:
Traceback (most recent call last):
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/channel.py", line 699, in recv
out = self.in_buffer.read(nbytes, self.timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/buffered_pipe.py", line 164, in read
raise PipeTimeout()
paramiko.buffered_pipe.PipeTimeout
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/suites/cdc/abstract.py", line 667, in try_query
res = query_function()
^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/suites/cdc/test_cdc.py", line 635, in <lambda>
query = lambda: self.mysql_query(
^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/suites/cdc/abstract.py", line 544, in mysql_query
result = self.ssh.exec_on_host(host, [
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/main/connection.py", line 335, in exec_on_host
return self._exec_on_host(host, commands, fetch, timeout=timeout, limit_output=limit_output)[host]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/main/connection.py", line 321, in _exec_on_host
res = list(out)
^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/file.py", line 125, in __next__
line = self.readline()
^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/file.py", line 291, in readline
new_data = self._read(n)
^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/channel.py", line 1361, in _read
return self.channel.recv(size)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/channel.py", line 701, in recv
raise socket.timeout()
TimeoutError
After some time, logged manually to machine and tried to read, it still reads fine. Not sure, what does this issue mean.
As explained, tried querying from database via python. Expected it will return count of rows, which it was happening until certain time, but after that, it threw timeout error and socket error.
Trying to query and it is working fine until some number of hours. But, after that, same query is throwing below exception.
The default value for interactive_timeout and wait_timeout is 28880 seconds (8 hours). you can disable this behavior by setting this system variable to zero in your MySQL config.
source: Configuring session timeouts

Error on writing to Google cloud spanner using Google cloud functions

I am trying to insert data into cloud spanner table using cloud functions but it is throwing the error given below.Reading data from cloud spanner is working properly but writing using both the Data Definition Language commands and batch.insert method both throws the same error. I am thinking its some kind of permissions problem! I don't know how to fix it?
Requirements file contains only google-cloud-spanner==1.7.1
Code running in cloud functions
import json
from google.cloud import spanner
INSTANCE_ID = 'AARISTA'
DATABASE_ID = 'main'
TABLE_NAME = 'userinfo'
dataDict = None
def new_user(request):
dataDict = json.loads(request.data)# Data is available in dict format
if dataDict['USER_ID']==None:
return "User id empty"
elif dataDict['IMEI'] == None:
return "Imei number empty"
elif dataDict['DEVICE_ID'] == None:
return "Device ID empty"
elif dataDict['NAME'] == None:
return "Name field is empty"
elif dataDict['VIRTUAL_PRIVATE_KEY']== None:
return "User's private key cant be empty"
else:
return insert_data(INSTANCE_ID,DATABASE_ID)
def insert_data(instance_id, database_id):
spanner_client = spanner.Client()
instance = spanner_client.instance(instance_id)
database = instance.database(database_id)
def insert_user(transcation):
row_ct= transcation.execute_update("INSERT userinfo
(USER_ID,DEVICE_ID,IMEI,NAME,VIRTUAL_PRIVATE_KEY) VALUES"
"("+dataDict['USER_ID']+',
'+dataDict['DEVICE_ID']+', '+ dataDict['IMEI']+',
'+dataDict['NAME']+',
'+dataDict['VIRTUAL_PRIVATE_KEY']+")")
database.run_in_transaction(insert_user)
return 'Inserted data.'
Error logs on Cloud Functions
Traceback (most recent call last):
File "/env/local/lib/python3.7/site-packages/google/cloud/spanner_v1/pool.py", line 265, in get session = self._sessions.get_nowait()
File "/opt/python3.7/lib/python3.7/queue.py", line 198, in get_nowait return self.get(block=False)
File "/opt/python3.7/lib/python3.7/queue.py", line 167, in get raise Empty _queue.Empty
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/env/local/lib/python3.7/site-packages/google/api_core/grpc_helpers.py", line 57, in error_remapped_callable return callable_(*args, **kwargs)
File "/env/local/lib/python3.7/site-packages/grpc/_channel.py", line 547, in __call__ return _end_unary_response_blocking(state, call, False, None)
File "/env/local/lib/python3.7/site-packages/grpc/_channel.py", line 466, in _end_unary_response_blocking raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with: status = StatusCode.INVALID_ARGUMENT details = "Invalid CreateSession request." debug_error_string = "{"created":"#1547373361.398535906","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1036,"grpc_message":"Invalid> CreateSession request.","grpc_status":3}" >
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 297, in run_http_function result = _function_handler.invoke_user_function(flask.request)
File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 199, in invoke_user_function return call_user_function(request_or_event)
File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 192, in call_user_function return self._user_function(request_or_event)
File "/user_code/main.py", line 21, in new_user return insert_data(INSTANCE_ID,DATABASE_ID)
File "/user_code/main.py", line 31, in insert_data database.run_in_transaction(insert_user)
File "/env/local/lib/python3.7/site-packages/google/cloud/spanner_v1/database.py", line 438, in run_in_transaction with SessionCheckout(self._pool) as session:
File "/env/local/lib/python3.7/site-packages/google/cloud/spanner_v1/pool.py", line 519, in __enter__ self._session = self._pool.get(**self._kwargs)
File "/env/local/lib/python3.7/site-packages/google/cloud/spanner_v1/pool.py", line 268, in get session.create()
File "/env/local/lib/python3.7/site-packages/google/cloud/spanner_v1/session.py", line 116, in create session_pb = api.create_session(self._database.name, metadata=metadata, **kw)
File "/env/local/lib/python3.7/site-packages/google/cloud/spanner_v1/gapic/spanner_client.py", line 276, in create_session request, retry=retry, timeout=timeout, metadata=metadata
File "/env/local/lib/python3.7/site-packages/google/api_core/gapic_v1/method.py", line 143, in __call__ return wrapped_func(*args, **kwargs)
File "/env/local/lib/python3.7/site-packages/google/api_core/retry.py", line 270, in retry_wrapped_func on_error=on_error,
File "/env/local/lib/python3.7/site-packages/google/api_core/retry.py", line 179, in retry_target return target()
File "/env/local/lib/python3.7/site-packages/google/api_core/timeout.py", line 214, in func_with_timeout return func(*args, **kwargs)
File "/env/local/lib/python3.7/site-packages/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
google.api_core.exceptions.InvalidArgument: 400 Invalid CreateSession request.
I tried to reproduce this but it seems to work for me as a Python 3.7 function. I used the latest google-cloud-spanner library in requirements.txt.
While I am unsure what would be causing your error I did notice a few other things.
It seemed odd to declare a global dataDict and not use the one constructed and pass it. Instead I added that as a param to the insert method.
The spacing of the query was a bit odd and the use of single and double quotes was odd. this made it hard to parse visually. As the function runs as python 3.7 you can also use f-strings which likely would make it even more readable.
Here is the code I ran in a function that seemed to work.
import json
from google.cloud import spanner
INSTANCE_ID = 'testinstance'
DATABASE_ID = 'testdatabase'
TABLE_ID = 'userinfo'
def new_user(request):
data = { 'USER_ID': '10', 'DEVICE_ID': '11' }
return insert_data(INSTANCE_ID, DATABASE_ID, data)
def insert_data(instance_id, database_id, data):
spanner_client = spanner.Client()
instance = spanner_client.instance(instance_id)
database = instance.database(database_id)
def insert_user(transaction):
query = f"INSERT {TABLE_ID} (USER_ID,DEVICE_ID) VALUES ({data['USER_ID']},{data['DEVICE_ID']})"
row_ct = transaction.execute_update(query)
database.run_in_transaction(insert_user)
return 'Inserted data.'

Authentication failed to connect to mongodb using pymongo

We have written a piece of code in python script using pymongo that connects to mongodb.
username = 'abc'
password = 'xxxxxx'
server = 'dns name of that server'
port = 27017
In program, the code looks like:
import pymongo
from pymongo import MongoClient
client = MongoClient(url, serverSelectionTimeoutMS=300)
database = client.database_name
data_insert = database.collection_name.insert_one({'id': 1, 'name': xyz})
When I tried to do these operations, it raises an error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 1114, in next
if len(self.__data) or self._refresh():
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 1036, in _refresh
self.__collation))
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 873, in __send_message
**kwargs)
File "/usr/local/lib/python2.7/dist-packages/pymongo/mongo_client.py", line 905, in _send_message_with_response
exhaust)
File "/usr/local/lib/python2.7/dist-packages/pymongo/mongo_client.py", line 916, in _reset_on_error
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pymongo/server.py", line 99, in send_message_with_response
with self.get_socket(all_credentials, exhaust) as sock_info:
File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/usr/local/lib/python2.7/dist-packages/pymongo/server.py", line 168, in get_socket
with self.pool.get_socket(all_credentials, checkout) as sock_info:
File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/usr/local/lib/python2.7/dist-packages/pymongo/pool.py", line 792, in get_socket
sock_info.check_auth(all_credentials)
File "/usr/local/lib/python2.7/dist-packages/pymongo/pool.py", line 512, in check_auth
auth.authenticate(credentials, self)
File "/usr/local/lib/python2.7/dist-packages/pymongo/auth.py", line 470, in authenticate
auth_func(credentials, sock_info)
File "/usr/local/lib/python2.7/dist-packages/pymongo/auth.py", line 450, in _authenticate_default
return _authenticate_scram_sha1(credentials, sock_info)
File "/usr/local/lib/python2.7/dist-packages/pymongo/auth.py", line 201, in _authenticate_scram_sha1
res = sock_info.command(source, cmd)
File "/usr/local/lib/python2.7/dist-packages/pymongo/pool.py", line 419, in command
collation=collation)
File "/usr/local/lib/python2.7/dist-packages/pymongo/network.py", line 116, in command
parse_write_concern_error=parse_write_concern_error)
File "/usr/local/lib/python2.7/dist-packages/pymongo/helpers.py", line 210, in _check_command_response
raise OperationFailure(msg % errmsg, code, response)
pymongo.errors.OperationFailure: Authentication failed.
In MongoDB, while performing queries we are getting the responses normally, without raising any errors.
Because the other answers to your question didn't work for me, I'm going to copy and paste my answer from a similar question.
If you've tried the above answers and you're still getting an error:
pymongo.errors.OperationFailure: Authentication failed.
There's a good chance you need to add ?authSource=admin to the end of your uri.
Here's a working solution that I'm using with MongoDB server version 4.2.6 and MongoDB shell version v3.6.9.
from pymongo import MongoClient
# Replace these with your server details
MONGO_HOST = "XX.XXX.XXX.XXX"
MONGO_PORT = "27017"
MONGO_DB = "database"
MONGO_USER = "admin"
MONGO_PASS = "pass"
uri = "mongodb://{}:{}#{}:{}/{}?authSource=admin".format(MONGO_USER, MONGO_PASS, MONGO_HOST, MONGO_PORT, MONGO_DB)
client = MongoClient(uri)
Similar fix for command line is adding --authenticationDatabase admin
Well, I have been stuck with the same error for almost 3-4 hours. I came across solution with the following steps:
from your shell connect to MongoDB by typing: mongo
afterwards, create a database: use test_database
Now create a user with the following command with readWrite and dbAdmin privileges.
db.createUser(
{
user: "test_user",
pwd: "testing12345",
roles: [ "readWrite", "dbAdmin" ]
}
);
This will prompt Successfully added user: { "user" : "test_user", "roles" : [ "readWrite", "dbAdmin" ] }
you can check by typing: show users.
It will also show you DB name you created before in the json.
now you should be able to insert data to your database:
client = MongoClient("mongodb://test_user:myuser123#localhost:27017/test_database")
db = client.test_database
data = {"initial_test":"testing"}
db["my_collection"].insert_one(data).inserted_id
I ran into this error, and my problem was with the password.
I had what I believe to be a special character in the Master account. Changing the password to be only alphanumeric fixed it for me.
Code snippet
client = pymongo.MongoClient(
'mongodb://username:alphaNumericPassword#localhost:27017/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false'
)
# Specify the database to be used
db = client['prod-db']

Python - RuntimeError: working outside of request context

Trying to get the GET parameters from the URL. I have it working in my __init__.py file, but in a different file its not working.
I tried to use with app.app_context(): but I am still getting the same issue.
def log_entry(entity, type, entity_id, data, error):
with app.app_context():
zip_id = request.args.get('id')
RuntimeError: working outside of request context
Any suggestions?
Additional Info:
This is using Flask web framework which is setup as a service (API).
Example URL the user would hit http://website.com/api/endpoint?id=1
As mentioned above using `zip_id = request.args.get('id') works fine in the main file but I am in runners.py (just another file with definitions in)
Full traceback:
Debugging middleware caught exception in streamed response at a point where response headers were already sent.
Traceback (most recent call last):
File "/Users/ereeve/.virtualenvs/pi-automation-api/lib/python2.7/site-packages/werkzeug/wsgi.py", line 703, in __next__
return self._next()
File "/Users/ereeve/.virtualenvs/pi-automation-api/lib/python2.7/site-packages/werkzeug/wrappers.py", line 81, in _iter_encoded
for item in iterable:
File "/Users/ereeve/Documents/TechSol/pi-automation-api/automation_api/runners.py", line 341, in create_agencies
log_entry("test", "created", 1, "{'data':'hey'}", "")
File "/Users/ereeve/Documents/TechSol/pi-automation-api/automation_api/runners.py", line 315, in log_entry
zip_id = request.args.get('id')
File "/Users/ereeve/.virtualenvs/pi-automation-api/lib/python2.7/site-packages/werkzeug/local.py", line 343, in __getattr__
return getattr(self._get_current_object(), name)
File "/Users/ereeve/.virtualenvs/pi-automation-api/lib/python2.7/site-packages/werkzeug/local.py", line 302, in _get_current_object
return self.__local()
File "/Users/ereeve/.virtualenvs/pi-automation-api/lib/python2.7/site-packages/flask/globals.py", line 20, in _lookup_req_object
raise RuntimeError('working outside of request context')
RuntimeError: working outside of request context
Def in the same file calling the log_entry def
def create_agencies(country_code, DB, session):
document = DB.find_one({'rb_account_id': RB_COUNTRIES_new[country_code]['rb_account_id']})
t2 = new_t2(session)
log_entry("test", "created", 1, "{'data':'hey'}", "")

Internal Error returned from Softlayer DNSManager API

We are using the Python 2.7 and the Python Softlayer 3.0.1 package and calling the get_records method on the DNSManager class. This is currently returning an Internal Server error:
2016-05-11T11:18:04.117406199Z Traceback (most recent call last):
2016-05-11T11:18:04.117715505Z File "/opt/**/**/***.py", line 745, in <module>
2016-05-11T11:18:04.117927757Z httpDnsRecords = dnsManager.get_records(httpDomainRecordId, data=dataspace, type="cname")
2016-05-11T11:18:04.118072183Z File "/usr/local/lib/python2.7/dist-packages/SoftLayer/managers/dns.py", line 152, in get_records
2016-05-11T11:18:04.118152705Z filter=_filter.to_dict(),
2016-05-11T11:18:04.118302389Z File "/usr/local/lib/python2.7/dist-packages/SoftLayer/API.py", line 347, in call_handler
2016-05-11T11:18:04.118398852Z return self(name, *args, **kwargs)
2016-05-11T11:18:04.118512777Z File "/usr/local/lib/python2.7/dist-packages/SoftLayer/API.py", line 316, in call
2016-05-11T11:18:04.118632422Z return self.client.call(self.name, name, *args, **kwargs)
2016-05-11T11:18:04.118814604Z File "/usr/local/lib/python2.7/dist-packages/SoftLayer/API.py", line 176, in call
2016-05-11T11:18:04.118907953Z timeout=self.timeout)
2016-05-11T11:18:04.118995360Z File "/usr/local/lib/python2.7/dist-packages/SoftLayer/transports.py", line 64, in make_xml_rpc_api_call
2016-05-11T11:18:04.119096993Z e.faultCode, e.faultString)
2016-05-11T11:18:04.119547899Z SoftLayer.exceptions.SoftLayerAPIError: SoftLayerAPIError(SOAP-ENV:Server): Internal Error
The httpDomainRecordId is the Id for the domain obtained from softlayer and dataspace is the string 'uk'.
Does anyone know why this would be returning an Internal Error from the server?
Likely the error is due to the response contains a big amount of data, this error is documented here, so you can try:
1.- Increase the timeout in the client.
2.- Add more filters in your request to limmit the result, currently your using datqa and type try adding host or ttl
3.- you can try using limits, but the manager does not provide that option. so you need to use API calls e.g.
import SoftLayer
client = SoftLayer.Client()
zoneId = 12345
objectMask = "id,expire,domainId,host,minimum,refresh,retry, mxPriority,ttl,type,data,responsiblePerson"
result = client['Dns_Domain'].getResourceRecords(id=zoneId, mask=objectMask, limit=200, offset=0)
print (result)

Categories

Resources