inserting timestamps with python cassandra prepared statements

inserting timestamps with python cassandra prepared statements - python

Is it possible to insert a timestamp value into a Cassandra keyspace using prepared statements of Python Cassandra driver? When I tried to do that, I got the following error message:
Expected: <class 'cassandra.cqltypes.DateType'>, Got: <type 'str'>
I see that this problem had been discussed before. But not sure whether it has been resolved. How to do this? Doing the same using simple statements would be inefficient.

Yes, you can insert a timestamp value via prepared statements by binding a datetime object. I have tried it with success.

Like Aaron said, you need to use a datetime object. Given a simple table definition:
CREATE TABLE stackoverflow2.timestamps (
bucket text,
value timestamp,
PRIMARY KEY (bucket, value)
) WITH CLUSTERING ORDER BY (value DESC)
This code will INSERT ten timestamps into the timestamps table, given a valid (connected) session:
preparedInsert = session.prepare(
"""
INSERT INTO stackoverflow2.timestamps (bucket,value) VALUES (?,?);
"""
)
#end prepare statements
for counter in range(1,10):
currentTime = datetime.datetime.today()
bucket = currentTime.strftime("%Y%m")
session.execute(preparedInsert,[bucket,currentTime])
Essentially, the datetime.datetime.today() line creates a datetime object with the current time. The strftime creates a string time bucket from it, and then the preparedInsert puts them both into Cassandra.

Related

I have a date_time field in Dynamo-db table. How can I query only the entries between two specific timedates?

I'm using boto3. The table name is exapmle_table. I want to get only specific hour entries according to the
date_time field.
So far I've tried this without a success:
def read_from_dynamodb():
now = datetime.datetime.now()
one_hour_ago = now - datetime.timedelta(hours=1)
now = timestamp = now.replace(tzinfo=timezone.utc).timestamp()
now = int(now)
one_hour_ago = one_hour_ago.replace(tzinfo=timezone.utc).timestamp()
one_hour_ago = int (one_hour_ago)
dynamodb = boto3.resource("dynamodb", aws_access_key_id=RnD_Credentials.aws_access_key_id,
aws_secret_access_key=RnD_Credentials.aws_secret_access_key,
region_name=RnD_Credentials.region
)
example_table = dynamodb.Table('example_table')
response = example_table.query(
IndexName='date_time',
KeyConditionExpression=Key('date_time').between(one_hour_ago, now)
)
return response
I'm getting the error:
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the Query operation: The table does not have the specified index: date_time

Queries that rely on ranges - including between - need the attribute (in your case date_time) to be a sort key in an index (primary or otherwise), and you'll also need to supply the partition key as part of the KeyConditionExpression. You can't query the entire table by sort key unless all items in the table have the same partition key.
If date_time is an attribute outside of the primary key you can add a GSI where date_time is the sort key but you'll still need to supply a partition key too.
Another idea if you're writing the data yourself is to create a new attribute with the start time of the hour for that item's date_time i.e. quantize it, then create a GSI hash key on that new hour attribute. Then you can query that specific hour rather than look for ranges. If you're working with existing data you could scan and refactor your table to add this attribute - not ideal but might be a solution depending on the size of the table and your use case.
If you're really stuck you could also scan and filter the table instead of query, but that is far less efficient as it will require reading every item every time you execute it.

Python - Filtering SQL query based on dates

I am trying to build a SQL query that will filter based on system date (Query for all sales done in the last 7 days):
import datetime
import pandas as pd
import psycopg2
con = p.connect(db_details)
cur = con.cursor()
df = pd.read_sql("""select store_name,count(*) from sales
where created_at between datetime.datetime.now() - (datetime.today() - timedelta(7))""",con=con)
I get an error
psycopg2.NotSupportedError: cross-database references are not implemented: datetime.datetime.now

You are mixing Python syntax into your SQL query. SQL is parsed and executed by the database, not by Python, and the database knows nothing about datetime.datetime.now() or datetime.date() or timedelta()! The specific error you see is caused by your Python code being interpreted as SQL instead and as SQL, datetime.datetime.now references the now column of the datetime table in the datetime database, which is a cross-database reference, and psycopg2 doesn't support queries that involve multiple databases.
Instead, use SQL parameters to pass in values from Python to the database. Use placeholders in the SQL to show the database driver where the values should go:
params = {
# all rows after this timestamp, 7 days ago relative to 'now'
'earliest': datetime.datetime.now() - datetime.timedelta(days=7),
# if you must have a date *only* (no time component), use
# 'earliest': datetime.date.today() - datetime.timedelta(days=7),
}
df = pd.read_sql("""
select store_name,count(*) from sales
where created_at >= %(latest)s""", params=params, con=con)
This uses placeholders as defined by the psycopg2 parameters documentation, where %(latest)s refers to the latest key in the params dictionary. datetime.datetime() instances are directly supported by the driver.
Note that I also fixed your 7 days ago expression, and replaced your BETWEEN syntax with >=; without a second date you are not querying for values between two dates, so use >= to limit the column to dates at or after the given date.

datetime.datetime.now() is not a proper SQL syntax, and thus cannot be executed by read_sql(). I suggest either using the correct SQL syntax that computes current time, or creating variables for each datetime.datetime.now() and datetime.today() - timedelta(7) and replacing them in your string.
edit: Do not follow the second suggestion. See comments below by Martijn Pieters.

Maybe you should remove that Python code inside your SQL, compute your dates in python and then use the strftime function to convert them to strings.
Then you'll be able to use them in your SQL query.

Actually, you do not necessarily need any params or computations in Python. Just use the corresponding SQL statement which should look like this:
select store_name,count(*)
from sales
where created_at >= now()::date - 7
group by store_name
Edit: I also added a group by which I think is missing.

How to modify the input value of a stored procedure with python

Currently, I am trying to modify two input values of the following stored procedure that I execute with Python.
country_cursor.execute(
"[DEV].[DevelopmentSchema].[QA_Extractor] #start_date='2017-05-05', #end_date='2017-05-11'")
I do not want to run this program every day and change the start_date and end_date manually from the code but instead trying to create a prompt where I can type down the dates that I want to look for the retrieval.
So far, I have done the following:
end_date = str(datetime.now()).rpartition(' ')[0]
start_date = str(datetime.now() - timedelta(days=7)).rpartition(' ')[0]
country_cursor.execute(
"[DEV].[DevelopmentSchema].[QA_Extractor] #start_date='2017-05-05', #end_date= "+"'"+end_date+"'"+"\"")
I just replaced one input date with a variable but when I execute this program I encounter the following SQL error:
pypyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC SQL Server
Driver][SQL Server]An object or column name is missing or empty. For SELECT
INTO statements, verify each column has a name. For other statements, look
for empty alias names. Aliases defined as "" or [] are not allowed. Change
the alias to a valid name.')
My point of view is that the Stored Procedure does not accept this variable as the end date, in consequence, the column to look for the retrieval does not exist. I also read in SQL Server query erroring with 'An object or column name is missing or empty' which supports my view. Am I right with my thinking or am I totally wrong?
How can I fix this problem? Any ideas, suggestions and improvements are welcome ;)

If I do this:
print("[DEV].[DevelopmentSchema].[QA_Extractor] #start_date='2017-05-05', #end_date= "+"'"+end_date+"'"+"\"")
I get this:
[DEV].[DevelopmentSchema].[QA_Extractor] #start_date='2017-05-05', #end_date= '2017-05-14'"
It seems to me that there is a stray " at the end of this query string.
Part of the problem is that you are working way too hard to format the dates as strings.
I am guessing that there is
from datetime import *
at the top of your code (ugly, but hardly your fault). If so, you can do
start_date = datetime.now() - timedelta(days=7)
end_date = datetime.now()
query_string = f"[DEV].[DevelopmentSchema].[QA_Extractor] #start_date='{start_date:%Y-%m-%d}', #end_date='{end_date:%Y-%m-%d'}"
country_cursor.execute(query_string)
which arguably makes it easier to see stray punctuation.

Query to compare between date with time and date without time - python using access db

I need help to create query to compare between date with time and date without time. I am using python with access db (pypyodbc).
In the database I have a column that contains date/time (includes time), and in python I have a datetime object (without time).
I want to write a sql query that compares just the dates of the two.
For Example:
cur.execute("SELECT * FROM MDSSDB WHERE [ValidStartTime] = #2016-05-17#")
The ValidStartTime includes time so it doesn't work. I want just the date from the ValidStartTime.

Consider using MS Access' DateValue function that extracts only the date component (TimeValue being the time component counterpart).
Also, consider passing your date value as parameter to better integrate with your Python environment with no need to concatenate into Access' # form. Below passes a parameter as tuple of one item:
from datetime import datetime
...
cur.execute("SELECT * FROM MDSSDB WHERE DateValue([ValidStartTime]) = ?", (datetime(2016, 5, 17),))

Python: using pyodbc and replacing row field values

I'm trying to figure out if it's possible to replace record values in a Microsoft Access (either .accdb or .mdb) database using pyodbc. I've poured over the documentation and noted where it says that "Row Values Can Be Replaced" but I have not been able to make it work.
More specifically, I'm attempting to replace a row value from a python variable. I've tried:
setting the connection autocommit to "True"
made sure that it's not a data type issue
Here is a snippet of the code where I'm executing a SQL query, using fetchone() to grab just one record (I know with this script the query is only returning one record), then I am grabbing the existing value for a field (the field position integer is stored in the z variable), and then am getting the new value I want to write to the field by accessing it from an existing python dictionary created in the script.
pSQL = "SELECT * FROM %s WHERE %s = '%s'" % (reviewTBL, newID, basinID)
cursor.execute(pSQL)
record = cursor.fetchone()
if record:
oldVal = record[z]
val = codeCrosswalk[oldVal]
record[z] = val
I've tried everything I can think bit cannot get it to work. Am I just misunderstanding the help documentation?
The script runs successfully but the newly assigned value never seems to commit. I even tried putting "print str(record[z])this after the record[z] = val line to see if the field in the table has the new value and the new value would print like it worked...but then if I check in the table after the script has finished the old values are still in the table field.
Much appreciate any insight into this...I was hoping this would work like how using VBA in MS Access databases you can use an ADO Recordset to loop through records in a table and assign values to a field from a variable.
thanks,
Tom

The "Row values can be replaced" from the pyodbc documentation refers to the fact that you can modify the values on the returned row objects, for example to perform some cleanup or conversion before you start using them. It does not mean that these changes will automatically be persisted in the database. You will have to use sql UPDATE statements for that.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

inserting timestamps with python cassandra prepared statements - python

Yes, you can insert a timestamp value via prepared statements by binding a datetime object. I have tried it with success.

Related

I have a date_time field in Dynamo-db table. How can I query only the entries between two specific timedates?

Python - Filtering SQL query based on dates

How to modify the input value of a stored procedure with python

Query to compare between date with time and date without time - python using access db

Python: using pyodbc and replacing row field values

Categories

Resources