I'm trying to replace a hardcoded date in my sql queury using '{Variable}' but I can't make it work.
Here is what I used to do (which works fine):
conn_drm = pyodbc.connect(
)
query_drm = '''
SELECT *
FROM
WHERE base_file_date = '2022-04-25'
'''
DF = pd.read_sql_query(query_drm, conn_drm)
Now, I would like do to somethig like this:
Yesterday = date.today() - timedelta(days=1)
Yesterday = str(Yesterday)
conn_drm = pyodbc.connect(
)
query_drm = '''
SELECT *
FROM
WHERE base_file_date = '{Yesterday}'
'''
DF = pd.read_sql_query(query_drm, conn_drm)
but I get the error:
Conversion failed when converting date and/or time from character string. (241)
Could someone plz help?
You have to use a formatable string to insert values like this. For this you should add a f in front of the string :
query_drm = f'''
SELECT *
FROM
WHERE base_file_date = '{Yesterday}'
'''
However, please take note that if you have any other input here other than the date and where a user can interact with, this method is dangerous and you should prefer to use the method here : pyodbc - How to perform a select statement using a variable for a parameter
Related
I am pulling data from an online database using SQL/postgresql queries and converting it into a Python dataframe using Pandas. I want to be able to change the dates in the SQL query from one point in my Python script instead of having to manually go through every SQL query and change it one by one as there are many queries and many lines in each one.
This is what I have to begin with for example:
random_query = """
select *
from table_A as a
where date_trunc('day',a.created_at) >= date('2022-03-01')
and date_trunc('day',a.created_at) <= date('2022-03-31')
group by 1,2,3
"""
Then I will read the data into Pandas as follows:
df_random_query = pd.read_sql(random_query, conn)
The connection above is to the database - the issue is not there so I am excluding that portion of code here.
What I have attempted is the following:
start_date = '2022-03-01'
end_date = '2022-03-31'
I have set the above 2 dates as variables and then below I have tried to use them in the SQL query as follows:
attempted_solution = """
select *
from table_A as a
where date_trunc('day',a.created_at) >= date(
""" + start_date + """)
and date_trunc('day',a.created_at) <= date(
""" + end_date + """)
group by 1,2,3
"""
This does run but it gives me a dataframe with no data in it - i.e. no numbers. I am not sure what I am doing wrong - any assistance will really help.
try dropping date function and formatting:
my_query = f"... where date_trunc('day', a.created_at) >= {start_date}"
I was able to work it out as follows:
start_date = '2022-03-01'
end_date = '2022-03-31'
random_query = f"""
select *
from table_A as a
where date_trunc('day',a.created_at) >= date('start_date')
and date_trunc('day',a.created_at) <= date('end_date')
group by 1,2,3
"""
It was amusing to see that all I needed to do was put start_date and end_date in ' ' as well. I noticed this simply by printing what query was showing in the script. Key thing here is to know how to troubleshoot.
Another option was also to use the .format() at the end of the query and inside it say .format(start_date = '2022-03-01', end_date = '2022-03-31').
I'm trying to query a database using Python/Pandas. This will be a recurring request where I'd like to look back into a window of time that changes over time, so I'd like to use some smarts in how I do this.
In my SQLite query, if I say
WHERE table.date BETWEEN DATETIME('now', '-6 month') AND DATETIME('now')
I get the result I expect. But if I try to move those to variables, the resulting table comes up empty. I found out that the endDate variable does work but the startDate does not. Presumably I'm doing something wrong with the escapes around the apostrophes? Since the result is coming up empty it's like it's looking at DATETIME(\'now\') and not seeing the '-6 month' bit (comparing now vs. now which would be empty). Any ideas how I can pass this through to the query correctly using Python?
startDate = 'DATETIME(\'now\', \'-6 month\')'
endDate = 'DATETIME(\'now\')'
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN ? AND ?
'''
df = pd.read_sql_query(query, db, params=[startDate, endDate])
You can try with the string format as shown below,
startDate = "DATETIME('now', '-6 month')"
endDate = "DATETIME('now')"
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN {start_date} AND {end_data}
'''
df = pd.read_sql_query(query.format(start_date=startDate, end_data=endDate), db)
When you provide parameters to a query, they're treated as literals, not expressions that SQL should evaluate.
You can pass the function arguments rather than the function as a string.
startDate = 'now'
startOffset = '-6 month'
endDate = 'now'
endOffset = '+0 seconds'
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN DATETIME(?, ?) AND DATETIME(?, ?)
'''
df = pd.read_sql_query(query, db, params=[startDate, startOffset, endDate, endOffset])
I have the following query in which i'm trying to pass start dates and end dates in a sql query.
def get_data(start_date,end_date):
ic = Connector()
q = f"""
select * from example_table a
where a.date between {start_date} and {end_date}
"""
result = ic.query(q)
return result
df = pd.DataFrame(get_data('2021-01-01','2021-01-31'))
print(df)
which leads to the following error:
AnalysisException: Incompatible return types 'STRING' and 'BIGINT' of exprs 'a.date' and '2021 - 1 - 1'.\n (110) (SQLExecDirectW)")
I have also tried to parse the dates as follows:
import datetime
start_date = datetime.date(2021,1,1)
end_date = datetime.date(2021,5,13)
df = pd.DataFrame(get_data(start_date,end_date))
but i still get the same error.
Any help will be much appreciated.
It seems to me, that it is because how you inject your values into sql query they don't get recognized as date values. Database will likely 2021-01-01 interpret as mathematical expression with 2019 being the result.
You should try put parentheses around your values.
q = f"""
select * from example_table a
where a.date between '{start_date}' and '{end_date}'
"""
Or preferably if your db library allows it don't inject your values directly
q = """
select * from example_table a
where a.date between %s and %s
"""
result = ic.query(q, (start_date, end_date))
EDIT: Some database libraries may use place-holder with different format than %s. You should probably consult documentation of db library you are using.
I have a booking system and I save the booked daterange in a DATERANGE column:
booked_date = Column(DATERANGE(), nullable=False)
I already know that I can access the actual dates with booked_date.lower or booked_date.upper
For example I do this here:
for bdate in room.RoomObject_addresses_UserBooksRoom:
unaviable_ranges['ranges'].append([str(bdate.booked_date.lower),\
str(bdate.booked_date.upper)])
Now I need to filter my bookings by a given daterange. For example I want to see all bookings between 01.01.2018 and 10.01.2018.
Usually its simple, because dates can be compared like this: date <= other date
But if I do it with the DATERANGE:
the_daterange_lower = datetime.strptime(the_daterange[0], '%d.%m.%Y')
the_daterange_upper = datetime.strptime(the_daterange[1], '%d.%m.%Y')
bookings = UserBooks.query.filter(UserBooks.booked_date.lower >= the_daterange_lower,\
UserBooks.booked_date.upper <= the_daterange_upper).all()
I get an error:
AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with UserBooks.booked_date has an attribute 'lower'
EDIT
I found a sheet with useful range operators and it looks like there are better options to do what I want to do, but for this I need somehow to create a range variable, but python cant do this. So I am still confused.
In my database my daterange column entries look like this:
[2018-11-26,2018-11-28)
EDIT
I am trying to use native SQL and not sqlalchemy, but I dont understand how to create a daterange object.
bookings = db_session.execute('SELECT * FROM usersbookrooms WHERE booked_date && [' + str(the_daterange_lower) + ',' + str(the_daterange_upper) + ')')
The query
the_daterange_lower = datetime.strptime(the_daterange[0], '%d.%m.%Y')
the_daterange_upper = datetime.strptime(the_daterange[1], '%d.%m.%Y')
bookings = UserBooks.query.\
filter(UserBooks.booked_date.lower >= the_daterange_lower,
UserBooks.booked_date.upper <= the_daterange_upper).\
all()
could be implemented using "range is contained by" operator <#. In order to pass the right operand you have to create an instance of psycopg2.extras.DateRange, which represents a Postgresql daterange value in Python:
the_daterange_lower = datetime.strptime(the_daterange[0], '%d.%m.%Y').date()
the_daterange_upper = datetime.strptime(the_daterange[1], '%d.%m.%Y').date()
the_daterange = DateRange(the_dateranger_lower, the_daterange_upper)
bookings = UserBooks.query.\
filter(UserBooks.booked_date.contained_by(the_daterange)).\
all()
Note that the attributes lower and upper are part of the psycopg2.extras.Range types. The SQLAlchemy range column types do not provide such, as your error states.
If you want to use raw SQL and pass date ranges, you can use the same DateRange objects to pass values as well:
bookings = db_session.execute(
'SELECT * FROM usersbookrooms WHERE booked_date && %s',
(DateRange(the_daterange_lower, the_daterange_upper),))
You can also build literals manually, if you want to:
bookings = db_session.execute(
'SELECT * FROM usersbookrooms WHERE booked_date && %s::daterange',
(f'[{the_daterange_lower}, {the_daterange_upper})',))
The trick is to build the literal in Python and pass it as a single value – using placeholders, as always. It should avoid any SQL injection possibilities; only thing that can happen is that the literal has invalid syntax for a daterange. Alternatively you can pass the bounds to a range constructor:
bookings = db_session.execute(
'SELECT * FROM usersbookrooms WHERE booked_date && daterange(%s, %s)',
(the_daterange_lower, the_daterange_upper))
All in all it is easier to just use the Psycopg2 Range types and let them handle the details.
I am trying to query a table in Bigquery via a Python script. However I have written the query as a standard sql query. For this I need to start my query with '#standardsql'. However when I do this it then comments out the rest of my query. I have tried to write the query using multiple lines but it does not allow me to do this either. Has anybody dealt with a problem like this and found out a solution? Below is my first code where the query becomes commented out.
client = bigquery.Client('dataworks-356fa')
query = ("#standardsql SELECT count(distinct serial) FROM `dataworks-356fa.FirebaseArchive.test2` Where (PeripheralType = 1 or PeripheralType = 2 or PeripheralType = 12) AND EXTRACT(WEEK FROM createdAt) = EXTRACT(WEEK FROM CURRENT_TIMESTAMP()) - 1 AND serial != 'null'")
dataset = client.dataset('FirebaseArchive')
table = dataset.table('test2')
tbl = dataset.table('Count_BB_Serial_weekly')
job = client.run_async_query(str(uuid.uuid4()), query)
job.destination = tbl
job.write_disposition= 'WRITE_TRUNCATE'
job.begin()
When I try to write the query like this python does not read anything past on the second line as the query.
query = ("#standardsql
SELECT count(distinct serial) FROM `dataworks-356fa.FirebaseArchive.test2` Where (PeripheralType = 1 or PeripheralType = 2 or PeripheralType = 12) AND EXTRACT(WEEK FROM createdAt) = EXTRACT(WEEK FROM CURRENT_TIMESTAMP()) - 1 AND serial != 'null'")
The query Im running selects values that have been produced within the last week. If there is a variation of this that would not be required to use standardsql I would be willing to switch my other queries as well but I have not been able to figure out how to do that. I would prefer for this to be the last resort though. Thank you for the help!
If you want to flag you'll be using Standard SQL inside the query itself, you can build it like:
query = """#standardSQL
SELECT count(distinct serial) FROM `dataworks-356fa.FirebaseArchive.test2` Where (PeripheralType = 1 or PeripheralType = 2 or PeripheralType = 12) AND EXTRACT(WEEK FROM createdAt) = EXTRACT(WEEK FROM CURRENT_TIMESTAMP()) - 1 AND serial != 'null'
"""
Another option you can use as well is setting the property use_legacy_sql of the job created to False, something like:
job = client.run_async_query(job_name, query)
job.use_legacy_sql = False # -->this also makes the API use Standard SQL
job.begin()