I have a database (called 'all_bookings')that has a column called 'date', in a table called 'bookings' which stores dates in the 'dd/mm/yyyy' form, for example '16/10/2017.' I want to write a function in python that will search for all dates on a particular month, and output them. For example, if in the dates stored where '12/12/2017', '23/11/2018',and '19/12/2018', and I wanted to output all dates that are in December, how would I do this? I know how to search for a specific date, just not a particular month. Any help would be gladly appreciated. Thanks
The SUBSTR function will do what you want. Modified to do what was asked in the OP's comment.
import sqlite3
conn = sqlite3.connect(':memory:')
events = [
('12/12/2017', 'ev_1'),
('23/11/2018', 'ev_2'),
('19/12/2018', 'ev_3'),
]
conn.execute('CREATE TABLE bookings (date, event)')
conn.executemany('INSERT INTO bookings (date, event) values (?,?)', events)
validMonth = False
while not validMonth:
lookForMonth = input('What month, please? (number from 1 through 12):')
try:
validMonth = 1<=int(lookForMonth)<=12
except:
pass
sqlCmd = 'SELECT date FROM bookings WHERE SUBSTR(date,4,2)="%.2i"' % int(lookForMonth)
for row in conn.execute(sqlCmd):
print (row)
Results are:
('12/12/2017',)
('19/12/2018',)
Related
I'm trying to query a database using Python/Pandas. This will be a recurring request where I'd like to look back into a window of time that changes over time, so I'd like to use some smarts in how I do this.
In my SQLite query, if I say
WHERE table.date BETWEEN DATETIME('now', '-6 month') AND DATETIME('now')
I get the result I expect. But if I try to move those to variables, the resulting table comes up empty. I found out that the endDate variable does work but the startDate does not. Presumably I'm doing something wrong with the escapes around the apostrophes? Since the result is coming up empty it's like it's looking at DATETIME(\'now\') and not seeing the '-6 month' bit (comparing now vs. now which would be empty). Any ideas how I can pass this through to the query correctly using Python?
startDate = 'DATETIME(\'now\', \'-6 month\')'
endDate = 'DATETIME(\'now\')'
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN ? AND ?
'''
df = pd.read_sql_query(query, db, params=[startDate, endDate])
You can try with the string format as shown below,
startDate = "DATETIME('now', '-6 month')"
endDate = "DATETIME('now')"
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN {start_date} AND {end_data}
'''
df = pd.read_sql_query(query.format(start_date=startDate, end_data=endDate), db)
When you provide parameters to a query, they're treated as literals, not expressions that SQL should evaluate.
You can pass the function arguments rather than the function as a string.
startDate = 'now'
startOffset = '-6 month'
endDate = 'now'
endOffset = '+0 seconds'
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN DATETIME(?, ?) AND DATETIME(?, ?)
'''
df = pd.read_sql_query(query, db, params=[startDate, startOffset, endDate, endOffset])
I have an SQLite database containing three columns; EmpName, theTime, and theDate. I'm trying to run a query that will return rows with EmpName and theTime that fit a specific range of dates. My goal is to add all these times up and give an end-of-week total time. However, when I run my code, it returns a list of 0.
I've tried queries such as:
SELECT * FROM Swipes WHERE theDate>=:starting<=:endof", {'starting': fstartingDate, 'endof': formatted}
and
SELECT EmpName AND theTime FROM Swipes WHERE theDate >=? <=?, (fstartingdate, formatted,)
and
SELECT EmpName AND theTime FROM Swipes WHERE thedate BETWEEN ? AND ?", (fstartingDate, formatted)
and other variations of that approach, but it feels like I'm running in circles.
This is my code:
def weekSummary(endofthatweek):
formatted = dt.strptime(endofthatweek, '%Y-%m-%d')
startingDate = formatted - td(days=7)
fstartingDate = startingDate.date()
con = sqlite3.connect(r'C:\Users\zstrickland.RANDSMACHINE\Documents\PymeClock\testTimeclock.db')
cur = con.cursor()
cur.execute("SELECT EmpName AND theTime FROM Swipes WHERE theDate>=:starting<=:endof", {'starting': fstartingDate, 'endof': formatted})
tupes = cur.fetchall()
con.close()
detuped = [x[0] for x in tupes]
print(detuped)
I hope to get a list(Probably a list of tuples..) in the following format:
[(EmpName, theTime), (EmpName, theTime), (EmpName, theTime), (EmpName, theTime)].
Any help or suggestions on how to make this calculation would be helpful. Thank you!
These conditions:
theDate>=:starting<=:endof
and:
theDate >=? <=?
are not valid SQL conditions.
You need BETWEEN:
theDate BETWEEN :starting AND :endof
or:
theDate BETWEEN ? AND ?
I have two queries and I want to match the rows of those two queries. That is I want to execute the same number of rows in both the queries. Below code executes the number of dates of the present month and score I have to change it manually every day which is not possible
cursor.execute("select TO_CHAR(i :: DATE, 'dd/mm/yyyy') from generate_series(date_trunc('month', current_date), current_date, '1 day'::interval) i ")
# data = cursor.fetchone()
rows = cursor.fetchall()
labels6 = list()
i = 0
for row in rows:
labels6.append(row[i])
Above is the code which executes dates of the current month
cursor.execute("select score*100 from daily_stats1 where user_id=102")
rows = cursor.fetchall()
# Convert query to objects of key-value pairs
presentmonth1 = list()
i = 0
for row in rows[:28]:
presentmonth1.append(row[i])
Above is the code which executes present month score.'28' is given manually I have to change it every day which is not possible.so I want a solution where the date rows match with the score rows
I assume the excess indentation in your code is a mistake.
If that is the case, I think this will solve your problem:
cursor.execute("select TO_CHAR(i :: DATE, 'dd/mm/yyyy') from "
"generate_series(date_trunc('month', current_date), current_date, '1 day'::interval) i ")
labels6 = cursor.fetchall()
cursor.execute("select score*100 from daily_stats1 where user_id=102")
presentmonth1 = cursor.fetchall()[:len(labels6)]
I removed some unneeded code, but the result should be correct.
I have a database that has a bookings table in. One of the columns in the bookings table is 'incomes', and another one is 'date_of_booking,' which stores dates in 'DD/MM/YYYY' format. I am trying to write a feature that lets a user input a month, and from that will calculate all the incomes from that month. So far I have this:
validMonth = False
while not validMonth:
lookForMonth = input('What month, please? (number from 1 through 12):')
try:
validMonth = 1<=int(lookForMonth)<=12
except:
pass
sqlCmd = 'SELECT date FROM bookings WHERE SUBSTR(date,4,2)="%.2i"' % int(lookForMonth)
for row in conn.execute(sqlCmd):
print (row)
With this code, I am able to output the date of bookings for a particular month. However I want to output the total incomes for a particular month. What do I need to add so that this works out the total incomes for a particular month and outputs it? Any help would be gladly appreciated, thanks.
Replace one statement.
SELECT sum(income) FROM bookings where SUBSTR(date,4,2)='04'
As in:
import sqlite3
conn = sqlite3.connect(':memory:')
c = conn.cursor()
c.execute('CREATE TABLE bookings (date text, income real)')
c.execute('''INSERT INTO bookings VALUES ('01/04/2017', 19.22)''')
c.execute('''INSERT INTO bookings VALUES ('15/04/2017', 19.22)''')
c.execute('''INSERT INTO bookings VALUES ('22/04/2017', 19.22)''')
validMonth = False
while not validMonth:
lookForMonth = input('What month, please? (number from 1 through 12):')
try:
validMonth = 1<=int(lookForMonth)<=12
except:
pass
sql = '''SELECT sum(income) FROM bookings where SUBSTR(date,4,2)="%.2i"''' % int(lookForMonth)
for row in c.execute(sql):
print (row)
Resulting output:
What month, please? (number from 1 through 12):4
(57.66,)
First of all, you want to select both in your sql statement.
sqlCmd = 'SELECT date_of_booking,incomes FROM bookings WHERE SUBSTR(date,4,2)="%.2i"' % int(lookForMonth)
income_sum = 0
for (row_date, row_income) in conn.execute(sqlCmd):
income_sum += row_income
print row_date
print income_sum
Then you can specify both date and income of the row in your loop like above.
So I found a great script over at QuantState that had a great walk-through on setting up my own securities database and loading in historical pricing information. However, I'm not trying to modify the script so that I can run it daily and have the latest stock quotes added.
I adjusted the initial data load to just download 1 week worth of historicals, but I've been having issues with writing the SQL statement to see if the row exists already before adding. Can anyone help me out with this. Here's what I have so far:
def insert_daily_data_into_db(data_vendor_id, symbol_id, daily_data):
"""Takes a list of tuples of daily data and adds it to the
database. Appends the vendor ID and symbol ID to the data.
daily_data: List of tuples of the OHLC data (with
adj_close and volume)"""
# Create the time now
now = datetime.datetime.utcnow()
# Amend the data to include the vendor ID and symbol ID
daily_data = [(data_vendor_id, symbol_id, d[0], now, now,
d[1], d[2], d[3], d[4], d[5], d[6]) for d in daily_data]
# Create the insert strings
column_str = """data_vendor_id, symbol_id, price_date, created_date,
last_updated_date, open_price, high_price, low_price,
close_price, volume, adj_close_price"""
insert_str = ("%s, " * 11)[:-2]
final_str = "INSERT INTO daily_price (%s) VALUES (%s) WHERE NOT EXISTS (SELECT 1 FROM daily_price WHERE symbol_id = symbol_id AND price_date = insert_str[2])" % (column_str, insert_str)
# Using the postgre connection, carry out an INSERT INTO for every symbol
with con:
cur = con.cursor()
cur.executemany(final_str, daily_data)
Some notes regarding your code above:
It's generally easier to defer to now() in pure SQL than to try in Python whenever possible. It avoids lots of potential pitfalls with timezones, library differences, etc.
If you construct a list of columns, you can dynamically generate a string of %s's based on its size, and don't need to hardcode the length into a repeated string with is then sliced.
Since it appears that insert_daily_data_into_db is meant to be called from within a loop on a per-stock basis, I don't believe you want to use executemany here, which would require a list of tuples and is very different semantically.
You were comparing symbol_id to itself in the sub select, instead of a particular value (which would mean it's always true).
To prevent possible SQL Injection, you should always interpolate values in the WHERE clause, including sub selects.
Note: I'm assuming that you're using psycopg2 to access Postgres, and that the primary key for the table is a tuple of (symbol_id, price_date). If not, the code below would need to be tweaked at least a bit.
With those points in mind, try something like this (untested, since I don't have your data, db, etc. but it is syntactically valid Python):
def insert_daily_data_into_db(data_vendor_id, symbol_id, daily_data):
"""Takes a list of tuples of daily data and adds it to the
database. Appends the vendor ID and symbol ID to the data.
daily_data: List of tuples of the OHLC data (with
adj_close and volume)"""
column_list = ["data_vendor_id", "symbol_id", "price_date", "created_date",
"last_updated_date", "open_price", "high_price", "low_price",
"close_price", "volume", "adj_close_price"]
insert_list = ['%s'] * len(column_str)
values_tuple = (data_vendor_id, symbol_id, daily_data[0], 'now()', 'now()', daily_data[1],
daily_data[2], daily_data[3], daily_data[4], daily_data[5], daily_data[6])
final_str = """INSERT INTO daily_price ({0})
VALUES ({1})
WHERE NOT EXISTS (SELECT 1
FROM daily_price
WHERE symbol_id = %s
AND price_date = %s)""".format(', '.join(column_list), ', '.join(insert_list))
# Using the postgre connection, carry out an INSERT INTO for every symbol
with con:
cur = con.cursor()
cur.execute(final_str, values_tuple, values_tuple[1], values_tuple[2])