I got a Database which have Text formatted Dates, now i need to filter specific Date ranges.
This Query works for me:
SELECT CDate(field) AS df
FROM table
WHERE CDate(field)=Date();
Sadly i didn't found how i can use a SQLAlchemy Query like this.
This works for me:
import sqlalchemy as sa
import sqlalchemy_access as sa_a
# …
tbl = sa.Table("so71529087", sa.MetaData(), sa.Column("field", sa_a.DateTime))
tbl.drop(engine, checkfirst=True)
tbl.create(engine)
qry = sa.select(sa.func.CDate(tbl.c.field).label("df")).where(
sa.func.CDate(tbl.c.field) == sa.text("Date()")
)
print(qry)
"""
SELECT CDate(so71529087.field) AS df
FROM so71529087
WHERE CDate(so71529087.field) = Date()
"""
with engine.begin() as conn:
results = conn.execute(qry).all()
Related
Consider following working code of copy a souce sqlite database to target sqlite database:
# Create two database.
import sqlite3
import pandas as pd
import time
cn_src = sqlite3.connect('source.db')
df=pd.DataFrame({"x":[1,2],"y":[2.0,3.0]})
df.to_sql("A", cn_src, if_exists="replace", index=False)
cn_tgt = sqlite3.connect('target.db')
cn_src.close()
cn_tgt.close()
from sqlalchemy import create_engine, MetaData, event
from sqlalchemy.sql import sqltypes
# create sqlalchemy conneciton
src_engine = create_engine("sqlite:///source.db")
src_metadata = MetaData(bind=src_engine)
exclude_tables = ('sqlite_master', 'sqlite_sequence', 'sqlite_temp_master')
tgt_engine = create_engine("sqlite:///target.db")
tgt_metadata = MetaData(bind=tgt_engine)
#event.listens_for(src_metadata, "column_reflect")
def genericize_datatypes(inspector, tablename, column_dict):
column_dict["type"] = column_dict["type"].as_generic(allow_nulltype=True)
tgt_conn = tgt_engine.connect()
tgt_metadata.reflect()
# delete tables in target database.
for table in reversed(tgt_metadata.sorted_tables):
if table.name not in exclude_tables:
print('dropping table =', table.name)
table.drop()
tgt_metadata.clear()
tgt_metadata.reflect()
src_metadata.reflect()
# copy table
for table in src_metadata.sorted_tables:
if table.name not in exclude_tables:
table.create(bind=tgt_engine)
# Update meta information
tgt_metadata.clear()
tgt_metadata.reflect()
# Copy data
for table in tgt_metadata.sorted_tables:
src_table = src_metadata.tables[table.name]
stmt = table.insert()
for index, row in enumerate(src_table.select().execute()):
print("table =", table.name, "Inserting row", index)
start=time.time()
stmt.execute(row._asdict())
end=time.time()
print(end-start)
The code was mainly borrowed from other source. The problem is the time end-start is about 0.017 in my computer which is too large. Is there any way to speed up? I have tried set isolation_level=None in create_engine but no luck.
It seems like that Insert object has no executemany method so we can't use bulk inserting.
It seems like that Insert object has no executemany method so we can't use bulk inserting.
SQLAlchemy does not implement separate execute() and executemany() methods. Its execute() method looks at the parameters it receives and
if they consist of a single dict object (i.e., a single row) then it calls execute() at the driver level, or
if they consist of a list of dict objects (i.e., multiple rows) then it calls executemany() at the driver level.
Note also that you are using deprecated usage patterns, specifically MetaData(bind=…). You should be doing something more like this:
import sqlalchemy as sa
engine = sa.create_engine("sqlite://")
tbl = sa.Table(
"tbl",
sa.MetaData(),
sa.Column("id", sa.Integer, primary_key=True, autoincrement=False),
sa.Column("txt", sa.String),
)
tbl.create(engine)
with engine.begin() as conn:
stmt = sa.insert(tbl)
params = [
dict(id=1, txt="foo"),
dict(id=2, txt="bar"),
]
conn.execute(stmt, params)
# check results
with engine.begin() as conn:
print(conn.exec_driver_sql("SELECT * FROM tbl").all())
# [(1, 'foo'), (2, 'bar')]
I come up with a solution using transaction:
# Copy data
trans=tgt_conn.begin()
for table in tgt_metadata.sorted_tables:
src_table = src_metadata.tables[table.name]
stmt = table.insert().execution_options(autocommit=False)
for index, row in enumerate(src_table.select().execute()):
tgt_conn.execute(stmt, row._asdict()) # must use tgt_conn.execute(), not stmt.execute()
trans.commit()
tgt_conn.close()
So I am fairly new to flask and I am currently trying to create a flask api for a project I am working on. However, there are a couple of issues I am facing.
So for my 1st issue, I can't get my dataframe from the 1st function to work in my second function. I am just wondering how I can get the data_1 to work in the second function.
Code:
from flask import Flask
from sqlalchemy import create_engine
import sqlite3 as sql
import pandas as pd
import datetime
import os
app = Flask(__name__)
#app.route('/', methods=['GET'])
def get_data():
...
data_1 = ...
#print(data_1.head(n=10))
return "hello"
#app.route('/table1', methods=['GET'])
def store_table1_data_df():
db_path = os.path.join(os.path.dirname(__file__),'table1.db')
engine = create_engine('sqlite:///{}'.format(db_path), echo=True)
sqlite_connection = engine.connect()
sqlite_table = 'table1'
data_1.to_sql(sqlite_table,sqlite_connection, if_exists='append')
sqlite_connection.close()
return "table1"
For my second issue, is there a better way of storing a dataframe within flask api using sqlalchemy or sqlite3?
More context as to what kind of data_1 is: data_1 can only hold the past 15 days/records like from 6/15/2021-6/30/2021. However, tomorrow, if I fetch the newest data_1 it will contain 6/16/2021-7/01/2021. How can I just append 07/01/2021 to the old data_1 without creating duplicate records from 06/16/2021, creating two more functions, and an extra db file?
#app.route('/table1', methods=['GET'])
def store_table1_data_df():
db_path = os.path.join(os.path.dirname(__file__),'table1.db')
engine = create_engine('sqlite:///{}'.format(db_path), echo=True)
sqlite_connection = engine.connect()
sqlite_table = 'table1'
data_1.to_sql(sqlite_table,sqlite_connection, if_exists='append')
sqlite_connection.close()
return "table1"
#app.route('/table2', methods=['GET'])
def store_table2_data_df():
db_path2 = os.path.join(os.path.dirname(__file__),'table2.db')
engine2 = create_engine('sqlite:///{}'.format(db_path2), echo=True)
sqlite_connection2 = engine2.connect()
sqlite_table2 = 'table2'
data_1.to_sql(sqlite_table2,sqlite_connection2, if_exists='append')
sqlite_connection2.close()
return "table2"
# What I probably have down below is not the correct way to solve this problem
#app.route('/table1', methods=['GET'])
conn = sql.connect("table1.db")
cur = conn.cursor()
#cur.execute
cur.execute("ATTACH 'table2.db' as 'table2' ")
conn.commit()
table_3 = pd.read_sql_query("SELECT DISTINCT date, value FROM table1 UNION SELECT DISTINCT date, value from table2 ORDER BY date", conn)
cur.execcute("SELECT DISTINCT date, value FROM table1 UNION SELECT DISTINCT date, value from table2 ORDER BY date")
conn.commit()
results3 = cur.fetchall()
sqlite_table='table1'
table_3.to_sql(sqlite_table, conn, if_exists='replace')
cur.close()
conn.close()
return "work"
Any help is greatly appreciated.
For your 1st problem. You may do either of these:
If the size of data-1 is small(than 200kb) you may use flask-session to store the data and access it across routes.
You create a function that returns data_1. Call that function in any route you want. Hint:
def getdata1(val1, val2):
#calculation here
return data_1
Just call this wherever you need data_1.
Store the data frame in a DB and fetch it.
For the second part, a simple for loop will work. Hint on that:
sql_table = ["Fetch your sql table here with the dataframe. Considering dates in one column"]
data_1 = ["Your dataframe"]
for i in data_1['Dates']:
if i != sql_table['dates']:
#insert this key:value pair in sql table
If your data frame and sql is getting loaded in order by date, even better. You just need to check the last elements of each.
After a query with Python's psycopg2
SELECT
id,
array_agg(еnty_pub_uuid) AS ptr_entity_public
FROM table
GROUP BY id
I get returned an array:
{a630e0a3-c544-11ea-9b8c-b73c488956ba,c2f03d24-2402-11eb-ab91-3f8e49eb63e7}
How can I parse this to a list in python?
Is there a builtin function in psycopg2?
psycopg2 cares about type conversations between python and postgres:
import psycopg2
conn = psycopg2.connect("...")
cur = conn.cursor()
cur.execute(
"select user_id, array_agg(data_name) from user_circles where user_id = '81' group by user_id"
)
res = cur.fetchall()
print(res[0])
print(type(res[0][1]))
Out:
('81', ['f085b2e3-b943-429e-850f-4ecf358abcbc', '65546d63-be96-4711-a4c1-a09f48fbb7f0', '81d03c53-9d71-4b18-90c9-d33322b0d3c6', '00000000-0000-0000-0000-000000000000'])
<class 'list'>
you need to register the UUID type for python and postgres to infer types.
import psycopg2.extras
psycopg2.extras.register_uuid()
sql = """
SELECT
id,
array_agg(еnty_pub_uuid) AS ptr_entity_public
FROM table
GROUP BY id
"""
cursor = con.cursor()
cursor.execute(sql)
results = cursor.fetchall()
for r in results:
print(type(r[1]))
Ok, I have tried several kinds of solutions recommended by others on this site and other sited. However, I can't get it work as I would like it to do.
I get a XML-response which I normalize and then save to a CSV. This first part works fine.
Instead of saving it to CSV I would like to save it into an existing table in an access database. The second part below:
Would like to use an existing table instead of creating a new one
The result is not separated with ";" into different columns. Everything ends up in the same column not separated, see image below
response = requests.get(u,headers=h).json()
dp = pd.json_normalize(response,'Units')
response_list.append(dp)
export = pd.concat(response_list)
export.to_csv(r'C:\Users\username\Documents\Python Scripts\Test\Test2_'+str(now)+'.csv', index=False, sep=';',encoding='utf-8')
access_path = r"C:\Users\username\Documents\Python Scripts\Test\Test_db.accdb"
conn = pyodbc.connect("DRIVER={{Microsoft Access Driver (*.mdb, *.accdb)}};DBQ={};" \
.format(access_path))
strSQL = "SELECT * INTO projects2 FROM [text;HDR=Yes;FMT=sep(;);" + \
"Database=C:\\Users\\username\\Documents\\Python Scripts\\Test].Testdata.csv;"
cur = conn.cursor()
cur.execute(strSQL)
conn.commit()
conn.close()
If you already have the data in a well-formed pandas DataFrame then you don't really need to dump it to a CSV file; you can use the sqlalchemy-access dialect to push the data directly into an Access table using pandas' to_sql() method:
from pprint import pprint
import urllib
import pandas as pd
import sqlalchemy as sa
connection_string = (
r"DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};"
r"DBQ=C:\Users\Public\Database1.accdb;"
r"ExtendedAnsiSQL=1;"
)
connection_uri = f"access+pyodbc:///?odbc_connect={urllib.parse.quote_plus(connection_string)}"
engine = sa.create_engine(connection_uri)
with engine.begin() as conn:
# existing data in table
pprint(
conn.execute(sa.text("SELECT * FROM user_table")).fetchall(), width=30
)
"""
[('gord', 'gord#example.com'),
('jennifer', 'jennifer#example.com')]
"""
# DataFrame to insert
df = pd.DataFrame(
[
("newdev", "newdev#example.com"),
("newerdev", "newerdev#example.com"),
],
columns=["username", "email"],
)
df.to_sql("user_table", engine, index=False, if_exists="append")
with engine.begin() as conn:
# updated table
pprint(
conn.execute(sa.text("SELECT * FROM user_table")).fetchall(), width=30
)
"""
[('gord', 'gord#example.com'),
('jennifer', 'jennifer#example.com'),
('newdev', 'newdev#example.com'),
('newerdev', 'newerdev#example.com')]
"""
(Disclosure: I am currently the maintainer of the sqlalchemy-access dialect.)
Solved with the following code
SE_export_Tuple = list(zip(SE_export.Name,SE_export.URL,SE_export.ImageUrl,......,SE_export.ID))
print(SE_export_Tuple)
access_path = r"C:\Users\username\Documents\Python Scripts\Test\Test_db.accdb"
conn = pyodbc.connect("DRIVER={{Microsoft Access Driver (*.mdb, *.accdb)}};DBQ={};" \
.format(access_path))
cursor = conn.cursor()
mySql_insert_query="INSERT INTO Temp_table (UnitName,URL,ImageUrl,.......,ID) VALUES (?,?,?,......,?)"
cursor.executemany(mySql_insert_query,SE_export_Tuple)
conn.commit()
conn.close()
However, when I add many fields I get an error at "executemany", saying:
cursor.executemany(mySql_insert_query,SE_export_Tuple)
Error: ('HY004', '[HY004] [Microsoft][ODBC Microsoft Access Driver]Invalid SQL data type (67) (SQLBindParameter)')
I want to insert multiple rows using connection.execute, and one of the columns must be set to the result of the database's CURRENT_TIMESTAMP function.
For example, given this table:
import sqlalchemy as sa
metadata = sa.MetaData()
foo = sa.Table('foo', metadata,
sa.Column('id', sa.Integer, primary_key=True),
sa.Column('ts', sa.TIMESTAMP))
# I'm using Sqlite for this example, but this question
# is database-agnostic.
engine = sa.create_engine('sqlite://', echo=True)
metadata.create_all(engine)
I can insert a single row like this:
conn = engine.connect()
with conn.begin():
ins = foo.insert().values(ts=sa.func.current_timestamp())
conn.execute(ins)
However when I try to insert multiple rows:
with conn.begin():
ins = foo.insert()
conn.execute(ins, [{'ts': sa.func.current_timestamp()}])
a TypeError is raised:
sqlalchemy.exc.StatementError: (builtins.TypeError) SQLite DateTime type only accepts Python datetime and date objects as input.
[SQL: INSERT INTO foo (ts) VALUES (?)]
[parameters: [{'ts': <sqlalchemy.sql.functions.current_timestamp at 0x7f3607e21070; current_timestamp>}]
Replacing the function with the string "CURRENT_TIMESTAMP" results in a similar error.
Is there a way to get the database to set the column to CURRENT_TIMESTAMP using connection.execute?
I'm aware that I can work around this by querying for the value of CURRENT_TIMESTAMP within the same transaction and using that value in the INSERT values, or executing and UPDATE after the INSERT. I'm specifically asking whether this can be done in connection.execute's *multiparams argument.
It's a hack for sure, but this appears to work for SQLite at least:
from datetime import datetime
from pprint import pprint
import sqlalchemy as sa
engine = sa.create_engine("sqlite:///:memory:")
metadata = sa.MetaData()
foo = sa.Table(
"foo",
metadata,
sa.Column("id", sa.Integer, primary_key=True, autoincrement=True),
sa.Column("ts", sa.TIMESTAMP),
sa.Column("txt", sa.String(50)),
)
foo.create(engine)
with engine.begin() as conn:
ins_query = str(foo.insert().compile()).replace(
" :ts, ", " CURRENT_TIMESTAMP, "
)
print(ins_query)
# INSERT INTO foo (id, ts, txt) VALUES (:id, CURRENT_TIMESTAMP, :txt)
data = [{"id": None, "txt": "Alfa"}, {"id": None, "txt": "Bravo"}]
conn.execute(sa.text(ins_query), data)
print(datetime.now())
# 2021-03-06 17:41:35.743452
# (local time here is UTC-07:00)
results = conn.execute(sa.text("SELECT * FROM foo")).fetchall()
pprint(results, width=60)
"""
[(1, '2021-03-07 00:41:35', 'Alfa'),
(2, '2021-03-07 00:41:35', 'Bravo')]
"""