Hello I'm getting an error: near "join": syntax error. Is there an obvious issue with this that I'm not picking up on? I've changed names in the query but I've gone over and checked for spelling errors already.
import pandas as pd
import sqlite3 as sql
path1 = r'C:\file.xlsx'
path2 = r'C:\file2.xlsx'
tenants = pd.read_excel(path1, sheet_name='1')
buildings = pd.read_excel(path2)
db = sql.connect('temp.db')
tenants.to_sql('tenantsdb', db)
buildings.to_sql('buildingsdb', db)
Query = pd.read_sql("select t.*, b.distance from tenantsdb t where city = 'city' join buildingsdb b on t.Address = b.Street_Address;", db)
db.close()
In SQL, the order of clauses is SELECT, FROM, JOIN, WHERE. You have JOIN in the wrong place.
Query = pd.read_sql("""
select t.*, b.distance
from tenantsdb t
join buildingsdb b on t.Address = b.Street_Address
where city = 'city';""", db)
Related
So I am fairly new to flask and I am currently trying to create a flask api for a project I am working on. However, there are a couple of issues I am facing.
So for my 1st issue, I can't get my dataframe from the 1st function to work in my second function. I am just wondering how I can get the data_1 to work in the second function.
Code:
from flask import Flask
from sqlalchemy import create_engine
import sqlite3 as sql
import pandas as pd
import datetime
import os
app = Flask(__name__)
#app.route('/', methods=['GET'])
def get_data():
...
data_1 = ...
#print(data_1.head(n=10))
return "hello"
#app.route('/table1', methods=['GET'])
def store_table1_data_df():
db_path = os.path.join(os.path.dirname(__file__),'table1.db')
engine = create_engine('sqlite:///{}'.format(db_path), echo=True)
sqlite_connection = engine.connect()
sqlite_table = 'table1'
data_1.to_sql(sqlite_table,sqlite_connection, if_exists='append')
sqlite_connection.close()
return "table1"
For my second issue, is there a better way of storing a dataframe within flask api using sqlalchemy or sqlite3?
More context as to what kind of data_1 is: data_1 can only hold the past 15 days/records like from 6/15/2021-6/30/2021. However, tomorrow, if I fetch the newest data_1 it will contain 6/16/2021-7/01/2021. How can I just append 07/01/2021 to the old data_1 without creating duplicate records from 06/16/2021, creating two more functions, and an extra db file?
#app.route('/table1', methods=['GET'])
def store_table1_data_df():
db_path = os.path.join(os.path.dirname(__file__),'table1.db')
engine = create_engine('sqlite:///{}'.format(db_path), echo=True)
sqlite_connection = engine.connect()
sqlite_table = 'table1'
data_1.to_sql(sqlite_table,sqlite_connection, if_exists='append')
sqlite_connection.close()
return "table1"
#app.route('/table2', methods=['GET'])
def store_table2_data_df():
db_path2 = os.path.join(os.path.dirname(__file__),'table2.db')
engine2 = create_engine('sqlite:///{}'.format(db_path2), echo=True)
sqlite_connection2 = engine2.connect()
sqlite_table2 = 'table2'
data_1.to_sql(sqlite_table2,sqlite_connection2, if_exists='append')
sqlite_connection2.close()
return "table2"
# What I probably have down below is not the correct way to solve this problem
#app.route('/table1', methods=['GET'])
conn = sql.connect("table1.db")
cur = conn.cursor()
#cur.execute
cur.execute("ATTACH 'table2.db' as 'table2' ")
conn.commit()
table_3 = pd.read_sql_query("SELECT DISTINCT date, value FROM table1 UNION SELECT DISTINCT date, value from table2 ORDER BY date", conn)
cur.execcute("SELECT DISTINCT date, value FROM table1 UNION SELECT DISTINCT date, value from table2 ORDER BY date")
conn.commit()
results3 = cur.fetchall()
sqlite_table='table1'
table_3.to_sql(sqlite_table, conn, if_exists='replace')
cur.close()
conn.close()
return "work"
Any help is greatly appreciated.
For your 1st problem. You may do either of these:
If the size of data-1 is small(than 200kb) you may use flask-session to store the data and access it across routes.
You create a function that returns data_1. Call that function in any route you want. Hint:
def getdata1(val1, val2):
#calculation here
return data_1
Just call this wherever you need data_1.
Store the data frame in a DB and fetch it.
For the second part, a simple for loop will work. Hint on that:
sql_table = ["Fetch your sql table here with the dataframe. Considering dates in one column"]
data_1 = ["Your dataframe"]
for i in data_1['Dates']:
if i != sql_table['dates']:
#insert this key:value pair in sql table
If your data frame and sql is getting loaded in order by date, even better. You just need to check the last elements of each.
I have the following Python code:
import pandas as pd
from sqlalchemy import create_engine
import mysql.connector
# Give the location of the file
loc = ("C:\\Users\\27826\\Desktop\\11Sixteen\\Models and Reports\\Historical results files\\EPL 1993-94.csv")
df = pd.read_csv(loc)
# Remove empty columns then rows
df = df.dropna(axis=1, how='all')
df = df.dropna(axis=0, how='all')
# Create DataFrame and then import to db (new game results table)
engine = create_engine("mysql://root:xxx#localhost/11sixteen")
df.to_sql('new_game_results', con=engine, if_exists="replace")
# Move from new games results table to game results table
db = mysql.connector.connect(host="localhost",
user="root",
passwd="xxx",
database="11sixteen")
my_cursor = db.cursor()
my_cursor.execute("INSERT INTO 11sixteen.game_results "
"SELECT * FROM 11sixteen.new_game_results WHERE "
"NOT EXISTS (SELECT date, HomeTeam "
"FROM 11sixteen.game_results WHERE "
"11sixteen.game_results.date = 11sixteen.new_game_results.date AND "
"11sixteen.game_results.HomeTeam = 11sixteen.new_game_results.HomeTeam)")
print("complete")
Basically the objective is that I copy data from several excel files to a SQL table (one at a time) and then transfer it from there to the fuller table where ALL the data will be aggregated (without duplicates hopefully)
Everything works 100% except the SQL query as below:
INSERT INTO 11sixteen.game_results
SELECT * FROM 11sixteen.new_game_results
WHERE NOT EXISTS ( SELECT date, HomeTeam
FROM 11sixteen.game_results WHERE
11sixteen.game_results.date = 11sixteen.new_game_results.date AND
11sixteen.game_results.HomeTeam = 11sixteen.new_game_results.HomeTeam)
If I run the same query on MySQL Workbench it works perfect. Any ideas why I can't get Python to execute the query as expected?
add a commit at the end.
db.commit()
When I tried to join two tables I got the following error:
sqlalchemy.exc.ObjectNotExecutableError: Not an executable object: sqlalchemy.sql.selectable.Join at 0x7f31a35b02e8; Join object on
chanel(139851192912136) and Device(139851192912864)
My code is:
import sqlalchemy as db
from sqlalchemy import and_,or_,not_,inspect,text,inspection
engine = db.create_engine("mssql+pymssql://sa:elnetsrv#192.108.55.95/ELNetDB")
Data1 = db.Table("chanel", metadata, autoload=True, autoload_with=engine)
Data2 = db.Table("Device",metadata,autoload = True,autoload_with = engine)
metadata = db.MetaData()
j = Data1.join(Data2,Data1.columns.No == Data2.columns.ID)
print(engine.execute(j))
Data1.join(Data2,Data1.columns.No == Data2.columns.ID) is not executable because it is not a query object.
You can try this instead (assuming you want to select every column from Data1):
print( engine.execute(select([Data1]).select_from(j) )
see https://docs.sqlalchemy.org/en/13/core/metadata.html#sqlalchemy.schema.Table.join for reference.
I know I can do it manually using sqlalchemy and pandas
dbschema ='myschema'
engine = create_engine('postgresql://XX:YY#localhost:5432/DB',
connect_args={'options': '-csearch_path={}'.format(dbschema )})
df = psql.read_sql('Select * from myschema."df"', con = engine)
But is it possible to do a loop and to get all the tables ?
I tried something like
tables = engine.table_names()
print(tables)
['A', 'B']
for table in tables :
table = psql.read_sql('Select * from myschema."%(name)s"', con = engine, params={'name' : table})
I get this message:
LINE 1: Select * from myschema.'A'
I guess the problem is caused by my quotes but I am not so sure.
EDIT :
So I tried the example here : Passing table name as a parameter in psycopg2
from psycopg2 import sql
try:
conn = psycopg2.connect("dbname='DB' user='XX' host='localhost' password='YY'")
except:
print ('I am unable to connect to the database')
print(conn)
cur = conn.cursor()
for table in tables :
table = cur.execute(sql.SQL("Select * from myschema.{}").format(sql.Identifier(table)))
But my tables are 'None' so I am doing something wrong but I can't see what.
In trying to replicate a MySQL query in SQL Alchemy, I've hit a snag in specifying which tables to select from.
The query that works is
SELECT c.*
FROM attacks AS a INNER JOIN hosts h ON a.host_id = h.id
INNER JOIN cities c ON h.city_id = c.id
GROUP BY c.id;
I try to accomplish this in SQLAlchemy using the following function
def all_cities():
session = connection.globe.get_session()
destination_city = aliased(City, name='destination_city')
query = session.query(City). \
select_from(Attack).\
join((Host, Attack.host_id == Host.id)).\
join((destination_city, Host.city_id == destination_city.id)).\
group_by(destination_city.id)
print query
results = [result.serialize() for result in query]
session.close()
file(os.path.join(os.path.dirname(__file__), "servers.geojson"), 'a').write(geojson.feature_collection(results))
When printing the query, I end up with ALMOST the right query
SELECT
cities.id AS cities_id,
cities.country_id AS cities_country_id,
cities.province AS cities_province,
cities.latitude AS cities_latitude,
cities.longitude AS cities_longitude,
cities.name AS cities_name
FROM cities, attacks
INNER JOIN hosts ON attacks.host_id = hosts.id
INNER JOIN cities AS destination_city ON hosts.city_id = destination_city.id
GROUP BY destination_city.id
However, you will note that it is selecting from cities, attacks...
How can I get it to select only from the attacks table?
The line here :
query = session.query(City)
is querying the City table also that's why you are getting the query as
FROM cities, attacks