Related
My Query is,
engine = create_engine("postgres://")
conn = engine.connect()
conn.autocommit = True
In Flask Route i am using this query,
result = conn.execute("""UPDATE business_portal SET business_name ="""+str(business_name)+""", name_tag ="""+str(business_tag)+""",name_atr = """+str(business_attr)+""", address =""" +str(address)+""",address_tag =""" +str(address_tag)+""", address_atr = """+str(address_attr)+""", city = """+str(city)+""", city_tag ="""+str(city_tag)+""", city_atr =""" +str(city_attr)+""", state = """+str(state)+""", state_tag = """+str(state_tag)+""",state_atr = """+str(state_attr)+""",zip_code = """+str(zipcode)+""",zip_tag ="""+str(zip_tag)+""",zip_atr ="""+str(zip_attr)+""",contact_number ="""+str(contact_number)+""",num_tag = """+str(contact_tag)+""", num_atr ="""+str(contact_attr)+""",domain ="""+str(domain)+""", search_url = """+str(search_url)+""",category =""" +str(category)+""", logo_path =""" +str(logo_path)+""" WHERE id=%s """,(id))
The above query accepts the data without space (eg abcd).... But when data are with spaces (eg abcd efgh ijkl) it displays a syntax error.
Can any one help me?
The values for in the SET clause need to be quoted in the same way as the values in the WHERE clause.
>>> cur = conn.cursor()
>>> stmt = "UPDATE tbl SET col = %s WHERE id = %s"
>>>
>>> # Observe that the SET value is three separate characters
>>> cur.mogrify(stmt % ('a b c', 37))
b'UPDATE tbl SET col = a b c WHERE id = 42'
>>>
>>> # Observe that the SET value is a single, quoted value
>>> cur.mogrify(stmt, ('a b c', 37))
b"UPDATE tbl SET col = 'a b c' WHERE id = 42"
NB cursor.mogrify is a psycopg2 method that prints the query that would be sent to the server by cursor.execute: it doesn't actually execute the query.
I have a list of Dictionaries where the Dictionaries have the following structure:
{
"subject" : "subjectValue",
"object" : "objectValue",
"prediction" : "predictionValue"
}
Currently i have three Methods to filter all Dictionaries from the list for each of the fields like this:
def getSubjects(value, objectValue, predictionValue):
return [stmt["subject"] for stmt in value if stmt["object"] == objectValue and stmt["prediction"] == predicitonValue]
def getObjects(value, subjectValue, predictionValue):
return [stmt["object"] for stmt in value if stmt["subject"] == subjectValue and stmt["prediction"] == predicitonValue]
def getPredictions(value, objectValue, subjectValue):
return [stmt["prediction"] for stmt in value if stmt["object"] == objectValue and stmt["subject"] == subjectValue]
I also have the following three methods to just get one of the Dictionaries out of the list:
def getSubject(value, objectValue, predictionValue):
return next(stmt["subject"] for stmt in value if if stmt["object"] == objectValue and stmt["prediction"] == predicitonValue)
def getObject(value, subjectValue, predictionValue):
return next(stmt["object"] for stmt in value if stmt["subject"] == subjectValue and stmt["prediction"] == predicitonValue)
def getPrediction(value, objectValue, subjectValue):
return next(stmt["prediction"] for stmt in value if stmt["object"] == objectValue and stmt["subject"] == subjectValue)
Is there a better way to achive this? Maybe in one single Method or in a more pythonic way?
You can get rid of the next() functions by always returning an iterator instead of a list. Then you can either call next() or list() on the returned value. To generalize, you could require that the function is called with keyword arguments representing the keys you want to match along with the key you want to return. The signature would look like:
def pluckFilter(ds, key, **filters):
And you could call it like:
pluckFilter(listOfDicts, 'object', subject = 'subjectValue2', prediction = 'predictionValue' )
This would allow a pretty simple implementation (although you might want to validate the input):
listOfDicts = [{
"subject" : "subjectValue",
"object" : "objectValue",
"prediction" : "predictionValue"
},
{
"subject" : "subjectValue2",
"object" : "objectValue",
"prediction" : "predictionValue"
}
]
def pluckFilter(ds, key, **filters):
filtered = filter(lambda d: all(d[filt] == filters[filt] for filt in filters), ds)
return map(lambda d: d[key], filtered)
list(pluckFilter(listOfDicts, 'subject', object = 'objectValue', prediction = 'predictionValue' ))
# ['subjectValue', 'subjectValue2']
next(pluckFilter(listOfDicts, 'subject', object = 'objectValue', prediction = 'predictionValue' ))
# 'subjectValue'
list(pluckFilter(listOfDicts, 'object', subject = 'subjectValue2', prediction = 'predictionValue' ))
# ['objectValue']
I'd recommend using a SQLite3 database or a Pandas Data Frame. SQLite3 tutorial
import sqlite3
conn = sqlite3.connect("database.db")
c = conn.cursor()
c.execute("CREATE TABLE IF NOT EXISTS my_table(subject REAL, object REAL, prediction REAL)")
c.execute("INSERT INTO my_table(subject, object, prediction) VALUES (?, ?, ?)",
(10, 20, 30))
c.execute("SELECT * FROM my_table")
data = c.fetchall()
for row in data:
print(row)
c.close()
conn.close()
Why won't you use pandas?
You can convert the list of dictionaries into pandas DataFrame. And then use convenient filtering to get the same values as the functions you have:
df = pd.DataFrame(list_of_dict, index=range(len(list_of_dict)))
e.g. analogon of getSubject:
subjects = subjects = df.loc[(df["object"] == 'objectValue') & (df["prediction"] == 'predictionValue'), 'subject']
DataBase is already created(PostgreSQL)
There is a list:
data =
['param_1', 0],
['amp', 0],
['voltage', 1],
['params', 1],
['antenna', 1],
['freq', 0.00011000000085914508]
I tried that
import psycopg2
from psycopg2 import sql
with psycopg2.connect(dbname='db_name', user='postgres',
password='123', host='localhost') as conn:
conn.autocommit = True
with conn.cursor() as cur:
query = "INSERT INTO table_name (%s) VALUES (%s);"
cur.executemany(query, data)
I need to insert values in the table in the database that contains fields named: 'param_1': 'paRam_2' e.t.c.
How do I generate a query string?
I will be happy for any help, thanks in advance, stack.
Parameter substitution can only be used to pass column values, not column names, so we'll need to build a list of column names to insert into the SQL command text. Specifically, we'll need to
build a comma-separated string of column names
build a comma-separated string of parameter placeholders
create the INSERT command, including the two items above
create a tuple of (numeric) parameter values
execute the command
That would look something like this:
# create environment
# ~~~~~~~~~~~~~~~~~~
data = (
['param_1', 0],
['amp', 0],
['voltage', 1],
['params', 1],
['antenna', 1],
['freq', 0.00011000000085914508]
)
# example code
# ~~~~~~~~~~~~
columns = ','.join([f'"{x[0]}"' for x in data])
print(columns)
# "param_1","amp","voltage","params","antenna","freq"
param_placeholders = ','.join(['%s' for x in range(len(data))])
print(param_placeholders)
# %s,%s,%s,%s,%s,%s
sql = f"INSERT INTO table_name ({columns}) VALUES ({param_placeholders})"
print(sql)
# INSERT INTO table_name ("param_1","amp","voltage","params","antenna","freq") VALUES (%s,%s,%s,%s,%s,%s)
param_values = tuple(x[1] for x in data)
print(param_values)
# (0, 0, 1, 1, 1, 0.00011000000085914508)
cur.execute(sql, param_values)
Thank's #GordThompson for his response. It's been very helpful and based on your response I have created two functions:
- One for reading string of falues.
- Second to insert values into the table.
I would like to notice that psycopg2 did not fight for a dynamic lookup of the table name in the query string generation
import psycopg2
def insert_to(self, table_name: str, data: dict):
# формирование строки запроса
columns = ','.join([f'"{x}"' for x in data])
param_placeholders = ','.join(['%s' for x in range(len(data))])
query = f'INSERT INTO "{table_name}" ({columns}) VALUES ({param_placeholders})'
param_values = tuple(x for x in data.values())
try:
self.cur.execute(query, param_values)
except Exception as e:
log.exception(f'\r\nException: {e}')
else:
log.warning(f'INSERT INTO "{table_name}" {data}')
def read_from(self, table_name: str, dict_for_get_pk: dict) -> any:
# формирование строки запроса
columns = ','.join([f'"{x}"' for x in dict_for_get_pk])
param_placeholders = ','.join(['%s' for x in range(len(dict_for_get_pk))])
query = f'SELECT * FROM "{table_name}" WHERE ({columns}) = ({param_placeholders})'
param_values = tuple(x for x in dict_for_get_pk.values())
try:
self.cur.execute(query, param_values)
except Exception as e:
log.exception(f'\r\nException: {e}')
else:
db_values = self.cur.fetchall()
if db_values:
return db_values[0]
else:
return None
I have to construct a dynamic update query for postgresql.
Its dynamic, because beforehand I have to determine which columns to update.
Given a sample table:
create table foo (id int, a int, b int, c int)
Then I will construct programmatically the "set" clause
_set = {}
_set['a'] = 10
_set['c'] = NULL
After that I have to build the update query. And here I'm stuck.
I have to construct this sql Update command:
update foo set a = 10, b = NULL where id = 1
How to do this with the psycopg2 parametrized command? (i.e. looping through the dict if it is not empty and build the set clause) ?
UPDATE
While I was sleeping I have found the solution by myself. It is dynamic, exactly how I wanted to be :-)
create table foo (id integer, a integer, b integer, c varchar)
updates = {}
updates['a'] = 10
updates['b'] = None
updates['c'] = 'blah blah blah'
sql = "upgrade foo set %s where id = %s" % (', '.join("%s = %%s" % u for u in updates.keys()), 10)
params = updates.values()
print cur.mogrify(sql, params)
cur.execute(sql, params)
And the result is what and how I needed (especially the nullable and quotable columns):
"upgrade foo set a = 10, c = 'blah blah blah', b = NULL where id = 10"
There is actually a slightly cleaner way to make it, using the alternative column-list syntax:
sql_template = "UPDATE foo SET ({}) = %s WHERE id = {}"
sql = sql_template.format(', '.join(updates.keys()), 10)
params = (tuple(addr_dict.values()),)
print cur.mogrify(sql, params)
cur.execute(sql, params)
Using psycopg2.sql – SQL string composition module
The module contains objects and functions useful to generate SQL dynamically, in a convenient and safe way.
from psycopg2 import connect, sql
conn = connect("dbname=test user=postgres")
upd = {'name': 'Peter', 'age': 35, 'city': 'London'}
ref_id = 12
sql_query = sql.SQL("UPDATE people SET {data} WHERE id = {id}").format(
data=sql.SQL(', ').join(
sql.Composed([sql.Identifier(k), sql.SQL(" = "), sql.Placeholder(k)]) for k in upd.keys()
),
id=sql.Placeholder('id')
)
upd.update(id=ref_id)
with conn:
with conn.cursor() as cur:
cur.execute(sql_query, upd)
conn.close()
Running print(sql_query.as_string(conn)) before closing connection will reveal this output:
UPDATE people SET "name" = %(name)s, "age" = %(age)s, "city" = %(city)s WHERE id = %(id)s
No need for dynamic SQL. Supposing a is not nullable and b is nullable.
If you want to update both a and b:
_set = dict(
id = 1,
a = 10,
b = 20, b_update = 1
)
update = """
update foo
set
a = coalesce(%(a)s, a), -- a is not nullable
b = (array[b, %(b)s])[%(b_update)s + 1] -- b is nullable
where id = %(id)s
"""
print cur.mogrify(update, _set)
cur.execute(update, _set)
Output:
update foo
set
a = coalesce(10, a), -- a is not nullable
b = (array[b, 20])[1 + 1] -- b is nullable
where id = 1
If you want to update none:
_set = dict(
id = 1,
a = None,
b = 20, b_update = 0
)
Output:
update foo
set
a = coalesce(NULL, a), -- a is not nullable
b = (array[b, 20])[0 + 1] -- b is nullable
where id = 1
An option without python format using psycopg2's AsIs function for column names (although that doesn't prevent you from SQL injection over column names). Dict is named data:
update_statement = f'UPDATE foo SET (%s) = %s WHERE id_column=%s'
columns = data.keys()
values = [data[column] for column in columns]
query = cur.mogrify(update_statement, (AsIs(','.join(columns)), tuple(values), id_value))
Here's my solution that I have within a generic DatabaseHandler class that provides a lot of flexibility when using pd.DataFrame as your source.
def update_data(
self,
table: str,
df: pd.DataFrame,
indexes: Optional[list] = None,
column_map: Optional[dict] = None,
commit: Optional[bool] = False,
) -> int:
"""Update data in the media database
Args:
table (str): the "tablename" or "namespace.tablename"
df (pandas.DataFrame): dataframe containing the data to update
indexes (list): the list of columns in the table that will be in the WHERE clause of the update statement.
If not provided, will use df indexes.
column_map (dict): dictionary mapping the columns in df to the columns in the table
columns in the column_map that are also in keys will not be updated
Key = df column.
Value = table column.
commit (bool): if True, the transaction will be committed (default=False)
Notes:
If using a column_map, only the columns in the data_map will be updated or used as indexes.
Order does not matter. If not using a column_map, all columns in df must exist in table.
Returns:
int : rows updated
"""
try:
if not indexes:
# Use the dataframe index instead
indexes = []
for c in df.index.names:
if not c:
raise Exception(
f"Dataframe contains indexes without names. Unable to determine update where clause."
)
indexes.append(c)
update_strings = []
tdf = df.reset_index()
if column_map:
target_columns = [c for c in column_map.keys() if c not in indexes]
else:
column_map = {c: c for c in tdf.columns}
target_columns = [c for c in df.columns if c not in indexes]
for i, r in tdf.iterrows():
upd_params = ", ".join(
[f"{column_map[c]} = %s" for c in target_columns]
)
upd_list = [r[c] if pd.notna(r[c]) else None for c in target_columns]
upd_str = self._cur.mogrify(upd_params, upd_list).decode("utf-8")
idx_params = " AND ".join([f"{column_map[c]} = %s" for c in indexes])
idx_list = [r[c] if pd.notna(r[c]) else None for c in indexes]
idx_str = self._cur.mogrify(idx_params, idx_list).decode("utf-8")
update_strings.append(f"UPDATE {table} SET {upd_str} WHERE {idx_str};")
full_update_string = "\n".join(update_strings)
print(full_update_string) # Debugging
self._cur.execute(full_update_string)
rowcount = self._cur.rowcount
if commit:
self.commit()
return rowcount
except Exception as e:
self.rollback()
raise e
Example usages:
>>> df = pd.DataFrame([
{'a':1,'b':'asdf','c':datetime.datetime.now()},
{'a':2,'b':'jklm','c':datetime.datetime.now()}
])
>>> cls.update_data('my_table', df, indexes = ['a'])
UPDATE my_table SET b = 'asdf', c = '2023-01-17T22:13:37.095245'::timestamp WHERE a = 1;
UPDATE my_table SET b = 'jklm', c = '2023-01-17T22:13:37.095250'::timestamp WHERE a = 2;
>>> cls.update_data('my_table', df, indexes = ['a','b'])
UPDATE my_table SET c = '2023-01-17T22:13:37.095245'::timestamp WHERE a = 1 AND b = 'asdf';
UPDATE my_table SET c = '2023-01-17T22:13:37.095250'::timestamp WHERE a = 2 AND b = 'jklm';
>>> cls.update_data('my_table', df.set_index('a'), column_map={'a':'db_a','b':'db_b','c':'db_c'} )
UPDATE my_table SET db_b = 'asdf', db_c = '2023-01-17T22:13:37.095245'::timestamp WHERE db_a = 1;
UPDATE my_table SET db_b = 'jklm', db_c = '2023-01-17T22:13:37.095250'::timestamp WHERE db_a = 2;
Note however that this is not safe from SQL injection due to the way it generates the where clause.
I have a database table as follows. The data is in the form of a tree with
CREATE TABLE IF NOT EXISTS DOMAIN_HIERARCHY (
COMPONENT_ID INT NOT NULL ,
LEVEL INT NOT NULL ,
COMPONENT_NAME VARCHAR(127) NOT NULL ,
PARENT INT NOT NULL ,
PRIMARY KEY ( COMPONENT_ID )
);
The following data is in the table
(1,1,'A',0)
(2,2,'AA',1)
(3,2,'AB',1)
(4,3,'AAA',2)
(5,3,'AAB',2)
(6,3,'ABA',3)
(7,3,'ABB',3)
I have to retrieve the data and store in a python dictionary
I wrote the below code
conx = sqlite3.connect( 'nameofdatabase.db' )
curs = conx.cursor()
curs.execute( 'SELECT COMPONENT_ID, LEVEL, COMPONENT_NAME, PARENT FROM DOMAIN_HIERARCHY' )
rows = curs.fetchall()
cmap = {}
for row in rows:
cmap[row[0]] = row[2]
hrcy={}
for level in range( 1, maxl + 1 ):
for row in rows:
if row[1] == level:
if hrcy == {}:
hrcy[row[2]] = []
continue
parent = cmap[row[3]]
hrcy[parent].append( { row[2]: [] } )
The problem I'm facing is for nodes more than 2nd level ,they are getting added to the root instead of their parent ; where should I do the change in the code?
The problem is that you can't directly see the nodes for the second level after you insert them. Try this:
conx = sqlite3.connect( 'nameofdatabase.db' )
curs = conx.cursor()
curs.execute( 'SELECT COMPONENT_ID, LEVEL, COMPONENT_NAME, PARENT ' +
'FROM DOMAIN_HIERARCHY' )
rows = curs.fetchall()
cmap = {}
hrcy = None
for row in rows:
entry = (row[2], {})
cmap[row[0]] = entry
if row[1] == 1:
hrcy = {entry[0]: entry[1]}
# raise if hrcy is None
for row in rows:
item = cmap[row[0]]
parent = cmap.get(row[3], None)
if parent is not None:
parent[1][row[2]] = item[1]
print hrcy
By keeping each component's map of subcomponents in cmap, I can always reach each parent's map to add the next component to it. I tried it with the following test data:
rows = [(1,1,'A',0),
(2,2,'AA',1),
(3,2,'AB',1),
(4,3,'AAA',2),
(5,3,'AAB',2),
(6,3,'ABA',3),
(7,3,'ABB',3)]
The output was this:
{'A': {'AA': {'AAA': {}, 'AAB': {}}, 'AB': {'ABA': {}, 'ABB': {}}}}