I have a problem inserting data into a nested column.
I use map_imperatively.
Columns of a different type are filled. Only the nested column remains empty.
My code:
import attr
from sqlalchemy import (
create_engine, Column, MetaData, insert
)
from sqlalchemy.orm import registry
from clickhouse_sqlalchemy import (
Table, make_session, types, engines,
)
uri = 'clickhouse+native://localhost/default'
engine = create_engine(uri)
session = make_session(engine)
metadata = MetaData(bind=engine)
mapper = registry()
#attr.dataclass
class NestedAttr:
key1: int
key2: int
key3: int
#attr.dataclass
class NestedInObject:
id: int
name: str
nested_attr: NestedAttr
nested_test = Table(
'nested_test', metadata,
Column(name='id', type_=types.Int8, primary_key=True),
Column(name='name', type_=types.String),
Column(
name='nested_attr',
type_=types.Nested(
Column(name='key1', type_=types.Int8),
Column(name='key2', type_=types.Int8),
Column(name='key3', type_=types.Int8),
)
),
engines.Memory()
)
mapper.map_imperatively(
NestedInObject,
nested_test
)
nested_test.create()
values = [
{
'id': 1,
'name': 'name',
'nested_attr.key1': [1, 2],
'nested_attr.key2': [1, 2],
'nested_attr.key3': [1, 2],
}
]
session.execute(insert(NestedInObject), values)
I don't get an error, but the nested columns are empty.
I tried different data. Checked the data type in the database. I don't understand why the columns are left empty.
Related
The following code results in None () across the row in every attempt. The query.values() code below is just a shortened line so as to keep things less complicated. Additionally I have problems inserting a dict as JSON in the address fields but that's another question.
CREATE TABLE public.customers (
id SERIAL,
email character varying(255) NULL,
name character varying(255) NULL,
phone character varying(16) NULL,
address jsonb NULL,
shipping jsonb NULL,
currency character varying(3) NULL,
metadata jsonb[] NULL,
created bigint NULL,
uuid uuid DEFAULT uuid_generate_v4() NOT NULL,
PRIMARY KEY (uuid)
);
from sqlalchemy import *
from sqlalchemy.orm import Session
# Create engine, metadata, & session
engine = create_engine('postgresql://postgres:password#database/db', future=True)
metadata = MetaData(bind=engine)
session = Session(engine)
# Create Table
customers = Table('customers', metadata, autoload_with=engine)
query = customers.insert()
query.values(email="test#test.com", \
name="testy testarosa", \
phone="+12125551212", \
address='{"city": "Cities", "street": "123 Main St", \
"state": "CA", "zip": "10001"}')
session.execute(query)
session.commit()
session.close()
# Now to see results
stmt = text("SELECT * FROM customers")
response = session.execute(stmt)
for result in response:
print(result)
# Results in None in the fields I explicitly attempted
(1, None, None, None, None, None, None, None, 1, None, None, None, None, UUID('9112a420-aa36-4498-bb56-d4129682681c'))
Calling query.values() returns a new insert instance, rather than modifying the existing instance in-place. This return value must be assigned to a variable otherwise it will have no effect.
You could build the insert iteratively
query = customers.insert()
query = query.values(...)
session.execute(query)
or chain the calls as Karolus K. suggests in their answer.
query = customers.insert().values(...)
Regarding the address column, you are inserting a dict already serialised as JSON. This value gets serialised again during insertion, so the value in the database ends up looking like this:
test# select address from customers;
address
══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
"{\"city\": \"Cities\", \"street\": \"123 Main St\", \"state\": \"CA\", \"zip\": \"10001\"}"
(1 row)
and is not amenable to being queried as a JSON object (because it's a JSONified string)
test# select address->'state' AS state from customers;
state
═══════
¤
(1 row)
You might find it better to pass the raw dict instead, resulting in this value being stored in the database:
test# select address from customers;
address
════════════════════════════════════════════════════════════════════════════
{"zip": "10001", "city": "Cities", "state": "CA", "street": "123 Main St"}
(1 row)
which is amenable to being queried as a JSON object:
test# select address->'state' AS state from customers;
state
═══════
"CA"
(1 row)
I am not sure what do you mean with
The query.values() code below is just a shortened line so as to keep
things less complicated.
So maybe I am not understanding the issue properly.
At any case the problem here is that you execute the insert() and the values() separately, while it is meant to be "chained".
Doing something like:
query = customers.insert().values(email="test#test.com", name="testy testarosa", phone="+12125551212", address='{"city": "Cities", "street": "123 Main St", "state": "CA", "zip": "10001"}')
will work.
Documentation: https://docs.sqlalchemy.org/en/14/core/selectable.html#sqlalchemy.sql.expression.TableClause.insert
PS: I did not faced any issues with the JSON field as well. Perhaps something with PG version?
I need to write an automated python code to create database table having column names as the keys from the json file and column data should be the values of those respective key.
My json looks like this:
{
"Table_1": [
{
"Name": "B"
},
{
"BGE3": [
"Itm2",
"Itm1",
"Glass"
]
},
{
"Trans": []
},
{
"Art": [
"SYS"
]
}]}
My table name should be: Table_1.
So my column name should look like: Name | BGE3 | Trans | Art.
And data should be its respected values.
Creation of table and columns has to be dynamic because I need to run this code on multiple json file.
So far I have managed to connect to the postgresql database using python.
So please help me with the solutions.Thankyou.
Postgres version 13.
Existing code:
cur.execute("CREATE TABLE Table_1(Name varchar, BGE3 varchar, Trans varchar, Art varchar)")
for d in data: cur.execute("INSERT into B_Json_3(Name, BGE3, Trans , Art) VALUES (%s, %s, %s, %s,)", d)
Where data is a list of arrays i made which can only be executed for this json. I need a function that will execute any json i want that can have 100 elements of list in the values of any key.
The table creation portion, using Python json module to convert JSON to Python dict and psycopg2.sql module to dynamically CREATE TABLE:
import json
import psycopg2
from psycopg2 import sql
tbl_json = """{
"Table_1": [
{
"Name": "B"
},
{
"BGE3": [
"Itm2",
"Itm1",
"Glass"
]
},
{
"Trans": []
},
{
"Art": [
"SYS"
]
}]}
"""
# Transform JSON string into Python dict. Use json.load if pulling from file.
# Pull out table name and column names from dict.
tbl_dict = json.loads(tbl_json)
tbl_name = list(tbl_dict)[0]
tbl_name
'Table_1'
col_names = [list(col_dict)[0] for col_dict in tbl_dict[tbl_name]]
# Result of above.
col_names
['Name', 'BGE3', 'Trans', 'Art']
# Create list of types and then combine column names and column types into
# psycopg2 sql composed object. Warning: sql.SQL() does no escaping so potential
# injection risk.
type_list = ["varchar", "varchar", "varchar"]
col_type = []
for i in zip(map(sql.Identifier, col_names), map(sql.SQL,type_list)):
col_type.append(i[0] + i[1])
# The result of above.
col_type
[Composed([Identifier('Name'), SQL('varchar')]),
Composed([Identifier('BGE3'), SQL('varchar')]),
Composed([Identifier('Trans'), SQL('varchar')])]
# Build psycopg2 sql string using above.
sql_str = sql.SQL("CREATE table {} ({})").format(sql.Identifier(tbl_name), sql.SQL(',').join(col_type) )
con = psycopg2.connect("dbname=test host=localhost user=aklaver")
cur = con.cursor()
# Shows the CREATE statement that will be executed.
print(sql_str.as_string(con))
CREATE table "Table_1" ("Name"varchar,"BGE3"varchar,"Trans"varchar)
# Execute statement and commit.
cur.execute(sql_str)
con.commit()
# In psql client the result of the execute:
\d "Table_1"
Table "public.Table_1"
Column | Type | Collation | Nullable | Default
--------+-------------------+-----------+----------+---------
Name | character varying | | |
BGE3 | character varying | | |
Trans | character varying | | |
I'm trying to get my result dictonary from sqlalchemy automatically to the Pydantic output for Fastapi to maps using the from_orm method, but I always get a validation error.
File "pydantic\main.py", line 508, in pydantic.main.BaseModel.from_orm
pydantic.error_wrappers.ValidationError: 2 validation errors for Category
name
field required (type=value_error.missing)
id
field required (type=value_error.missing)
If I create the objects with the Pydantic schema myself and add them to the list, the method works.
What would I have to change for from_orm to work?
Did I possibly miss something in the documentation?
https://pydantic-docs.helpmanual.io/usage/models/#orm-mode-aka-arbitrary-class-instances
https://fastapi.tiangolo.com/tutorial/sql-databases/#use-pydantics-orm_mode
or is there another/better way to turn the ResultProxy into a Pydantic capable output?
The output I get from the database method is the following:
[{'id': 1, 'name': 'games', 'parentid': None}, {'id': 2, 'name': 'computer', 'parentid': None}, {'id': 3, 'name': 'household', 'parentid': None}, {'id': 10, 'name': 'test', 'parentid': None}]]
Models.py
from sqlalchemy import BigInteger, Column, DateTime, ForeignKey, Integer, Numeric, String, Text, text, Table
from sqlalchemy.orm import relationship, mapper
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
metadata = Base.metadata
category = Table('category', metadata,
Column('id', Integer, primary_key=True),
Column('name', String(200)),
Column('parentid', Integer),
)
class Category(object):
def __init__(self, cat_id, name, parentid):
self.id = cat_id
self.name = name
self.parentid = parentid
mapper(Category, category)
Schemas.py
from pydantic import BaseModel, Field
class Category(BaseModel):
name: str
parentid: int = None
id: int
class Config:
orm_mode = True
main.py
def result_proxy_to_Dict(results: ResultProxy):
d, a = {}, []
for rowproxy in results:
# rowproxy.items() returns an array like [(key0, value0), (key1, value1)]
for column, value in rowproxy.items():
# build up the dictionary
d = {**d, **{column: value}}
a.append(d)
return a
def crud_read_cat(db: Session) -> dict:
# records = db.query(models.Category).all()
#query = db.query(models.Category).filter(models.Category.parentid == None)
s = select([models.Category]). \
where(models.Category.parentid == None)
result = db.execute(s)
#print(type(result))
#print(result_proxy_to_Dict(result))
#results = db.execute(query)
# result_set = db.execute("SELECT id, name, parentid FROM public.category;")
# rint(type(result_set))
# for r in result_set:
# print(r)
# return [{column: value for column, value in rowproxy.items()} for rowproxy in result_set]
# return await databasehelper.database.fetch_all(query)
return result_proxy_to_Dict(result)
#return results
#router.get("/category/", response_model=List[schemas.Category], tags=["category"])
async def read_all_category(db: Session = Depends(get_db)):
categories = crud_read_cat(db)
context = []
print(categories)
co_model = schemas.Category.from_orm(categories)
# print(co_model)
for row in categories:
print(row)
print(row.get("id", None))
print(row.get("name", None))
print(row.get("parentid", None))
tempcat = schemas.Category(id=row.get("id", None), name=row.get("name", None),
parentid=row.get("parentid", None))
context.append(tempcat)
#for dic in [dict(r) for r in categories]:
# print(dic)
# print(dic.get("category_id", None))
# print(dic.get("category_name", None))
# print(dic.get("category_parentid", None))
# tempcat = schemas.Category(id=dic.get("category_id", None), name=dic.get("category_name", None),
# parentid=dic.get("category_parentid", None))
# context.append(tempcat)
return context
New to this my self so cant promise the best answer but I noticed if you simply put Optional in the schema it works.
`
class Category(BaseModel):
name: Optional[str]
parentid: int = None
id: Optional[int]
class Config:
orm_mode = True
`
Still returning that info in the response:
[
{
"name": "games",
"parentid": null,
"id": 1
},
{
"name": "computer",
"parentid": null,
"id": 2
},
{
"name": "household",
"parentid": null,
"id": 3
},
{
"name": "test",
"parentid": null,
"id": 10
}
]
Likely still some sort of validation error but seems like a usable work around for now.
I just had the same problem. I think its related to pydantic nonethless. Please have a look at this link for more information https://github.com/samuelcolvin/pydantic/issues/506.
But having changed my model:
class Student(BaseModel):
id: Optional[int] --- changed to optional
name: Optional [str]
surname: Optional [str]
email: Optional [str]
The error validation goes away. Its a funny error - given that the entries in my database still updated with the values...I am new to fastAPI also so the workaround and the error does not really make sense for now....but yes it worked. Thank you
I got below error when use the Sqlalchemy to insert the data into Snowflake warehouse, any idea?
Error:
Failed to rewrite multi-row insert [SQL: 'INSERT INTO widgets (id, name, type)
SELECT %(id)s AS anon_1, %(name)s AS anon_2, widgets.id \nFROM widgets \nWHERE widgets.type = %(card_id)s']
[parameters: ({'id': 2, 'name': 'Lychee', 'card_id': 1}, {'id': 3, 'name': 'testing', 'card_id': 2})]
Code:
from sqlalchemy import *
from snowflake.sqlalchemy import URL
# Helper function for local debugging
def createSQLAlchemyEngine():
url = URL(
account='',
user='',
password='',
database='db',
schema='',
warehouse='',
role='',
proxy_host='',
proxy_port=8099
)
engine = create_engine(url)
return engine
conn = createSQLAlchemyEngine()
# Construct database
metadata = MetaData()
widgetTypes = Table('widgetTypes', metadata,
Column('id', INTEGER(), nullable=True),
Column('type', VARCHAR(), nullable=True))
widgets = Table('widgets', metadata,
Column('id', INTEGER(), nullable=True),
Column('name', VARCHAR(), nullable=True),
Column('type', INTEGER(), nullable=True))
engine = conn
metadata.create_all(engine)
# Connect and populate db for testing
conn = engine.connect()
sel = select([bindparam('id'), bindparam('name'), widgets.c.id]).where(widgets.c.type == bindparam('card_id'))
ins = widgets.insert().from_select(['id', 'name', 'type'], sel)
conn.execute(ins, [
# {'name': 'Melon', 'type_name': 'Squidgy'},
{'id': 2, 'name': 'Lychee', 'card_id' : 1 },
{'id': 3, 'name': 'testing', 'card_id': 2}
])
conn.close()
Basically, what Im trying to do this like below, but the Snowflake doesn't support this syntax.
insert into tableXX
values('Z',
(select max(b) +1 from tableXX a where a.CD = 'I'), 'val1', 'val2')
So, I have to do with something like this, but got the above error.
insert into tableXX
select 'val1', 'val2', ifnull(max(c), 0) + 1 from tableXX where a.CD = 'I',
select 'val1', 'val2', ifnull(max(c), 0) + 1 from tableXX where a.CD = 'I',
Target:
The logic behind the code is I want to update the sequence_id base on the max of existing sequence_id of record which has the same 'CD' with the new record.
Goal:
I want to allow the user to search for a document by ID, or allow other text-based queries.
Code:
l_search_results = list(
cll_sips.find(
{
'$or': [
{'_id': ObjectId(s_term)},
{'s_text': re.compile(s_term, re.IGNORECASE)},
{'choices': re.compile(s_term, re.IGNORECASE)}
]
}
).limit(20)
)
Error:
<Whatever you searched for> is not a valid ObjectId
s_term needs to be a valid object ID (or at least in the right format) when you pass it to the ObjectId constructor. Since it's sometimes not an ID, that explains why you get the exception.
Try something like this instead:
from pymongo.errors import InvalidId
or_filter = [
{'s_text': re.compile(s_term, re.IGNORECASE)},
{'choices': re.compile(s_term, re.IGNORECASE)}
]
try:
id = ObjectId(s_term)
or_filter.append({ '_id': id })
except InvalidId:
pass
l_search_results = list(
cll_sips.find({ '$or': or_filter }).limit(20)
)