Insert different UUID on each row of a large table by python

Insert different UUID on each row of a large table by python - python

15
I have a table with ~80k rows with imported data. Table structure is as follows:
order_line_items
id
order_id
product_id
quantity
price
uuid
On import, the order_id, product_id, quantity, and price were imported, but the uuid field was left null.
Is there a way, using python's UUID() function, to add a uuid to each row of the table in bulk? I could use a script to cycle through each row and update it but if there is a python solution, that would be fastest.

Probably you need to add the default uuid for the table/model and and save value
from uuid import uuid4
from sqlalchemy import Column, String
class Table(Base):
__tablename__ = 'table'
id = Column(String, primary_key=True, default=uuid4)
# add other column
records =[] # records in dict
sess = session() # database session
# save all records in db
sess.bulk_insert_mappings(Table, records)
sess.commit()

A more Pythonic way in adding/modifying a value in a column is by using map method. You can find refer here for more details: https://pandas.pydata.org/docs/reference/api/pandas.Series.map.html.
Basically, what map is doing is map values of a column according to an function.
Your function must return a value for this to works, and you can take in the original value in the column as argument.

I'm fairly certain you can do this directly in MySQL using the UUID function.
UPDATE your_table_name SET uuid = UUID();

Related

Dynamodb max value

I'm using Dynamodb. I have a simple Employee table with fields like id, name, salary, doj, etc. What is the equivalent query of select max(salary) from employee in dynamodb?

You can model your schema something like:
employee_id, as partition-key
salary, as sort-key of the table or local secondary index.
Then, Query your table for given employee_id, with ScanIndexForward as false, and pick the first returned entry. Since all rows one employee_id are stored in sorted fashion, the first entry in desc order will be the one with highest salary.
You can also keep Limit as 1, in which case DynamoDB will return only one record.
Relevant documentation here.

Not sure about boto3 but inboto can be run this way
from boto.dynamodb2.table import Table
table = Table("employee")
values = list(table.query_2(reverse=True, limit=1))
MAXVALUE = values[0][salary]

There is no cheap way to achieve this in Dynamodb. There is no inbuild function to determine the max value of an attribute without retrieving all items and calculate programmatically.

Bulk update in SQLAlchemy from text query result/ResultProxy or dict

I would like to bulk update an ORM table in SQLAlchemy from a query for which I only have the text and a database connection. I cannot easily (I believe) reflect the source query in the ORM because it could come from an unlimited set of tables. The extra wrinkle is that I would like to update a key-value HSTORE column (postgres). I think I can figure out how to do this row-by-row, but would prefer a bulk UPDATE FROM-style operation.
To keep it simple:
class Table(Base):
__tablename__= 'table'
id = Column(Integer, primary_key=True)
hstore = Column(MutableDict.as_mutable(HSTORE))
query_to_update_from = 'select id, attr1, attr2 from source_table where id between 1 and 100'
I would like to update Table.hstore with {'attr1':attr1, 'attr2':attr2} where ids match. I want any columns not named id to update the hstore.
I know I can do session.execute('select id, attr1, attr2 from source_table where id between 1 and 100') and get a list of column names and row data easily. I can make a list of dictionaries from that, but can't figure out how to use that in a bulk update.
I have also tried making a query().subquery out of the raw text query to no avail, understandably since there isn't the required structure.
I am stumped at this point!

Postgre/SQLAlchemy UUID inserts but failed to compare

I am accessing Postgre database using SQLAlchemy models. In one of models I have Column with UUID type.
id = Column(UUID(as_uuid=True), default=uuid.uuid4(), nullable=False, unique=True)
and it works when I try to insert new row (generates new id).
Problem is when I try to fetch Person by id I try like
person = session.query(Person).filter(Person.id.like(some_id)).first()
some_id is string received from client
but then I get error LIKE (Programming Error) operator does not exist: uuid ~~ unknown.
How to fetch/compare UUID column in database through SQLAlchemy ?

don't use like, use =, not == (in ISO-standard SQL, = means equality).
Keep in mind that UUID's are stored in PostgreSQL as binary types, not as text strings, so LIKE makes no sense. You could probably do uuid::text LIKE ? but it would perform very poorly over large sets because you are effectively ensuring that indexes can't be used.
But = works, and is far preferable:
mydb=>select 'd796d940-687f-11e3-bbb6-88ae1de492b9'::uuid = 'd796d940-687f-11e3-bbb6-88ae1de492b9';
?column?
----------
t
(1 row)

How to achieve that every record in two tables have different id?

I have two tables inherited from base table ( SQLALCHEMY models)
class Base(object):
def __tablename__(self):
return self.__name__.lower()
id = Column(Integer, primary_key=True, nullable=False)
utc_time = Column(Integer, default=utc_time(), onupdate=utc_time())
datetime = Column(TIMESTAMP, server_default=func.now(), onupdate=func.current_timestamp())
and inherited tables Person and Data
How to achieve that every Person and Data have different id, every id to be unique for two tables ? ( Person when generate of id to be aware of Data ids and vice versa)

if you're using Postgresql, Firebird, or Oracle, use a sequence that's independent of both tables to generate primary key values. Otherwise, you need to roll some manual process like an "id" table or something like that - this can be tricky to do atomically.
Basically if I were given this problem, I'd ask why exactly would two tables need unique primary key values like that - if the primary key is an autoincrementing integer, that indicates it's meaningless. It's only purpose is to provide a unique key into a single table.

Selecting distinct column values in SQLAlchemy/Elixir

In a little script I'm writing using SQLAlchemy and Elixir, I need to get all the distinct values for a particular column. In ordinary SQL it'd be a simple matter of
SELECT DISTINCT `column` FROM `table`;
and I know I could just run that query "manually," but I'd rather stick to the SQLAlchemy declarative syntax (and/or Elixir) if I can. I'm sure it must be possible, I've even seen allusions to this sort of thing in the SQLAlchemy documentation, but I've been hunting through that documentation for hours (as well as that of Elixir) and I just can't seem to actually figure out how it would be done. So what am I missing?

You can query column properties of mapped classes and the Query class has a generative distinct() method:
for value in Session.query(Table.column).distinct():
pass

For this class:
class Assurance(db.Model):
name = Column(String)
you can do this:
assurances = []
for assurance in Assurance.query.distinct(Assurance.name):
assurances.append(assurance.name)
and you will have the list of distinct values

I wanted to count the distinct values, and using .distinct() and .count() would count first, resulting in a single value, then do the distinct. I had to do the following
from sqlalchemy.sql import func
Session.query(func.count(func.distinct(Table.column))

For class,
class User(Base):
name = Column(Text)
id = Column(Integer, primary_key=True)
Method 1: Using load_only
from sqlalchemy.orm import load_only
records= (db_session.query(User).options(load_only(name)).distinct().all())
values = [record[0] if len(record) == 1 else record for record in records] # list of distinct values
Method2: without any imports
records = db_session.query(User.name).distinct().all()
l_values = [record.__dict__[l_columns[0]] for record in records]

for user in session.query(users_table).distinct():
print user.posting_id

SQL Alchemy version 2 encourages the use of the select() function. You can use an SQL Alchemy table to build a select statement that extracts unique values:
select(distinct(table.c.column_name))
SQL Alchemy 2.0 migration ORM usage:
"The biggest visible change in SQLAlchemy 2.0 is the use of Session.execute() in conjunction with select() to run ORM queries, instead of using Session.query()."
Reproducible example using pandas to collect the unique values.
Define and insert the iris dataset
Define an ORM structure for the iris dataset, then use pandas to insert the
data into an SQLite database. Pandas inserts with if_exists="append" argument
so that it keeps the structure defined in SQL Alchemy.
import seaborn
import pandas
from sqlalchemy import create_engine
from sqlalchemy import MetaData, Table, Column, Text, Float
from sqlalchemy.orm import Session
Define metadata and create the table
engine = create_engine('sqlite://')
meta = MetaData()
meta.bind = engine
iris_table = Table('iris',
meta,
Column("sepal_length", Float),
Column("sepal_width", Float),
Column("petal_length", Float),
Column("petal_width", Float),
Column("species", Text))
iris_table.create()
Load data into the table
iris = seaborn.load_dataset("iris")
iris.to_sql(name="iris",
con=engine,
if_exists="append",
index=False,
chunksize=10 ** 6,
)
Select unique values
Re using the iris_table from above.
from sqlalchemy import distinct, select
stmt = select(distinct(iris_table.c.species))
df = pandas.read_sql_query(stmt, engine)
df
# species
# 0 setosa
# 1 versicolor
# 2 virginica

the marked solution showed me an error so I just specified the column and it worked here is the code
for i in (session.query(table_name.c.column_name).distinct()):
print(i)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Insert different UUID on each row of a large table by python - python

I'm fairly certain you can do this directly in MySQL using the UUID function. UPDATE your_table_name SET uuid = UUID();

Related

Dynamodb max value

Bulk update in SQLAlchemy from text query result/ResultProxy or dict

Postgre/SQLAlchemy UUID inserts but failed to compare

How to achieve that every record in two tables have different id?

Selecting distinct column values in SQLAlchemy/Elixir

Categories

Resources