Python - pandas to_sql

Python - pandas to_sql - python

I'm trying to use https://pandas.pydata.org/pandas-docs/version/0.21/generated/pandas.DataFrame.to_sql.html
When I change the Name argument, e.g. say I set
pd.to_sql(name="testTable",constring)
the actual table name comes up as [UserName].[testTable] rather than just [testTable]
Is there a way I can get rid of the [userName]? which is linked to the user who runs the script?

The [UserName] portion of the table name is the schema that the table is in. I don't know which database engine you're using, but the schema you're looking for might be "dbo".
According to the documentation, you can provide a schema argument:
pd.to_sql(name="testTable",constring, schema="dbo")
Note that if the schema is left blank, it uses the DB user's default schema (as defined when the user was added to the database), which in your case, appears to be the schema of the user.

Related

sqlalchemy to_sql if_exists = "replace" creates a table with different prefix

When I try to execute the following code, it drops the actual table in the database but doesn't recreate the same one. Instead of generating the dbo.TableName type, it does create username/TableName.
engine = sqlalchemy.create_engine(
"mssql+pyodbc://server/dbname?driver=ODBC+Driver+13+for+SQL+Server")
df.to_sql("OTD_1_DELIVERY_TRACKING_F_IMPORT", con=engine, if_exists="replace", index=False)
Does anyone know how to fix this so it recreates the dbo.TableName table?

Does anyone know how to fix this so it recreates the dbo.TableName table
This is happening because the user has a default schema other than dbo. In very old versions of SQL Server each user had their own schema, and this is a vestige of that behavior.
So when that user runs
DROP TABLE OTD_1_DELIVERY_TRACKING_F_IMPORT
It looks in the user's default schema first, and not finding anything, then looks in the dbo schema, and drops that table. But when the user runs
CREATE TABLE OTD_1_DELIVERY_TRACKING_F_IMPORT ...
It's created in the user's default schema.
The easy fix is to change the user's DEFAULT SCHEMA to dbo. EG
ALTER USER [SomeUser] WITH DEFAULT_SCHEMA = dbo;

How to get base from existing sql DDL file?

I'm using SQLAlchemy for MySQL.
The common example of SQLAlchemy is
Defining model classes by the table structure. (class User(Base))
Migrate to the database by db.create_all (or alembic, etc)
Import the model class, and use it. (db.session.query(User))
But what if I want to use raw SQL file instead of defined model classes?
I did read automap do similar like this, but I want to get mapper object from raw SQL file, not created database.
Is there any best practice to do this?
This is an example of DDL
-- ddl.sql
-- This is just an example, so please ignore some issues related to a grammar
CREATE TABLE `card` (
`card_id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'card',
`card_company_id` bigint(20) DEFAULT NULL COMMENT 'card_company_id',
PRIMARY KEY (`card_id`),
KEY `card_ix01` (`card_company_id`),
KEY `card_ix02` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='card table'
And I want to do like
Base = raw_sql_base('ddl.sql') # Some kinda automap_base but from SQL file
# engine, suppose it has two tables 'user' and 'address' set up
engine = create_engine("mysql://user#localhost/program")
# reflect the tables
Base.prepare(engine)
# mapped classes are now created with names by sql file
Card = Base.classes.card
session = Session(engine)
session.add(Card(card_id=1, card_company_id=1))
session.commit() # Insert

SQLAlchemy is not an SQL parser, but the exact opposite; its reflection works against existing databases only. In other words you must execute your DDL and then use reflection / automap to create the necessary Python models:
from sqlalchemy.ext.automap import automap_base
# engine, suppose it has two tables 'user' and 'address' set up
engine = create_engine("mysql://user#localhost/program")
# execute the DDL in order to populate the DB
with open('ddl.sql') as ddl:
engine.execute(ddl)
Base = automap_base()
# reflect the tables
Base.prepare(engine, reflect=True)
# mapped classes are now created with names by sql file
Card = Base.classes.card
session = Session(engine)
session.add(Card(card_id=1, card_company_id=1))
session.commit() # Insert
This of course may fail, if you have already executed the same DDL against your database, so you would have to handle that case as well. Another possible caveat is that some DB-API drivers may not like executing multiple statements at a time, if your ddl.sql happens to contain more than one CREATE TABLE statement etc.
...but I want to get mapper object from raw SQL file.
Ok, in that case what you need is the aforementioned parser. A cursory search produced two candidates:
sqlparse: Generic, but the issue tracker is a testament to how nontrivial parsing SQL is. Is often confused, for example parses ... COMMENT 'card', `card_company_id` ... as a keyword and an identifier list, not as a keyword, a literal, punctuation, and an identifier (or even better, the column definitions as their own nodes).
mysqlparse: A MySQL specific solution, but with limited support for just about anything, and it seems abandoned.
Parsing would be just the first step, though. You'd then have to convert the resulting trees to models.

CQL update query not working using Python

I am trying to update new password after reset to cassandra db. This is the query I have written where both username and password fields are dynamic. Is this right?
def update_db(uname, pwd):
query = session.prepare('update user.userdetails set "password"=%s where "username" = ? ALLOW FILTERING', pwd)
session.execute(query, (uname,))
update_db(username, new_pwd)
I am calling this through an API. But it doesn't seem to update.

Alex is absolutely correct in that you need to provide the complete PRIMARY KEY for any write operation. Remove ALLOW FILTERING and your query should work as long as your primary key definition is: PRIMARY KEY (username).
Additionally, it's best practice to parameterize your entire prepared statement, instead of relying on string formatting for password.
query = session.prepare('update user.userdetails set "password"=? where "username"=?')
session.execute(query,[pwd,uname])
Note: If at any point you find yourself needing the ALLOW FILTERING directive, you're doing it wrong.

for updating record you need to provide primary key(s) completely. It will not work with ALLOW FILTERING - you need first to get all primary keys that you want to update, and then issue individual update commands. See the documentation for more detailed description of UPDATE command.
If you really want to specify the default value for some column - why not simply handle it with something like .get('column', 'default-value')?

Specifying the schema in Pandas to_sql

From the source of to_sql, I can see that it gets mapped to an Meta Data object meta = MetaData(con, schema=schema). However, I can't find SQLAlchemy docs that tell me how to define the Schema for MySQL
How do I specify the schema string ?

The schema parameter in to_sql is confusing as the word "schema" means something different from the general meaning of "table definitions". In some SQL flavors, notably postgresql, a schema is effectively a namespace for a set of tables.
For example, you might have two schemas, one called test and one called prod. Each might contain a table called user_rankings generated in pandas and written using the to_sql command. You would specify the test schema when working on improvements to user rankings. When you are ready to deploy the new rankings, you would write to the prod schema.
As others have mentioned, when you call to_sql the table definition is generated from the type information for each column in the dataframe. If the table already exists in the database with exactly the same structure, you can use the append option to add new data to the table.

DataFrame.to_sql(self, name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None, method=None)
Just use schema parameter. But note that schema is not odbc driver.

Starting from the Dialects page of the SQLAlchemy documentation, select documentation page of your dialect and search for create_engine to find example on how to create it.
Even more concise overview you can get on Engine Configuration page for all supported dialects.
Verbatim extract for mysql:
# default
engine = create_engine('mysql://scott:tiger#localhost/foo')
# mysql-python
engine = create_engine('mysql+mysqldb://scott:tiger#localhost/foo')
# MySQL-connector-python
engine = create_engine('mysql+mysqlconnector://scott:tiger#localhost/foo')
# OurSQL
engine = create_engine('mysql+oursql://scott:tiger#localhost/foo')
Then pass this engine to the to_sql(...) of pandas' DataFrame.

What does the second parameter mean in dblite's open

As shown in the example in the link below, I am having trouble to figure out what the second parameter in open() is.. Can anyone tell me about this? Thank you
https://pypi.python.org/pypi/scrapy-dblite/0.2.5

The Storage() class constructor documents the second parameter as:
uri - URI to sqlite database, sqlite://<sqlite-database>:<table>
So you name the full path of the database file (sqlite stores a database in one file), and a table name for the items to be stored in.
If you use an absolute path, it should start with an extra slash:
sqlite:///some/path/to/database.db:foobar
will open /some/path/to/database.db (creating it if it doesn't yet exist), and use a table called foobar in that database (again, creating it if it doesn't yet exist).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python - pandas to_sql - python

Related

sqlalchemy to_sql if_exists = "replace" creates a table with different prefix

How to get base from existing sql DDL file?

CQL update query not working using Python

Specifying the schema in Pandas to_sql

What does the second parameter mean in dblite's open

Categories

Resources