Insert Pandas dataframe into Cassandra Table - python

From the documentation , there is a way to insert data into table:
session.execute(
"""
INSERT INTO users (name, credits, user_id)
VALUES (%s, %s, %s)
""",
("John O'Reilly", 42, uuid.uuid1())
)
The column name have to be stated there. However, in my case, I have a dataframe which has only a header row and a row of data, for example:
"sepal_length" : 5.1,"sepal_width" : 3.5,"petal_length" : 1.4 ,"petal_width" : 0.2, "species" : "Iris" .
The user will provide the information for my API to connect to their particular Cassandra database's table which contain the columns name that stored in the dataframe. How can I insert the dataframe's data with respect to the column header mapped to the table without actually hardcode the column name like stated in the documentation since the headers are not the same for different cases.
I am trying to achieve something like this:
def insert_table(df, table_name, ... #connection details):
#Set up connection and session
session.execute(
"""
INSERT INTO table_name(#df's column header)
VALUES (%s, %s, %s)
""",
(#df's data for the only row)
)
I discovered this but I actually just need a simple insert operation.

You can get the Dataframe's column names with the following
column_names = list(my_dataframe.columns.values)
You could rewrite insert_table(...) to accept the list of column names as an argument.
For example, string substitution can be used to form the CQL statement:
cql_query = """
INSERT INTO {table_name} ({col_names})
VALUES (%s, %s, %s)
""".format(table_name="my_table", col_names=','.join(map(str, column_names)))
...

Related

How to insert composite column as one single column in sql

for row in dfp.itertuples():
cursor.execute('''
INSERT INTO athletes(id, season, name)
VALUES (?,?,?)
''',
(
row.id,
row.season,
row.name,
)
'''
)
mysql.connector.errors.ProgrammingError: Not all parameters were used in the SQL statement.
I am getting this error because name is composed of first name and last name, so it is considering it as two different parameters when in reality it is only a single parameter. The same thing is happening in another table when I am using date (considering it as 3 parameters).
In pymysql, don't use ? as the parameter placeholders. Use %s.
cursor.execute('''INSERT INTO athletes(id, season, name) VALUES (%s,%s,%s)''',
(row.id, row.season, row.name,))

Insert values in table with two excecute commands

trying to insert values into one MySQL table using python.
First inserting values from csvfile; with:
sql = "INSERT INTO mydb.table(time,day,number)values %r" % tuple (values),)
cursor.execute(sql)
then insert into the same table and same row an other value
sql = "INSERT INTO mydb.table(name) values(%s)"
cursor.execute(sql)
with this i get the inserts in two different rows…
But i need to insert it into the same row without using sql = "INSERT INTO mydb.table(time,day,number,name)values %r" % tuple (values),)
Is there a way to insert values into the same row in two 'insert statements'?
INSERT will always add a new row. If you want to change values in this row, you have to specify a unique identifier (key) in the WHERE clause to access this row and use UPDATE or REPLACE instead.
When using REPLACE you need to be careful if your table contains an auto_increment column, since a new value will be generated.

Use ON DUPLICATE KEY UPDATE from python

I'm trying to use this code to update a table on mySQL, but I'm getting error with the update part
table_name = 'my_table'
sql_select_Query = """
INSERT {0} (age, city, gender,UniqueId)
VALUES ({1},{2},{3},{4})
ON DUPLICATE KEY UPDATE
age=VALUES(age,city=VALUES(city),gender=VALUES(gender),height=VALUES(height)
""".format(table_name, '877','2','1','2898989')
cursor = mySQLconnection .cursor()
cursor.execute(sql_select_Query)
mySQLconnection.commit()
For example, to update the city I get:
Unknow columns '877'
Hence it seems it is taking the value as a column name and search for then in my_table.
The correct way to use parameters with Python is to pass the values in the execute() call, not to interpolate the values into the SQL query.
Except for identifiers like the table name in your case.
sql_select_Query = """
INSERT `{0}` (age, city, gender, UniqueId)
VALUES (%s, %s, %s, %s)
ON DUPLICATE KEY UPDATE
age=VALUES(age), city=VALUES(city), gender=VALUES(gender)
""".format(table_name)
cursor.execute(sql_select_Query, ('877', '2', '1', '2898989'))
See also MySQL parameterized queries
You forgot the ) after VALUES(age. Maybe that was just a typo you made transcribing the question into Stack Overflow. I've fixed it in the example above.
Your INSERT statement sets the height column, but it's not part of the tuple you insert. I removed it in the example above. If you want height in the UPDATE clause, then you need to include it in the tuple and pass a value in the parameters.
Also I put back-quotes around the table name, just in case the table name is a reserved keyword or contains a space or something.

In MySql, using Python, how to INSERT INTO tables and columns, depending on variables?

I am building an interface in Python to get access to different queries in some big data tables (queries such as insert, search, predefined ones, etc.). The problem is that there are a few different tables which contain each a number of columns... So I would want to modularize my code and MySql queries, so that depending on which table we want to insert data to and to which columns these data concern, it will know what MySql command it will have to execute.
I saw that we can use variables for values, for example :
sql = "INSERT INTO table_name (col1, col2) VALUES (%s, %s)"
values = ("val1", "val2")
mycursor.execute(sql, values)
Is it possible to have something similar with table_nameand columns ? To have something for example like :
sql = "INSERT INTO (%s) (%s, %s) VALUES (%s, %s)"
table = "table_name"
columns = ("col1", "col2")
values = ("val1", "val2")
mycursor.execute(sql, table, columns, values)
With that, it would be far easier for me to initialize table, columnsand values when needed (for example when the user clicks a button, enters values in some text fields, etc.) than having a lot of such sql queries, one for each table and each possible subset of columns.
I am not sure that it is all clear with my pretty random english, if you need some more information feel free to ask !
Thank you in advance for your time,
Sanimys
For the few of you that will see this while looking for an answer, in fact this is pretty easy ! You can do pretty much exaclty like I proposed, for example :
sql = "INSERT INTO %s %s VALUES %s"
table = "table_name"
columns = "(col1, col2)"
values = "(val1, val2)"
mycursor.execute(sql % (table, columns, values))
Where every element is a string.
By doing that, you can have some nice dynamic computations, such as :
sql = "INSERT INTO %s %s VALUES %s"
table = "table_name"
values = "(val1, val2)"
// Some code
def compute_columns(list_of_columns_to_insert):
col = "("
for c in list_of_columns_to_insert:
col = col + "'" + c + "', "
col = col[:-2] + ")"
return col
// Some code generating a list of columns that we want to insert to
columns = compute_columns(list_of_columns_to_insert)
mycursor.execute(sql % (columns, values))
Here you go, hope that it could help someone that struggles like me !

inserting huge number of rows to mysql

I want to read from MSSQL table then insert in to MySQL table but i couldn't format my MSSQL results to executemany on them
cursor.execute('select * from table') # MSSQL
rows = cursor.fetchall()
many_rows = []
for row in rows:
many_rows.append((row))
sql = "insert into mysql.table VALUES (NULL, %s) on duplicate key update REFCOLUMN=REFCOLUMN" # MYSQL
mycursor.executemany(sql, many_rows)
mydb.commit()
this gives Failed executing the operation; Could not process parameters
First NULL is for id column and %s for other 49 columns. It works with 1 by 1 but takes ages over remote connection
EDIT
my example print output of many_rows:
[
(49 columns' values, all string and separated by comma),
(49 columns' values, all string and separated by comma),
(49 columns' values, all string and separated by comma),
...
]
I was able to fix my issue with appending data like below:
many_rows.append((list(row)))

Categories

Resources