Writing DataFrame rows to different database Tables using re.match and SQLite - python

RESOLVED - SEE FINAL EDIT AT BOTTOM.
I have a DataFrame that looks the following way:
df_transpose=
Time Date Morning (5AM-9AM) Day (9AM-6PM) \
Area
D1_NY_1 01_05_2012 0.000000 0.000000
D2_NY_2 01_05_2012 0.000000 0.000000
D3_NJ_1 01_05_2012 1.000000 0.966667
...
I want to write this row-by-row to different tables in a database using SQLite. I've set up the database Data.db which contains separate tables for each Area - i.e. the table names contain the Area names as listed in the DataFrame above (ex "Table_D1-NY-1" ect.). I want to test if theres a match between the Area (the index) in the DataFrame above and the names of the tables in my database, and if there's a match write the entire relevant row of the DataFrame to the Table that contains the same Area in the name. Here is what I've written so far, as well as the error I get:
CODE:
ii=0
for ii in range(0,row_count):
df_area= df_transpose.index[ii]
export_data= df_transpose.iloc[ii]
cur.execute("SELECT name FROM sqlite_master WHERE type='table'")
available_tables=(cur.fetchall())
for iii in range (0, row_count):
if re.match('\w*'+df_area, available_tables[iii]):
relevant_table=available_tables[iii]
export_data.to_sql(name=relevant_table, con=con, if_exists=append)
iii=iii+1
ERROR: for the "if re.match..." line:
TypeError: expected string or buffer
I tried to make the second (iii)loop after searching for solutions to the problem to avoid inputting a list object (available_tables) (instead of a string) to re.match(). I still get the same error though. Can anyone see my error or help me fix my the code?
EDIT:
For information, df_area and available_tables outputs the following:
df_area=
u'D1_NY_1
available_tables=
[(u'US_D1_NY_1',), (u'US_D2_NY_2',), (u'US_D3_NJ_1',), ...]
EDIT:
Have not been able to figure this out yet and would appreciate input. I've tried to play around with my code but the error remains the same.
FINAL EDIT:
Thought I would post how I got past this. The problem was that before available_tables was a list of tuples, instead of a list of strings. To get the re.match() test to work available_tables had to be a list of strings. I changed this using the following command:
cur.execute("SELECT name FROM sqlite_master WHERE type='table'")
available_tables=[item[0] for item in cur.fetchall()]

Can't comment, but maybe try r'\w*' instead of '\w' since you're using a backslash.
Also, it seems from the output you gave that available_tables is a list of tuples. So you'd probably want to use:
re.match(r'\w*'+df_area, available_tables[iii][0])

Related

How to make the database giving me two datas

Hello my Database (Sqlite3) is giving me just one data from all the database, I want it to give me all datas but I don't know why It didn't, here's what I tried
c.execute("SELECT type FROM accounts")
acctype = c.fetchmany()
It gave me just One for Some reasons !
By default, 1 is considered "many"! Use fetchmany(size=2) if you want to fetch the next 2 rows.
If you want all the rows, just use fetchall() instead.

MySQL Python Query matching row name with column name

I'm building a software to combine some chemicals into different compounds (each compound can have 1,2,3 or 4 chemicals), but some chemicals cannot combine with some other chemicals.
I have a table in my mysql db that has the following columns:
chemical_id,chemicalName, and one column for each chemical in my list.
Each row has one of the chemicals. the value in the fields tell me if both these chemicals can go together in a compound, or not (1, or 0). So all chemicals have a row, and a column. They were created in the same order, too. Here (dummy data): https://imgur.com/a/e2Fbq1K
I have a python list of chemicals_ids, which I'm gonna combine with themselves to make compounds of 1,2,3 and 4 chems, but I need a function to determine if any two of them ain't compatible.
I was trying to mess around with INFORMATION_SCHEMA COLUMN_NAME but I'm kinda lost.
A loop around something like this would work, but the syntax won't.
list_of_chemicals = ['ChemName1','ChemName2','ChemName3'] #etc
def verify_comp(a,b): #will be passed with chem names
mycursor.execute("SELECT chemicalName FROM chemical_compatibility WHERE chemical_id = 'ChemName1' AND 'ChemName2' = 0")
#etc
I have tried to use %s placeholders but it seems only to work in certain parts of mysql query. I'm a beginner both at Python and SQL so any light will be much appreciated.
Thanks!
I followed #Akina's suggestion and made a new table containing pairs of chemicals and compatibility value for each pair.
I also learned that apart from placeholders %s, which can only be used for values on python cursor sql execute statements, you can use a py variable too by doing something like this:
mycursor.execute("SELECT * FROM "+variablename+" WHERE condition = 1")
I'm not worried about SQL Injection for this project nor do I know if what I say here is 100% correct, but maybe it can help people that are lost nevertheless.

SELECT command wont find column name with a colon

I have a python script to retrieve a value (table ID) from a PostgreSQL database. The column name contains a colon though and I believe this is stopping it working. I've tested this on columns without colons and it does get the ID correctly.
The line in question is
cur.execute("SELECT tID from titles where name like 'METEOROLOG:WINDSPEED_F' order by structure, comp1, comp2")
rowswind=cur.fetchall()
When I print rowswind nothing is returned (just empty brackets)
I have also tried..
cur.execute('SELECT tID from titles where name like "METEOROLOGY:WINDSPEED_F" order by structure, comp1, comp2')
But that comes back with the error
psycopg2.ProgrammingError: column "METEOROLOGY:WINDSPEED_F" does not
exist
(it definitely does).
I've also tried escaping the colon any way I can think of (i.e. back slash) but nothing works, I just get syntax errors.
Any advice would be welcome. Thanks.
ADDITION 20190429
I've now tried parameterizing the query but also with no success.
wind=('METEOROLOGY:WINDSPEED_F')
sql="SELECT tID from titles where name like '{0}' order by structure, comp1, comp2".format(wind)
I've tried many different combinations of double and single quotes to try and escape the colon with no success.
psycopg2.ProgrammingError: column "METEOROLOGY:WINDSPEED_F" does not exist
You're getting this error because you're using double quotes around the targeted value in your query's WHERE statement, here:
cur.execute('SELECT tID from titles where name like "METEOROLOGY:WINDSPEED_F" order by structure, comp1, comp2')
You're getting 0 results back here:
cur.execute("SELECT tID from titles where name like 'METEOROLOG:WINDSPEED_F' order by structure, comp1, comp2")
because 0 rows exist with the value "METEOROLOG:WINDSPEED_F" in the name column. This might just be because you're spelling METEOROLOGY wrong.
The way you're using LIKE, you might as well be using =. LIKE is great if you're going to use % to find other values like that value.
Example:
SELECT *
FROM
TABLE
WHERE
UPPER(NAME) LIKE 'JOSH%'
This would return results for these values in name: JOSHUA, JoShUa, joshua, josh, JOSH. If I straight up did NAME LIKE 'JOSH' then I would only find results for the exact value of JOSH.
Since you are making the value all caps in your WHERE, try this adding an UPPER() to your query like this:
cur.execute("SELECT tID from titles where UPPER(name) like 'METEOROLOG:WINDSPEED_F' order by structure, comp1, comp2")

Substituting column names in Python sqlite3 query [duplicate]

This question already has answers here:
How do you escape strings for SQLite table/column names in Python?
(8 answers)
Closed 7 years ago.
I have a wide table in a sqlite3 database, and I wish to dynamically query certain columns in a Python script. I know that it's bad to inject parameters by string concatenation, so I tried to use parameter substitution instead.
I find that, when I use parameter substitution to supply a column name, I get unexpected results. A minimal example:
import sqlite3 as lite
db = lite.connect("mre.sqlite")
c = db.cursor()
# Insert some dummy rows
c.execute("CREATE TABLE trouble (value real)")
c.execute("INSERT INTO trouble (value) VALUES (2)")
c.execute("INSERT INTO trouble (value) VALUES (4)")
db.commit()
for row in c.execute("SELECT AVG(value) FROM trouble"):
print row # Returns 3
for row in c.execute("SELECT AVG(:name) FROM trouble", {"name" : "value"}):
print row # Returns 0
db.close()
Is there a better way to accomplish this than simply injecting a column name into a string and running it?
As Rob just indicated in his comment, there was a related SO post that contains my answer. These substitution constructions are called "placeholders," which is why I did not find the answer on SO initially. There is no placeholder pattern for column names, because dynamically specifying columns is not a code safety issue:
It comes down to what "safe" means. The conventional wisdom is that
using normal python string manipulation to put values into your
queries is not "safe". This is because there are all sorts of things
that can go wrong if you do that, and such data very often comes from
the user and is not in your control. You need a 100% reliable way of
escaping these values properly so that a user cannot inject SQL in a
data value and have the database execute it. So the library writers do
this job; you never should.
If, however, you're writing generic helper code to operate on things
in databases, then these considerations don't apply as much. You are
implicitly giving anyone who can call such code access to everything
in the database; that's the point of the helper code. So now the
safety concern is making sure that user-generated data can never be
used in such code. This is a general security issue in coding, and is
just the same problem as blindly execing a user-input string. It's a
distinct issue from inserting values into your queries, because there
you want to be able to safely handle user-input data.
So, the solution is that there is no problem in the first place: inject the values using string formatting, be happy, and move on with your life.
Why not use string formatting?
for row in c.execute("SELECT AVG({name}) FROM trouble".format(**{"name" : "value"})):
print row # => (3.0,)

Python, SQL: How to update multiple rows and columns in a single trip around the database?

Hello StackEx community.
I am implementing a relational database using SQLite interfaced with Python. My table consists of 5 attributes with around a million tuples.
To avoid large number of database queries, I wish to execute a single query that updates 2 attributes of multiple tuples. These updated values depend on the tuples' Primary Key value and so, are different for each tuple.
I am trying something like the following in Python 2.7:
stmt= 'UPDATE Users SET Userid (?,?), Neighbours (?,?) WHERE Username IN (?,?)'
cursor.execute(stmt, [(_id1, _Ngbr1, _name1), (_id2, _Ngbr2, _name2)])
In other words, I am trying to update the rows that have Primary Keys _name1 and _name2 by substituting the Neighbours and Userid columns with corresponding values. The execution of the two statements returns the following error:
OperationalError: near "(": syntax error
I am reluctant to use executemany() because I want to reduce the number of trips across the database.
I am struggling with this issue for a couple of hours now but couldn't figure out either the error or an alternate on the web. Please help.
Thanks in advance.
If the column that is used to look up the row to update is properly indexed, then executing multiple UPDATE statements would be likely to be more efficient than a single statement, because in the latter case the database would probably need to scan all rows.
Anyway, if you really want to do this, you can use CASE expressions (and explicitly numbered parameters, to avoid duplicates):
UPDATE Users
SET Userid = CASE Username
WHEN ?5 THEN ?1
WHEN ?6 THEN ?2
END,
Neighbours = CASE Username
WHEN ?5 THEN ?3
WHEN ?6 THEN ?4
END,
WHERE Username IN (?5, ?6);

Categories

Resources