I am running the following sql code in python :
SELECT
FIN AS 'LIN',
CUSIP,
Borrower_Name,
Alias,
DS_Maturity,
Spread,
Facility,
Facility_Size,
Log_date
FROM
[Main].[FacilityInformation]
WHERE
CUSIP IN ('00485GAC2', 'N1603LAD9')
OR (YEAR(DS_Maturity) in (2019,2024)
AND ((Borrower_Name LIKE 'Acosta Inc , Bright Bidco BV%'
OR Alias LIKE 'Acosta 9/14 (18.61) Non-Extended Cov-Lite, Lumileds 2/18 Cov-Lite%')))
It works perfectly when I have 3 or 4 borrower names or cusips or alias, but I am trying to run this with dozens of possible values. I thoungh that following the same logic as IN ('{}') with LIKE '{}%' will work, but it doesn't. So I want to use a efficient code but not something like :
SELECT * FROM table WHERE
column LIKE 'text1%'
column LIKE 'text2%'
.
.
.
column LIKE 'textn%'
This is good if every time you know the exactly numbers of time that you have to introduce the 'text, even though it is hard to do this 30 times or more so it will be pretty bad for a large number of borrower_names or cusips. It is not efficient. I hope it is clear what I am trying to ask.
You can join using VALUES and LIKE:
-- Test data
DECLARE #s TABLE (mytext NVARCHAR(20))
INSERT INTO #s VALUES ('abc'), ('def'), ('ghi')
-- Select
SELECT
s.mytext
FROM #s s
INNER JOIN (VALUES ('a%'),('%h%')) n(wildcard) ON s.mytext LIKE n.wildcard
Or you can do it by using a table:
DECLARE #s TABLE (mytext NVARCHAR(20))
DECLARE #t TABLE (wildcard NVARCHAR(20))
INSERT INTO #s VALUES ('abc'), ('def'), ('ghi')
INSERT INTO #t VALUES ('a%'), ('%h%')
SELECT s.mytext FROM #s s
INNER JOIN #t t ON s.mytext LIKE t.wildcard
Both gives the result:
mytext
------
abc
ghi
Related
SUMMARY:
How to query against values from different data frame columns with table.column_name combinations in SQL Alchemy using the OR_ statement.
I'm working on a SQL Alchemy project where I pull down valid columns of a dataframe and enter them all into SQL Alchemy's filter. I've successfully got it running where it would enter all entries of a column using the head of the column like this:
qry = qry.filter(or_(*[getattr(Query_Tbl,column_head).like(x) \
for x in (df[column_head].dropna().values)]))
This produced the pattern I was looking for of (tbl.column1 like a OR tbl.column1 like b...) AND- etc.
However, there are groups of the dataframe that need to be placed together where the columns are different but still need to be placed within the OR_ category,
i.e. (The desired result)
(tbl1.col1 like a OR tbl.col1 like b OR tbl.col2 like c OR tbl.col2 like d OR tbl.col3 like e...) etc.
My latest attempt was to sub-group the columns I needed grouped together, then repeat the previous style inside those groups like:
qry = qry.filter(or_((*[getattr(Query_Tbl, set_id[0]).like(x) \
for x in (df[set_id[0]].dropna().values)]),
(*[getattr(Query_Tbl, set_id[1]).like(y) \
for y in (df[set_id[1]].dropna().values)]),
(*[getattr(Query_Tbl, set_id[2]).like(z) \
for z in (df[set_id[2]].dropna().values)])
))
Where set_id is a list of 3 strings corresponding to column1, column2, and column 3 so I get the designated results, however, this produces simply:
(What I'm actually getting)
(tbl.col1 like a OR tbl.col1 like b..) AND (tbl.col2 like c OR tbl.col2 like d...) AND (tbl.col3 like e OR...)
Is there a better way to go about this in SQL Alchemy to get the result I want, or would it better to find a way of implementing column values with Pandas directly into getattr() to work it into my existing code?
Thank you for reading and in advance for your help!
It appears I was having issues with the way the data-frame was formatted, and I was reading column names into groups differently. This pattern works for anyone who want to process multiple df columns into the same OR statements.
I apologize for the issue, if anyone has any comments or questions on the subject I will help others with this type of issue.
Alternatively, I found a much cleaner answer. Since SQL Alchemy's OR_ function can be used with a variable column if you use Python's built in getattr() function, you only need to create (column,value) pairs where by you can unpack both in a loop.
for group in [group_2, group_3]:
set_id = list(set(df.columns.values) & set(group))
if len(set_id) > 1:
set_tuple = list()
for column in set_id:
for value in df[column].dropna().values:
set_tuple.append((column, value))
print(set_tuple)
qry = qry.filter(or_(*[getattr(Query_Tbl,id).like(x) for id, x in set_tuple]))
df = df.drop(group, axis=1)
If you know what column need to be grouped in the Or_ statement, you can put them into lists and iterate through them. Inside those, you create a list of tuples where you create the (column, value) pairs you need. Then within the Or_ function you upact the column and values in a loop, and assign them accordingly. The code is must easier to read and much for compack. I found this to be a more robust solution than explicitly writing out cases for the group sizes.
I have a database table with 3 columns (A,B,C). I want to add some rows in the table, for that i am going to take input from user by making a 'textentrydialog' like this https://pastebin.com/0JYm5x6e. But the problem is that i want to add multiple rows in table for the multiple values of 'A' but values of B and C are same (For example)
B = Ram
C = Aam
A = s,t,k
So the values in table should insert in this way:
(s,Ram,Aam)
(t,Ram,Aam)
(k,Ram,Aam)
Can someone please help with this how can i insert?
Here is a proposal, able to create the output you have shown, with the input you have shown.
Note that I assume you insist on the way to input, which implies using a single table.
If you can accept different input, I recommend to use two tables.
One with (id, A, C) one with (id, B) and then query by using join using(id).
A MCVE for this is at the end of the answer. It contains some additional test cases, which I made up to demonstrate that it does not only give output for given input, trying to guess obvious usecases.
Query:
select A, group_concat(B), C
from toy
group by A,C;
Output:
Mar|t,u|Aam
Ram|s,t,k|Aam
Ram|k,s,m|Maa
MCVE:
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE toy (A varchar(10), B varchar(10), C varchar(10));
INSERT INTO toy VALUES('Ram','s','Aam');
INSERT INTO toy VALUES('Ram','t','Aam');
INSERT INTO toy VALUES('Ram','k','Aam');
INSERT INTO toy VALUES('Mar','t','Aam');
INSERT INTO toy VALUES('Mar','u','Aam');
INSERT INTO toy VALUES('Ram','k','Maa');
INSERT INTO toy VALUES('Ram','s','Maa');
INSERT INTO toy VALUES('Ram','m','Maa');
COMMIT;
I am querying MySQL database (fintech_16) through Python (pymysql) to get UNIQUE values of a column (Trend). When I ran the following query:
cursor.execute ("SELECT DISTINCT `Trend` FROM `fintech_16` ")
cursor.fetchall()
I got the following result:
((u'Investments',),
(u'Expansion',),
(u'New Products',),
(u'Collaboration',),
(u'New Products,Investments',),
(u'New Products,Expansion',),
(u'Expansion,Investments',),
(u'New Products,Collaboration',),
(u'Regulations',),
(u'Investments,New Products',),
(u'Investments,Expansion',),
(u'Collaboration,Investments',),
(u'Expansion,New Products',),
(u'Collaboration,New Products',))
Now. since some of the ids had more than one trend, the DB is counting them as a separate trend.
How should I tweak my query to get only the 5 trends (Investments, Expansion, New Products, Collaboration, Regulations) along with their counts?
Though these are only 5, I can use the LIKE %Investments% to get the count manually, but I want the code/query to do it.
TIA
First approach is to use SET data type to define exact values, you have 5 - ['New Products', 'Investments', 'Expansion'...].
Then you could use FIND_IN_SET function to count values you need:
SELECT COUNT(*) FROM `fintech_16` WHERE FIND_IN_SET('New Products', `Trend`) > 0;
I am in facing a performance problem in my code.I am making db connection a making a select query and then inserting in a table.Around 500 rows in one select query ids populated .Before inserting i am running select query around 8-9 times first and then inserting then all using cursor.executemany.But it is taking 2 miuntes to insert which is not qood .Any idea
def insert1(id,state,cursor):
cursor.execute("select * from qwert where asd_id =%s",[id])
if sometcondition:
adding.append(rd[i])
cursor.executemany(indata, adding)
where rd[i] is a aray for records making and indata is a insert statement
#prog start here
cursor.execute("select * from assd")
for rows in cursor.fetchall()
if rows[1]=='aq':
insert1(row[1],row[2],cursor)
if rows[1]=='qw':
insert2(row[1],row[2],cursor)
I don't really understand why you're doing this.
It seems that you want to insert a subset of rows from "assd" into one table, and another subset into another table?
Why not just do it with two SQL statements, structured like this:
insert into tab1 select * from assd where asd_id = 42 and cond1 = 'set';
insert into tab2 select * from assd where asd_id = 42 and cond2 = 'set';
That'd dramatically reduce your number of roundtrips to the database and your client-server traffic. It'd also be an order of magnitude faster.
Of course, I'd also strongly recommend that you specify your column names in both the insert and select parts of the code.
I have to parse a very complex dump (whatever it is). I have done the parsing through Python. Since the parsed data is very huge in amount, I have to feed it in the database (SQL). I have also done this. Now the thing is I have to compare the data now present in the SQL.
Actually I have to compare the data of 1st dump with the data of the 2nd dump. Both dumps have the same fields (attributes) but the values of their fields may be different. So I have to detect this change. For this, I have to do the comparison. But I don't have the idea how to do this all using Python as my front end.
If you don't have MINUS or EXCEPT, there is also this, which will show all non-matching rows using a UNION/GROUP BY trick
SELECT MAX(table), data1, data2
FROM (
SELECT 'foo1' AS table, foo1.data1, foo1.data2 FROM foo1
UNION ALL
SELECT 'foo2' AS table, foo2.data1, foo2.data2 FROM foo2
) AS X
GROUP BY data1, data2
HAVING COUNT(*) = 1
ORDER BY data1, data2
I have a general purpose table compare SP which also can do a more complex table compare with left and right and inner joins and monetary threshold (or threshold percentage) and subset criteria.
Why not do the 'dectect change' in SQL? Something like:
select foo.data1, foo.data2 from foo where foo.id = 'dump1'
minus
select foo.data1, foo.data2 from foo where foo.id = 'dump2'