Create or populate a table in python based on other tables - python

I am new to python/pandas and have a fairly basic question.
I have 2 tables with numerous columns and "ID" as the primary key.
I want to create another table with conditions based on the 2 tables.
For example: Table A, Table B --> Table C
In SQL I would write something like this:
create table TableC as select
a.ID,
case when b.Field1=1000 and a.Field1=50 then 20 else 0 end as FieldA,
case when b.Field2=15 and a.Field2=100 then 100 else 0 end as FieldB
from TableA a, TableB b
where a.ID=b.ID
order by 1
I am struggling to put similar together Table C using python.
I have tried to make a function but I cant seem to include more than 1 table in the function nor create a new table based off multiple tables.
Any help will be much appreciated.

IIUC
TableC=TableA.merge(TableB,on='ID')
TableC['FieldA']=np.where((TableC.Field1_x==1000)& (TableC.Field1_y==50),20,0)
TableC['FieldB']=np.where((TableC.Field2_x==15)& (TableC.Field2_y==100),100,0)

Related

Joining portions of a python dictionary using a reference dataframe

I have a dictionary of dataframes with keys that look like this. It's called frames1.
dict_keys(['TableA','TableB','TableC','TableD'])
I also have a 'master' dataframe that tells me how to join these dataframes.
Gold Table
Silver Table 1
Silver Table 2
Join Type
Left_Attr
Right_Attr
System
Table A
Table B
left
ID
applic_id
System
Table C
Table A
right
fam
famid
System
Table A
Table D
left
NameID
name
The "System" gold table is the combination of all 3 rows. In other words, I need to join Table A to Table B on the attributes listed and then use that output as my NEW Table A when I join Table C and Table A in row 2. Then I need to use that table to as my NEW Table A to join to Table D. This creates the final "System" Table.
What I've tried:
for i in range(len(master)):
System = pd.merge(frames1[master.iloc[i,1]],frames1[master.iloc[i,2]], how=master.iloc[i,3], on_left= master.iloc[i,4],on_right=master.iloc[i,5])
This only gets me one row which will then over write the other rows as it goes on. How would I go about creating a for loop to join these together?

Can I update existing rows in a MYSQL table with rows in a Pandas DataFrame based on an ID that exists in both?

I currently have a MYSQL table that I pull into a Pandas DF with the pd.read_sql_query function.
Example table/DF before added data:
ID
Value
1
2
3
4
I then add some data to an existing column in the MYSQL table in Pandas.
Example DF after added data:
ID
Value
1
5
2
6
3
2
4
9
I then want to update the rows in the MYSQL database based on the ID column.
I have tried the following:
df.to_sql(con=conn,name='table_name',if_exists='replace',index=False)
However, I specifically want to update the specific rows and not drop and replace the table itself.
I have tried using SQL Alchemy functions, but I don't have much experience with SQL to implement it correctly.

Pandas convert data from two tables into third table. Cross Referencing and converting unique rows to columns

I have the following tables:
Table A
listData = {'id':[1,2,3],'date':['06-05-2021','07-05-2021','17-05-2021']}
pd.DataFrame(listData,columns=['id','date'])
Table B
detailData = {'code':['D123','F268','A291','D123','F268','A291'],'id':['1','1','1','2','2','2'],'stock':[5,5,2,10,11,8]}
pd.DataFrame(detailData,columns=['code','id','stock'])
OUTPUT TABLE
output = {'code':['D123','F268','A291'],'06-05-2021':[5,5,2],'07-05-2021':[10,11,8]}
pd.DataFrame(output,columns=['code','06-05-2021','07-05-2021'])
Note: The code provided is hard coded code for the output. I need to generate the output table from Table A and Table B
Here is brief explanation of how the output table is generated if it is not self explanatory.
The id column needs to be cross reference from Table A to Table B and the dates should be put instead in Table B
Then all the unique dates in Table B should be made into columns and the corresponding stock values need to be shifted to then newly created date columns.
I am not sure where to start to do this. I am new to pandas and have only ever used it for simple data manipulation. If anyone can suggest me where to get started, it will be of great help.
Try:
tableA['id'] = tableA['id'].astype(str)
tableB.merge(tableA, on='id').pivot('code', 'date', 'stock')
Output:
date 06-05-2021 07-05-2021
code
A291 2 8
D123 5 10
F268 5 11
Details:
First, merge on id, this is like doing a SQL join. First, the
dtypes much match, hence using astype to str.
Next, reshape the dataframe using pivot to get code by date.

Is there an Excel function to look up & not stop at first match but look for all the values (dates) and return "Yes/No" if it matches certain date

I have two data sets in excel. Like below (Table 1 and Table 2) I am trying to get result in Table 1 as Yes/No if the date matches for the corresponding ID in Table 2. See result Column in Table 1. Can you please let me know how this can be achieved using excel formulas? Thanks
Table 1
Table1
Table 2
Table2
You could try this:
The formula I've used is:
=IF(COUNTIFS($G$3:$G$6;A3;$H$3:$H$6;B3)=0;"No";"Yes")

How to add values from one column to another table using join?

I am having difficulties in merging 2 tables. In fact, I would like add a column from table B into table A based on one key
Table A (632 rows) contains the following columns:
part_number / part_designation / AC / AC_program
Table B (4,674 rows) contains the following columns:
part_ref / supplier_id / supplier_name / ac_program
I would like to add the supplier_name values into Table A
I have succeeded compiling a left joint based on the condition tableA.part_number == tableB.part_ref
However, when I look at the resulting Table, additional rows were created. I have now 683 rows instead of the initial 632 rows in Table A. How do I keep the same number of rows with including the supplier_name values in Table A? Below is presented a graph of my transformations:
Here is my code:
Table B seems to contain duplicates (part_ref). The join operation creates a new record in your original table for each duplicate in Table B
import pandas as pd
print(len(pd.unique(updated_ref_table.part_ref)))
print(updated_ref_table.shape[0])

Categories

Resources