Selecting minimum and grouping - python

I have the following table:
id | product_id | quantity
--------------------------
1 | 222 | 25
2 | 222 | 35
3 | 223 | 10
Now I want to select the lowest quantities grouped by product_id. In SQL this works as
SELECT product_id, MIN(quantity) FROM my_table GROUP BY product_id
The result of this query is the following
product_id | MIN(quantity)
--------------------------
222 | 25
223 | 10
However, how can I use Django's database models to do the same?
I tried
myModel.objects.filter(product__in=product_ids).annotate(Min("quantity")).values_list("product", "quantity__min")
This returns the full table.

objects = (MyModel.objects
.values('product')
.annotate(Min('quantity'))
# if you want to get values_list
.values_list('product', 'quantity__min'))
https://docs.djangoproject.com/en/1.8/topics/db/aggregation/#values

Related

Find min max from two different tables

I have 3 tables in a MYSQL DB
ORDER
order_id | order_date
-------------------------
1 | 2021-09-20
2 | 2021-09-21
PRODUCTS
product_id | product_price
------------------------------
1 | 30
2 | 34
3 | 39
4 | 25
ORDER_PRODUCTS
product_id | order_id | discount_price
------------------------------------------
1 | 1 | null
2 | 1 | 18
1 | 2 | null
4 | 2 | null
Now I want to know the min and max prices of all products in a specific ORDER record when I give a specific product_id (I need all the ORDERS that have the provided product) group by order_id. I got the required data for this, but here is the tricky part, the ORDER_PRODUCTS table will have the discounted_price for that particular product for that specific ORDER.
So, when computing MIN, MAX values I want discount_price to be prioritized instead of product_price if that product doesn't have any discount_price then product_price should be returned.
EX:
order_id | min_price | max_price
------------------------------------------------------
1 | 18(p_id=2)(discount price) | 30(p_id=1)
2 | 25(p_id=4) | 30(p_id=1)
If I understand correctly you are looking for the IfNull()function, you can read about it here
You can simply surround the IfNull()function in the appropriate aggregate function
select o.order_id,
min(ifnull(discount_price,product_price)),
max(ifnull(discount_price,product_price))
from PRODUCTS p
inner join ORDER_PRODUCTS op on op.product_id =p.product_id
inner join ORDER o on o.order_id = op.order_id
group by o.order_id, p.product_id

Python or Excel: How can you compare 2 columns and then write the value of a 3rd column in a new column?

I have 2 tables that look like this:
Table 1
| ID | Tel. | Name |
|:--:|:----:|:-------:|
| 1 | 1234 | Denis |
| 2 | 4567 | Michael |
| 3 | 3425 | Peter |
| 4 | 3242 | Mary |
Table 2
| ID | Contact Date |
|:--:|:------------:|
| 1 | 2014-05-01 |
| 2 | 2003-01-05 |
| 3 | 2020-01-10 |
| 4 | NULL |
Now I want to Compare the First Table with the second table with the ID column to look if contact 1 is already in the list where people were contacted. After this, I want to write the Contact Date into the first table to see the last contact date in the main table.
How would I do this?
Thanks for any answers!!
Here's a solution that might get your interest using MySQL; Then if you want to implement it into Python will be easy.
First, Let's create our T1 Table AS CODED
create table T1(
id integer primary key,
tel integer,
name varchar(100)
);
create table T2(
id integer primary key,
contactDate date
);
Second Table T2
create table T2(
id integer primary key,
contactDate date
);
Insertion Of T1 AND T2:
-- Table 'T1' Insertion
insert into T1(id, tel, name) VALUES
(1, 1234, "Denis"),
(2, 4567, "Michael"),
(3, 3425,"Peter"),
(4, 3242, "Mary");
-- Table 'T2' Insertion
insert into T2(id, contactDate) VALUES
(1, 20140105),
(2, 20030105),
(3, 20201001),
(4, Null);
Then, Create T3 With the Select Statement for both tables using INNER JOIN For Joins Results
CREATE TABLE T3 AS
SELECT T1.id, T1.name, T1.tel, T2.contactDate
FROM T1
INNER JOIN T2 ON T1.id=T2.id;
Then, SELECT to check the results
select * from T3;
OUTPUT
| id | name | tel | contactDate |
|:--:|:-------:|------|-------------|
| 1 | Denis | 1234 | 2014-01-05 |
| 2 | Michael | 4567 | 2003-01-05 |
| 3 | Peter | 3425 | 2020-10-01 |
| 4 | Mary | 3242 | NULL |
I Hope This Helps. I was really trying to merge the T3---> contactDate with T1 for like 3 hours. But it was process heavy. I will attach Links that could help you more.
Reference
INNER JOIN
SQL ALTER TABLE Statement
SQL Server INSERT Multiple Rows
INSERT INTO SELECT statement overview and examples
SQL INSERT INTO SELECT Statement

How to make func.sum and group_by to output sum of the rows and merge the duplicate rows using sqlalchemy

I want to generate a table which sums the number of books that are sold and the total amount paid for that distinct book in a given period of time. I need it to show a report of books that are sold.
My subquery is:
bp = db.session.query(CustomerPurchase.book_category_id,
func.sum(CustomerPurchase.amount).label('amount'),
func.sum(CustomerPurchase.total_price).label('total_price'))\
.filter(CustomerPurchase.created_on >= start_date)\
.filter(CustomerPurchase.created_on <= end_date)\
.group_by(CustomerPurchase.book_category_id).subquery()
Combined query with a subquery:
cp = CustomerPurchase.query\
.join(bp, bp.c.category_id == CustomerPurchase.category_id)\
.distinct(bp.c.category_id)\
.order_by(bp.c.category_id)
My CustomerPurchase table looks like this and the output of my query looks the same:
id | book_category_id | book_title | amount | total_price |
---+------------------+------------+--------+-------------+
1 | 1 | Book A | 10 | 35.00 |
2 | 1 | Book A | 20 | 70.00 |
3 | 2 | Book B | 40 | 45.00 |
Desired output after the query run should be like this:
id | book_category_id | book_title | amount | total_price |
---+------------------+------------+--------+-------------+
1 | 1 | Book A | 30 | 105.00 |
2 | 2 | Book B | 40 | 45.00 |
Above query displays all the books that are sold to customer from CustomerPurchase table, but it doesn't SUM the amount and total_price nor it merges the duplicate
I have seen many examples but none of them worked for me. Any help is greatly appreciated! Thanks in advance!
So after a lot of research and trials I came up with a query which solved my problem. Basically I used add_column attribute in sqlalchemy which gave me exact rows that I wanted to display for my report.
bp = db.session.query(CustomerPurchase.book_store_category_id,
func.sum(CustomerPurchase.quantity).label('quantity'),
func.sum(CustomerPurchase.total_price).label('total'))\
.filter(CustomerPurchase.created_on >= start_date)\
.filter(CustomerPurchase.created_on <= end_date)
bp = bp.add_column(BookStore.book_amount)\
.filter(BookStore.category_id == CustomerPurchase.book_store_category_id)
bp = bp.add_columns(Category.category_name, Category.total_stock_amount)\
.filter(Category.id == CustomerPurchase.book_store_category_id)
bp = bp.add_column(Category.unit_cost)\
.filter(Category.id == CustomerPurchase.book_store_category_id)
bp = bp.add_column(Book.stock_amount)\
.filter(Book.category_id == CustomerPurchase.book_store_category_id)\
.group_by(BookStore.book_amount, CustomerPurchase.book_store_category_id, Category.category_name, Category.unit_cost, Category.total_stock_amount, Book.stock_amount)
bp = bp.all()

Query items that are not in the list, without having a unique id in one column

I'm looking to create a Python script that displays the items that are no longer available in the store (location 1) but are available in the warehouse (location 99). E.g. a list with items needed to be restocked. The difficulty is, that instead of putting the item count on 0 for the store, the inventory management system just removes the item completely from the database. Plus the fact that items don't have a unique ID, but are a combination of two columns being serial number and the column with the item's size.
The (PostgreSQL) database is restock with one table inventory looking like this:
====================================================================
| serialnumber | size | location | itemcount | season |
--------------------------------------------------------------------
| 1120.10.0369 | 140 | 99 | 8 | 74 |
| 1120.10.0369 | 140 | 1 | 2 | 74 |
| 1120.10.4011 | 170/176 | 99 | 3 | 74 |
| 1120.10.4011 | 170/176 | 1 | 2 | 74 |
| 1120.10.4011 | 86/92 | 99 | 1 | 74 |
| 1120.10.8006 | 158 | 99 | 1 | 74 |
| 1120.10.8006 | 158 | 1 | 2 | 74 |
In the above example, the item with serial number 1120.10.4011 and size 86/92 is available in location 99 (the warehouse) but there is no row with serial number 1120.10.4011 and size 86/92 where location is 1, so that item I want to have on a list to be restocked.
I'm trying to show the items that don't exist in the data do this by using a count on a GROUP BY query:
getresult = "SELECT serialnumber, size, location, itemcount, season, COUNT(*)" + \
"FROM inventory " + \
"WHERE season = 74" + \
"AND location != 1" + \
"GROUP BY serialnumber, size " + \
"HAVING COUNT(*) < 1"
However, this doesn't work as expected.
The question still stands:
How can I retrieve a list with rows missing from the database?
Or
How can I retrieve a list with rows where the serialnumber + size are there for location 99 but are not present with a location of 1?
You must use a subselect in a not exist clause:
getresult = """SELECT serialnumber, size FROM inventory i99
WHERE season = 74 AND location = 99
AND NOT EXISTS (SELECT serialnumber FROM inventory i1
WHERE i1.serialnumber = i99.serialnumber
AND i1.size = i99.size
AND i1.location = 1)"""
That is the only way I know to find non existent values directly at the SQL level, because IS NULL even in an outer join test the value is the original table and not what will be returned by the select.
getresult = """
select i99.serialnumber, i99.size, i99.itemcount, i99.season
from
inventory i99
left join
inventory i1 on
(i99.serialnumber, i99.size, i99.season, i1.season, i99.location, i1.location) =
(i1.serianumber, i1.size, 74, 74, 99, 1)
where i1.size is null
"""

turn rows values to columns , and count repetitions for all possible values mysql

I have a table(from log file) with emails and three other columns that contains states of that user's interaction with a system, an email(user) may have 100 or 1000 entries, each entries contain those three combinations of values, that might repeat on and on for same email and others.
something look like this:
+---------+---------+---------+-----+
| email | val1 | val2 | val3 |
+---------+---------+---------+-----+
|jal#h | cast | core | cam |
|hal#b |little ja| qar | ja sa |
|bam#t | cast | core | cam |
|jal#h |little ja| qar | jaja |
+---------+---------+---------+-----+
and so, the emails repeat, all values repeat, and there are 40+ possible values for each columns, all strings. so i want to sort distinct email email and then put all possible value as column name, and under it a count for how many this value repeated for a particular email, like so:
+-------+--------+--------+------+----------+-----+--------+-------+
| email | cast | core | cam | little ja| qar | ja sa | blabla |
+-------+--------+--------+------+----------+-----+--------+--------|
|jal#h | 55 | 2 | 44 | 244 | 1 | 200 | 12 |
|hal#b | 900 | 513 | 101 | 146 | 2 | 733 | 833 |
|bam#t | 1231 | 33 | 433 | 411 | 933 | 833 | 53 |
+-------+--------+--------+------+----------+-----+--------+---------
I have tried mysql but i managed to count a certain value total occurances for each email, but not counting all possible values in each columns:
SELECT
distinct email,
count(val1) as "cast"
FROM table1
where val1 = 'cast'
group by email
This query clearly doesn't do it, as it output only on value 'cast' from the first column 'val1', What i'm looking for is all distinct values in first, second, and third columns be turned to columns heads and the values in rows will be the total for that value, for a certain email 'user'.
there is a pivote table thing but i couldn't get it to work.
I'm dealing with this data as a table in mysql, but it is available in csv file, so if it isn't possible with a query, python would be a possible solution, and prefered after sql.
update
in python, is it possible to output the data as:
+-------+--------+---------+------+----------+-----+--------+-------+
| | val1 | val2 | val3 |
+-------+--------+---------+------+----------+-----+--------+-------+
| email | cast |little ja|core | qar |cam | ja sa | jaja |
+-------+--------+---------+------+----------+-----+--------+--------|
|jal#h | 55 | 2 | 44 | 244 | 1 | 200 | 12 |
|hal#b | 900 | 513 | 101 | 146 | 2 | 733 | 833 |
|bam#t | 1231 | 33 | 433 | 411 | 933 | 833 | 53 |
+-------+--------+--------+------+----------+-----+--------+---------
I'm not very familiar with python.
If you use pandas, you can do a value_counts after grouping your data frame by email and then unstack/pivot it to wide format:
(df.set_index("email").stack().groupby(level=0).value_counts()
.unstack(level=1).reset_index().fillna(0))
To get the updated result, you can group by both the email and val* columns after the stack:
(df.set_index("email").stack().groupby(level=[0, 1]).value_counts()
.unstack(level=[1, 2]).fillna(0).sort_index(axis=1))
I'd reconstruct dataframe, then groupby and unstack with pd.value_counts
v = df.values
s = pd.Series(v[:, 1:].ravel(), v[:, 0].repeat(3))
s.groupby(level=0).value_counts().unstack(fill_value=0)
cam cast core ja sa jaja little ja qar
bam#t 1 1 1 0 0 0 0
hal#b 0 0 0 1 0 1 1
jal#h 1 1 1 0 1 1 1
If you know the list you can calculate it using group by:
SELECT email,
sum(val1 = 'cast') as `cast`,
sum(val1 = 'core') as `core`,
sum(val1 = 'cam') as `cam`,
. . .
FROM table1
GROUP BY email;
The . . . is for you to fill in the remaining values.
You can use this Query to generate a PREPARED Statement dynamic from your Values val1-val3 in your table:
SELECT
CONCAT( "SELECT email,\n",
GROUP_CONCAT(
CONCAT (" SUM(IF('",val1,"' IN(val1,val2,val3),1,0)) AS '",val1,"'")
SEPARATOR ',\n'),
"\nFROM table1\nGROUP BY EMAIL\nORDER BY email") INTO #myquery
FROM (
SELECT val1 FROM table1
UNION SELECT val2 FROM table1
UNION SELECT val3 FROM table1
) AS vals
ORDER BY val1;
-- ONLY TO VERIFY QUERY
SELECT #myquery;
PREPARE stmt FROM #myquery;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
sample table
mysql> SELECT * FROM table1;
+----+-------+-----------+------+-------+
| id | email | val1 | val2 | val3 |
+----+-------+-----------+------+-------+
| 1 | jal#h | cast | core | cam |
| 2 | hal#b | little ja | qar | ja sa |
| 3 | bam#t | cast | core | cam |
| 4 | jal#h | little ja | qar | cast |
+----+-------+-----------+------+-------+
4 rows in set (0,00 sec)
generate query
mysql> SELECT
-> CONCAT( "SELECT email,\n",
-> GROUP_CONCAT(
-> CONCAT (" SUM(IF('",val1,"' IN(val1,val2,val3),1,0)) AS '",val1,"'")
-> SEPARATOR ',\n'),
-> "\nFROM table1\nGROUP BY EMAIL\nORDER BY email") INTO #myquery
-> FROM (
-> SELECT val1 FROM table1
-> UNION SELECT val2 FROM table1
-> UNION SELECT val3 FROM table1
-> ) AS vals
-> ORDER BY val1;
Query OK, 1 row affected (0,00 sec)
verify query
mysql> -- ONLY TO VERIFY QUERY
mysql> SELECT #myquery;
SELECT email,
SUM(IF('cast' IN(val1,val2,val3),1,0)) AS 'cast',
SUM(IF('little ja' IN(val1,val2,val3),1,0)) AS 'little ja',
SUM(IF('core' IN(val1,val2,val3),1,0)) AS 'core',
SUM(IF('qar' IN(val1,val2,val3),1,0)) AS 'qar',
SUM(IF('cam' IN(val1,val2,val3),1,0)) AS 'cam',
SUM(IF('ja sa' IN(val1,val2,val3),1,0)) AS 'ja sa'
FROM table1
GROUP BY EMAIL
ORDER BY email
1 row in set (0,00 sec)
execute query
mysql> PREPARE stmt FROM #myquery;
Query OK, 0 rows affected (0,00 sec)
Statement prepared
mysql> EXECUTE stmt;
+-------+------+-----------+------+------+------+-------+
| email | cast | little ja | core | qar | cam | ja sa |
+-------+------+-----------+------+------+------+-------+
| bam#t | 1 | 0 | 1 | 0 | 1 | 0 |
| hal#b | 0 | 1 | 0 | 1 | 0 | 1 |
| jal#h | 2 | 1 | 1 | 1 | 1 | 0 |
+-------+------+-----------+------+------+------+-------+
3 rows in set (0,00 sec)
mysql> DEALLOCATE PREPARE stmt;
Query OK, 0 rows affected (0,00 sec)
mysql>

Categories

Resources