django query - how to get latest row during distinct - python

this is my db RATING table where i want the last row of duplicate entries:
i did this:
bewertung = Rating.objects.filter(von_location=1).distinct('von_location')
but i am getting the first row where bewertung=4. i want the last row with this property. how can i get bewertung=3, the last row of von_location column? what is the best way for this to get?

It looks as if you want the record with the largest id (among the records with the right value for von_location). That's the first record retrieved when you sort them in descending order by the id column:
Rating.objects.filter(von_location=1).order_by('-id')[0]

Related

How to get the whole row with column names while looking for a minimum value?

I have a data frame. There are four columns.
I can find a minimum number using this code:
df_temp=df_A2C.loc[ (df_A2C['TO_ID'] == 7)]
mini_value = df_temp['DURATION_H'].min()
print("minimum value in column 'TO_ID' is: " , mini_value)
Output:
minimum value in column 'TO_ID' is: 0.434833333333333
Now, I am trying to get the whole row with all column names while looking for a minimum value using TO_ID. Something like this.
How can we get the whole row with all column names while looking for a minimum value?
if you post the data as a code or text, I would have been able to share the result
assumption: you're searching for a minimum value for a specific to_id
# as per your code, filter out by to_id
# sort the result on duration and take the top value
df_A2C.loc[ (df_A2C['TO_ID'] == 7)].sort_values('DURATION_H').head(1)

How do I iterate through a pandas data frame to return a different column's value based on conditions?

Lets say I have a table:
tableName
parentPartitionKey
parentTableName
User
entityId
Entity
Employee
entityId
Entity
Customer
entityId
Entity
CreditType
creditId
Credit
Entity
None
None
Essentially, for each table I want to look for the parentTable’s partition key value. For example, table User has a parent table ‘Entity’. I want to return the Entity’s partition key value. How do I iterate through the data frame to find every table’s parentPartitionKey?
I tried using a np.where but this didn’t work. I am not trying to merge this data frame.
for idx, row in df.iterrows():
print(row["parentPartitionKey"].where(row["parentTableName"] == row["tableName”])))

Pandas dataframe- How to count the number of distinct rows for a given ID

I have this dataframe and I want to add a column to it with the total of distinct SalesOrderId for a given CustomerId
So, with I am trying to do there would be a new column with the value 3 for all this rows.
How can I do it?
I am trying this way but I get an error
data['TotalOrders'] = data.groupby([['CustomerID','SalesOrderID']]).size().reset_index(name='count')
Try using transform:
data['TotalOrders'] = df.groupby('CustomerID')['SalesOrderID'].transform('nunique')
This will give you one entry for each entry in the group. (thanks #Rodalm)

Extracting row based on row number

I have a data frame with 100 rows based on user's input I want to extract that particular row.
I have tried using if row==df.index but it gives this error "The truth value of an array with more than one element is ambiguous"
If user is inputting the row number i.e. integer location of the row then you can use df.iloc[I].
If user is inputting the index of the row, then you can use df.loc[I]
If something else then please update your question with more information

How to create a list of every items purchased by client X?

How can I create a list of every item purchased by the client ID that I specify?
This question is part of the recommendation system I am building.
My dataframe has 3 columns ['ClientID'],['Products'],['Ratings']
(A 'Rating' different than NaN represent an product purchased)
So far I have wrote this code:
First, I created a pivot table with ['ClientID'] on the vertical index and ['Products'] on the horizontal index and ['Ratings'] as the values.
piv = datacf1.pivot_table(index=['ClientID'], columns=['Products'], values=['Ratings'])
#Drop all columns containing only zeros, representing users who did not rate/purchased
piv.fillna(0, inplace=True)
piv = piv.T
piv = piv.loc[:, (piv != 0).any(axis=0)]
The following code uses the pivot table to get the list of purchased items by client X
#Create a list of every products traded by user X
purchased = piv.T[piv.loc[5039595.0,:]>0].index.tolist()
purchased
5039595 is the ['ClientID'] for which I want to create my list of items purchased and I would like to apply a different ['ClientID'] on demand.
I get an error when running the code to create the list.
Why I believe the 'create a list' code gives me an error:
I believe this code reads the vertical index as ['0','1','2','3',...] so it expects to find the column '5039595', however as said previously the vertical index represents the ['ClientID'] which are random.
Here is a snapshot of the vertical index ['ClientID'] of the pivot table:
How can I fix my code to look for the Client X that I want to create the list for?
Or is there another way to do it? Perhaps with my original dataset with 3 columns ['ClientID'],['Products'],['Ratings']
I think the best way to do this is to select all the rows containing purchases for a given customer then take all the unique product values from those rows. It would look something like this:
desired_rows = (data["ClientID"].isin([client_id]) & data[Ratings].notnull())
product_list = data.loc[desired_rows, "Products"].unique().tolist()
print(product_list)

Categories

Resources