How to annotate strings in django objects - python

I want to concatinate first name + last name but i'm getting 0 as a value of full name
What I'm trying to do is this
Customer.objects.annotate(full_name=F('first_name') + F('last_name')).filter(full_name='Filan Fisteku')

from django.db.models.functions import Concat
ss = Customer.objects.annotate(full_name=Concat('first_name', Value(' '), 'last_name')).filter(full_name='Filan Fisteku')

Related

Trying to add prefixes to url if not present in pandas df column

I am trying to add prefixes to urls in my 'Websites' Column. I can't figure out how to keep each new iteration of the helper column from overwriting everything from the previous column.
for example say I have the following urls in my column:
http://www.bakkersfinedrycleaning.com/
www.cbgi.org
barstoolsand.com
This would be the desired end state:
http://www.bakkersfinedrycleaning.com/
http://www.cbgi.org
http://www.barstoolsand.com
this is as close as I have been able to get:
def nan_to_zeros(df, col):
new_col = f"nanreplace{col}"
df[new_col] = df[col].fillna('~')
return df
df1 = nan_to_zeros(df1, 'Website')
df1['url_helper'] = df1.loc[~df1['nanreplaceWebsite'].str.startswith('http')| ~df1['nanreplaceWebsite'].str.startswith('www'), 'url_helper'] = 'https://www.'
df1['url_helper'] = df1.loc[df1['nanreplaceWebsite'].str.startswith('http'), 'url_helper'] = ""
df1['url_helper'] = df1.loc[df1['nanreplaceWebsite'].str.startswith('www'),'url_helper'] = 'www'
print(df1[['nanreplaceWebsite',"url_helper"]])
which just gives me a helper column of all www because the last iteration overwrites all fields.
Any direction appreciated.
Data:
{'Website': ['http://www.bakkersfinedrycleaning.com/',
'www.cbgi.org', 'barstoolsand.com']}
IIUC, there are 3 things to fix here:
df1['url_helper'] = shouldn't be there
| should be & in the first condition because 'https://www.' should be added to URLs that start with neither of the strings in the condition. The error will become apparent if we check the first condition after the other two conditions.
The last condition should add "http://" instead of "www".
Alternatively, your problem could be solved using np.select. Pass in the multiple conditions in the conditions list and their corresponding choice list and assign values accordingly:
import numpy as np
s = df1['Website'].fillna('~')
df1['fixed Website'] = np.select([~(s.str.startswith('http') | ~s.str.contains('www')),
~(s.str.startswith('http') | s.str.contains('www'))
],
['http://' + s, 'http://www.' + s], s)
Output:
Website fixed Website
0 http://www.bakkersfinedrycleaning.com/ http://www.bakkersfinedrycleaning.com/
1 www.cbgi.org http://www.cbgi.org
2 barstoolsand.com http://www.barstoolsand.com

how to extract only day from timestamp in Django

I want to get a specific date like "8" out of (2021-8-3) but it's showing like this image
how can I extract the specific date?
usertime = User.objects.filter(groups__name = 'patient').values('date_joined').annotate(date_only=Cast('date_joined', DateField()))
from django.db.models import F, Func,Value, CharField
usertime = (User.objects.filter(groups__name = 'patient').values('date_joined')
.annotate(date_only=Func(
F('date_joined'),
Value('MM'),
function='to_char',
output_field=CharField()
)
).values('date_only'))
Try this,
got a reference from #Yannics answer at: https://stackoverflow.com/a/60924664/5804947
you can further use YYYY / DD for years/date respectively under the Value field and works fine when the PostgreSQL database is used.
ANOTHER METHOD
from django.db.models.functions import Extract
usertime = User.objects.filter(groups__name = 'patient').values('date_joined').annotate(date_only=Extract('date_joined', 'month'))

Pandas & Jango - How to create a DataFrame filter from multiple user inputs without using string executable

I am building a website where users can graph data in a dataframe and apply filters to this data to only plot information they are interested in, as shown below
Right now I am having trouble figuring out a way to take the filters a user inputs in the fields above and using those inputs to filter the dataframe. Since the user can create an unbounded number of filters, I decided to use a for loop to build an executable string that contains all of the filters in one variable that is shown below
column = (value selected in "Select Parameter", which corresponds to a column in the dataframe)
boolean = (value selected in "Select Condition" e.g., >, <, >= ect....
user_input = (value user inputs into field e.g., 2019 and Diabetes)
executable = 'df = df[df[' + column1 + '] ' + boolean1 + user_input1 + ' and ' + 'df[' + column2 + '] ' + boolean2 + user_input2 + ' and '.....
exec(executable)
While this method works, it leaves my code very vulnerable to injection. Is there a better way of doing this?
You can use the operator module, it has operators as function. You can do operator.lt(a,b) for a<b. Just map user input to an operator.
ex:
import operator
operator_map = {'<':operator.lt, '<=':operator.le, '>':operator.gt}
Then you can create a filter like
import numpy as np
# Start with selecting all then `and` the filters
df_filter = np.ones((df.shape[0],),dtype=bool)
# Go through each user filter input
for ...:
df_filter = df_filter & operator_map[boolean](df[column], user_input)
filtered_df = df[df_filter]

Pass dataframe column value to function

My situation is that I'm receiving transaction data from a vendor that has a datetime that is in local time but it has no offset. For example, the ModifiedDate column may have a value of
'2020-05-16T15:04:55.7429192+00:00'
I can get the local timezone by pulling some other data together about the store in which the transaction occurs
timezone_local = tz.timezone(tzDf[0]["COUNTRY"] + '/' + tzDf[0]["TIMEZONE"])
I then wrote a function to take those two values and give it the proper timezone:
from datetime import datetime
import dateutil.parser as parser
import pytz as tz
def convert_naive_to_aware(datetime_local_str, timezone_local):
yy = parser.parse(datetime_local_str).year
mm = parser.parse(datetime_local_str).month
dd = parser.parse(datetime_local_str).day
hh = parser.parse(datetime_local_str).hour
mm = parser.parse(datetime_local_str).minute
ss = parser.parse(datetime_local_str).second
# ms = parser.parse(datetime_local_str).microsecond
# print('yy:' + str(yy) + ', mm:' + str(mm) + ', dd:' + str(dd) + ', hh:' + str(hh) + ', mm:' + str(mm) + ', ss:' + str(ss))
aware = datetime(yy,mm,dd,hh,mm,ss,0,timezone_local)
return aware
It works fine when I send it the timestamp as a string in testing but balks when I try to apply it to a dataframe. I presume because I don't yet know the right way to pass the column value as a string. In this case, I'm trying to replace the current ModifiedTime value with the results of the call to the function.
from pyspark.sql import functions as F
.
.
.
ordersDf = ordersDf.withColumn("ModifiedTime", ( convert_naive_to_aware( F.substring( ordersDf.ModifiedTime, 1, 19 ), timezone_local)),)
Those of you more knowledgeable than I won't be surprised that I received the following error:
TypeError: 'Column' object is not callable
I admit, I'm a bit of a tyro at python and dataframes and I may well be taking the long way 'round. I've attempted a few other things such as ordersDf.ModifiedTime.cast("String"), etc but no luck I'd be grateful for any suggestions.
We're using Azure Databricks, the cluster is Scala 2.11.
You need to convert the function into a UDF before you can apply it on a Spark dataframe:
from pyspark.sql import functions as F
# I assume `tzDf` is a pandas dataframe... This syntax wouldn't work with spark.
timezone_local = tz.timezone(tzDf[0]["COUNTRY"] + '/' + tzDf[0]["TIMEZONE"])
# Convert function to UDF
time_udf = F.udf(convert_naive_to_aware)
# Avoid overwriting dataframe variables. Here I appended `2` to the new variable name.
ordersDf2 = ordersDf.withColumn(
"ModifiedTime",
convert_naive_to_aware(
F.substring(ordersDf.ModifiedTime, 1, 19), F.lit(str(timezone_local))
)
)

Django, how to extract values list monthly from DateField?

I have the following models:
class Materiale(models.Model):
sottocategoria = models.ForeignKey(Sottocategoria, on_delete=models.CASCADE, null=True)
quantita=models.DecimalField(')
prezzo=models.DecimalField()
data=models.DateField(default="GG/MM/YYYY")
I wanna calculate the value given by the following expressions PREZZO*QUANTIA in a monthly's view (in other words the total sum of PRZZO*QUANTITA of all items in a single month), but my code does not work:
Monthly_views=Materiale.objects.filter(data__year='2020').values_list('month').annotate(totale_mensile=F(('quantita')*F('prezzo')))
Use values() method instead of values_list()
from django.db.models import F, Sum
result = Materiale.objects.annotate(totale_mensile=F('quantita') * F('prezzo')
).values('data__month').annotate(totale_mensile_sum=Sum('totale_mensile')))
or simply
result = Materiale.objects.values('data__month').annotate(totale_mensile_sum=Sum(F('quantita') * F('prezzo')))
Try filtering by month also
Monthly_views=Materiale.objects.filter(data__year='2020').filter(data_month='4')

Categories

Resources