I'm trying to create a simple daily time recording app, that updates the table row upon submitting.
Here's what I mean, suppose a staff timed-in in the morning, then my table row would be like this:
id
time_in_am
time_out_am
time_in_pm
time_out_pm
staff_id
1
2021-05-09 08:17:07.27
NULL
NULL
NULL
223-8881
and upon submitting or scanning an id again, then it would update time_out_am, until the end of the day which is time_out_pm.
My problem then starts here, how would I know if the staff with an id no. of 223-8881 already clocked in today?
I've tried this:
today_dt = datetime(datetime.today().year, datetime.today().month, datetime.today().day)
# check if staff clocked in today
dtr_log = DailyTimeRecord.query.filter(DailyTimeRecord.time_in_am==today_dt, staff_id=staff.id).first()
# end check
using the above code, I get the error: TypeError: filter() got an unexpected keyword argument 'staff_id'
and if I use filter_by(), I get this: filter_by() takes 1 positional argument but 2 were given
heres my model if it helps:
class DailyTimeRecord(db.Model):
id = db.Column(db.Integer, primary_key=True)
time_in_am = db.Column(db.DateTime(timezone=True))
time_out_am = db.Column(db.DateTime(timezone=True))
time_in_pm = db.Column(db.DateTime(timezone=True))
time_out_pm = db.Column(db.DateTime(timezone=True))
staff_id = db.Column(db.Integer, db.ForeignKey('staff.id'))
When you're using filter you need to specify the model and use 'proper' comparisons e.g: the staff_id=staff.id should be DailyTimeRecord.staff_id==staff.id so would look like:
dtr_log = DailyTimeRecord.query.filter(
DailyTimeRecord.time_in_am==today_dt,
DailyTimeRecord.staff_id==staff.id
).first()
If you were using the filter_by helper it would look like:
dtr_log = DailyTimeRecord.query.filter(
time_in_am=today_dt,
staff_id=staff.id
).first()
But you'll also run into a problem with the date comparison. time_in_am is a datetime, and you're building a datetime to compare with, but essentially your today_dt is a datetime at midnight (the hour, min, sec will default to zero because you didn't give them a value).
You really want to just deal with a date to date comparison-- so would want to cast the database value to also be a date to do what you mention, so with filter:
from sqlalchemy import func
from datetime import date
today_dt = date.today()
dtr_log = DailyTimeRecord.query.filter(
func.date(DailyTimeRecord.time_in_am)==today_dt,
DailyTimeRecord.staff_id==staff.id
).first()
Related
I have a model IncomingCorrespondence with auto incrementing field ID. I also have field number, and I want two things for this field:
This field will auto-increment its value, just like ID
Every new year its value will start over from 0 (or 1)
ID
Number
Date
…
…
…
285
285
2020-03-12
286
286
2020-04-19
287
1
2021-01-01
class IncomingCorrespondence(models.Model):
ID = models.models.AutoField(primary_key=True)
date = models.DateField(null=True)
number = models.IntegerField(null=True)
How can I do that the most efficient and reliable way?
You do not need to store the number, you can simply derive it by the number of items that are stored in the database since it has turned that year with:
class IncomingCorrespondence(models.Model):
date = models.DateField(null=True)
created = models.DateTimeField(auto_now_add=True)
#property
def number(self):
return IncomingCorrespondence._base_manager.filter(
created__year=self.created.year,
created__lt=self.created
).count() + 1
We thus have a timestamp created that will store at what datetime the object was created, and we count the number of IncomingCorrespondence for that year before the timestamp plus one. You can in that case work with a package like django-softdelete [GitHub] to keep deleted objects in the database, and just filter these out when viewing the item.
another way might be to assign the maximum plus one to a field:
from django.db.models import Max
from django.utils.timezone import now
def next_number():
data = IncomingCorrespondence._base_manager.filter(
date__year=now().year
).aggregate(
max_number=Max('number')
)['max_number'] or 0
return data + 1
class IncomingCorrespondence(models.Model):
ID = models.models.AutoField(primary_key=True)
date = models.DateField(auto_now_add=True)
number = models.IntegerField(default=next_number, editable=False)
But here Django will dispatch numbers through a query. If there are multiple threads that concurrently create an IncomingCorrespondence, then this can fail. It also depends on the insertion time, not the date of the IncomingCorrespondence object.
You should count number with some method like querying count of created IncomingCorrespondence this year. It should not be done any other way (cronjob for example) as it won't be stable (crontab may fail and you will end up with anomalies (and won't even be able to notice that), or you will create instance right before crontab reset the sequence)
i have a model which contains a field birth_year and in another model i have the user registration date.
I have the list of user ids for which i want to query if their age belongs to a particular range of age.User age is calculated as registration date - birth_year.
I was able to calculate it from current date as:
startAge=25
endAge=50
ageStartRange = (today - relativedelta(years=startAge)).year
ageEndRange = (today - relativedelta(years=endAge)).year
and i made the query as:
query.filter(profile_id__in=communityUsersIds, birth_year__lte=age_from, birth_year__gte=age_to).values('profile_id')
This way i am getting the userids whose age is in range bw 25 and 50. Instead of today how can i use registration_date(it is a field in another model) of user.
You can use native DB functions. Works like a charm using Postgres.
from django.contrib.auth.models import User
from django.db.models import DurationField, IntegerField, F, Func
class Age(Func):
function = 'AGE'
output_field = DurationField()
class AgeYears(Func):
template = 'EXTRACT (YEAR FROM %(function)s(%(expressions)s))'
function = 'AGE'
output_field = IntegerField()
users = User.objects.annotate(age=Age(F("dob")), age_years=AgeYears(F("dob"))).filter(age_years__gte=18)
for user in users:
print(user.age, user.age_years)
# which will generate result like below
# 10611 days, 0:00:00 29
The "today" version of the query was easy to do, because the "today" date doesn't depend on the individual fields in the row.
F Expressions
You can explore Django's F expressions as they allow you to reference the fields of the model in your queries (without pulling them into Python)
https://docs.djangoproject.com/en/1.7/topics/db/queries/#using-f-expressions-in-filters
e.g. for you, the age would be this F expressions:
F('registration_date__year') - F('birth_year')
However, we don't really need to calculate that, because e.g. to query for what you want, consider this query:
Model.filter(birth_year__lte=F('regisration_date__year') - 25)
From that you can do add a:
birth_year__gte=F('regisration_date__year') + 50,
or use a birth_year__range=(F('regisration_date__year') - 25, F('regisration_date__year') + 50))
Alternative: precalculate age value
Otherwise you can precalculate that age, since that value is knowable on user registration time
Model.update(age=F('registration_date__year') - F('birth_year'))
Once that is saved, it's as simple as Model.filter(age__range=(25, 50))
I need to execute a query which compares only the year and month value from TIMESTAMP column where the records look like this:
2015-01-01 08:33:06
The SQL Query is very simple (the interesting part is the year(timestamp) and month(timestamp) which extracts the year and the month so I can use them for comparison:
SELECT model, COUNT(model) AS count
FROM log.logs
WHERE SOURCE = "WEB"
AND year(timestamp) = 2015
AND month(timestamp) = 01
AND account = "TEST"
AND brand = "Nokia"
GROUP BY model
ORDER BY count DESC limit 10
Now the problem:
This is my SQLAlchemy Query:
devices = (db.session.query(Logs.model, Logs.timestamp,
func.count(Logs.model).label('count'))
.filter_by(source=str(source))
.filter_by(account=str(acc))
.filter_by(brand=str(brand))
.filter_by(year=year)
.filter_by(month=month)
.group_by(Logs.model)
.order_by(func.count(Logs.model).desc()).all())
The part:
.filter_by(year=year)
.filter_by(month=month)
is not the same as
AND year(timestamp) = 2015
AND month(timestamp) = 01
and my SQLAchemy query is not working. It seems like year and month are MySQL functions that extract the values from a timestamp column.
My DB Model looks like this:
class Logs(db.Model):
id = db.Column(db.Integer, primary_key=True)
timestamp = db.Column(db.TIMESTAMP, primary_key=False)
.... other attributes
It is interesting to mention that when I select and print Logs.timestamp it is in the following format:
(datetime.datetime(2013, 7, 11, 12, 47, 28))
How should this part be written in SQLAlchemy if I want my SQLAlchemy query to compare by the DB Timestamp year and month ?
.filter_by(year=year) #MySQL - year(timestamp)
.filter_by(month=month) #MySQL- month(timestamp)
I tried .filter(Logs.timestamp == year(timestamp) and similar variations but no luck. Any help will be greatly appreciated.
Simply replace:
.filter_by(year=year)
.filter_by(month=month)
with:
from sqlalchemy.sql.expression import func
# ...
.filter(func.year(Logs.timestamp) == year)
.filter(func.month(Logs.timestamp) == month)
Read more on this in SQL and Generic Functions section of documentation.
You can use custom constructs if you want to use functions that are specific to your database, like the year function you mention for MySQL. However I don't use MySQL and cannot give you some tested code (I did not even know about this function, by the way).
This would be an simple and useless example for Oracle (which is tested). I hope from this one you can quite easily deduce yours.
from sqlalchemy.sql import expression
from sqlalchemy.ext.compiler import compiles
from sqlalchemy import Date
class get_current_date(expression.FunctionElement):
type = Date()
#compiles(get_current_date, 'oracle')
def ora_get_current_date(element, compiler, **kw):
return "CURRENT_DATE"
session = schema_mgr.get_session()
q = session.query(sch.Tweet).filter(sch.Tweet.created_at == get_current_date())
tweets_today = pd.read_sql(q.statement, session.bind)
However I don't need to mention that this way you make your highly portable SQLAlchemy code a bit less portable.
Hope it helps.
I've got a table set up like so:
{"String" : {uuid1 : "String", uuid1: "String"}, "String" : {uuid : "String"}}
Or...
Row_validation_class = UTF8Type
Default_validation_class = UTF8Type
Comparator = UUID
(It's basically got website as a row label, and has dynamically generated columns based on datetime.datetime.now() with TimeUUIDType in Cassandra and a string as the value)
I'm looking to use Pycassa to retrieve slices of the data based on both the row and the columns. However, on other (smaller) tables I've done this but by downloading the whole data set (or at least filtered to one row) and then had an ordered dictionary I could compare with datetime objects.
I'd like to be able to use something like the Pycassa multiget or get_indexed_slice function to pull certain columns and rows. Does something like this exist that allows filtering on datetime. All my current attempts result in the following error message:
TypeError: can't compare datetime.datetime to UUID
The best I've managed to come up with so far is...
def get_number_of_visitors(site, start_date, end_date=datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S:%f")):
pool = ConnectionPool('Logs', timeout = 2)
col_fam = ColumnFamily(pool, 'sessions')
result = col_fam.get(site)
number_of_views = [(k,v) for k,v in col_fam.get(site).items() if get_posixtime(k) > datetime.datetime.strptime(str(start_date), "%Y-%m-%d %H:%M:%S:%f") and get_posixtime(k) < datetime.datetime.strptime(str(end_date), "%Y-%m-%d %H:%M:%S:%f")]
total_unique_sessions = len(number_of_views)
return total_unique_sessions
With get_posixtime being defined as:
def get_posixtime(uuid1):
assert uuid1.version == 1, ValueError('only applies to type 1')
t = uuid1.time
t = (t - 0x01b21dd213814000L)
t = t / 1e7
return datetime.datetime.fromtimestamp(t)
This doesn't seem to work (isn't returning the data I'd expect) and also feels like it shouldn't be necessary. I'm creating the column timestamps using:
timestamp = datetime.datetime.now()
Does anybody have any ideas? It feels like this is the sort of thing that Pycassa (or another python library) would support but I can't figure out how to do it.
p.s. table schema as described by cqlsh:
CREATE COLUMNFAMILY sessions (
KEY text PRIMARY KEY
) WITH
comment='' AND
comparator='TimeUUIDType' AND
row_cache_provider='ConcurrentLinkedHashCacheProvider' AND
key_cache_size=200000.000000 AND
row_cache_size=0.000000 AND
read_repair_chance=1.000000 AND
gc_grace_seconds=864000 AND
default_validation=text AND
min_compaction_threshold=4 AND
max_compaction_threshold=32 AND
row_cache_save_period_in_seconds=0 AND
key_cache_save_period_in_seconds=14400 AND
replicate_on_write=True;
p.s.
I know you can specify a column range in Pycassa but I won't be able to guarantee that the start and end values of the range will have entries for each of the rows and hence the column may not exist.
You do want to request a "slice" of columns using the column_start and column_finish parameters to get(), multiget(), get_count(), get_range(), etc. For TimeUUIDType comparators, pycassa actually accepts datetime instances or timestamps for those two parameters; it will internally convert them to a TimeUUID-like form with a matching timestamp component. There's a section of the documentation dedicated to working with TimeUUIDs that provides more details.
For example, I would implement your function like this:
def get_number_of_visitors(site, start_date, end_date=None):
"""
start_date and end_date should be datetime.datetime instances or
timestamps like those returned from time.time().
"""
if end_date is None:
end_date = datetime.datetime.now()
pool = ConnectionPool('Logs', timeout = 2)
col_fam = ColumnFamily(pool, 'sessions')
return col_fam.get_count(site, column_start=start_date, column_finish=end_date)
You could use the same form with col_fam.get() or col_fam.xget() to get the actual list of visitors.
P.S. try not to create a new ConnectionPool() for every request. If you have to, set a lower pool size.
Im trying to search for some values within a date range for a specific type, but content for dates that exist in the database are not being returned by the query.
Here is an extract of the python code:
deltaDays = timedelta(days= 20)
endDate = datetime.date.today()
startDate = endDate - deltaDays
result = db.GqlQuery(
"SELECT * FROM myData WHERE mytype = :1 AND pubdate >= :2 and pubdate <= :3", type, startDate, endDate
)
class myData(db.Model):
mytype = db.StringProperty(required=True)
value = db.FloatProperty(required=True)
pubdate = db.DateTimeProperty(required=True)
The GQL returns data, but some rows that I am expecting are missing:
2009-03-18 00:00:00
(missing date in results: 2009-03-20 data exists in database)
2009-03-23 00:00:00
2009-03-24 00:00:00
2009-03-25 00:00:00
2009-03-26 00:00:00
(missing date in results: 2009-03-27 data exists in database)
2009-03-30 00:00:00
(missing date in results: 2009-03-31. data exists in database)
2009-04-01 00:00:00
2009-04-02 00:00:00
2009-04-03 00:00:00
2009-04-06 00:00:00
I uploaded the data via de bulkload script. I just can think of the indexes being corrupted or something similar. This same query used to work for another table i had. But i had to replace it with new content from another source, and this new content is not responding to the query in the same way. The table has around 700.000 rows if that makes any difference.
I have done more research ant it appears that its a bug in the appEngine DataStore.
For more information about the bug check this link:
http://code.google.com/p/googleappengine/issues/detail?id=901
I have tried droping the index and recreating it with no luck.
thanks
nothing looks wrong to me. are you sure that the missing dates also have mytype == type?
i have observed some funny behaviour with indexes in the past. I recommend writing a handler to iterate through all of your records and just put() them back in the database. maybe something with the bulk uploader isn't working properly.
Here's the type of handler I use to iterate through all the entities in a model class:
class PPIterator(BaseRequestHandler):
def get(self):
query = Model.gql('ORDER BY __key__')
last_key_str = self.request.get('last')
if last_key_str:
last_key = db.Key(last_key_str)
query = Model.gql('WHERE __key__ > :1 ORDER BY __key__', last_key)
entities = query.fetch(11)
new_last_key_str = None
if len(entities) == 11:
new_last_key_str = str(entities[9].key())
for e in entities:
e.put()
if new_last_key_str:
self.response.out.write(json.write(new_last_key_str))
else:
self.response.out.write(json.write('done'))
You can use whatever you want to iterate through the entities. I used to use Javascript in a browser window, but found that was a pig when making hundreds of thousands of requests. These days I find it more convenient to use a ruby script like this one:
require 'net/http'
require 'json'
last=nil
while last != 'done'
url = 'your_url'
path = '/your_path'
path += "?/last=#{last}" if last
last = Net::HTTP.get(url,path)
puts last
end
Ben
UPDATE: now that remote api is working and reliable, I rarely write this type of handler anymore. The same ideas apply to the code you'd use there to iterate through the entities in the remote api console.