Index ¦ Archives ¦ Atom

Avoiding cached datetimes with Django querysets

In Django we often need to filter out objects from a queryset which shouldn't be visible to public users, a typical example of this would be a news post in a blog. A staff user could edit a news post to have a publish date in the future, allowing it to be automatically published by the site without having to log back in and publish it.

A simple model for such a news post could look like:

from django.db import models


class Post(models.Model):
    title = models.CharField(max_length=100)
    content = models.TextField()
    published_at = models.DateTimeField(db_index=True)

    class Meta:
        ordering = ('-published_at',)

    def __str__(self):
        return self.title

In this example, we're using a typical ListView, filtering out any posts which haven't yet been published:

from django.utils import timezone
from django.views.generic import ListView

from .models import Post


class PostListView(ListView):
    queryset = Post.objects.filter(published_at__lte=timezone.now())

Note - we could use an ArchiveIndexView instead, which by default excludes objects from the future. However for this example, we're sticking with ListView to show a simplified version of the problem for other use cases.

When we first load the page, looking through the SQL queries generated for the request, we can see the posts being filtered by their publish date:

SELECT "news_post"."id",
       "news_post"."title",
       "news_post"."content",
       "news_post"."published_at"
  FROM "news_post"
 WHERE "news_post"."published_at" <= '2018-07-21T11:02:12.998079+00:00'::timestamptz
 ORDER BY "news_post"."published_at" DESC

At first this all seems okay, however at some point later on you'll realise that new posts aren't being shown. Looking at the SQL queries generated for the following requests, we can see that the timestamp doesn't change between requests:

SELECT "news_post"."id",
       "news_post"."title",
       "news_post"."content",
       "news_post"."published_at"
  FROM "news_post"
 WHERE "news_post"."published_at" <= '2018-07-21T11:02:12.998079+00:00'::timestamptz
 ORDER BY "news_post"."published_at" DESC

Why? The queryset gets evaluated when the Django server starts, as the queryset is an attribute of the generic view.

One solution for this is to move the queryset into the get_queryset method for the generic view:

class PostListView(ListView):

    def get_queryset(self):
        return Post.objects.filter(published_at__lte=timezone.now())

By using a method we're creating a new queryset for every request - using the current timestamp when the request is generated. Problem solved!

However, since Django 1.9 there's a better way - let the database figure out the current time stamp.

Instead of using timezone.now(), we can switch the view code to the Now() database function:

from django.db.models.functions import Now
from django.views.generic import ListView

from .models import Post


class PostListView(ListView):
    queryset = Post.objects.filter(published_at__lte=Now())

Looking at the SQL queries generated for the request, we can see that STATEMENT_TIMESTAMP() is being used by Postgres to filter out any news posts

SELECT "news_post"."id",
       "news_post"."title",
       "news_post"."content",
       "news_post"."published_at"
  FROM "news_post"
 WHERE "news_post"."published_at" <= (STATEMENT_TIMESTAMP())
 ORDER BY "news_post"."published_at" DESC

The same SQL query will be used for every request, which will now work as the current timestamp gets evaluated by the database - and we don't need to create a get_queryset method for every generic view!

© Alex Tomkins.