[Solved]-Django ORM group by, and find latest item of each group (window functions)

17👍

Django 2.0 introduced window functions that are made for that kind of queries. Simple answer for your question will be:

Cake.objects.annotate(
    first_cake=Window(
        expression=FirstValue('cake_name'),
        partition_by=[TruncDate('baked_on')],
        order_by=F('baked_on').asc(),
    ),
    last_cake=Window(
        expression=FirstValue('cake_name'),
        partition_by=[TruncDate('baked_on')],
        order_by=F('baked_on').desc(),
    ),
    day=TruncDate('baked_on'),
).distinct().values_list('day', 'first_cake', 'last_cake')

Why FirstValue in last_cake? That’s becaues window query by default will traverse through each row and won’t look ahead, so for every row, last row will be equal to current row. Using last_row together with descending sorting will fix that. Either that or you can define frame for which window query should work:

Cake.objects.annotate(
    first_cake=Window(
        expression=FirstValue('cake_name'),
        partition_by=[TruncDate('baked_on')],
        order_by=F('baked_on').asc(),
    ),
    last_cake=Window(
        expression=LastValue('cake_name'),
        partition_by=[TruncDate('baked_on')],
        order_by=F('baked_on').asc(),
        frame=ValueRange(),
    ),
    day=TruncDate('baked_on'),
).distinct().values_list('day', 'first_cake', 'last_cake')

Leave a comment