What are your thoughts on when to use LAST_VALUE() vs. MAX() KEEP(DENSE_RANK LAST ... )

Moops@lemmy.world · edit-2 3 months ago

What are your thoughts on when to use LAST_VALUE() vs. MAX() KEEP(DENSE_RANK LAST ... )

anonymouse@sh.itjust.works · edit-2 3 months ago

First of all thanks for this list, I didn’t know about the KEEP functionality so read up on it a bit, I found the following interesting read: https://rwijk.blogspot.com/2012/09/keep-clause.html?m=1

I can’t really play around with it at the moment as it seems Oracle only (I think) and I’m not using that at the moment. Might also explain preference for last/first value of your colleague cause it seems more general available.

Question from my side, why is DENSE_RANK used in this case? Seems to me like RANK or ROW_NUMBER could perform better, since there is less to keep track off? Of course ROW_NUMBER could give different results if there are duplicates in the ORDER BY column, but in that case the MAX is also sort of an arbitrary choice.

Moops@lemmy.world · edit-2 3 months ago

Yeah, I guess it’s a less-used analytic function (function? I think that’s the right term here).

I believe this will only accept DENSE_RANK as the ranking part of this method, which makes sense cause you want a distinct list of ranks to identify the last one. You are right though that it’s possible for the record returned by the MAX() to be arbitrary if there are duplicate values in the ORDER BY field. Luckily you can usually handle for that by using multiple fields in the ORDER BY. The conditions in the query that the function operates on top of also play a roll. You can also include a PARTITION BY, but I’ve only needed to use that a time or two.

I think one reason I like this more is it’s more readable to me. For a lot of the other analytic functions you end up using partitions and additional keywords like “ROWS BETWEEN UNBOUNDED AND …etc.” in order to break out results the way you need. With this dense_rank method I typically accomplish that using regular 'ole WHERE conditions. Of course that’s gonna be pretty subjective.

Replying on in my phone now so limited in how detailed I can be. I can give better examples later if that doesn’t make sense.

Edit: I think it is an Oracle thing. Analytics are one of the areas I will unapologetically use DB-specific functions. Portability be damned! Generally they’ve been tuned very specifically to solve a problem with the specific engine in mind. Of course that won’t always be true, but that’s my general thought process.