How do I select every pair of consecutive events in Hive?
Imagine I have Hive table T with consecutive events:
n
---
1
2
3
4
...
I need to write some code to select every pair of consecutive events from
this table. Currently I have a solution like
select t1.n, min(t2.n) from t t1 join t t2 where t1.n < t2.n group by t1.n;
Which is very ineffective even for relatively small table (thousands of
rows) as it produces temporary cartesian product of table on itself (i.e.
O(n^2) in complexity).
I would like to find less expensive (hopefully linear) solution to the
same problem.
No comments:
Post a Comment