Monday, May 31, 2010

Exponential Moving Average

There was a nice question on OTN today about whether there is a standard Oracle function to calculate the exponential moving average. The answer is that there is no such function, but with the model clause, you can calculate it very easy. And it's a great example of what I mean with "variable number of calculations based on calculated values", written in my third part of the model clause tutorial.

Before today, I didn't even know what an exponential moving average was exactly. You can read more about it here on Wikipedia or here with a nice example. From the first link:

An exponential moving average (EMA), applies weighting factors which decrease exponentially. The weighting for each older data point decreases exponentially, giving much more importance to recent observations while still not discarding older observations entirely.


From the second link:

The formula for calculating an Exponential Moving Average (EMA) is:

X = (K * (C - P)) + P

Where:

X = Current EMA (i.e. EMA to be calculated)

C = Current original data value

K = Smoothing Constant

P = Previous EMA

(The first EMA in the range to be calculated is arbitrary and can be the corresponding original data value or, often, a Simple Moving Average value.

Where:

K = Smoothing Constant = 2 / (1 + n)


And this formula is followed by an example which I extended a bit, using this table:

rwijk@TEST10> create table sales (product,month,amount)
2 as
3 select 'A', date '2009-01-01', 10 from dual union all
4 select 'A', date '2009-02-01', 15 from dual union all
5 select 'A', date '2009-03-01', 17 from dual union all
6 select 'A', date '2009-04-01', 20 from dual union all
7 select 'A', date '2009-05-01', 22 from dual union all
8 select 'A', date '2009-06-01', 20 from dual union all
9 select 'A', date '2009-07-01', 25 from dual union all
10 select 'A', date '2009-08-01', 27 from dual union all
11 select 'A', date '2009-09-01', 30 from dual union all
12 select 'A', date '2009-10-01', 35 from dual union all
13 select 'A', date '2009-11-01', 37 from dual union all
14 select 'A', date '2009-12-01', 40 from dual union all
15 select 'B', date '2009-01-01', 0 from dual union all
16 select 'B', date '2009-02-01', 50 from dual union all
17 select 'B', date '2009-03-01', 10 from dual union all
18 select 'B', date '2009-04-01', 40 from dual union all
19 select 'B', date '2009-05-01', 15 from dual union all
20 select 'B', date '2009-06-01', 35 from dual union all
21 select 'B', date '2009-07-01', 30 from dual union all
22 select 'B', date '2009-08-01', 30 from dual union all
23 select 'B', date '2009-09-01', 20 from dual union all
24 select 'B', date '2009-10-01', 20 from dual union all
25 select 'B', date '2009-11-01', 20 from dual union all
26 select 'B', date '2009-12-01', 20 from dual
27 /

Table created.


The records from product A match the example in the link. I made up the numbers from product B. Here is the model clause query that implements the formula. Note how the formula directly translate into the one and only rule of the model clause. The "smoothing constant" K is set to .5, based on a window of values (n) equalling 3.

rwijk@TEST10> select product
2 , month
3 , amount
4 , round(ema,3) "Exponential Moving Average"
5 from sales
6 model
7 partition by (product)
8 dimension by (month)
9 measures (amount,0 ema)
10 ( ema[any] order by month
11 = nvl2
12 ( ema[add_months(cv(),-1)]
13 , ( .5 * (amount[cv()] - ema[add_months(cv(),-1)])) + ema[add_months(cv(),-1)]
14 , amount[cv()]
15 )
16 )
17 order by product
18 , month
19 /

PRO MONTH AMOUNT Exponential Moving Average
--- ------------------- ---------- --------------------------
A 01-01-2009 00:00:00 10 10
A 01-02-2009 00:00:00 15 12,5
A 01-03-2009 00:00:00 17 14,75
A 01-04-2009 00:00:00 20 17,375
A 01-05-2009 00:00:00 22 19,688
A 01-06-2009 00:00:00 20 19,844
A 01-07-2009 00:00:00 25 22,422
A 01-08-2009 00:00:00 27 24,711
A 01-09-2009 00:00:00 30 27,355
A 01-10-2009 00:00:00 35 31,178
A 01-11-2009 00:00:00 37 34,089
A 01-12-2009 00:00:00 40 37,044
B 01-01-2009 00:00:00 0 0
B 01-02-2009 00:00:00 50 25
B 01-03-2009 00:00:00 10 17,5
B 01-04-2009 00:00:00 40 28,75
B 01-05-2009 00:00:00 15 21,875
B 01-06-2009 00:00:00 35 28,438
B 01-07-2009 00:00:00 30 29,219
B 01-08-2009 00:00:00 30 29,609
B 01-09-2009 00:00:00 20 24,805
B 01-10-2009 00:00:00 20 22,402
B 01-11-2009 00:00:00 20 21,201
B 01-12-2009 00:00:00 20 20,601

24 rows selected.


Challenge: try this without the model clause and see if you can come up with something more comprehensive ...