Abstract
Exponential moving averages (EMA) is a way to remove noise from data series. Unfortunately, useful straightforward spreadsheet support for EMA is absent. This article examines the problems involved and proposes a one line formula solution to add EMA to a spreadsheet.Keywords
Exponential Moving Average, EMA, Noise Removal, Filtering, Google Sheets, Spreadsheets.Introduction
Traditionally, implementing an exponential moving average has been done in spreadsheets using a recursive formula, i.e., an auto-regressive (AR) process or an infinite impulse response (IIR) filter. The formula is the following:\[y(n) = \alpha \cdot x(n) + (1 - \alpha) \cdot y(n-1)\]
There are some problems with this approach:
- Strictly speaking, this is not a moving average. Moving average assumes a finite length sliding window under which data is being weighted. There is a growing window, not a sliding window.
- The last item implies that the average takes into account every single data sample. The previous formula actually implements an IIR filter. This kind of averaging never forgets a value, although old values certainly get irrelevant over time. It would be nice to have control over the window where the exponential average takes place, especially when one intends to use a small window.
- It does not deal properly with missing values. In a spreadsheet, it is common to have missing data values and the previous approach does not allow them. Converting missing values to zero would unacceptably distort the average value. If one insists in applying this formula to the series by packing the original series, the weights applied to the values in the averaging process will not reflect the actual distance in time that these samples might have.
Proposed solution
The function SERIESSUM(a, n, m, x) is defined as\[SERIESSUM(a, n, m, x) = \sum_{i=0}^n x_i a^{n+m i}.\]
The proposed solution is to use this function twice, first to calculate the weighted sum, and then a second time to calculate the sum of the weights where data is not missing.
Assume that:
- Cell F1 contains the geometric progression ratio \((\alpha)\);
- Cell F2 contains the window size \((n)\);
- Column B contains the raw data;
- The current cell is B22.
\[\begin{array}{l}
=\\
SERIESSUM($F$1, $F$2, -1, \\
\qquad ARRAYFORMULA(N(OFFSET(B22, -$F$2 + 1, 0, $F$2, 1))))\\
/ \\
SERIESSUM($F$1, $F$2, -1, \\
\qquad ARRAYFORMULA(\\
\qquad \qquad N(ISNUMBER(OFFSET(B23, -$F$2 + 1, 0, $F$2, 1)))))
\end{array}\]
The details of this expression are as follows:
- \(OFFSET(current\_cell, -n+1, 0, n, 1)\) is used to produce a range of \(n\) cells, of which the current cell is the last one.
- \(ARRAYFORMULA(N(OFFSET(\dotsc)))\) will apply the \(N()\) function to each element of the argument range to generate a new range with zero values in the missing data cells. Without this trick, \(SERIESSUM()\) would use non-missing values as if they were contiguous.
- \(ARRAYFORMULA(N(ISNUMBER(OFFSET(\dotsc))))\) will generate a range composed of ones where data is not missing and zeros where data is missing.
Results
The formula has been tested against some weighting data. The resulting spreadsheet has a plot of the original data, along with the \(AVERAGE()\) data and EMA data for comparison.
Conclusion
A spreadsheet formula for the correct calculation of an exponential moving average has been derived and successfully tested on Google Sheets. The proposed formula deals with missing values in a way similar to the \(AVERAGE()\) function, avoiding the distortions that would be caused either by using zero in place of the missing values or by packing the original series.
Thank you! I was about to do the same spreadsheet. I suppose you read John Walker too?
ReplyDeleteHi Savio,
DeleteI did read it, indeed, but the diet I used to get those results is not what he advocates. I did not count calories, I just cut out all carbs and went ketogenic!
But the exponential moving average is a great tool, it captures details that the pure average does not.
Regards!
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteI am using this formula to calculate EMA for Financial Instruments. BUT there is a problem.
ReplyDeleteThe geometrical progression ratio is not a constant. It depends on the period length:
alpha = 2 * (n + 1)
n is the period. In my case n is the number of days (9, 20, 50, 200).
When n = 200, the multiplier alpha = 0.009950248756
In this case, the Series Sum have a wrong return.
I am using the formula on Google Spreadsheet
Mathematical Reference: https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average
If I am right considering this a problem, how could it be managed?
Kind regards
G. Aloe
I have forgotten 1 detail. This is how I am using your formula by the use of GoogleFinance() in Arrayformula and not Offset on cells.
Delete=SERIESSUM($G$1, $G$2-1, -1, ARRAYFORMULA(N(QUERY(GoogleFinance( L22, "price", WORKDAY( TODAY(), -$G$2), $G$2), "select Col2 label Col2 ''", 1)))) / SERIESSUM($G$1, $G$2-1, -1, ARRAYFORMULA(N(ISNUMBER(QUERY(GoogleFinance( L22, "price", WORKDAY( TODAY(), -$G$2), $G$2), "select Col2 label Col2 ''", 1)))))
This comment has been removed by the author.
DeleteOne more thing. You have done a great job with the arrayformula to calculate EMA by Sum of Series.
ReplyDeleteI am in struggle to find a way to calculate the DOUBLE Exponential Moving Average (DEMA) on Google Spreadsheet.
The reason is that I am still not skilled enough on how to store the Series of n EMAs to use to calculate the EMAn(EMAn)
Here the formula:
DEMAn = 2∗EMAn - EMAn(EMAn)
where:
n=look-back period
Here the Mathematical Reference: https://www.investopedia.com/terms/d/double-exponential-moving-average.asp
Maybe you can do much better than me in the finding of a solution that can be used on Google Sheets, making the full calculation in only 1 cell OR developed by partial calculations that are all in the same row.
Thank you very much for your support and for this post that helped me.
Kind regards
G. Aloe
Nice tutorial and explanation.
ReplyDeleteCan you explain what value stored in F1 and F2 and how the values are derived , please.
ReplyDeleteHi, sorry for the question, I'm a total novice. I'm trying to create a spreadsheet that pulls historical price data and calculate the EMA values based on different time periods (10day EMA, 15day EMA, 50day EMA). I see that your example uses a list of values but is there a simpler way to use "array_constrain" to pull the data via "GoogleFinance" so the formula can be used in a single line without building a dataset?
ReplyDeleteAn SMA formula uses the following for a 50day (originally meant for a 200day): =average(index(array_constrain(sort(GoogleFinance(B1,"price",workday(today(),-220),today()),1,false),51,2),"",2))