Fractional Differencing Derivation Walkthrough (FD Part 2)

Just a quick warning before I start, this post is going to be math heavy. Those who are not brave enough to traverse these waters, be forewarned! Let's get right to it:

To recap, last time I talked about a few basic statistical concepts regarding time series. Stationarity, Memory and reconciling them both using an idea called fractional differencing. This post walks through how we do this mathematically and gets down to the brass tacks'.

Before starting, I need to explain a few intermediate steps to avoid confusion.

1. The Taylor series expansion for the function f(x) = (1+x)^d for any complex number d, is the binomial series.

Don't worry too much about the complex number part. Essentially, what this is saying is the following: We have a function (1+x)^d that's a bit messy to deal with so we use a mathematical technique called Taylor-series approximation to get a bit of insight into how we construct this function as a sum of polynomials. We'll come back to this later... For now, just take the above as a given since we're going to need it to derive our formula for fractional differencing.

2. The backshift operator has the following properties:

A backshift (or Lag operator as it is otherwise known) is an operator much like any other mathematical operator (think addition, subtraction, multiplication, division) but instead of acting on a pair of numbers (i.e. 4-2 or 6+7), its an operator for a single number. And not just any number, but an element in a time-series. Essentially, applying this number to a time series element produces the previous element. These properties are important to keep in mind for the derivation below.

The general formula we have for getting the coefficients in differencing a time series is the following:

where the exponent d, our differencing factor, is allowed to be any real-valued number. For example,
if we were to take first-differences (to get returns), we would set d to 1. But since we are concerned with fractional differencing, we are really looking at values of d between 0 and 1. 

Using the 2 steps above, we can do group some factors together, simplify, expand and derive the following:

Phew! That wasn't too hard, was it? There are some interesting properties that come out of this derivation. One of which allows us to iteratively estimate the weights (coefficients) without resorting to the complicated formula above (which is going to be SUPER useful for me when I implement this in the next post). More specifically:

If it's kinda hard to wrap your head around, here's an example using the 3rd and 4th weights (w2 and w3 respectively):

Which is the coefficient for the 4th term in the expansion that we worked out above! So, we're one step closer to being able to do this in practice. In the next post, I'll be taking what we've seen in the last two posts and bringing it all together in Python. It's going to be a long one since I'll be walking through the implementation, looking at a common stationarity test (ADF) to see if it actually works and talking a bit more about the concept as a whole.

Thanks for reading, til next time!


  1. Love it! Will be following along. Currently making my foray into financial mathematics as a young finance graduate myself. Curiously seeking to discover more about the wonderful world that we live in today.

    All the best,

    1. Thanks my man, really appreciate the sentiment. Its a journey filled with a lot of sh*t in the way but if you find it as fascinating as I do, I'm sure you're in for a treat.

  2. This is great, can't wait to see part iii.

    1. Its up already if you didn't already know


Post a Comment

Popular posts from this blog

Welcome and Introduction to Fractional Differencing (FD Part 1)

Fractional Differencing Implementation (FD Part 3)