Intelligent Robot LabBrown University, Providence RI
The Fourier Basis
The Fourier basis is a simple, principled basis function scheme for linear value function approximation in reinforcement learning. It has performed well over a wide range of problems, despite its simplicity, and I now use it almost exclusively. My advice to fellow researchers who require function approximation is that the Fourier basis should be the first thing you try.
If a problem is small enough, the Fourier basis avoids the need for extensive problem-specific feature engineering. For example, we have achieved good results on the Acrobot problem using a Fourier basis of order between 5 and 9; contrast this with the two paragraphs describing a problem-specific tile-coding originally used to solve the Acrobot. For larger problems, the Fourier basis provides a generic but complete basis function scheme suitable for feature selection.
For a single continuous state variable, , (scaled to ), the th order Fourier basis is the set of basis functions defined as:
for . Thus, an order Fourier basis produces basis functions, the first having value 1 everywhere. The following pictures show a few example basis functions; note that, as increases, the basis functions become higher frequency.
For multivariate state spaces, where
where each cj is a length
A few example basis functions for a 2D domain are shown below. Each element of the
One caveat: when performing gradient-descent style value function approximation (such as TD()), we must deal with the fact that some Fourier basis functions have much higher frequencies than others, and scale their gradient descent step sizes appropriately. In practice, we've found that setting , where is the base learning rate assigned to the basis function with , performs reliably well.
An th order Fourier basis in a -dimensional space has basis functions, and thus suffers the combinatorial explosion in exhibited by all complete fixed basis methods. In a domain where is sufficiently small - perhaps less than 6 or 7 - we may simply pick an order and enumerate all basis functions. This amounts to an expectation that components of the value function with frequency or less are signal, and components with higher frequency are noise. For higher dimensional domains, we may reduce the size of the basis by placing restrictions on the vectors. For example, we can include only coefficient vectors with two or fewer non-zero elements; this is analogous to tile coding pairs of variables.
The paper describing the Fourier Basis is:
Here is a list of papers using the Fourier basis, courtesy of Google Scholar.