# Univariate

Univeriate curve-fitting, interpolation, polynomial, spline, Akima

**Contents**

## Overview

*Univariate*

*interpolation*is an area of curve-fitting which, as opposed to univariate regression analysis, finds the curve that provides an

*exact*fit to a series of two-dimensional data points. It is called univariate as the data points are supposed to be sampled from a one-variable function. Compare this to multivariate interpolation, which aims at fitting data points sampled from a function of several variables. Formally speaking, consider a series of data points and, for the sake of simplicity, consider that , i.e. the points are distinct and are in increasing order with respect to . By interpolating these data points we mean finding a function such that: To make things even clearer, consider the following example. Let the data points be , shown in red in the image below. As you may notice, we can choose to be the interpolation function, which is also shown in blue in the previous graph. It clearly satisfies the constraints of equation (1), but it's not necessarily the most straightforward choice. We might as well choose the constant function , which also satisfies the constraints of equation (1) and is therefore a valid interpolation function for the given data points. As a conclusion, we are allowed to choose any type of function as long as it provides an exact fit to the given data points. Our choice should be based on some prior knowledge of the phenomena which generated that series of points.

## Polynomial Interpolation

In the following, let us assume that the interpolation function is polynomial, i.e. is of the type where is called the*degree*of and are some real numbers, called the

*coefficients*of . In order to find the expression for , it suffices to find its coefficients. We are able to find them by writing the constraints of equation (1) in our particular case: This is a system of linear equations with unknowns, and equations. In order to have a unique solution, we need to require that , or ; that is, we require that the degree of should be equal to the number of data points minus one. However, this is a worst case scenario since, by solving the above linear system, one may obtain which will decrease the degree of by one, and so on. As a general rule, the degree of will be strictly less than the number of given data points. The function determined by solving the above system of linear equations is called the

*interpolation*

*polynomial*for the series of data points . Directly solving this system of linear equations comes down to finding the inverse of its corresponding matrix, which might become computationally expensive. Another easier way of finding the coefficients of is by writing it in the

*Lagrange*

*form*where is defined through for all . It can easily be shown that and , for all ; therefore , so satisfies the constraints of equation (1). This provides a method of computing the interpolation polynomial known as

*Lagrange*

*interpolation*, which is also implemented as the component Interpolation/Lagrange. Consider the function with in the interval and let us sample 12 equidistant points from this function. Applying Lagrange interpolation on this series of data points results in the following approximation (shown in red):

## Runge's Phenomenon

When trying to estimate the error between the original function, from which the series of data points has been sampled, and the polynomial interpolation function , one may notice the following phenomenon. Conside the Runge function: Now, consider a series of equidistant points between and , where for all . Runge proved that the polynomial interpolation function corresponding to this set of data oscillates toward the end points of the interval . More than that, the interpolation error at the ends of the interval tends to increase to infinity as you increase the number of equidistant data points. This is also known as*Runge's*

*phenomenon*. In the image below, the Runge function is depicted in red, while the 5-th degree (6 points) and 9-th degree (10 points) interpolation polynomials are shown in blue and green, correspondingly. Notice how the approximation error at the ends of the interpolation interval increases with the degree of , or with the number of given equidistant points. As a conclusion, polynomial interpolation in the case of a large number of equidistant points might generate large approximation errors toward the end points of the interpolation interval. This phenomenon can be avoided using spline interpolation, which is the subject of the next section.

## Spline Interpolation

Spline interpolation is somehow a generalization of polynomial interpolation, in that we do not necessarily have to find a single polynomial function to fit the data over the entire interval, but we rather try to find several polynomial functions to fit the data over each subinterval determined by two consecutive data points, while obeying some smoothness conditions. One of the advantages of this generalization is that the resulting interpolation function is less wiggly, as in the case of e.g. Lagrange interpolation. Formally, consider the series of data points in the above paragraphs, which are distinct and ordered increasingly with respect to . A*spline*

*interpolation*

*function*of degree for the given data points is a function which satisfies the following conditions

- , for all and all , where is some polynomial function with degree less than or equal to
- the derivatives of up to the order are all continuous in the given data points, which basically means that for all we require that:

*natural*

*cubic*

*spline*. Other kinds of cubic splines can be obtained using other pairs of conditions. In the case when , is piecewise linear and the method is called

*linear*

*interpolation*, which is also implemented as Interpolation/Linear. In the graph below you can see the way linear interpolation (in red) works in the case of 12 equidistant points sampled from the function , previously defined. In the case when , is comprised of cubic polynomial functions on each subinterval and the method is called

*cubic*

*interpolation*, which is also implemented as Interpolation/Cubic. The following graph shows the cubic spline (in red) for the function. Notice that higher degree spline interpolation results in smaller approximation errors. However, in order to escape from Runge's phenomenon, instead of increasing the degree of the spline, it is prefered to increase the number of data points. This can be achieved, for instance, by doing further experiments and obtaining further input data.

*Akima*

*interpolation*is a particular type of third-degree spline interpolation, also implemented as Interpolation/Akima. Its main advantage is that, as opposed to cubic spline interpolation, it is applicable on successive intervals, so it does not require solving large systems of linear equations. Apart from being more computationally efficient, it also provides a more natural interpolation curve, closer to human intuition. The graph below shows the Akima interpolation curve for the function.

## References

- George M. Philips, Interpolation and Approximation by Polynomials, Springer-Verlag, New York, 2003.
- http://en.wikipedia.org/wiki/Curve_fitting
- http://en.wikipedia.org/wiki/Interpolation
- http://en.wikipedia.org/wiki/Runge%27s_phenomenon
- http://en.wikipedia.org/wiki/Spline_interpolation
- Hiroshi Akima, A New Method of Interpolation and Smooth Curve Fitting Based on Local Procedures, Journal of the ACM, Vol. 17, No. 4, October 1970, pp. 589-602.
- http://www.iue.tuwien.ac.at/phd/rottinger/node60.html