Satellite images captured by optical remote sensors usually contain noise due to weather conditions and changing solar illumination throughout the year [25, 26]. When performing time-series smoothing, researchers are also reminded that maintenance of the original characteristics of the time-series profile is critical [25]. Just as with other high spatial resolution images, the trade-off is that HJ-1 A/B time-series data is spaced irregularly in time. Signal processing techniques (e.g., wavelet and Fourier analysis) require the time-series data have regular, equidistant spacing [27, 28], hence they may not work well in the nonequidistantly spaced time-series data derived from HJ-1 A/B satellites.
Moreover, remote sensing VI products such as MODIS and SPOT-VGT are organized in fixed-day intervals (e.g., 8 or 10 days composite). The processing methods available in time-series software (e.g., TIMESAT) do not function for unevenly distributed time-series; on the other hand, the function fitting or Fourier-based filters may be problematic when applied to irregular VI time-series [29]. Therefore, in this section, we improved our method by first introduced the Savitzky-Golay (S-G) smoothing method. Then other smoothing methods were tested for comparison to prove its superiority. Finally, a missing-data interpolation was proposed to ensure regular spacing on a daily basis and generate daily NDVI images. IDL (Interactive Data Language) programming helps to achieve these goals.
S-G smoothing method
Time-series smoothing must be done to retrieve the essential shape of a curve. In this paper, the Savitzky-Golay (S-G) smoothing method was employed to facilitate the irregular spacing in HJ-1 A/B NDVI time-series. The S-G filter, also known as the least squares or digital smoothing polynomial, can be used to smooth a noisy signal [30]. The algorithm description can be described as below:
$$ {g}_i=\sum_{n=- nL}^{nR}{c}_n{f}_{i+n}/n $$
(1)
where f
i
represents the original data value in the time-series, and g
i
is the smoothed value, which is the linear combination of c
n
and f
i
. Here, n is the width of the moving window to perform filtering, and nL and nR correspond to the left and right edge of the signal component. If c
n
is a constant defined as c
n
= 1/(nL + nR + 1), then the S-G filtering becomes a moving average smoothing. The idea of S-G filtering is to find filtering coefficients c
n
that preserve higher moments. Therefore, in Eq.(2), c
n
is not a constant but a polynomial fitting function, typically quadratic or quartic. Then a least-squares fit is solved, ranging from nL to nR to obtain c
n
. For a specific dataset of a time series in a moving window, we define the fitting function as a quadratic polynomial for fitting a specific range of f
i
:
$$ {c}_n(t)={c}_1+{c}_2t+{c}_3{t}^2 $$
(2)
where t corresponds to the day of the year in NDVI time series. Therefore, the smoothed value g
i
can be obtained via Eq.(1). The result of S-G smoothing methods is shown in Fig. 2.
Comparison between S-G and other methods
Several commonly used methods for smoothing were tested as follows. The results could firmly demonstrate the S-G filter’s superiority.
-
(1)
Global function fitting
The SPLINE-curve fitting is a global function fitting to smooth discrete data. By forming a polynomial equation, a smoothed function curve is obtained to represent the discrete data. Likewise, other function fitting methods, such as using the asymmetric Gaussian model, have been adopted for fitting AVHRR-NDVI time-series data [1]. In this study, we applied a SPLINE-curve fitting, \( \kern.8em y={ax}^3+{bx}^2+ cx+d \), and a Gaussian function fitting,\( y={\mathrm{a}}^{\ast }{e}^{\left(-\frac{{\left(\mathrm{x}-\mathrm{b}\right)}^2}{c}\right)} \), on HJ-1 A/B time series data for smoothing (Fig. 3). Figure 3 (a) appears to fit well but does not maintain the essential shape of the time-series trajectory; in Fig. 3 (b), the Gaussian function did not perform well in fitting a double cropping area which had two growth cycles over time. Apparently, as demonstrated, global function fitting methods are not suitable for unevenly spaced time-series data.
-
(2)
Signal denoising
Viewing time series data as a signal, fast Fourier transformation (FFT) and a wavelet transform (WT) were adopted to handle the HJ-1 A/B time-series data. FFT and WT have already been applied on MODIS VI time series to retrieve a smoothed trajectory vegetation growth cycle [8]. We programmed the IDL-based fast Fourier transformation and the wavelet transform function to apply to HJ-1 A/B time series data. However, as mentioned before, signal denoising (for FFT and WT) did not perform well for unevenly spaced time-series data (Fig. 4); no matter how the denoising parameters were set, such methods neither maintained an original shape nor preserved the original date in the time series.
-
(3)
HANTS method
HANTS (Harmonic Analysis of NDVI Time-Series) is a commonly used tool to smooth time-series remote sensing data (http://www.un-spider.org/links-andresources/gis-rs-software/hants%C2%A0harmonic%C2%A0analysis-of%C2%A0time%C2%A0series-nlrgdsc). HANTS can be used to remove cloud effects, smooth the data set, interpolate the missing data, and compress the data. Although the HANTS method could generate a pleasing looking time series, it has the same problem as FFT and WT when using the signal denoising method. For unevenly spaced HJ-1 A/B time-series data, HANTS tended to maintain the spatial completeness of a pixel profile (just as cloud removal), but to scarify the temporal characteristics. Moreover, the HANTS method did not preserve the original date in the time series; temporal characteristics revealing critical phenology details were drowned (Fig. 5).
Generating daily NDVI images
S-G filtering was employed to smooth HJ-1 A/B NDVI time-series data, to ensure a continuous and complete time-series dataset. A feasible approach was then proposed to ensure regular spacing on a daily basis. Linear interpolation is a simple interpolation method commonly used in mathematics and computer science. This paper developed a locally adaptive linear interpolation to generate missing data throughout the NDVI time series. Missing data between two images can be generated by Eq.(3)–(4):
$$ \frac{NDVI-{NDVI}_0}{DOY-{DOY}_0}=\frac{NDVI_1-{NDVI}_0}{DOY_1-{DOY}_0} $$
(3)
$$ NDVI={NDVI}_0+{\left({NDVI}_1-{NDVI}_0\right)}^{\ast}\frac{DOY-{DOY}_0}{DOY_1-{DOY}_0} $$
(4)
where NDVI represents the missing day to be interpolated, and NDVI
1
and NDVI
0
represent the valid images used for the interpolation. Therefore, the NDVI between NDVI
0
and NDVI
1
can be treated as a linear relationship and then generated according to Eq. (4).
The smoothing performance and interpolation accuracy were evaluated by 1:1-line comparison as described in Fig. 6, both presented a goodness of fit.
Code design
Most commonly-used remote sensing software (e.g., ERDAS, ENVI) does not provide functionalities to manipulate time-series data; at present, TIMESAT and SPIRITS are also not designed to facilitate such relatively high spatial time series data. Additionally, an executable processing framework has not been found that allows researchers to obtain high spatiotemporal time-series dataset that meets their research demands. Although the methodology described above was easily achieved in operating a one-dimensional array by most programming platforms, the question raised here was how to perform smoothing and interpolation for time-series data with a three-dimensional array. IDL is an array-oriented language with numerous mathematical analysis and graphical display techniques, is ideal programming language for image data analysis, visualization, and cross-platform application development (http://www.harrisgeospatial.com/ProductsandTechnology/Software/ENVI.aspx). Anyone who working with imagery or raster data has probably encountered ENVI software; its library routines are IDL-based functions and procedures. IDL programming is based on the ENVI function that is capable of operating remote sensing images; thus we developed a program to achieve filtering and interpolation of three-dimension time-series data.
As introduced in S-G smoothing method section and Generating daily NDVI images section, this program mainly consists of two steps: time series filtering and image interpolation. This IDL-based program was developed with the IDL functions library, using Savitzky-Golay filtering and interpolation. The coding of this program was designed to treat time-series datasets as a three-dimension array by calling ENVI functions to manipulate individual images; the program loops pixel by pixel to extract the time-dimension to perform filtering and interpolation.
As described in section S-G smoothing method, the S-G filter will tend to minimize overall noise in NDVI time-series to preserve the original trajectory. This IDL program requires users define the width of the moving window and degree of polynomial fitting in S-G filtering. The interpolation in the IDL program includes three commonly used methods: (1) simple linear interpolation, as described in Eq. (3)–(4); (2) least squares quadratic fit for each four-point neighborhood (x[i-1], x[i], x[i + 1], x[i + 2]) surrounding the interval; and (3) SPLINE fit, which is a polynomial function fitting function for the four surrounding points. A user can use different interpolation methods to achieve optimal effect. The overall schematic of the functionality in this program is described in Fig. 7. Since the interpolation was applied locally in the time series within a defined interval, the result suggest that the essential shape of the NDVI trajectory was well maintained.