 Original Article
 Open access
 Published:
Remote sensing of burned areas via PCA, Part 2: SVDbased PCA using MODIS and Landsat data
Open Geospatial Data, Software and Standards volume 2, Article number: 21 (2017)
Abstract
Background
Singular value decomposition (SVD), as an alternative solution to principal components analysis (PCA), may enhance the spectral profile of burned areas in satellite image composites.
Methods
In this regard, we combine the preprocessing options of centering, noncentering, scaling, and nonscaling the input multispectral data, prior to the matrix decomposition, and treat their combinations as four different SVDbased PCA versions. Using both unitemporal and bitemporal data sets, we test all four combinations to derive principal components. We assess the effects of the transformations based on multiresponse permutation procedures and quantify the enhanced spectral separability between burned areas and other major land cover classes via the JeffriesMatusita metric. Lastly, we evaluate visually and numerically all principal components and select a subset of interest.
Results
The best transformation for the subset of selected components, is the uncenteredunscaled one.
Conclusions
The results indicate that an uncentered and unscaled SVD may improve the spectral separability of burned areas in some of the higher order components.
Background
In the article “Remote sensing of burned areas via PCA, Part 1: centering, scaling and EVD vs SVD.” [1], we present indepth the concepts of PCA [2]; past scientific literature of PCA in remote sensing applications [3]; the link of PCA to burned area mapping [4]; the implications of centering and scaling [5]; and finally suggest that the uncenteredunscaled SVDbased PCA variant may further improve the spectral enhancement of burned area clusters compared to the conventional centered and EVD^{1}based PCA.
In multispectral imagery, burned areas build homogeneous clusters of low internal heterogeneity. Their mean spectral value is distanced from the composite’s overall mean and they present lower projections, in some dimensions, in both uni and multitemporal composites. In the latter case, it is well noted that burned surfaces are absent in the prefire dimensions.
The preprocessing options to center and scale the image composites before the matrix decomposition, can be combined in different ways [2]. Their application influences the transformation of the spectral properties of burned area clusters. The impact of the transformations, is most evident in some of the higher order principal components. A noncentered SVD, captures in the first component greater amounts of information around the mean value of the input composite [5]. This can be advantageous in isolating burned clusters in some of the higher order components. Not scaling the input data may as well allow for subtle, yet useful, transformations applied in the initial dataset to be expressed in the restructured principal components. In this article, we demonstrate numerically the theoretical concepts of spectrally enhancing remotely sensed burned areas via SVDbased PCA. We apply and discuss the performance of four SVD versions. In addition, we go through an examplebased quantitative discussion on the selection of the best principal components obtained via SVD.
Data
Within the first weeks, after the pause of large wildfires, burn scars absorbe higher amounts of solar energy. Compared to other surfaces, they present lower reflectance values in both Nearinfrared (NIR) and Midinfrared (MIR) bands (Fig. 1) and appear expectedly darker than older burns. Therefore, postfire multispectral imagery, needs to be timely acquired near after the pause of fires. Regarding prefire imagery in multitemporal data sets, they are best if acquired within the same season as the postfire images. That is to hold the interseasonal reflectance variation of landscape features as low as possible. Generally, all scenes should be as cloudfree as possible, over large fireaffected regions in order to obtain more accurate results.
Based on the above, we analyse daily MODIS Terra L2G (MOD09GA)^{2} and Landsat5 TM surface reflectance products (Figs. 2 and 3) respectively over Peloponnese and Mt Parnitha in Greece (Fig. 4). The selected MODIS acquisitions are a postfire scene in summer 2007 (Julian day 242)^{3} and a prefire in summer 2006 (Julian day 239)^{4}. MOD09 products are estimations of the surface spectral reflectance for each designated MODIS band and they are already atmospherically corrected. Variancecovariance and correlation coefficients for the selected input surface reflectance bands are presented in Table 1.
Worth mentioning is that MODIS band 5 (1.240 A m) is a very good discriminator with respect to the spectral response of burned areas (see sampled burned areas in Fig. 5 and refer to [6, 7]). However, in the acquired scene, band 5 is stripped, likely due to a calibration artefact causing anomalously high reflectance values [6]. Experimental transformations with data sets including band 5, derived noisy components. Therefore, this band has been excluded entirely from the analyses.
The Landsat5 TM scenes^{5} were acquired in summer 2007 (Julian day 248, postfire)^{6} and in summer 2003 (Julian day 237, prefire)^{7}. These are already preprocessed data of Level1^{8} and delivered as scaled digital numbers. Since we do not crosscompare data from different sensors, and burned areas feature distinct spectral profiles, no further preprocessing was performed.
The selected MODIS scenes (Fig. 2) cover the Peloponnese peninsula (South Greece) with a total surface of 22,068 k m ^{2} (main land of about 21,405 k m ^{2} incl. surrounding islands on East, South). The Landsat5 TM products (Fig. 3) illustrate a region North of Athens–including Mt Parnitha–of about 1027 k m ^{2}. Both areas were severely damaged by large and uncontrolled wildland fires at the end of the summer 2007.
Tools
The employed methods were performed using free and open source software. Geospatial processing was performed using GRASSGIS [8], QGIS [9] and FWTools [10]. The SVDbased PCA algorithm was applied via R’s function prcomp [11]. MultiResponse Permutation Procedures (MRPP) statistics were estimated using the mrpp and meandist functions, part of the Rpackage vegan [12]. The JM index was implemented via custom R functions.
Methods
In the context of spectrally enhancing burned area clusters, we present uni and bitemporal study data sets. Therefore we label the four SVDbased PCA versions to derive principal components. Next, we describe the use of multiresponse permutation procedures to assess the effects of all transformations applied, namely centering, scaling and SVD itself. In addition, we refer to the JeffriesMatusita spectral distance metric as a tool to quantify the separability between burned area and other major land cover class samples. Lastly, we overview an evaluation process for selecting principal components in which burned areas are spectrally enhanced. The complete workflow is visualised in Fig. 6.
Samples of burned areas and major land cover classes
Firstly, we delineated 42 samples of burned areas and numerous for vegetation and water bodies. Secondly, we extracted urban surfaces (greater than 200 ha) and bare ground samples from the CORINE 2000 land data map [13]. The samples, visualised in Fig. 7 are of both regular and irregular shape and consist by at least or more than 17 pixels^{9}. We did avoid to digitise large and mixed samples that could result in high internal class heterogeneity.
Unitemporal and bitemporal composites
We define the following multispectral data sets:

1.
Two unitemporal postfire sets: (a) a MODIS set build out of bands 1, 2, 6, 7 (in Fig. 2) and (b) a Landsat5 TM set composed of bands 1, 2, 3, 4, 5, 7 (in Fig. 3)

2.
Two bitemporal sets: (a) a MODIS composite build out of pre and postfire bands 2, 6, 7 (in Fig. 2) and (b) a Landsat5 TM composite using pre and postfire bands 2, 4, 7 (in Fig. 3)
The MODIS bands 1 and 2 were downscaled to 500 m to match the resolution of bands 6 and 7. The data sets will be crossreferenced as 1a, 1b, 2a and 2b hereafter. Scatterplot matrices for the samples in Fig. 7 extracted from both the unitemporal and bitemporal MODIS composites are visualised in Figs. 12 and 13.
Four ways of extracting principal components via SVD
Employing SVD in burned area mapping applications, is an inbetween enhancement step. It means to improve the performance of subsequent classification algorithms. Towards this end, we extract principal components via SVD from MODIS and Landsat5 TM surface reflectance data.
We subject to SVD the following versions of the data sets defined in the subsection “Unitemporal and bitemporal composites”: (A) uncenteredunscaled, (B) uncenteredscaled, (C) centeredunscaled, (D) centeredscaled. Henceforth, the various versions will be referred as A, B, C and D respectively. Scatterplot matrices for the samples in Fig. 7 extracted from the MODISderived transformed images, are visualised in Figs. 14, 15, 16, 17, 18, 19, 20 and 21.
Multiresponse permutation procedures
Following multiresponse permutation procedures (MRPP) [14], one can describe the composition and configuration of major land cover class samples extracted from both the original and the transformed composites (Tables 2 and 3).
The MRPP null hypothesis (H _{0}) accepts no differences among the sampled classes.^{10} This means that there is an equal chance for any possible combination of the data under H _{0}. The procedures estimate and compare the observed intraclass average distances (δ _{ o }), weighted by their sample size (n), with average distances derived by all possible combinations (δ _{ e x p.}) of the sampled data (permutations) expected under H _{0}. Essentially, they compare the dissimilarities within and among classes.
The significance of the test is reflected in the probability (Pvalue) of observing a mean distance δ as small or smaller than the observed δ _{ o } under H _{0}. In addition, a measure of the withinclass homogeneity is provided by A=1−δ _{0}/δ _{ e x p.}. The extreme case of all withinclass observations being identical, equals to δ _{ o }=0 and A=1. Since the mean distance δ under H _{0} is 0, an A>0 represents withinclass homogeneity and an A<0 signifies withinclass heterogeneity. Lastly, the classification strength [15] is the difference of the average between and withinclass dissimilarities.
The tests were performed using the complete set of observations sampled from the MODISbased composites (in total 1085 pixels extracted from each band). However, due to the enormous amount of permutations demanded by the high number of observations sampled from Landsat5 TM data (in total 18865 pixels), we ran MRPP on 3000 randomly selected observations, independently for each Landsat5 TMbased data set. The euclidean distance metric was selected as the measure of dissimilarity between two observations.
Spectral distance metric
The MRPP test assesses primarily the sampled burned area classe’s quality of being different among the rest of the classes. Moreover, to verify numerically the effects of the preprocessing options meancentering and scaling on the clusters of the sampled classes in terms of their configuration and composition. The procedures do not quantify, however, in a precise manner, the spectral enhancement of burned area samples after the application of SVD. To highlight how much the spectral separability, between burned and other class samples, increases or decreases, we rely on the JeffriesMatusita (J–M) index.
J–M is well established in remote sensing applications as a measure of spectral separability between classes. The index is a transformation of the Bhatacharyya distance (Eq. 2) and applies to multivariate normal spectral class models. It is bound between [0,2.0] as defined by [16].
where
where
B= Bhatacharyya index; i= first spectral signature vector; j= second spectral signature vector; Σ _{ i }= covariance matrix of sample i; and Σ _{ j }= covariance matrix of sample j.
Evaluation of the principal components
Selecting the components in which burn scars are emphasized, is important for any subsequent mapping attempt. The selection is rather a rejection scheme to filter out components that are dominated by information linked to unchanged landscape features. Likewise to reject ones that consist mainly of noise.
In this sense, we evaluate the outcomes of SVD considering indepth the effects of the preprocessing transformations centering and scaling via MRPP on samples of the land cover classes of interest; by visually inspecting the principal components; and comparing the eigen ^{11} vectors ^{12} and eigen ^{13} values ^{14}.
Results and discussion
We discuss hereafter the results of the transformations and their impact on spatial distances within and between the sampled land cover classes. In addition, we compare the performance of the four SVDbased PCA versions in terms of the spectral enhancement of burned area clusters via the JeffriesMatusita index. Next, we evaluate the principal components visually and numerically. Regarding the latter, we thoroughly review the case of the bitemporal MODIS data set (2a), how its variance is redistributed among the principal components. Finally, we justify the selection of the components that hold the highest separabilities.
Synopsis of preprocessing effects
Centering shifts the origin of the coordinate axes in the gravity center of the multidimensional data set. Scaling the centered dimensions forces unit variance before the analysis. In turn, this increases the influence of those variables with low variance and decreases the influence of those with high variance. Scaling, however, noncentered data does not yield to unit variance. It may even be mathematically questionable to do so, we do however include this combination for experimental completeness. While a centered SVD, equals the conventional EVDbased PCA, visual differences in terms of contrast may be perceived between components of the same order. These are atributed to the arbitrary sign in front of the eigenvectors.
Within vs between classes mean distances
We performed the MRPP test in order to diagnose the internal heterogeneity of burned area samples (withinclass low dispersion of mean) and question their distinctness among other sampled land cover features (betweenclasses heterogeneity).
The withinclasses heterogeneity is described by the A statistic and deviates little, in general, before and after the transformations–overall around 0.4 for MODIS data and around 0.6 for Landsat5 TM data. Hence, the transformations do not operate destructively in the internal structure of clusters for each class.
Before the transformations, the MRPP statistics show that burned area samples have relatively small mean withinclass distance, which reflects their low withinclass heterogeneity. For example, the respective δ values for burned area samples extracted from the MODIS data sets 1a and 2a, are 796.8 and 1079.0, lower than the observed δ _{0} for all observations 979.8 and 1361.0 respectively (Table 2). In contrast, urban areas (and similarly mineral extraction sites in Landsat5 TM data) present similar δ than δ _{0} values (i.e. 1004.0 and 1354.0 vs. 979.8 and 1361.0), yet higher than burned areas. Depending on the temporality of the samples extracted from the uni or bitemporal composites, burned area class distances δ are close to the ones of vegetation, sparse vegetation, and bare ground. A clear disjunction of water samples is present in all sampled data sets.
In the transformed data (Table 3), it is evident that centering does not alter the within or betweenclasses spatial distances. The mean distances are identical for all MODISderived transformed composites (Table 3a, A and C of 1a and 1b) and practically equal for all Landsat5 TMderived transformed composites (Table 4, A and C of 1b and 2b).
Scaling effects on both the range and the shape of the original point scatters are evident in the statistics (A, δ _{0} and δ _{ e x p.} values). For MODISbased transformations, nearly all scaled data sets result in higher A values than the unscaled data (Table 3  1a: 0.439 (B) vs. 0.4282 (A); 0.4357 (D) vs. 0.4282 (C); and Table 32a: 0.4038 (B) vs. 0.3851 (A)). An exception is the bitemporal centeredscaled data set which is practically the same as the centered data ((Table 32a: 0.3833 (D) vs. 0.3851 (C)). For the Landsat5 TMbased transformations, scaled bitemporal data have reduced A values while for the scaled unitemporal data they are close to the A values that correspond to the noncentered and centered data. Hence, low A and decreasing δ _{ o } values, as observed for all scaled versions, reflect the suppresion of fine intraclass variations in the transformed data.
Lastly, we consider the classification strength values. Overall, the mean betweenclasses distances are higher than the withinclasses distances for all data sets. For uncentered and centered data, both before and after the transformations, they are identical for the MODIS data sets (991.80 and 1058.76 in Table 2) and practically of equal importance for the Landsat5 TM data sets (64.97 and 55.13 in Table 2 and 67.35, 66.23 and 54.69, 54.62 in Table 4). In contrast, they are suppressed to low values and differ for all scaled versions. This translates in lower differences of within and betweenclasses dissimilarities.
Estimation of class separabilities
Separability estimations between samples of burned areas and major land cover classes, quantify the magnitude of spectral enhancements. The indices are compared in a onetoone manner, for all SVD versions, for each land cover class and principal component. Individual estimations and averages of the highest mean distances between samples of burned areas and major land cover classes can be extracted from Tables 5 and 6 for MODIS and Landsat data respectively.
In these tables, the row means correspond to the individual spectral separabilities between samples of burned areas, and other major land cover classes, for each rowspecific principal component. The column means correspond to the individual spectral separabilities between samples of burned areas and the columnspecific land cover class for each version of SVDbased PCA. To exemplify, in Table 5, the average of the spectral separabilities between burned and other classes (first row) 0.959, 0.451, 1.178 and 1.005, extracted from principal component 1 derived from the uncentered and unscaled version of the unitemporal MODIS data set, is 0.898. The average of the spectral separabilities exclusively between samples of burned and urban areas (first column) 0.959, 0.122, 1.215 and 0.181, for components 1, 2, 3 and 4, derived from the uncentered and unscaled version of the unitemporal MODIS data set, is 0.619.
Overall higher separabilities
For the unitemporal MODIS data set 1a, we gain higher overall average separabilities 0.722 in case uncenteredunscaled (A). The bitemporal set 2a individuates the highest average 0.695 when the data are centered and scaled (D), practically identical to 0.694 when using uncenteredunscaled data (A). The corresponding average separation peaks for the Landsat5 TM sets, are 1.151 for the unitemporal set (1b) with uncenteredunscaled data (A) and 1.109 for the bitemporal set (2b) with uncentered but scaled data (B).
Cellbycell highest separabilities
Overall, when comparing the separability matrices in a cellbycellmanner (per class and component comparison), most of the highest observed values are concentrated in the uncenteredscaled case (B) followed by the uncenteredunscaled (A), leaving behind the other two cases. Cases A and C share most of the unitemporalbased highest separabilities, followed by the uncenteredscaled, leaving behind the centeredscaled data. For the bitemporal sets, uncenteredscaled (B) data count most of the highest separations followed by uncenteredunscaled, centeredscaled and lastly the centeredunscaled (C).
Percomponent and perclass highest separabilities
Centered and scaled data (D) produce the highest separations in components 1 and 2 while uncenteredscaled data (B) attach to components 3, 4 and 6. The 5th component contains the smallest number of separation peaks, most of them contributed when using centeredunscaled data (C). Urban area samples are best separated from burned areas when using centeredunscaled data (C), while vegetated and bare ground samples with uncenteredunscaled data (D). Water surface samples peak their distance from burned areas twice in both uncenteredscaled (B) and centeredscaled (D) data. Mineral extraction sites peak once in uncenteredunscaled (A) and once in centeredscaled (D). Concluding, the most critical classes are best separated by using uncenteredunscaled data.
Visual inspection of the components
Visual inspection of the transformed images serves for quick control and is part of the complete evaluation process. Onsight, components 2, 3 and 4 are expected to be among the candidates in order to extract burned areas.

1.
MODIS unitemporal data sets
Burn scars are distinguished in all components derived from the unitemporal MODIS data set (1a, Fig. 8). For all SVD versions, burned areas appear very poor in the first component and rather blurry in the fourth component. Only the centered (both unscaled and scaled) second component represents sharply the scars. The uncentered components 2 and 3, appear to contain similar amounts of information linked to burned areas.

2.
MODIS bitemporal data set
The bitemporal MODIS composite (2a, Fig. 9) yields components in which we identify the burn scars within the 2nd, the 3rd and 4th components. The 3rd component appears occasionally unclear. Fragments of burn scars appear also in the 6th component, though they are rather noisy and stripped. In contrast, the 1st and the 5th components do not appear to hold distinguishable burned areas.

3.
Landsat5 TM unitemporal data set
On inspecting the components coming from the unitemporal postfire Landsat5 TM composite (1b, Fig. 10), the uncentered cases (A, B) distribute the scars on all components but the first. Also, they are barely visible in the 6th component. Conversely, in the centered but unscaled case (C) they appear more concentrated within the components 2, 3, 4 and noisy in components 5 and 6. Finally, the centered and scaled case (C) clearly displays the burnt signals in components 2 and 3 while the signal is rather weak in the remaining.

4.
Landsat5 TM bitemporal data set
The outcomes based on the bitemporal Landsat5 TM composite (2b, Fig. 11), include in all SVD versions a 2nd component that holds a moderate burnt signal. Component 3 is weaker for the uncentered cases and even more weak for the centered cases (C, D). Component 4 is best in cases A, C and D except for the case B where scars appear very weak if visible at all. The 5th component holds recognisable scars only in cases B and D.
Visually comparing the outcomes of the transformations allows for a rough similarity grouping of the images between centered and uncentered. As well, we observe, that the uncenteredscaled set of components, deviates from the uncenteredunscaled components.
Using the bitemporal MODIS and the unitemporal Landsat5 TM composites, uncentered data highlight the burn scars in the third and fourth components while they appear weaker in the 2nd component (Figs. 9 and 10 respectively). Centered data emphasize the large burned surfaces within the second component and slightly alter their presence in the fourth component. An exception is the 4th centeredscaled transformed image, which seems very poor for the features of our interest. Using the unitemporal MODIS data (1a), burn scars are divided among the second and third components. Finally, regarding the bitemporal Landsat5 TM (2b) composite, uncenteredunscaled data spread the information in decreasing order of visual contrast against other features among the 4th, 2nd and 3rd components. The centered data, however, concentrate the scars in components 4 and 2 (Figs. 8 and 11).
Quantitative evaluation of the transformation matrices
Careful observation of the transformed variances expressed in percentage (%), reveals two groups of ranges for each component, depending on whether the input data matrix was centered or not (Table 7). This is expected as the first uncentered component passes through the origin of the coordinate system near to the centroid of the multidimensional point swarm. In the following subsections we discuss the effects of centering and scaling based on the transformation matrices derived from SVD on the bitemporal MODIS composite 2a (Fig. 9). All numbers compared beloware drawn from Table 8. The transformations matrices for composites 1a, 1b and 2b are presented in Tables 9, 10 and 11.
A subtlety that affects the numerical accuracy of calculations is the divisor N used for the covariance matrix in the princomp function (an EVDbased PCA implementation) and the divisor N−1 used in the prcomp function (an SVD implementation) [17]. Though this should practically make no difference for samples containing more than 30 observations.
Variance
In general, uncentered data practically channel all of the original’s data variance in the 1st component (variances 98.5% and 98.3% for cases A, B respectively). On the other hand, centered data distribute significant amounts of information in higher order components (variances 74.7%, 72.1%, 74.7%, 72.1%, for cases C, D respectively).
For all cases, the variances of the last components (5th and 6th) are very low, while, as expected, the highest ones are identified in the major component (1st). In general, one can safely ignore these components since the former can be attributed to residual information and the latter mainly to unchanged features. Thus, we focus on the 2nd, 3rd and 4th components. The distribution of each original band in the transformed images is reflected in the eigen vectors, which act as weighting coefficients.
Centering
Centering decreases the absolute standard deviations of the extracted components. Yet the variance percentages of the higher order components increase substantially. This signifies that important amounts of the initial variation are redistributed among the higher order components 2, 3 and 4. On the contrary, performing the analysis without centering results in higher absolute standard deviations. Nonetheless, the variance percentages of the higher order components are substantially reduced in comparison to the 1st component. We then observe that centering relocates a lot of the information included in postfire band 2 in the 2nd component (eigenvector increases from 0.53 in case A to 0.73 in case C).
Burned surfaces are recorded as lower reflectance values in most of the spectral bands. Assuming they form data clusters which are clearly separated from the mean, the biggest portion of spectral information channeled in the 1st uncentered component, resembles mostly features other than burned. Postfire band 7sourced information, increases in the 1st and 3rd centered components (respectively from 0.29 and 0.16 to 0.39 and 0.22) and decreases in the 2nd and 4th components (from 0.47 and 0.58 to 0.42 and 0.54) which might be also interpreted as a loss of useful information from the higher order components 2 and 4.
Scaling
While the effect of centering is obvious in both the eigen values (or singular) and vectors, scaling the input data deals with finer details. Depending on whether the dimensions to be scaled are already centered or not, the influence on the variance percentages of the extracted components varies. The variance changes very little, and only for the first two components, when using uncentered input data. Quite the opposite, using centered input data produces different percentages.
In general, scaling reduces the variance of the 1st component. The variance percentages for component 2 increase from 0.9 to 1% and 13 to 14.9% in cases A, B respectively. In the higher order components 3 and 4, scaling of the uncentered input data does not alter the variance percentages 0.4% and 0.2% respectively for cases A, B. The same is observed when using centered input data sets with respect to components 5 and 6 whose variances are 0.7% and 0.2% for cases C, D. This does not hold true, however, for components 3 and 4 where the numbers increase: 7.5 to 7.9% and 3.8 to 4.2% for cases C, D.
Worth emphasising is that scaling uncentered data prior to SVD relocates the biggest proportion of information originating from both the prefire and the postfire band 2 in components other than the 3rd. For case B, the prefire band 2 loadings in the 3rd component decrease from 0.52 to 0.18. Most of the prefire band 2 information is clearly channeled in the 4rth component (loading −0.69). The postfire band 2 loading in the 3rd component decreases as well from 0.50 to 0.21. Thus, burned areas appear isolated in the 4rth component (Fig. 9).
Selecting components with highest separabilities
Most of the highest perclass separability peaks, exist within the uncenteredunscaled data followed by the uncenteredscaled, the centeredscaled, and lastly the centeredunscaled data set. Yet, observations of the highest mean separabilities only, whether perSVD version or perclass, do not suffice for selecting the best components. We know that the first and the last components are likely to be rejected. The first due to its highest variance, representing classes other than burned areas. The last due to its nearzero variance, capturing mainly noise. Hence, we focus on some of the higher order components, though, ignoring the last ones.
The mean separabilities for the components subset of our interest (meaning components 2, 3 and 4) are summarised in Table 12. The overall best PCA version for these components is the uncenteredunscaled one. Even in cases where centered data present relatively higher mean separabilities (in Table 12, 0.907 in case D over 0.902 in A for set 2a), we need to consider that a centered PCA redistributes greater amounts of the original variance–that is including unchanged patterns–among the higher order components.
Conclusions
The statistical evaluation shows that centering and scaling, prior to the application of SVD, operate on the input multidimensional matrix generally in a nondestructive way. If performed, centering modifies the way that data clusters are intercepted by the transformed axes. Effectively projecting spectral information related to unchanged patterns in higher order components. This works rather against the spectral enhancement of burned area clusters. Scaling smooths out fine variations existing in the original data. The latter may neutralise minor to moderate–but potentially useful details.
Within the framework of burned area mapping, the spectral separability estimations between burned and major land cover samples, point to the uncenteredunscaled SVDbased PCA version as the most suitable one. The uncenteredscaled version is rather expectedly not useful as it appears to have random effects. The centeredunscaled and centeredscaled versions should be tested. Yet, we generally discourage the use of scaling the original data if it is important to retain fine details after the transformations.
Since SVD is not optimised for class separability, centering or not centering the input data matrix, should be examined carefully. Even small improvements might be significant in further analysing the transformed data.
Endnotes
^{1} eigenvector decomposition
^{2} Distributed by the Land Processes Distributed Active Archive Center (LP DAAC), located at USGS/EROS, Sioux Falls, SD. http://lpdaac.usgs.gov
^{3} Local Granule ID: MOD09GQ.A2007242.h19v05. 005.2007244231200.hdf
^{4} Local Granule ID: MOD09GQK.A2006239.h19v05. 004.2006241155630
^{5} Available from the U.S. Geological Survey, http://www.usgs.gov.
^{6} Scene ID:LT51830332007248MOR00
^{7} Scene ID: LT51830332003237MTI01
^{8} Landsat Processing Details, ”USGS  Landsat Missions,” https://landsat.usgs.gov/landsatprocessingdetails(accessed April 16, 2017)
^{9} Driven by the sample size restriction in GRASSGIS’ i.smap module, an implementation of the SMAPalgorithm [18] to perform supervised image classification
^{10} We use the term “class” in place “group” as used originally in the MRPP test
^{11} here actually singular vectors
^{12} vectors can be seen as loadings or weighting coefficients which determine the direction of the principal components
^{13} here actually singular values which are square roots of nonzero eigenvalues
^{14} eigen values represent the variance of the original data contained in the principal components
References
Alexandris N, Gupta S, Koutsias N. Remote sensing of burned areas via PCA. Part 1: centering, scaling and EVD vs SVD. Open Geospatial Data, Software and Standards. 2017. doi:10.1186/s4096501700281.
Jolliffe IT. Principal Component Analysis, 2nd edn. Springer; 2002. 28 illustrations. http://www.springer.com/statistics/statistical+theory+and+methods/book/9780387954424.
Lu D, Mausel P, Brondizio E, Moran E. Change detection techniques. Int J Remote Sensing. 2003; 25(12):2365. doi:10.1080/0143116031000139863.
Richards J, Milne A. Mapping fire burns and vegetation regeneration using principal components analysis. In: 1983 International Geoscience and Remote Sensing Symposium(IGARSS’83). San Francisco: 1983.
Cadima J, Jolliffe I. On relationships between uncentred and columncentred principal component analysis. Pak J Stat. 2009; 25(4):473–503.
Roy D, Lewis P, Justice C. Burned area mapping using multitemporal moderate spatial resolution data  a bidirectional reflectance modelbased expectation approach. Remote Sensing Environ. 2002; 83:263–86.
Roy D, Landmann T. Characterizing the surface heterogeneity of fire effects using multitemporal reflective wavelength data. Int J Remote Sensing. 2005; 26(19):4197–218.
GRASS DT. Geographic Resources Analysis Support System (GRASS GIS) Software. Open Source Geospatial Foundation, 2008. Open Source Geospatial Foundation. http://grass.osgeo.org. Accessed 28 June 2017.
QGIS DT. Quantum GIS Geographic Information System. Open Source Geospatial Foundation, 2009. Open Source Geospatial Foundation. http://qgis.osgeo.org. Accessed 28 June 2017.
Warmerdam F. FWTools: Open Source GIS Binary Kit for Windows and Linux. http://fwtools.maptools.org/. Accessed 28 June 2017.
R Core Team. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing; 2010. R Foundation for Statistical Computing. ISBN 3900051070. http://www.Rproject.org. Accessed 28 June 2017.
Oksanen J, Blanchet FG, Kindt R, Legendre P, O’Hara RB, Simpson GL, Solymos P, Stevens MHH, Wagner H. Vegan: Community Ecology Package. 2010. R package version 1.175. http://CRAN.Rproject.org/package=vegan. Accessed 28 June 2017.
Bossard M, Feranec J, Otahel J, Steenmans C. CORINE land cover technical guide – Addendum 2000. European Environment Agency, Kongens Nytorv 6, DK–1050 Copenhagen K, Denmark: EEA; 2000.
Mielke PWJ. The application of multivariate permutation methods based on distance functions in the earth sciences. Earth Science Rev. 1991; 31:55–71. doi:10.1016/00128252(91)90042E.
Sickle JV. Using mean similarity dendrograms to evaluate classifications. J Agric Biol Environ Stat. 1997; 2(4):370–88.
Richards J, Jia X. Remote Sensing Digital Image Analysis. An Introduction. Third, Revised and Enlarged Edition, 3rd edn: Springer; 1999, p. 363. Hard cover. ISBN 3540648607.
R Core Team. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing; 2016. R Foundation for Statistical Computing. https://www.Rproject.org/. Accessed 28 June 2017.
Bouman CA, Shapiro M. A multiscale random field model for bayesian image segmentation. IEEE Trans Image Process. 1994; 3(2):162–77. doi:10.1109/83.277898.
Acknowledgments
The authors thank Aniruddha Ghosh and Georgia Kakoulaki for reading the manuscript.
Authors’ contributions
All authors contributed equally to this article. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Alexandris, N., Koutsias, N. & Gupta, S. Remote sensing of burned areas via PCA, Part 2: SVDbased PCA using MODIS and Landsat data. Open geospatial data, softw. stand. 2, 21 (2017). https://doi.org/10.1186/s4096501700290
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s4096501700290