Download citation
Download citation
link to html
Principal component analysis (PCA) is a multivariate data analysis approach commonly used in X-ray absorption spectroscopy to estimate the number of pure compounds in multicomponent mixtures. This approach seeks to describe a large number of multicomponent spectra as weighted sums of a smaller number of component spectra. These component spectra are in turn considered to be linear combinations of the spectra from the actual species present in the system from which the experimental spectra were taken. The dimension of the experimental dataset is given by the number of meaningful abstract components, as estimated by the cascade or variance of the eigenvalues (EVs), the factor indicator function (IND), or the F-test on reduced EVs. It is shown on synthetic and real spectral mixtures that the performance of the IND and F-test critically depends on the amount of noise in the data, and may result in considerable underestimation or overestimation of the number of components even for a signal-to-noise (s/n) ratio of the order of 80 (σ = 20) in a XANES dataset. For a given s/n ratio, the accuracy of the component recovery from a random mixture depends on the size of the dataset and number of components, which is not known in advance, and deteriorates for larger datasets because the analysis picks up more noise components. The scree plot of the EVs for the components yields one or two values close to the significant number of components, but the result can be ambiguous and its uncertainty is unknown. A new estimator, NSS-stat, which includes the experimental error to XANES data analysis, is introduced and tested. It is shown that NSS-stat produces superior results compared with the three traditional forms of PCA-based component-number estimation. A graphical user-friendly interface for the calculation of EVs, IND, F-test and NSS-stat from a XANES dataset has been developed under LabVIEW for Windows and is supplied in the supporting information. Its possible application to EXAFS data is discussed, and several XANES and EXAFS datasets are also included for download.

Supporting information

pdf

Portable Document Format (PDF) file https://doi.org/10.1107/S1600577514013526/hf5263sup1.pdf
Supplementary figures S1, S2 and S3

pdf

Portable Document Format (PDF) file https://doi.org/10.1107/S1600577514013526/hf5263sup2.pdf
Information about the workings of the NSS analysis referred to in the main text and included in the PCA_Estimator.exe programme supplied in the zip file of the supplementary material

zip

Zip compressed file https://doi.org/10.1107/S1600577514013526/hf5263sup3.zip
Graphical data-analysis programme and user manual for the calculation of the number of pure component XAFS spectra from a dataset of multicomponent spectra. The datasets listed in Table 1 are also included


Follow J. Synchrotron Rad.
Sign up for e-alerts
Follow J. Synchrotron Rad. on Twitter
Follow us on facebook
Sign up for RSS feeds