Noise reduction for near-infrared spectroscopy data using extreme learning machines

doi:10.1016/j.engappai.2018.12.005

Engineering Applications of Artificial Intelligence

Volume 79, March 2019, Pages 13-22

https://doi.org/10.1016/j.engappai.2018.12.005 Get rights and content

Highlights

•
There are many pre-processing techniques for NIR data to choose from.
•
We propose to avoid the pre-processing by using a novel algorithm called C-PL-ELM.
•
C-PL-ELM uses two Lagrange multipliers as optimization constraints.
•
Results for regression and classification tasks confirm the advantages of C-PL-ELM.

Abstract

The near infrared (NIR) spectra technique is an effective approach to predict chemical properties and it is typically applied in petrochemical, agricultural, medical, and environmental sectors. NIR spectra are usually of very high dimensions and contain huge amounts of information. Most of the information is irrelevant to the target problem and some is simply noise. Thus, it is not an easy task to discover the relationship between NIR spectra and the predictive variable. However, this kind of regression analysis is one of the main topics of machine learning. Thus machine learning techniques play a key role in NIR based analytical approaches. Pre-processing of NIR spectral data has become an integral part of chemometrics modeling. The objective of the pre-processing is to remove physical phenomena (noise) in the spectra in order to improve the regression or classification model. In this work, we propose to reduce the noise using extreme learning machines which have shown good predictive performances in regression applications as well as in large dataset classification tasks. For this, we use a novel algorithm called C-PL-ELM, which has an architecture in parallel based on a non-linear layer in parallel with another non-linear layer. Using the soft margin loss function concept, we incorporate two Lagrange multipliers with the objective of including the noise of spectral data. Six real-life dataset were analyzed to illustrate the performance of the developed models. The results for regression and classification problems confirm the advantages of using the proposed method in terms of root mean square error and accuracy.

Introduction

Near-infrared (NIR) (Pierna et al., 2011) spectroscopy are mainly used to measure light absorption of the so-called mid-infrared light, in order to identify and quantify various materials. Spectroscopy in combination with varied multivariate algorithms has played an important role for fast and nondestructive analysis in petrochemical, agricultural, medical, and environmental sectors (Liu et al., 2015a, Luypaert et al., 2007, Kim et al., 2010, Park et al., 2012).

According to the Beer–Lambert law, the absorption of light in a medium is proportional to the path length and the concentration of the absorbing agent. That is, there is a linear relationship between absorbance and concentration when the path length remains constant, which motivated the use of linear multivariate calibration techniques, such as multiple linear regression (MLR), principal components regression (PCR) (Keithley et al., 2009) and partial least squares regression (PLS) (Wilcox et al., 2016).

However, the linearity of the Beer–Lambert law is limited by chemical and instrumental factors, such as, deviations in absorptivity coefficients at high concentrations, non-symmetrical chemical equilibrium, intermolecular reactions, existence of humidity inducing hydrogen bonding, changes in temperature, non-monochromatic radiation, scattering of light, fluorescence or phosphorescence of the sample, stray light, nonlinear detector response (Despagne and Luc Massart, 1998), etc. When the system exhibits strong nonlinear behaviors, classical linear methods may not completely identify the relationship between the spectra and corresponding concentrations and thus would produce large errors in regression and classification problems. Many nonlinear techniques have been developed, such as, artificial neural network (ANN) (Hajnayeb et al., 2011), support vector machine (SVM) (Li et al., 2009), nonlinear partial least squares (Rosipal and Trejo, 2001), etc. These methods may perform well on nonlinear data but are computationally more complex than linear methods and have the limitation of being prone to overfitting (Peng et al., 2013).

In the case that the experimental measures deviate from the Lambert–Beer law, a suitable pre-processing must be considered to compensate for this nonlinear behavior. The disadvantage of including such additional factors is an increase in the complexity of the model and, in turn, it is likely to have a reduction of the robustness of the model for future predictions. All pre-processing techniques have the aim to reduce the noise in the data with the purpose of improving the characteristics looked for in the spectra. However, there is always the danger of choosing the wrong type or applying a pre-processing that is too severe that you may (unintentionally) delete valuable information. This problem is described in detail by Rinnan et al. (2009).

For near-infrared spectroscopy data, it generally contains linear and non-linear components in regression or classification problems. Linear methods may have difficulty obtaining a good performance, since the non-linearity is usually modeled in a limited way. However, the linear methods are more simple and stable. The non-linear methods can provide better performance than the linear methods, but are more complex. Therefore, a simple, fast, precise and effective method is required.

In the early 1990s, different authors (Schmidt et al., 1992, Pao et al., 1994) independently proposed feedforward neural networks comprising randomly initialized and untrained connections between the input layer and a hidden layer of non-linear neurons. Then, in the 2000s these type of networks were revisited under the name of Extreme Learning Machine (ELM) (Huang et al., 2004). Recently, a comparison of these neural networks with random weights, for classification and regression problems, was carried out by Henríquez and Ruz (2018a), also, in Zhang and Suganthan (2016b), a survey on randomized algorithms for training neural networks was presented. In this paper, we will use the term ELM as a reference for randomized feed-forward neural networks (Henríquez and Ruz, 2018b).

ELM has been successfully applied in many fields (Samat et al., 2014, Cao et al., 2016). Especially, for NIR data, ELM combined with feature selection techniques has been used to determine amino acid nitrogen in soy sauce (Ouyang et al., 2013), total acid content in vinegar (Chen et al., 2012), pear internal quality attributes (Jiang and Zhu, 2013), etc. Recently, in Li et al. (2016) was analyzed the feasibility of Fourier transform infrared transmission (FT-IR) spectroscopy to detect talcum powder illegally added in tea. In Yang and Sun (2016) the abilities of six popular multivariate classification techniques are compared, including ELM. In Bian et al. (2017) ELM was used for near-infrared spectral quantitative analysis of diesel fuel and edible blend of oil samples.

A main difficulty when working with NIR data is the fact that there are many pre-processing techniques from which to choose from (more details in Section 2), and even combinations of them, where different researchers use arbitrarily such methodologies, in order to achieve good performance in classification and prediction problems. With this motivation, there is a need for a robust methodology to solve this problem. Therefore, we propose to reduce the noise in NIR data by using a novel algorithm called C-PL-ELM, which has a parallel architecture. This algorithm has a non-linear layer in parallel with another non-linear layer, generating a more powerful nonlinear mapping. Using the soft margin loss function concept (Bennett and Mangasarian, 1992), we incorporate two Lagrange multipliers as optimization constraints (similar to the concept of support vector regression (Drucker et al., 1997)) with the objective of including the noise in spectroscopy data, thus avoiding the pre-processing of the experimental measures.

The remaining of the paper is organized as follows: In Section 2, we briefly review some of the most popular pre-processing techniques. The detail of the proposed C-PL-ELM algorithm is presented in Section 3. Simulation results and comparisons are provided in Section 4. Section 5 presents discussions on the performance of C-PL-ELM with respect to the different datasets. Conclusions are drawn in Section 6.

Section snippets

Brief review of Pre-processing techniques

The aim of signal pre-processing is to improve the data quality before modeling and to remove physical information from the spectra. Applying pre-processing can increase the repeatability/reproducibility of the method, model robustness and accuracy, although there are no guarantees this will actually work.

The most widely used pre-processing techniques in NIR spectroscopy can be divided into two group: scatter-correction methods and spectral derivatives.

The first group of scatter-corrective

Extreme learning machines

Extreme learning machine (ELM) (Huang et al., 2006b) is a unifying learning algorithm which can be used for several learning tasks. It was originally developed for the single hidden-layer feed forward neural networks (SLFNs), and then extended to the generalized SLFNs (Huang et al., 2012). In ELM, the feature mapping function is also called the activation function. The network structure of ELM is shown in Fig. 1. In accordance with the ELM’s universal approximation capability theorems (Huang et

Performance evaluation

In this section, the performance of the proposed C-PL-ELM learning algorithm is measured. All the simulations are carried out using the free R software for statistical computing environment running on a 2.6 GHz Intel Core i5 and 8 GB-RAM computer. In relation to the scope of the random weights and biases, the random values are considered typically in the range $[- 1, 1]$ (as we do in this paper), some additional conditions are considered in Gorban et al., 2016, Li and Wang, 2017, Tyukin and

Discussions

The selection of an appropriate pre-processing technique for NIR data is a difficult problem. As mentioned in Rinnan et al. (2009) this can affect the robustness of the model. In the previous sections, we have proposed a novel method to include the noise from the data in the model. We use two Lagrange multipliers as optimization constraints (similar to the concept SVR proposed by Drucker et al. (1997)). We propose an algorithm without using pre-processing techniques. We believe that a robust

Conclusion

In this paper, we propose a novel algorithm for the analysis in near-infrared spectroscopy. We use the algorithm C-PL-ELM with an architecture in parallel based on a non-linear layer in parallel by another non-linear layer. We incorporate two Lagrange multipliers as optimization constraints with the aim of avoiding the pre-processing of the spectra. The experimental results of this paper are promising and indicate that C-PL-ELM has a good performance in the presence of spectra with noise. More

Acknowledgment

The authors would like to thank CONICYT-Chile under grant CONICYT Doctoral scholarship (2015-21150790) (P.H.), Basal(CONICYT)-CMM (G.A.R), and the Research Center Millennium Nucleus Models of Crisis, Chile (NS130017) (G.A.R), for financially supporting this research.

References (64)

Al-JowderO. et al.
Mid-infrared spectroscopy and authenticity problems in selected meats: a feasibility study
Food Chem.
(1997)
AlamarP.D. et al.
Quality evaluation of frozen guava and yellow passion fruit pulps by NIR spectroscopy and chemometrics
Food Res. Int.
(2016)
AlampreseC. et al.
Identification and quantification of turkey meat adulteration in fresh, frozen-thawed and cooked minced beef by ft-nir spectroscopy and chemometrics
Meat Sci.
(2016)
BevilacquaM. et al.
Tracing the origin of extra virgin olive oils by infrared spectroscopy and chemometrics: A case study
Anal. Chim. Acta
(2012)
BevilacquaM. et al.
Application of near infrared (NIR) spectroscopy coupled to chemometrics for dried egg-pasta characterization and egg content quantification
Food Chem.
(2013)
CaoJ. et al.
Extreme learning machine and adaptive sparse representation for image classification
Neural Netw.
(2016)
ChenQ. et al.
Rapid measurement of total acid content (TAC) in vinegar using near infrared spectroscopy based on efficient variables selection algorithm and nonlinear regression tools
Food Chem.
(2012)
ChenJ. et al.
Rapid determination of total protein and wet gluten in commercial wheat flour using sisvr-nir
Food Chem.
(2017)
DingX. et al.
NIR spectroscopy and chemometrics for the discrimination of pure, powdered, purple sweet potatoes and their samples adulterated with the white sweet potato flour
Chemometr. Intell. Lab. Syst.
(2015)
FrizonC.N. et al.
Determination of total phenolic compounds in yerba mate (ilex paraguariensis) combining near infrared spectroscopy (NIR) and multivariate analysis
LWT - Food Sci. Technol.
(2015)

GorbanA.N. et al.

Approximation with random bases: Pro et contra

Inform. Sci.

(2016)

HajnayebA. et al.

Application and comparison of an ANN-based feature selection method and the genetic algorithm in gearbox fault diagnosis

Expert Syst. Appl.

(2011)

HenríquezP.A. et al.

Extreme learning machine with a deterministic assignment of hidden weights in two parallel layers

Neurocomputing

(2017)

HenríquezP.A. et al.

A non-iterative method for pruning hidden neurons in neural networks with random weights

Appl. Soft Comput.

(2018)

HuangG.B. et al.

Extreme learning machine: Theory and applications

Neurocomputing

(2006)

KeithleyR. et al.

Multivariate concentration determination using principal component regression with residual analysis

TrAC - Trends Anal. Chem.

(2009)

KimS.B. et al.

An effective classification procedure for diagnosis of prostate cancer in near infrared spectra

Expert Syst. Appl.

(2010)

LiH. et al.

Support vector machines and its applications in chemistry

Chemometr. Intell. Lab. Syst.

(2009)

LiM. et al.

Insights into randomized algorithms for neural networks: Practical issues and common pitfalls

Inform. Sci.

(2017)

LiuY. et al.

Predicting soil salinity with vis–nir spectra after removing the effects of soil moisture using external parameter orthogonalization

PLoS One

(2015)

LiuC. et al.

A comparative study for least angle regression on nir spectra analysis to determine internal qualities of navel oranges

Expert Syst. Appl.

(2015)

LorenteD. et al.

Visible–nir reflectance spectroscopy and manifold learning methods applied to the detection of fungal infections on citrus fruit

J. Food Eng.

(2015)

LuypaertJ. et al.

Near-infrared spectroscopy applications in pharmaceutical analysis

Talanta

(2007)

MaboodF. et al.

Detection and estimation of super premium 95 gasoline adulteration with premium 91 gasoline using new NIR spectroscopy combined with multivariate methods

Fuel

(2017)

PaoY.H. et al.

Learning and generalization characteristics of the random vector functional-link net

Neurocomputing

(1994)

ParkJ.I. et al.

Improved prediction of biomass composition for switchgrass using reproducing kernel methods with wavelet compressed ft-nir spectra

Expert Syst. Appl.

(2012)

PengJ. et al.

Combination of activation functions in extreme learning machines for multivariate calibration

Chemometr. Intell. Lab. Syst.

(2013)

PiernaJ.F. et al.

Comparison of various chemometric approaches for large near infrared spectroscopic data of feed and feed products

Anal. Chim. Acta

(2011)

RinnanA. et al.

van den Berg F Engelsen S.B. Review of the most common pre-processing techniques for near-infrared spectra

TRAC Trends Anal. Chem.

(2009)

XuL. et al.

Rapid and nondestructive detection of multiple adulterants in kudzu starch by near infrared (NIR) spectroscopy and chemometrics

LWT - Food Sci. Technol.

(2015)

YeM. et al.

Rapid detection of volatile compounds in apple wines using ft-nir spectroscopy

Food Chem.

(2016)

ZhangL. et al.

A comprehensive evaluation of random vector functional link networks

Inform. Sci.

(2016)

Cited by (17)

Combination of PCA with LDA and SVM classifiers: A model for determining the geographical origin of coconut in the coastal plantation, Aceh Province, Indonesia
2024, Case Studies in Chemical and Environmental Engineering
Since Indonesia is the world's largest producer of coconuts, it is necessary to classify them according to their chemical profiles. The examination of coconut endosperm from 13 districts in Aceh's coastal plantations was done using the NIRS. Chemometric analysis of the NIRS spectra revealed that smoothing, the 1st derivative, and SNV preprocessing of the PCA produced improved visualization outcomes. The discrimination study revealed that the PCA-LDA and PCA-SVM produced accurate results that were satisfactory, with 100% accuracy for each. The effectiveness of combinations of PCA-LDA and PCA-SVM was examined in this article as being helpful for enhancing classification accuracy.
Maize seed variety identification using hyperspectral imaging and self-supervised learning: A two-stage training approach without spectral preprocessing
2024, Expert Systems with Applications
Rapid and non-destructive variety identification is essential for screening maize seeds for different end-uses such as food, feed, and breeding. Hyperspectral imaging (HSI) is one of the most commonly used techniques in such seed classification. Typically, after acquiring hyperspectral images of seeds, the spectral domain signals need to be preprocessed and a classifier need to be designed. The traditional method is to find a appropriate spectral preprocessing method through trial-and-error experiment, which is time-consuming, laborious and has high risk of misuse preprocessing. In view of this, this paper proposes a self-supervised learning method that includes pre-training and fine-tuning phases. In the pre-training phase, a model was trained on the unlabeled raw spectral data in an unsupervised manner to obtain general representations. In the fine-tuning phase, the pre-trained model was fine-tuned with the goal of the seed classification task and trained in a supervised manner on labeled spectral data. Experimental results showed that the proposed method did not rely on spectral preprocessing, and its performance was superior to other existing seed classification methods. In addition, the self-supervised pre-trained model significantly outperformed the non-pre-trained model in the downstream seed classification task, and obtained good generalization ability. Overall, this method combined with HSI for seed quality evaluation has broad application prospects.
VasLine: Realize online detection and augmented NIR using deep learning
2023, Engineering Applications of Artificial Intelligence
The chemical information acquisition capabilities of near infrared (NIR) have attracted great interests of investigators on exploring its potential as process analytical technologies (PAT) on the traditional Chinese medicine (TCM). However, due to the operation environment of TCM is complex and noisy, the accurate online composition detection and analysis is challenging and remains unsolved. Therefore, in this paper, we propose a new online TCM composition analysis platform for high quality online spectral data acquisition. We design, VasLine, a generative framework based on deep learning to estimate the multiple critical quality attributes (CQAs). To demonstrate the feasibility of the system, the platform was applied to the extraction process of Xiao’er Xiaoji Zhike Oral Liquid (XXZOL) and the framework was trained to estimate the content of the 7 CQAs. In the evaluation, we carried out 16 batches extraction experiments and collected extensive online NIR data, offline NIR data and high-performance liquid chromatography (HPLC) data for TCM composition analysis. The results show that the estimation accuracy of VasLine outperforms the state-of-the-art regression approaches significantly for all the 7 different CQAs, e.g., the R² metrics of VasLine for the CQAs are all higher than 0.95. This study aims to propose a novel generative framework based on a deep learning model, and the framework is applied in the self-developed platform to estimate CQAs with high-quality generative data from noisy online NIR accurately. The experiments for online TCM composition analysis show that VasLine is a new solution for the quality improvement of the actual pharmaceutical industry.
Near infrared spectroscopy as a fast and non-destructive technique for total acidity prediction of intact mango: Comparison among regression approaches
2022, Computers and Electronics in Agriculture
Citation Excerpt :
The primary information that can be gathered from the interaction of the near-infrared radiation with the biological object is its physical, optical and chemical properties. Fruit, grain and forage material have shown to have identifiable CH, NH, and OH absorption bands in the near-infrared region whereas each have a specific vibrational frequency and it is different between one object and the others (Henríquez and Ruz, 2019). The whole measurement processing in NIRS generally consists of the following (Cen and He, 2007): (1) NIR spectra data acquisitions, (2) spectra pre-processing to eliminate noises and baseline shift from the instrument and background, (3) develop calibration models using a set of samples with known analysed concentration obtained by suitable and standard laboratory procedures correlated with sample spectra, and (4) validate the prediction models using another set of independent samples
In the present study, the potential of near infrared spectroscopy (NIRS) as a rapid and non-destructive tools for quality attributes measurement of intact mango was investigated. Three different regression approaches namely partial least square regression (PLSR), support vector machine regression (SVMR), and artificial neural network (ANN) were used and compared in predicting total acidity (TA) of intact mangos. This quality parameter prediction models were established based on near infrared diffuse reflectance spectra acquired in wavelength range from 1000 to 2500 nm. Standard normal variate (SNV) transformation was applied as spectra enhancement prior to prediction models development. The results obtained show that ANN and SVMR are better than PLSR for TA prediction. The optimal prediction model for TA quality attribute were obtained by ANN with the first 4 principal components (PCs) scores as input. The coefficient determination of calibration (R²cal) and prediction (R²pred), the root-mean square error of calibration (RMSEC) and prediction (RMSEP), and the ratio of prediction to deviation (RPD) were 0.97, 0.89, 25.29 mg100g⁻¹, 28.42 mg100g⁻¹ and 4.02, respectively. The overall results satisfactorily demonstrate that NIRS technology combined with proper regression approaches has the promising results to determine TA of intact mango non-destructively.
Comparative assessment on smart pre-processing methods for extracting information in FT-NIR measured data
2020, Measurement: Journal of the International Measurement Confederation
Citation Excerpt :
The development of more efficient methodologies is on a great demand for FT-NIR analysis of soil as there will be a large amount of bad quality, unpretreated soil samples at the online monitoring part. Extracting information and eliminating noise in spectroscopy analysis are very important for improving model predictive effect, especially for analytes with multi-components [9,10]. Linear method is widely used for comprehensively screening spectroscopic data, extracting information variables and overcoming spectral colinearity [11].
Fourier transform near infrared (FT-NIR) spectroscopy is a rapid and straightforward technology to analyse soil properties. However, FT-NIR measurement of soil is always accompanied with latent noises coming from masking and overlaps because of systematic bias and manual errors. Several pre-processing methods were investigated as smart pre-modelling modes for extracting data information and getting rid of the noises. Using least squares support vector regression (LSSVR), the FT-NIR calibration models were established for predicting target nutrients (nitrogen, phosphorus and potassium). With the combined investigation on noise-sensing pre-processing methods, the LSSVR models were automatically optimized by tuning its kernel parameters. Results show that the pre-processing methods were optimized in smart ways of polynomial modifications or function derivatives on quasi-continuous data. The effective pre-processing methods for the three target nutrients were originated from optical path length estimation and correction (OPLEC) and Whittaker smoother (WTK), switched to modification of the regulation polynomials and 1st- or 2nd- order derivatives, respectively. Thus we concluded that the modified OPLEC and the derivative WTK had strong potentials to replace the traditional pre-processing methods in FT-NIR analysis of soil nutrients.
Stochastic parallel extreme artificial hydrocarbon networks: An implementation for fast and robust supervised machine learning in high-dimensional data
2020, Engineering Applications of Artificial Intelligence
Citation Excerpt :
Moreover, it was proved that ELM is an effective universal approximation model (Huang et al., 2006a) when using as supervised learner. For instance, applications of ELM can be found in Henríquez and Ruz (2019), Chin and Ji (2018), Wang and Han (2015), Raghuwanshi and Shukla (2018), Wang et al. (2017), Geng et al. (2017), Nobrega and Oliveira (2015), Hu et al. (2017), Lu and Kao (2016), Yu et al. (2016) and Vitor de Campos Souza (2018). For implementation, the Moore–Penrose pseudo-inverse can be computed using singular value decomposition (SVD) (Kokkinos and Margaritis, 2018).
Artificial hydrocarbon networks (AHN) – a supervised learning method inspired on organic chemical structures and mechanisms – have shown improvements in predictive power and interpretability in comparison with other well-known machine learning models. However, AHN are very time-consuming that are not able to deal with large data until now. In this paper, we introduce the stochastic parallel extreme artificial hydrocarbon networks (SPE-AHN), an algorithm for fast and robust training of supervised AHN models in high-dimensional data. This training method comprises a population-based meta-heuristic optimization with defined individual encoding and objective function related to the AHN-model, an implementation in parallel-computing, and a stochastic learning approach for consuming large data. We conducted three experiments with synthetic and real data sets to validate the training execution time and performance of the proposed algorithm. Experimental results demonstrated that the proposed SPE-AHN outperforms the original-AHN method, increasing the speed of training more than $10, 000 x$ times in the worst case scenario. Additionally, we present two case studies in real data sets for solar-panel deployment prediction (regression problem), and human falls and daily activities classification in healthcare monitoring systems (classification problem). These case studies showed that SPE-AHN improves the state-of-the-art machine learning models in both engineering problems. We anticipate our new training algorithm to be useful in many applications of AHN like robotics, finance, medical engineering, aerospace, and others, in which large amounts of data (e.g. big data) is essential.

View all citing articles on Scopus

^☆: No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.engappai.2018.12.005..

View full text

Noise reduction for near-infrared spectroscopy data using extreme learning machines☆

Highlights

Abstract

Introduction

Section snippets

Brief review of Pre-processing techniques

Extreme learning machines

Performance evaluation

Discussions

Conclusion

Acknowledgment

Food Chem.

Food Res. Int.

Meat Sci.

Anal. Chim. Acta

Food Chem.

Neural Netw.

Food Chem.

Food Chem.

Chemometr. Intell. Lab. Syst.

LWT - Food Sci. Technol.

Inform. Sci.

Expert Syst. Appl.

Neurocomputing

Appl. Soft Comput.

Neurocomputing

TrAC - Trends Anal. Chem.

Expert Syst. Appl.

Chemometr. Intell. Lab. Syst.

Inform. Sci.

PLoS One

Expert Syst. Appl.

J. Food Eng.

Talanta

Fuel

Neurocomputing

Expert Syst. Appl.

Chemometr. Intell. Lab. Syst.

Anal. Chim. Acta

TRAC Trends Anal. Chem.

LWT - Food Sci. Technol.

Food Chem.

Inform. Sci.