Detection and characterization of adulterants in baru and soursop oils

The presence of vegetable oils in the pharmaceutical and cosmetic industries boosts their production and commercialization, maximizing the economic potential, however to increase profits, vegetable oils with high added value are the target of adulteration practices. This negatively affects the quality of the product and compromises the bioactive properties of these oils. The objective of this study was to build partial least squares regression models to identify baru oil and soursop oil samples adulterated with soybean oil. For data collection the oils samples were characterized using portable near-infrared spectrometer and Fourier-transform infrared spectrometer. These techniques were chosen because both provide information about the functional groups of chemical compounds, however portable equipment, being cheaper, presents signals with lower spectral resolution. Thus, the objective of the work is also to verify if the results using both equipment are similar. The developed regression models were effective; however, the near-infrared technique presented some limitations in the identification of soursop oil. This is because for the construction of his model it was necessary to use a smaller number of variables and levels of adulteration. Nevertheless, high R 2 values and relatively low errors were obtained for all models, making it possible to identify the adulteration.


Introduction
The pharmaceutical and cosmetic industries for the manufacture of their products require a large amount of vegetable oils, which generates a great demand and increases the economic potential of the flora (El-Hamidi & Zaher, 2018).Some oilseed species offer a low extraction yield and high-added value of their resulting oils; thus, the practice of adulteration of these oils becomes advantageous and pervasive as it results in higher profits through the commercialization of these products.In this context, various high-value oils (babassu, olive, canola and sunflower) are adulterated with other oils with low commercial value, but with physical-chemical characteristics similar, as soybean oil (SO) (Bunaciu et al., 2022;Chen et al., 2018b;Silveira et al., 2017;Pereira et al., 2022).Because this practice negatively affects the quality of the product and compromises the functional and bioactive properties of these oils, the authenticity and quality of these products need to be strictly monitored.(Al-Ahmed et al., 2008).
A fruit species obtained from local producers in Pirenópolis, Goiás, Brazil, rich in phenolic compounds and which was used for oil extraction in Mato Grosso, Brazil is the baruzeiro (Dipteryx alata Vog).It belongs to the Fabaceae family with ovoid fruits that are brown in color and contain only one almond.Its seed is usually consumed as food, having a significant nutritional value (Santos et al., 2016;Lemos et al., 2012).
Another fruit-bearing species that belongs to the Annonaceae family is Annona muricata, also known as soursop, with oval and greenish-colored fruits and flexible spines on their bark (Okigbo & Obire, 2009).In the food industry, the seeds of this species are usually discarded after extracting their pulp.However, the extracted oil from the seed, in Ondo town and nearby villages in Nigeria, is rich in proteins with a small amount of toxic compounds, making it an attractive alternative to complement animal and human food (Fasakin et al., 2008).
Several instrumental methodologies can be used in the quality control of these products, such as spectroscopy techniques in the infrared region combined with mathematical, statistical, and computer modelings capable of extracting the information provided, a chemical discipline called chemometrics (Barros et al., 2006).These methodologies are highly advantageous with short analysis times, easy-to-handle instrumentations, and minimum residue generation.Regression methods are used to analyze the relationship between a dependent variable (concentration or another property of interest) and a set of independent variables (instrumental signals), which are used to quantify the important properties of a set of samples (Olivieri, 2018).One of the most well-known regression methods is the partial least squares (PLS) method, which was developed in the mid-1960s by Herman Wold and is still widely used today in spectrometry, monitoring, and control of industrial processes.It uses the obtained components and seeks to maximize the covariance between independent and dependent variables, combining the characteristics of multivariate regression, canonical correlation analysis, and principal component analysis (Lin et al., 2020) Several studies have used near-infrared (NIR) and Fouriertransform infrared (FT-IR) spectroscopy techniques combined with chemometrics for quality control of vegetable oils.For analysis in the NIR region, there are works in the literature using PLS to identify adulteration of palm oil with pure lard (Basri et al., 2017) and to verify the pure black sesame oil that was mixed with five kinds of oils (corn, rice, peanut, rapeseed and blend) (Chen et al., 2018a).In addition multivariate calibration method was used to study adulteration of olive oil with soybean, cotton, corn, canola and sunflower oils (Öztürk et al., 2010).In the mid-infrared region, analyses of rapeseed oil adulterated with refined and purified residual cooking oil and extra virgin olive oil adulterated with edible oil revealed that this is a promising technique to distinguish pure samples (Sun et al., 2015;Wu et al., 2015).
To contribute to studies on spectroscopic and chemometric methodologies, the present study aimed to determine the adulteration of baru oil (BO) and soursop oil (SSO) with soybean oil (SO) by analyzing the spectral data obtained from NIR and FT-IR measurements, using PLS models.

Oil extraction
Baru and soursop seeds were purchased from local producers in Gurupi, TO, Brazil and the experiments were conducted at the Federal University of Tocantins in the Materials Chemistry and Environmental Analysis laboratories.The extraction of oils was carried out in a hydraulic press with capacity of 15 tons, where the extractor cell was made of aluminum, completely dismountable for better handling and to facilitate the cleaning of the samples.For oils extraction, 200 g of baru seed and 250 g of soursop seed were previously cleaned, dried and ground.Next, added to the extractor cell and subjected to a pressure of 10 tons for a period of 10 h, after which there was no more oil flow.After extraction, the oils were transferred to a centrifuge tube and centrifuged at 2500 rpm for 15 minutes, to eliminate possible solid residues present.Then, the residue-free oils were transferred to a glass flask and stored at -20 °C.The extraction yield for BO and SSO was respectively of the 30,3 and 20,1%.

Sample preparation
Adulteration of pure BO and SSO was performed with different percentages (% v/v) of SO, covering low, medium, and high levels, in order to use the maximum number of adulteration levels in model building.Thus, samples were prepared with adulteration levels ranging from 0.0-30.0%,70.0-100.0%(with intervals of 2.5%) and 40.0-65.0%(with intervals of 5.0%) (Mendes et al., 2015).However, during the construction of the models, the non-separation of all proposed levels of adulteration was observed.Thus, for the construction of the models, the levels that presented the best results were selected, as shown in Table 1.Adulterations of the extracted oils with SO were performed in quintuplicate.

Vibrational spectroscopy
Eight measurements were performed on each sample; thus, 80 spectra were obtained using the portable NIR and FT-IR equipment.NIR analyses were performed using a Texas Instruments® portable spectrometer (DLP NIRscan Nano Evaluation Module EVM, USA) equipped with an indium gallium arsenide detector.The spectra were obtained in the spectral range of 900-1700 nm (11099 to 5880 cm -1 ), with an average resolution of 3.51 nm (17 cm -1 ) and 32 scans in the absorbance mode.The software used was NIRscanNano GUI version 1.1.9,as developed by Texas Instruments® (TX, USA).
A Perkin Elmer FT-IR spectrometer (SP, Brazil) operating in the absorption mode was used for the analyses in the midinfrared region.The spectra were obtained using a CsI pellet in the range of 400-4000 cm -1 with a resolution of 4 cm -1 and 32 scans.To measure the blank of the equipment, a tablet without a sample was used and properly cleaned with pure acetone.The following steps were performed to obtain the spectra: A drop of oil was placed on the surface of the tablet, and the residual oil was removed using cotton swabs.Subsequently, the tablet was positioned inside the device to measure the oil sample.For each measurement, the tablet was cleaned with acetone, and the blank measurements were repeated.

PLS regression
For the analysis of the spectra obtained and the construction of the PLS models, the MATLAB software version 7.12 (R2011b) and PLS Toolbox package version 6.5 were employed, respectively.PLS is a well-known methodology, and the objective of this study was to find a correlation between matrix X, containing the spectral data obtained using the IR methods, and vector y, which corresponds to the chosen adulteration.This method does not require complete knowledge of all the components present in the samples under study, making it possible to predict the parameters of interest even in the presence of interferences, as long as they are present in the construction of the model (Rosipal, 2011).The matrices in this method are elucidated according to Equations 1 and 2, and the main idea is to calculate the main scores of X and Y and define a regression model between them: ´ (where k = 1, 2, ... p; l = 1, 2, ... a, in which a is the number of components in the model), as shown in Equation 3: Therefore, the PLS method can be used for constructing a matrix of latent variables as a linear transformation of X, where B (p × m) is referred to as the PLS regression coefficient, the following Equation 5is obtained (Rosipal, 2011): The samples were divided into calibration and validation sets with 26 and 14 spectra, respectively.The optimal number of latent variables was determined using the cross-variation (leave-one-out) method.The effectiveness of the calibration was determined based on the values of R 2 , calibration errors (root mean square error of calibration, RMSEC), and errors obtained by cross-validation (root mean square error of crossvalidation, RMSECV).From the results of the validation set, R 2 , the error corresponding to the model prediction (root mean square error of prediction, RMSEP), and the obtained statistical bias were analyzed.

NIR spectroscopy analysis
Figure 1 shows the overlap of the NIR spectra of pure SO, SSO, and BO, where it is possible to observe a similarity between the samples.The spectral range used in this method was 900-1700 nm, with wide and overlapping bands and a limiting factor in the identification of tampering; thus, preprocessing was necessary to highlight the most advantageous bands for the models.The difference in the absorbance intensity of BO is significant when compared with SO and SSO, with the latter two presenting similar intensities.It is observed that the spectra obtained in the NIR region have characteristic bands corresponding to vibrational modes that are common for vegetable oils.The format of the spectra obtained from soybean oil is very similar to works described in the literature that used NIR spectroscopy to determine adulterations and classify oils (Costa et al., 2016;Pereira et al., 2019;Su et al., 2021).However, in the literature there is no description of the NIR spectrum of the baru seed oil, only the spectrum of the cross sections of the wood of its tree (Pan et al., 2021).Similarly, there is no NIR spectrum of soursop seed oil in the literature, only the FT-IR spectrum of the oil (Elagbar et al., 2016) and the NIR spectrum of the leaves of its tree (Wulandari et al., 2020).Figure 1 shows the two main absorption bands in the spectra of SO, SSO, and BO.The first band is in the range of 1100-1300 nm, where there is an increase in the intensity of the band related to the C-H bonds, which correspond to the -CH 2 , -CH 3 , and -CH=CH functional groups.The second main band that presents between 1350 and 1550 nm represents a combination of C-H stretching vibrations.These bands are common in the spectra of oils and fats because they are formed mainly by fatty acids (Basri et al., 2017;Galtier et al., 2007).

FT-IR spectroscopy analysis
The FT-IR measurements were performed in the range of 400 to 4000 cm -1 , which corresponds to the mid-infrared region.
Figure 2 shows the FT-IR spectra of the pure BO, SSO, and SO samples.The transmittance intensity also suggests that SSO and SO are more similar because they have similar intensity bands compared with the spectrum of BO.The spectra show main bands in the ranges of 1160-720 cm -1 , 1400-1200 cm -1 , 2922-2800 cm -1 , and at 1750 cm -1 .Furthermore, the bands at 721 cm -1 and 1160 cm -1 were assigned to the -CH 2 and -CO groups, respectively (Jalkh et al., 2018).The angular deformations of CH 2 and CH 3 were responsible for the bands in the region  between 1400 and 1200 cm -1 .The band observed at 1750 cm -1 was due to the stretching of the ester carbonyl.The intensity of the bands observed in the range of 2922-2800 cm -1 and at 1465 cm -1 corresponds to the C-H stretching of the terminal methylene and methyl groups, respectively, in the fatty acid chains of triacylglycerols (Lastras et al., 2021)

PLS models
Table 1 presents the tampering, spectral range, and number of variables selected for each model.For BO, the first method uses 131 variables corresponding to the two main bands presented in the spectra.The adulterants used in this model were 0, 10, 20, 50, 60, 85, 95, and 100% v/v.In the second method, the total of variables used was 1601, corresponding to the two bands with the highest absorption between 3200.00 and 1601.02cm -1 , with levels of adulteration of 0, 14, 30, 40, 50, 60, 70, and 100% v/v.In both methods, cross-validation was used (leave one out) and the pre-process methods used were baseline, multiplicative scatter correction (MSC), and mean center.R 2 was calculated for both the calibration and validation sets, in addition to the RMSEC and RMSECV for calibration as well as BIAS and RMSEP for validation.
Table 2 lists the parameters obtained from the two models using BO adulterated with SO. Figure 3A and 3B present the models obtained using NIR and FT-IR spectroscopies, respectively.The optimal number of latent variables was four for the two methods, as determined from the cross-validation, corresponding to the analysis of > 98% of the spectral data used for the two models.Both cases presented R 2 values > 0.96, with FT-IR being the more efficient technique.The RMSEC and RMSECV errors from the calibration set were low in both cases, as shown in Table 2.The RMSEP, which calculated errors related to the model's prediction potential, and the BIAS, which considered the systematic errors in the validation stage, were also significantly low, attesting to the effectiveness of the model's prediction.
Previous studies have proposed the use of FT-IR associated with multivariate modeling to obtain fast methods that can detect adulterations.In this sense, in Brazil, the adulteration of butter oil with soybean oil using partial least squares models and spectral data allowed the prediction of the percentage of adulteration with high precision, low relative errors and coefficients of determination of the global adjustment above 0.9 for all calibration and validation datasets (Pereira et al., 2019).A method for detecting adulteration in babassu oil, of Brazilian origin, with soybean oil using FT-IR and PLS showed good performance in detecting and quantifying this adulteration (sensitivity to detect 5% adulteration) (Pereira et al., 2022).The adulteration of passion fruit oil, of Brazilian origin, with sunflower oil was determined using FT-IR and PLS and as a result small amounts of the adulterant were detected (less than 1%) (Kiefer et al., 2019).Thus, considering the results already present in the literature, the present study presents the high potential of using FT-IR and PLS for the identification of adulterations of baru and soursop seed oils with soybean oil.
Based on the results of adulteration of SSO with SO, from the model using the NIR spectra (Figure 4A), which is the model that used less tampering in its construction, only the band found in the range of 1322.7-1584.1 nm was used, totaling 79 variables, with the adulterations 0, 20, 30, 40, 80, and 100% v/v (Table 1).This may be related to the similarity of the SSO and SO bands in the NIR spectra (Figure 1), which, in addition to having fewer bands, are wider, making it difficult to differentiate the oils.However, even with this restriction, the adulterants used were successfully identified, forming a good model.The model obtained from the FT-IR (Figure 4B) data used 1601 variables, and the two main bands (3200.00-1601.02cm -1 ) were used in its construction using the adulterations 0, 10, 30, 50, 60, 70, 90, and 100%.For both cases, cross-validation (leave-one-out) was performed on the calibration sets.Furthermore, baseline, MSC, and mean center preprocessing were employed because they present the most effective models for the identification of adulterated samples.Analogous to the models with SO were also calculated, and the parameters are listed in Table 3.
The results obtained for SSO (Table 3) demonstrate the effectiveness of the PLS models based on the NIR and FT-IR spectra.The two methods obtained R 2 values above 0.98 in the calibration and validation sets, with three latent variables for both NIR and FT-IR, allowing the analysis of more than 99% of the spectral data used.The errors related to the calibration (RMSEC and RMSECV) and validation (RMSEP and BIAS) sets confirmed the quality of the PLS models.
In literature the portable NIR and chemometric was used successfully for predicting quality attributes and adulteration levels of different types of samples.The potential of a portable near-infrared spectrometer for rapid adulteration detection in butter oil produced in Brazil and prediction of quality attributes (acidity and peroxide value) was developed using PLS regression and other chemometric tools, demonstrating an excellent ability to predict the percentage of milk fat that has been replaced by other fats (Medeiros et al., 2023).In the United States of America, near infrared spectroscopy using a portable instrument and PLS resulted in a good model to detect the level of adulteration of quinoa flour with buckwheat flour, cassava flour, corn starch, rice flour and wheat flour, resulting in a coefficient of determination of 0.94 (Wang et al., 2022).The authenticity of coriander oil and adulteration with other commercial vegetable oils such as palm olein, canola oil and soybean oil were studied using portable NIR and chemometric tools (PCA, LDA and PLS), in Brazil.As a result, for PLS, the regression models presented coefficient of determination of 0.98, 0.99 and 0.99, for coriander oil adulterated with palm oil, soybean oil and canola oil, respectively (Kaufmann et al., 2022).In this context, previous studies proposed and those presented show that the portable NIR spectrometer is effective to identify and quantify oil adulteration using chemometric tools.

Conclusion
These results show the importance of spectroscopic and chemometric methodologies in the identification of adulterated oils.This study allows the analysis of different techniques, addresses the need for preprocessing to minimize spectral noise and highlight the information contained in the spectra.Thus, models for the quantification of the adulterant (soybean oil) in baru and soursop oils were constructed using PLS regression and pre-processing methods; baseline, multiplicative scattering correction (MSC), baseline and mean center.The regression models successfully were built from the NIR and FT-IR analyses, with relatively low errors and the calibration validation coefficients > 0.95, for all cases.To build the soursop oil adulteration model  using NIR, it was necessary to manipulate a smaller number of adulteration levels, which resulted in a minimum detection limit of 20%, a higher value than that found using the FT-IR.In contrast, for the models of the baru oil, the limit inferior of detection of adulteration was lower using NIR.This indicates that both spectroscopic techniques can be used with the same objective, however they will provide models with different limits of quantification.
Finally, the results obtained indicate that FT-IR and portable NIR spectrometer are promising tools for identification of baru and soursop oils adulteration, and could be used, for example for fast authentication and quality control in the oil processing industry or in seizures by supervisory bodies.

Figure 1 .
Figure 1.NIR spectra of the BO, SSO, and SO samples.

Figure 2 .
Figure 2. FT-IR spectra of the BO, SSO, and SO samples.

Figure 3 .
Figure 3. Calibration and validation predictions of the percentage of BO adulteration by the PLS models based on the (A) NIR and (B) FT-IR spectra.

Figure 4 .
Figure 4. Calibration and validation predictions of the percentage of SSO adulteration by the PLS models based on the (A) NIR and (B) FT-IR spectra.

Table 1 .
Percentages of adulteration of oils, region spectra and number of variables.andQ' are the weights, T and U are the scores, and E and F are the error matrices of X and Y, respectively.The objective of the PLS algorithm is to minimize the F term and maintain a relationship between the X and Y matrices based on the internal relationship of U = TD, where D is a diagonal matrix.T scores (related to matrix X) are considered as linear combinations of the original variables kx and weights * kl w

Table 2 .
Evaluation of the PLS models predicting the percentage of adulteration of BO by SO.

Table 3 .
Evaluation of the PLS models predicting the percentage of adulteration of SSO by SO.