Introduction

It is commonly accepted that the three-dimensional (3D) atomic structure of metallic nanoparticles determines their catalytic properties1,2,3,4,5. Indeed, the presence of highly undercoordinated atoms or stepped facets at the surface governs many catalytic reactions. A quantitative characterization of the atomic configuration at the surface is therefore essential to reveal the active sites of the nanoparticle where reactant molecules are preferentially adsorbed, to understand the mechanisms of the catalytic behavior, and to improve the performance of these systems. Atomic-resolution annular dark field scanning transmission electron microscopy (ADF STEM) has become an invaluable tool for imaging metallic nanostructures6,7,8,9,10. In this context, electron tomography has been used to provide insights into the 3D shape of nanostructures11,12,13, but this technique requires a significant electron dose for the multiple projections. Consequently, this approach is not feasible when investigating small beam-sensitive catalysts or dynamical processes. Therefore, alternative methods have been developed where the 3D atomic structure is reconstructed from a single ADF STEM projection8,14,15,16,17. For this purpose, the number of atoms contained in an atomic column along the third dimension is retrieved from an ADF STEM image. These atom counts are combined with prior knowledge about a material’s crystal structure and used to create an initial atomic model. This serves as an input for energy minimization to obtain a relaxed 3D reconstruction of a nanostructure. The validity of this atom-counting/energy minimization method has qualitatively been verified using electron tomography10,16 and is applied to study several systems5,8,10,16,18.

Nowadays, different possible approaches are available for energy minimization that incorporates experimental observations. Indeed, the aim is not to find the global energy minimum from a purely computational study but to reconstruct also the metastable configurations in which the nanoparticles often reside. Yu et al. refined the atomic structure by minimizing the particle energy together with matching forward linear modeling and the experimental ADF STEM images15. In this study, an approximate model for the STEM image simulation was required limiting the accuracy of the reconstructed models. Other methodologies use atom-counting results as an input, and mainly, two different strategies can be distinguished here. Using the first approach, the energy is minimized by shifting the atomic columns up and down while keeping the number of atoms in a column fixed to the outcome of the atom-counting procedure8,18. The second approach consists of a full molecular dynamics simulation to relax the particle’s structure5,10,16. The first method is potentially too restrictive by ignoring the finite atom-counting precision, especially at lower doses. The expected inevitable counting imprecision in this case9, will likely result in slightly more roughness at the reconstructed atomic surface in the direction parallel to the beam direction8,16. On the other hand, the second method runs the risk of ending up in a global energy minimum and violating the physical constraints of the experimental observation. Both approaches hamper a reliable 3D reconstruction of the atomic configuration at the surface, especially for images acquired at lower doses.

Here, we propose a Bayesian genetic algorithm that includes the finite atom-counting precision via a Bayesian inference scheme to improve the 3D atomic models for small nanostructures. Bayesian methods are powerful tools in which a priori information is rationally combined with the observed data and have been successful in other fields of science including business, computer science, economics, educational research, environmental science, epidemiology, genetics, geography, imaging, law, medicine, political science, psychometrics, public policy, sociology, and sports19,20,21,22. Next to the finite atom-counting precision, the incorporation of additional prior knowledge from neighbor-mass relations will be beneficial when reconstructing atomic models from extremely low-dose ADF STEM images. This prior knowledge is fused into the genetic algorithm which uses atom-counting results as an input for reconstructing the 3D atomic structure. Genetic algorithms are typically used as efficient tools for solving large optimization problems where finding a direct solution is not possible. These evolutionary methods are widely applied for structure prediction of systems with increasing complexity, such as atomic clusters, crystals, interfaces, surfaces, thin films, nanoparticles, and defect structures15,21,23,24,25,26,27,28,29.

In this study, we focus on the 3D characterization of small, single-crystal, mono-metallic catalyst nanoparticles. The technologically relevant size of the catalytic particles considered here, for which the catalytic activity is peaking30, is around 3 nm in size, but the method will be applicable to structures up to 10 nm in diameter. The method will also be suitable for the different types of single-element crystalline nanostructures for which the model-based atom-counting method is designed31,32. This covers atomic-resolution ADF STEM images of nanostructures with a varying thickness, acquired along a main zone axis, which are defect-free along the beam direction. For more complex shapes, e.g., concave surfaces, it will depend on the direction of the surface with respect to the incident electron beam whether the specific feature can be reconstructed reliably from the atom-counting results of a single projection. In this work, the quality of the obtained reconstructions with our advanced Bayesian genetic algorithm is quantitatively evaluated via an extensive simulation study, in terms of the reliability with which the surface atoms can be reconstructed in 3D. In the last part, the algorithm is applied to retrieve 3D atomic models from an experimental time series of a Pt nanoparticle.

Results and discussion

Pt nanoparticle model and atom-counting

To count the number of atoms, a very well-established approach is utilized8,11,16,31,32. In the past years, the high accuracy and precision of this method have been demonstrated for nanocrystals of arbitrary shape, size, and atom type. For atom-counting, so-called scattering cross-sections (SCSs), corresponding to the total intensity of electrons scattered toward the ADF detector for every atomic column, have been introduced in ADF STEM. These SCSs can be measured using statistical parameter estimation theory33,34 or by integrating intensities over the probe positions in the vicinity of a single column of atoms35. For our simulation study, SCSs are determined from noise realizations at different doses of a simulated ADF STEM image of a Pt nanoparticle partially embedded in a carbon support, illustrated in Fig. 1a, b. The created particle largely resembles the Wulff construction solid for Pt586 (four atoms along each edge) and was modified to include several diagnostic features of interest commonly observed for catalytic metallic nanoparticles including a surface adatom, a surface vacancy, a terrace edge, a small area of {110} face, and rounded corners. With these modifications, the particle model contains 587 atoms. The slab of amorphous carbon measures 5 nm × 5 nm × 2 nm and the geometry follows the work of reference36. Image simulations were performed for the Pt particle viewed along the [110] zone axis. The details of the simulations are described in the Methods section. Figure 1c shows the annular bright field (ABF) STEM image, illustrating the presence of the carbon support and Fig. 1d shows the ADF STEM image using an electron dose of 104 eÅ−2. In Fig. 2a, b simulated ADF STEM images are shown using lower incident electron doses corresponding to 103 and 102 eÅ−2, respectively. The simulated ADF STEM images can be modeled as a superposition of Gaussian functions using the StatSTEM software16. The refined models for the simulated ADF STEM images are shown in Fig. 2c, d. From the estimated model parameters, the SCSs are determined for each atomic column and are represented in a histogram in Fig. 2g, h. In a subsequent analysis, the distribution of the SCSs of all atomic columns is decomposed into overlapping normal distributions, i.e., a Gaussian mixture model, as illustrated in Fig. 2g, h31,32. The locations of the normal distributions are matched to the expected SCS values from simulations for a column containing g atoms37 and their widths are determined. It should be noted that this procedure only depends on the values of the estimated SCSs and is independent of the subjective choice of bins in the histogram for visualization. The number of atoms in each projected atomic column in Fig. 2e, f is then obtained by assigning the SCS to the component that generates this SCS value with the highest probability. More details on the atom-counting methodology are provided in the Methods section. The width of the normal distributions reflects the finite precision of the atom-counting results and will be used as prior knowledge for reconstructing the 3D atomic structure.

Fig. 1: Simulated Pt nanoparticle partially embedded in an amorphous carbon support.
figure 1

a 3D model and b cross-section, c simulated ABF STEM image, and d simulated ADF STEM image at 104 eÅ−2. The scale bars in cd are 1 nm.

Fig. 2: Statistical atom-counting analysis and construction of the probability matrix.
figure 2

a, b Simulated ADF STEM images of the Pt nanoparticle embedded in the carbon support along the [110] zone axis for an incident electron dose of 103 and 102 eÅ−2. c, d Refined parametric models of the images shown in a, b. e, f Estimated number of atoms in each atomic column for the images shown in a, b. g, h Histograms of SCSs for the images shown in a, b. The decomposition into overlapping Gaussian components is shown in color, corresponding to the number of atoms. The black curve shows the estimated mixture model. i, j Probability matrices indicating the probability for a column n to contain a specific number of atoms g. The scale bars in a and b are 5 Å.

Bayesian inference: finite atom-counting precision and neighbor-mass relations

As an input we need the probability that an atomic column contains a specific number of atoms which can be defined from the decomposition into normal distributions, illustrated in Fig. 2g, h. Indeed, the normal distribution functions describe the probability that component g generates the nth SCS, i.e., p(SCSng), and by using Bayes’ theorem, the probability that the nth SCS has g atoms, i.e., p(gSCSn), can be computed:

$$p(g| {{{{\rm{SCS}}}}}_{n})=\frac{p(g)p({{{{\rm{SCS}}}}}_{n}| g)}{p({{{{\rm{SCS}}}}}_{n})}=\frac{p(g)p({{{{\rm{SCS}}}}}_{n}| g)}{{\sum }_{g}p(g)p({{{{\rm{SCS}}}}}_{n}| g)}.$$
(1)

Equal probabilities are assigned to the probability of having g atoms in a column p(g). The value of p(g) depends on the maximum number of g that is considered. However, for the computation of the probability p(gSCSn), it is irrelevant since a constant p(g) cancels in Eq. (1). The probability p(gSCSn) is visualized by a probability matrix in Fig. 2i, j. From the representation of the Gaussian mixture model on top of the histogram, the relation between the probability matrix and the width of the normal distributions is clear.

For improving the quality of the reconstructions further at lower doses, we can include even more relevant prior knowledge in the form of neighbor-mass relations. A neighbor-mass matrix helps to predict the column mass based on the average mass of the neighboring columns. For the small spherical convex nanoparticles in this study, abrupt discontinuities are highly non-physical30. Therefore, we propose here a diagonal neighbor-mass matrix. The matrix is visualized in Supplementary Fig. S2. The probability profile is Gaussian and the width is chosen such that the interval ±1 atom contains 80% of the probability. One should keep in mind that for particles with significantly different surfaces, the neighbor-mass matrix should be designed differently. The normalized neighbor-mass probability, p(gNBn) with NBn indicating the average neighbor-mass, is combined with the probability matrix accounting for the atom-counting reliability p(gSCSn), in order to take the two types of prior knowledge into account:

$$p\left(g| {{{{\rm{SCS}}}}}_{n}\cap {{{{\rm{NB}}}}}_{n}\right)=\frac{p(g| {{{{\rm{SCS}}}}}_{n})p(g| {{{{\rm{NB}}}}}_{n})}{{\sum }_{g}p(g| {{{{\rm{SCS}}}}}_{n})p(g| {{{{\rm{NB}}}}}_{n})}.$$
(2)

Bayesian genetic algorithm for 3D atomic modeling

The introduced probability matrices are used as input of the prior knowledge for the genetic algorithm that we will use to reconstruct the 3D atomic structure of the nanoparticle, hence the name Bayesian genetic algorithm. A genetic algorithm is an iterative process where first a population of randomly generated individuals is created. In our algorithm, this initial population is generated by randomly modifying the number of atoms and the height offset of the atomic columns of a 3D starting configuration within a certain mutation range. This 3D starting configuration is obtained by positioning the atoms (i.e., the outcome of the atom-counting procedure) in each atomic column parallel to the beam direction and symmetrically around a central plane. Based on prior knowledge about the crystal structure, the atoms are separated at a fixed distance \(a/\sqrt{2}\) with a = 3.92 Å corresponding to the [110] zone orientation. For this orientation, the atomic columns at the even planes are shifted along the beam direction by \(a/(2\sqrt{2})\) with respect to the odd planes. A population size of 500 is used in all calculations, the count mutation range equals 1, and the height mutation range for the offset of a column equals a lattice step, i.e., the interatomic spacing between the Pt atoms along the [110] direction. In each iteration, i.e., a generation, the fitness of every individual in the population is evaluated by the cost-function of the optimization problem. The individuals with the best cost-function values are selected from the current population, and a new complete population of candidate solutions is formed by recombining and mutating the selected individuals. The fraction of the population that is used for the recombination step equals 50%. For each recombined member, two parents are randomly chosen from the selected individuals, and cross-over is performed by randomly selecting columns from both parents. A mutation density of 2% is included to avoid ending up immediately in a local minimum by randomly modifying the number of atoms and height offset for 2% of the atomic columns in each new member. In addition to the usual iterations over many breeding generations, in this work we introduce a second loop to provide for multiple unique starting initializations, specifically to reduce the risk of finding only local-minima solutions. It should be noted that in the population of candidate solutions the atoms are not relaxed in x-, y-, and z-direction in order to search a broad population with fast convergence. A geometry relaxation, performed using LAMMPS38, would not affect the ranking of the models and thus not influence the reliability of the reconstructed models as illustrated in Supplementary Table S2. The geometry relaxation only reduces the cost-function by a consistent amount as shown in Supplementary Fig. S3. More details of the genetic algorithm are provided in the Methods section. The cost-function χ that we use to evaluate the candidate solutions within the different generations of the genetic algorithm is given by:

$$\chi =\frac{\sum {E}_{{{{\rm{a}}}}}}{{\sum }_{n}{g}_{n}}\cdot \left(1+\root n \of {\mathop{\prod}\limits_{n}p({g}_{n}| {{{\rm{prior\,knowledge}}}})}\right),$$
(3)

where ∑Ea is the sum of the energies per atom given by the EAM potential39,40 and ∑ngn is the total particle mass. This cost-function consists of two factors where the first represents the average energy per atom that we wish to minimize. The second factor allows us to penalize the energy per atom with the probability of the candidate solution based on the prior knowledge (Eq. (1) or Eq. (2)). This form of the cost-function weights the penalty term with the average energy per atom. The model probability itself is based on the geometric mean of the probabilities of each individual column and needs to be maximized. Since the average energy per atom is negative, this cost-function is minimized.

Simulation results

In order to evaluate the quality of the reconstructions using the Bayesian genetic algorithm, an extensive simulation study is carried out. For this purpose, the electron dose is varied between 102 and 105 eÅ−2 and 30 noise realizations are generated at each electron dose. Supplementary Fig. S4 summarizes the atom-counting results obtained following the methodology illustrated in Fig. 2 and which are used as an input for our Bayesian genetic algorithm. The results of the reconstructions are quantitatively summarized by comparing the reconstructed 3D models with the ground truth model. As a criterion to evaluate the 3D atomic models, we used the fraction of surface atom positions, i.e., with coordination number less than 12, that are correctly defined in 3D. These are the atoms that are of interest for catalysis. Figure 3 shows this fraction for the reconstructed atomic models. As a reference, we also included the fraction following the approach where the number of atoms is fixed to the outcome of the atom-counting procedure and where during the reconstruction the atomic columns are only shifted up and down. A significant, targeted improvement for the reconstructed surface atoms is observed for the lower incident electron doses when including the finite atom-counting precision and the neighbor-mass relations. For the lowest incident electron dose of 102 eÅ−2, we see an improvement from 57% to 73%. It is important to notice that only Poisson noise is included in the simulation study. In experiments, environmental distortions will introduce additional uncertainties in the atom-counting results41. Therefore, we expect that the Bayesian genetic algorithm will also improve the reliability of the reconstructed 3D models at higher incident electron doses for experiments. If the probability matrix of the finite atom-counting precision (Fig. 2i, j) shows very high probabilities for all the atomic columns, i.e., when there is no ambiguity in the assignment of the number of atoms, the less computational demanding method, where the number of atoms in the atomic columns is fixed during the reconstruction, might be preferred as it will result in a reliable 3D atomic model in a more fast manner. In Supplementary Fig. S5a, the results obtained when using the neighbor-mass relations only are also included, next to the results of Fig. 3. The quality of the neighbors-only reconstructions is dose-independent and significantly lower at the higher incident dose values. In this case, there are no constraints set by the experimental observations and the reconstruction is more determined by a lower energy solution. The high performance at low doses for the neighbors-only reconstructions is partially owing to the choice of a relatively well-faceted, low-energy particle. These results illustrate that prior knowledge about the finite atom-counting precision is essential in the Bayesian genetic algorithm and that it is the combination of both types of prior knowledge, that leads to improved performance as illustrated in Fig. 3. Furthermore, the results in this figure demonstrate that the different contributions in the cost-function, i.e., the average energy per atom, the finite atom-counting precision, and the neighbor-mass relations, are properly balanced and that the reconstructed 3D model is not fully determined by one of the contributions.

Fig. 3: Quantification evaluation of the reconstructed 3D models.
figure 3

Fraction of the surface atoms that are correctly reconstructed in 3D when including the finite atom-counting precision and neighbor-mass relations as prior knowledge. As a reference, the results when using a fixed number of atoms in a column are also displayed. The error bars indicate 95% confidence intervals.

In order to evaluate the reconstructed 3D atomic models in a bit more detail, the 4th worst and 4th best reconstructions of the 30 noise realizations at each dose can be visualized as an 80% prediction interval for the reconstructions. The 4th worst and 4th best reconstructions are selected based on the percentage of correctly reconstructed surface atoms. These intervals are shown in Fig. 4 and Supplementary Fig. S5b. The colors of the atoms correspond to the coordination number and the reconstruction in the box in Fig. 4c corresponds to the ground truth model. The coordination number serves as a powerful predictor for surface adsorption strength on Pt nanoparticles, and hence as a predictor of chemical activity3,42,43,44. Even for the lower doses in Fig. 4c, the shape is very well reconstructed and a vast improvement is observed when including relevant prior knowledge resulting in less roughness from the finite atom-counting precision at the surface as compared to the low-dose reconstructions in Fig. 4a, where the number of atoms in an atomic column has been kept fixed. Some features will be more easily reconstructed unambiguously than others. Surface facets are typically well reconstructed. As expected, smaller details are more challenging. For example, an adatom will be easier to detect if it is located at the side of the nanoparticle in projection. However, in this case, it is more difficult to locate this atom correctly along the depth direction. On the other hand, an adatom on top of the nanoparticle, will be more difficult to detect if the atom-counting precision is limited, but once detected, it can more easily be located along the depth direction. In general, if the direction of the incident electron beam is not immediately visible based on the reconstructed 3D atomic models, a trustworthy reconstruction is obtained.

Fig. 4: Visualization of the reconstructed 3D atomic models.
figure 4

A lower bound (red background) and upper bound (green background) of an 80% prediction interval are represented for the reconstructions when using a a fixed number of atoms in the atomic column during the reconstruction, b the finite atom-counting precision, and c the finite atom-counting precision and the neighbor-mass relations. The reconstruction in the box in the lower right corner corresponds to the ground truth model. The coloring of the atoms corresponds to the nearest-neighbor coordination, from 1 in red to 12 in dark blue.

Experimental results

As a last part of this work, we apply the Bayesian genetic algorithm to 25 frames of an experimental time series of a catalyst Pt nanoparticle30. The experimental details and corrections for scan noise and tilt are described in the Methods section. The incident electron dose is 1.38  104 eÅ−2 per frame, i.e., higher than the doses where we observe an improved reconstruction from Fig. 3. However, as already mentioned, during experiments, environmental distortions introduce additional uncertainties and this can be observed from the estimated width of the overlapping normal distributions of the Gaussian mixture model. This estimated width σGMM reaches values up to 0.0228, and can be compared to the values shown in Supplementary Fig. S1. This experimental value corresponds to the width for a dose lower than 103 eÅ−2, when considering Poisson noise only. Furthermore, the probability matrices for the finite atom-counting results from the experiment and the simulation study can be compared. Supplementary Fig. S6 illustrates that the experimental probability matrix corresponds more to the probability matrix of 5 × 102 eÅ−2 for which the Bayesian genetic algorithm clearly benefits.

To reliably count the number of atoms from the time series of images, we used a hidden Markov model which explicitly describes the possibility of structural changes over time45,46. The atom-counting results from every single frame have been used as an input for our Bayesian genetic algorithm in which we utilize the finite atom-counting precision and neighbor-mass relations. The reconstructed models are schematically represented in Fig. 5. The ADF STEM images and corresponding reconstructed models for all frames are shown in Supplementary Figs. S7 and S8. This approach enables a reliable 3D quantification of the structural changes of the Pt nanoparticle under the electron beam. From the evaluation of the coordination numbers (Supplementary Fig. S9), we can conclude that although each image has unique noise and that the structure is moving under the electron beam, the number of atoms with the same coordination number is consistent throughout time. Since these coordination numbers are very important to relate to the catalytic properties, it is important to point out here that the small changes clearly do not change the overall catalysis-relevant information that we can extract.

Fig. 5: Analysis of an experimental time series of a catalyst Pt nanoparticle.
figure 5

a ADF STEM time series. b Corresponding reconstructed 3D atomic models for the time sequence viewed along the beam direction. c Rotated models to show the dominant surface facets. The coloring of the atoms corresponds to the nearest-neighbor coordination, from 1 in red to 12 in dark blue. The scale bars in a are 10 Å.

To summarize, we introduced a powerful alternative to the initially developed atom-counting/energy minimization method for the 3D reconstruction of nanoparticles from a single viewing projection. This Bayesian genetic algorithm takes advantage of the finite atom-counting precision and neighbor-mass relations during the reconstruction. This results in more reliable reconstructions of the 3D atomic structure, especially at lower incident electron doses below 104 eÅ−2. The increased quality of the 3D atomic models has been validated by a quantitative evaluation of the reconstructed surface atoms. This result shows great promise to use these reconstructions to predict the adsorption properties of catalytic nanoparticles.

Methods

STEM simulations

Image simulations were performed using the MULTEM package47,48. An acceleration voltage of 200 kV, a semi-convergence angle of 22.48 mrad, and a pixel size of 0.124 Å were chosen, and averaging over 30 unique phonon configurations was performed. A source size having a Gaussian profile with an FWHM of 1.0 Å was added to further reflect experiments recorded under the same conditions. The full set of simulation settings is listed in Supplementary Table S1.

Atom-counting methodology

Using the StatSTEM software34, a parametric imaging model is fitted to the simulated ADF STEM images. This model consists of a superposition of N Gaussian functions and describes the intensity at the pixel (k, l) at the position (xk, yl) of the ADF STEM image:

$${f}_{kl}({{{\boldsymbol{\theta }}}})=\zeta +\mathop{\sum }\limits_{n=1}^{N}{\eta }_{n}\exp \left(-\frac{{\left({x}_{k}-{\beta }_{{x}_{n}}\right)}^{2}+{\left({y}_{l}-{\beta }_{{y}_{n}}\right)}^{2}}{2{\rho }^{2}}\right)$$
(4)

with ζ a constant background accounting for the amorphous carbon support, ρ the width of the Gaussian peak, ηn the column intensity of the nth Gaussian peak, and \({\beta }_{{x}_{n}}\) and \({\beta }_{{y}_{n}}\) the x- and y-coordinate of the nth atomic column, respectively. The unknown parameters of the imaging model are given by the parameter vector:

$${{{\boldsymbol{\theta }}}}=\left({\beta }_{{x}_{1}},\ldots ,{\beta }_{{x}_{N}},{\beta }_{{y}_{1}},\ldots ,{\beta }_{{y}_{N}},\rho ,{\eta }_{1},\ldots ,{\eta }_{N},\zeta \right)$$
(5)

and are estimated in the least square sense. From the obtained estimated parameters \(\hat{{{{\boldsymbol{\theta }}}}}\), the estimated SCSs can be calculated from the volumes under the Gaussian peaks above the background33,34:

$${{{{\rm{SCS}}}}}_{n}=2\pi {\hat{\eta }}_{n}{\hat{\rho }}_{n}^{2}.$$
(6)

The estimated SCSs are visualized in a histogram in Fig. 2g, h and are regarded as independent statistical draws from a so-called Gaussian mixture model. In a sense, the assumption of independent statistical draws implies that cross-talk between neighboring atomic columns is not significantly contributing49,50,51,52. The model is defined as a superposition of Gaussian components and describes the probability that a specific SCS value is observed. The probability density function of a mixture model with G components can parametrically be written as:

$${f}_{{{{\rm{mix}}}}}\left({{{{\rm{SCS}}}}}_{n};{{{{\mathbf{\Psi }}}}}_{G}\right)=\mathop{\sum }\limits_{g=1}^{G}{\pi }_{g}\frac{1}{\sqrt{2\pi }{\sigma }_{{{{\rm{GMM}}}}}}\exp \left(\frac{-{\left({{{{\rm{SCS}}}}}_{n}-{\mu }_{g}\right)}^{2}}{2{\sigma }_{{{{\rm{GMM}}}}}^{2}}\right).$$
(7)

The locations μg of the normal distributions are matched to the expected SCS values for a column containing g atoms obtained from image simulations performed for a Pt crystal in [110] orientation up to 30 atoms thickness. The settings for the frozen lattice multislice simulation are the same as for the simulation of the Pt nanoparticle embedded in the carbon support (listed in Supplementary Table S1). The symbol ΨG in Eq. (7) represents the vector containing all unknown parameters in the mixture model with G components:

$${{{{\mathbf{\Psi }}}}}_{G}=\left({\pi }_{1},\ldots ,{\pi }_{G-1},{\sigma }_{{{{\rm{GMM}}}}}\right).$$
(8)

The parameters πg and σGMM denote the mixing proportion of the gth component and the width of the components respectively. The mixing proportions πg sum up to unity, therefore the Gth mixing proportion is omitted in the parameter vector ΨG. The parameters ΨG are estimated from the measured SCSs using the maximum likelihood estimator for a given number of components G. Here, G equals the maximum thickness of the image simulations of the Pt crystal, i.e., 30 atoms, since mixing proportions of components that exceed the maximum thickness of the Pt nanoparticle are estimated zero.

In principle, the width σGMM reflects the finite atom-counting precision. However, it is known that the width of the Gaussian distributions might be underestimated for lower electron doses37. In order to counterbalance this underestimation, an effective width σeff will be used to describe the width of the normal components in the Gaussian mixture model. For this purpose, σGMM is evaluated with respect to the expected dose-dependent width σdose as explained in more detail in the Supplementary methods.

Genetic algorithm

The parameters of the genetic algorithm are specified for the reconstruction of small nanoparticles and balanced with the available RAM/CPU resources (desktop computer) (Table 1). A larger population size will capture a more diverse spread of solutions and result in a longer computation time. For small particles, a smaller population size can be used. Here, a population size of 500 particle configurations is used which is significantly larger than the number of atomic columns in the particle (≈100 atomic columns). A fraction of this population, i.e., the recombination fraction or cross-breeding fraction will survive from one generation to the next and will be used as “parents” for recombining or cross-breeding new solutions. A too small fraction would reduce the diversity in the solution. A too large fraction, on the other hand, reduces the amount of new children in each generation. Therefore, a 50% fraction is chosen which means that every solution with a quality above average will be preserved and selected for breeding the next generation. If only existing parents are used in the recombination step, there is a risk of ending up in a local minimum. For this reason, after each breeding operation, a small percentage of mutants for 2% of the atomic columns can be introduced to inject some randomness into the atomic models. If these traits are an improvement, they will persist, otherwise they will automatically disappear. A very large degree of mutation is undesirable. In such a case, the benefit from the history of the evolution might be lost. The degree of mutation is expressed in terms of the height mutation range and count mutation range. If these values are too large, very non-physical solutions are proposed. Here we consider a mutation range of ±1 atom in mass for a given atomic column and ±1 position height shift or lattice step. This choice is also justified by both the finite atom-counting precision and the expected finite surface roughness. Important to notice here is that if a mutation is beneficial, then a cumulative evolution is possible. In this manner, a solution of ±2 or more atoms and/or ±2 or more height changes can be obtained over different generations. Next to the iterations throughout the different generations, a second loop feeds multiple unique starting initializations to the algorithm to reduce the risk of ending with local-minima solutions. For the simulation study, for each reconstruction, 25 unique random starting populations are used in the algorithm. It should be noted that a structure with a better cost-function might be found when increasing this number. For the experimental time series, we used 100 unique populations initializations throughout the reconstruction procedure.

Table 1 Genetic algorithm parameters.

Experimental Pt series

The ADF STEM images were recorded on a JEOL ARM200CF fitted with a probe-aberration corrector using an acceleration voltage of 200 kV, a probe convergence angle of 22.48 mrad, an annular detector ranging from 52 to 248 mrad, a dwell time of 4 μs and an incident electron dose of 1.38  104 eÅ−2 per frame. The images of the time series were corrected for drift and other distortions using non-rigid registration53. During the time series, the Pt nanoparticle tilts slightly away from zone axis orientation and back, which affects the SCSs54. Therefore, the SCSs of the individual frames need to be compensated for tilt as an input for the hidden Markov model45. This is done by applying a linear scaling (up to 10%) to the SCSs of the individual frames37,55, assuming that the total number of atoms in the nanoparticle remains constant throughout the time series. This assumption is valid since the threshold energy for sputtering Pt atoms from a convex surface with step sites is 379 keV56, well above the incident electron energy of 200 keV. We therefore do not expect sputtering of atoms from the surface, only surface diffusion41. This is also further confirmed directly from the experimental time series since no single Pt atoms are observed on the carbon around the Pt particle.