- Email: [email protected]

Contents lists available at ScienceDirect

Current Opinion in Solid State and Materials Science journal homepage: www.elsevier.com/locate/cossms

Informatics and data science in materials microscopy Paul M. Voyles Department of Materials Science and Engineering, University of Wisconsin, Madison, 1509 University Ave, Madison, WI, United States

a r t i c l e

i n f o

Article history: Received 23 July 2016 Accepted 2 October 2016 Available online xxxx

a b s t r a c t The breadth, complexity, and volume of data generated by materials characterization using various forms of microscopy has expanded significantly. Combined with increases in computing power, this has led to increased application of techniques from informatics and data science to materials microscopy data, both to improve the data quality and improve the materials information extracted from the data. This review covers recent advances in data science applied to materials microscopy, including problems such as denoising, drift and distortion correction, spectral unmixing, and the use of simulated experiments to derive information about materials from microscopy data. Techniques covered include non-local patch-based methods, component analysis, clustering, optimization, and compressed sensing. Examples illustrate the need to combine several informatics approaches to solve problems and showcase recent advances in materials microscopy made possible by informatics. Ó 2016 Elsevier Ltd. All rights reserved.

1. Introduction Recent increases in data volume and complexity, ubiquitous access to high power computing, and development of powerful mathematical methods and computer algorithms have created an explosion of interest in the application of informatics and data science tools in the sub-field of materials characterization. Large data sets come from fast pixelated detectors [1–3] for high-speed imaging of dynamic processes [4,5]; acquisition of large, densely sampled diffraction data sets and spectrum images [6–14]; and sophisticated experiments which measure an entire, multidimensional stimulus-response data set from many positions on a sample [15–19]. Mathematics and algorithms developed in diverse communities including machine learning, medical imaging, and computer vision are being adapted to materials characterization data. One possible conceptual framework for the role of informatics in materials characterization is to divide the process of characterization into three steps: acquisition of data, development of materials information from the data, then the creation of scientific knowledge from the data. As a concrete example of the data ? information ? knowledge chain, consider mapping different phases present in a composite. First, one acquires a compositionsensitive spectrum image data set consisting of an energy dispersive spectrum (EDS) of characteristic x-rays at a grid of positions on the sample. Second, one analyzes the data to create information about the sample such as the spatial distribution of elements or composition of phases. Third and last, one connects the measured phase distribution to the processing history of the sample or its E-mail address: [email protected]

mechanical properties to create generalizable knowledge and advance the state of the art in the field. Informatics tools are applied primarily in the first two steps of the chain. Various tools can improve the quality of the data both during acquisition, for example by drift-correction and registration, or after acquisition, for example by denoising. We could envision on-the-fly adaptive experiments that, for example, estimate signal-to-noise ratio (SNR) from data as it is being acquired, then terminate the experiment automatically when sufficient high quality data has been obtained. That idea and significantly more sophisticated integration of data acquisition and informatics falls under the heading of computational imaging, an active field of research (see e.g. the SPIE conference series Computational Imaging) that has not yet had a large impact on materials science. Informatics tools applied at the information stage could include peak fitting, instrument response deconvolution, or machine learning approaches like component analysis or clustering to identify important features in complex data sets. The last step in the chain, the creation of knowledge, so far remains the purview of human scientists, although sophisticated expert systems eventually may encroach on this territory as well. This review will focus on the application of informatics techniques and data science to materials microscopy, with a bias towards electron microscopy but a few examples drawn from scanning probe microscopy as well. Coverage includes electron microscopy and related spectroscopies (EDS and electron energy loss spectroscopy, EELS) and electron tomography; and scanned probe microscopies, related spectroscopies, and complex, multidimensional stimulus response experiments such as piezoforce microscopy. The large, parallel, and complimentary efforts in

http://dx.doi.org/10.1016/j.cossms.2016.10.001 1359-0286/Ó 2016 Elsevier Ltd. All rights reserved.

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

2

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

X-ray scattering and neutron scattering are not covered here, nor does this review cover applications in biology. Finally, this review will focus on qualitative descriptions of classes of algorithms and on example applications to materials science and engineering problems. For mathematical details, the reader is largely directed to the references. This review is organized into two main sections. The first deals with approaches to improve the quality of materials microscopy data, including drift and instability correction, especially for scanned image acquisition, and with denoising. The second section deals with approaches to improve materials information via improved processing of data to extract known signals and the use of informatics to discover new signals in large, complex data sets. Applications discussed include spectral unmixing, tomographic reconstruction, and the use of forward simulations of experiments to obtain materials information when the inverse problem of extracting information for the data directly is intractable. Both sections include an example application which emphasizes the need to combine a selection of techniques, rather than relying on just one, and concludes with a few comments on possible opportunities for future development. There follows a short section on software implementation and data management, which pose significant obstacles for informatics in materials characterization, and a summary. 2. Improved microscopy data Improving data involves deriving an approximation of an ideal experiment from actual, imperfect experimental results. This section discusses various approaches to two types of problems: reducing noise in experimental data, with a particular focus on the Poisson noise that often dominates particle counting experiments like electron microscopy, and drift and distortion correction, especially of images acquired by scanning a probe across the sample.

S¼

N 1X ðy y2;i Þ2 : N i¼1 1;i

ð2Þ

This measure of S is not stable in the presence of large Poisson noise, since it emphasizes extreme values in the y’s. A better measure of S is the logarithm of the Poisson maximum likelihood (pml) ratio [23].

pmlðy1;i ; y2;i Þ ¼

max ½Pðy1;i ; kÞPðy2;i ; kÞ

k2ð0;1Þ

max Pðy1;i ; kÞ max Pðy2;i ; kÞ

k2ð0;1Þ

;

N X

log½pmlðy1;i ; y2;i Þ:

i¼1

Experimental measurements of all types are corrupted by noise. If the experimental data are denoted yi for i = 1 . . . N and the ‘‘true” noiseless data are denoted xi, the mathematical problem is to obtain from the yi an estimate xi that is as close to the xi as possible. Examples of possible types of noise in materials microscopy data include detector or electronic readout noise, which sometimes can be approximated as an additive random variable drawn from a Gaussian distribution G with fixed mean m and variance r, and noise arising from counting a limited number of electrons or photons, which is drawn from a Poisson distribution P with mean and variance both given by x, the true data value. Thus, a reasonably general representation of y is given by

ð1Þ

where P and G are both random variables. Data with high counts or that involve sensing a continuous quantity like a cantilever deflection may be dominated by detector noise. Detector noise can be non-Gaussian, such as Josephson noise or Fano noise. Data with low counts (small x) is often dominated by Poisson noise, even if some detector noise is present. Denoising algorithms in the literature often assume that the data is dominated by additive Gaussian white noise (AGWN), G in Eq. (1). AGWN is a good assumption for strong signals in, for example, photography and medical imaging, and has the advantage that Gaussian random variables have well-developed mathematical properties. It is not, however, a good model for many particle-based materials microscopy experiments, including

ð3Þ

k2ð0;1Þ

where k is the variance of the Poisson distribution, and

S ¼ N1

2.1. Denoising

y ¼ PðxÞ þ Gðm; rÞ;

electron imaging, EELS, and EDS, which tend to be dominated by Poisson noise. In particular, because the variance of the Poisson distribution is equal to the mean, the level of the noise is controlled by the strength of the signal, not uncorrelated to it. As a result, simple application of codes developed for AGWN to microscopy data may not yield good results. There are two approaches to adapting methods developed for AGWN to Poisson noise. One is the use of the Anscombe transform [20] to modify Poisson-noise-dominated data into a form with constant variance, independent of x. The Anscombe transform replaces pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ y with AðyÞ ¼ 2 y þ 3=8. The data is then processed using an AGWN algorithm, then untransformed. Reversing the transform is straightforward for y without noise, but estimating the reverse transform in the presence of noise in y is more challenging [21] and becomes unreliable for very low intensity in the image (counts 3 or lower) [22]. The other approach is to modify the algorithm to take into account the mathematical properties of Poisson noise directly. For example, a common problem in data science is to estimate the similarity S between two signals, y1,i and y2,i. The most common approach is to compute the square difference or the L2 norm of the difference between the two data vectors,

¼ N1

N X y þy y1;i log y1;i þ y2;i log y2;i ðy1;i þ y2;i Þ log 1;i 2 2;i :

ð4Þ

i¼1

These kind of changes are more robust against high levels of noise but require modifications of both the algorithms and the implementation. In general, we must make some assumptions about the nature of x or of the noise corrupting it in order to obtain x⁄: there is no free lunch. Perhaps the most common method for denoising is spatial smoothing, in which we combine the values of neighboring yi, such as adjacent energy channels in a spectrum or neighboring pixels in an image, either by simple averaging, convolution with a smoothing function like a Gaussian, on in the Fourier domain via Fourier, Wiener, or wavelet filtering. Smoothing to reduce noise assumes that the noise in neighboring points is uncorrelated, so that averaging will reduce noise, and that the underlying true data x are smooth, so that variability from point to point is mostly noise, not signal. The smoothness requirement is often met by spatially oversampling in the data acquisition, so that the spacing between points is much smaller than the size of the instrument response function or the feature size in the object. Smoothing has the drawback of an unavoidable tradeoff between improvement in SNR and decrease in resolution. 2.1.1. Non-local, patch-based methods Spatial or spectral adjacency is not the only way to choose data points to combine. Instead, we can combine groups of data points with similar values and patterns, called patches, as illustrated in

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

3

Fig. 1. Schematic of the process for patch-based, non-local denoising algorithms. The red patch is the part of the image being denoised, and the blue patches are other parts of the image containing similar intensity patterns. Taken from Ref. [24].

Fig. 1. The basic procedure for patch-based denoising is to identify the collection of patches similar, often in the sense of Eqs. (2) or (3), to the part of the data set being denoised, then combine them. The patches do not have to be contiguous or even nearby one another in the image, so these approaches are called ‘‘non-local”, and the first such approach was called non-local means (NLM) [25]. The assumption underlying non-local, patch-based denoising is that the data contains a significant amount of redundancy, in the form of patches that are similar to one another. Atomicallyresolved images often contain such redundancy, since in many cases the images of all the atoms are similar and we are primarily concerned with their positions, but so do many other types of images. However, non-local approaches will provide the least denoising to unique features, so, for example, in a chemical map containing an impurity at only one position, the impurity signal will see the least improvement from non-local denoising. Different non-local denoising algorithms are distinguished by the details of which patches are considered, how the similarity of patches is evaluated, and how similar patches are combined. Non-local means uses a weighted average of patches, with weights assigned by how similar the patches are [25]. Often, implementations of non-local means use only patches that are near the target patch, because the set of all possible overlapping patches in an image can be quite large, leading to large memory requirements and long processing times. Mevenkamp et al. have developed a NLM approach specifically for atomically-resolved images of crystals that exploits the periodic image structure to efficiently identify similar patches and compared it to a wavelet-based denoising approach modified for Poisson noise (PURE-LET) [26]. The current broadly-accepted state of the art in non-local denoising is the block-matching and 3D filtering (BM3D) algorithm [27]. BM3D takes the stack of similar patches and denoises it in a two-step procedure, first by 3D wavelet denoising, then Wiener filtering [27]. Periodic block matching to exploit the spatial structure of atomically-resolved crystal images has also been applied to BM3D [28]. Fig. 2 shows the results of BM3D denoising applied to a highresolution Z-contrast STEM image [28]. Fig. 2(a) is a simulated image of an edge dislocation in Si, including simulated scan distortion and drift, but not including noise. Simulated images were used

because access to the noiseless ‘‘ground truth” aids in quantitative assessment of the denoising performance using metrics such as the peak signal to noise ratio (PSNR). Fig. 2(b) and (c) are the image in (a) with simulated Poisson noise added. Fig. 2(b) has a peak intensity of 15 electrons in the most intense pixel, and Fig. 2(c) has a peak intensity of 6 electrons. Fig. 2(d) is a denoised version of (b) and (e) is a denoised version of (c). The BM3D algorithm used in both cases used the Poisson maximum likelihood (Eq. (3)) to measure the similarity of patches. For Fig. 2(d), patches were selected from the local environment around each patch. For Fig. 2(e), patches were selected uniformly from throughout the image. Like many atomic-resolution microscopy images, every atomic column in the image has the same pattern of intensity, so the images in Fig. 2 contain a high degree of redundancy and should be good targets for non-local denoising. The results in Fig. 2 bear out this expectation. Fig. 2(d) is visually quite similar to the noiseless Fig. 2(a). Even starting from an image in which the atomic lattice is barely visible and the dislocation is quite hard to see (Fig. 2 (c)), the denoised version in Fig. 2(e) shows clearly both the lattice and dislocation. Fig. 2(e) exhibits some blurring, however, which causes the atomic columns on either side of the Si dumb bells to be no longer resolved. These are the best results obtained for these images, selected from extensive testing varying the patch size and the patch search approach. Ref. [28] contains more comprehensive results and quantitative performance metrics for denoising. Fig. 2 shows that even extremely noisy images can be rendered useful for deriving microstructure using state-of-the-art denoising approaches. 2.1.2. Low-dimensional representation Another approach to denoising rests on the concept of a lowdimensional representation of the data. The idea is illustrated very schematically in Fig. 3. Each data set x is represented as a point in N dimensional space. (For example, a spectrum with 2048 energy channels could be represented as a point in a 2048-dimensional space.) However, a particular collection of data often occupies only a subspace in that N dimensional space. In Fig. 3, N = 3, but all of the data points sit on a plane, with N = 2. If we can find basis vectors for the subspace, like the vectors a and b in Fig. 3, we can represent the data x as a linear combination of those basis vectors.

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

4

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

Fig. 2. Denoising of simulated Z-contrast images using variations of the BM3D method. (a) a noiseless simulated STEM image including modest scan distortion. (b) Poisson noise added to (a) such that the peak intensity in any pixel is 15 electrons. (c) Poisson noise added to (a) with peak intensity of 6 electrons. (d) Image (b) denoised using BM3D with Poisson maximum likelihood and local block matching. (e) image (c) denoised using Poisson BM3D and uniform block matching. Adapted from Ref. [28].

Fig. 3. Illustration of low-dimensional representation. Each point represents a measurement such as a spectrum. All of the points in the three-dimensional space actually sit on a two-dimensional plane, so instead of representing a point as ux + vy + wz, the point can be represented by ia + jb for suitable vectors a and b.

Low-dimensional representation is a very general concept, and it will appear again in Section 3 below. It is useful for denoising because the noisy data y are often near, but not perfectly within, the low-dimensional subspace due to noise. If we then project y into the subspace, throwing away the parts of y that are outside

the subspace, we get a denoised estimate x⁄. There are various approaches to identifying the dimensionality of subspace and finding the right set of basis vectors. The earliest and most widely adopted in materials microscopy is principle component analysis (PCA), which can be adapted for Poisson noise in the data either by weight PCA (WPCA) [29] or by Poisson PCA (PPCA) [22]. Weighted PCA uses a transform of the data similar in purpose to the Anscombe transform described above, and Poisson PCA uses a modification of the similarity measure similar to Poisson maximum likelihood (Eq. (4)). WPCA for electron microscopy data is implemented in a variety of packages, including commercial software and open source software. Determining the correct number of components for componentbased denoising is a challenge. The fewer the components, the smoother the signal, but use too few components and important features of the signal will be excluded. Artifacts from too few components are particularly prevalent when the signal of interest is rare in the data set, such as an interface or defect state in an EEL spectrum image. The most common approach for PCA is a scree plot, which shows variance of the component signals from largest to smallest. The number of components is selected when the variance falls below some threshold or when there is a discontinuity in the slope of the plot, sometimes on a linear scale, sometimes on a log scale. Automated approaches exist [30,31], but are not necessarily better in all cases than testing by the user. Lichtert and Verbeeck [32] and Cueva et al. [33] have additional discussion of the risks and pitfalls of PCA denoising of spectrum image data.

2.1.3. An example combining both approaches Non-local PCA (NLPCA) combines both non-local, patch-based denoising and low-dimensional representation [22]. It follows the non-local approach of identifying sets of similar patches, then uses Poisson PCA to denoise each collection of similar patches. As implemented, NLPCA considers the set of all possible patches in a two- or three-dimensional data set, and groups of similar patches are

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

identified using a method called k-means clustering, described in Section 3. Fig. 4 shows an example of denoising very low-count, Poissonnoise limited EDS spectrum image data [24]. The sample is Castabilized Nd2/3TiO3, and the data set is was collected on an aberration-corrected Titan ChemiSTEM. The top panel shows the crystal structure, with mixed Nd/Ca columns, Ti/O columns, and pure O columns, superimposed on a Z-contrast STEM image of

5

the crystal along the [100] zone axis. The top row is composition maps created by integrating the O Ka, Ca Ka, Ti Ka, and Nd La spectral bands. Additional details of the experiments can be found in Ref. [24]. The bottom four rows of the figure show the original data after denoising by four different algorithms, WPCA [34], NLPCA [22], NLM [26], and BM3D [28]. (References are to the software implementations used here, not the original algorithms.) The WPCA

Fig. 4. Denoising of atomically-resolved EDS composition maps from Ca-stabilized Nd2/3TiO3. (top) An atomic model of the crystal structure showing the Nd/Ca, Ti/O, and O sites, and a HAADF STEM image of a similar region of the crystal. The hole in the upper right is deliberate beam damage used as a fiducial mark for registration. (first row) The original spectrum image, integrated over the O Ka, Ca Ka, Ti Ka, and Nd La spectral bands to produce composition-sensitive intensity maps. (lower rows) The raw data denoised using WPCA, NLPCA, NLM, and BM3D. Figure adapted from Ref. [24].

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

6

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

and NLPCA were performed on the SI data cube directly. The implementation of NLM and BM3D used only accepts two dimensional images, so each spectral band image was denoised separately. The NLM and BM3D implementations used a Poisson maximum likelihood measure to accommodate high levels of Poisson noise, and were tuned to exploit the spatial periodicity of atomicresolution image data by using a periodic search method to identify similar patches. NLPCA considered the entire set of overlapping patches, and WPCA is not patch-based. In the raw data, the atomic lattice is discernable in the Nd and Ti images, but not in the Ca or O images. NLPCA shows the best qualitative performance, with strong contrast in the Nd, Ti, and Ca images. There is even some regular contrast in the O image, but it does not have the periodicity of the O sublattice. Whether this is an artifact of the denoising or a consequence of the electron and X-ray scattering processes that give rise to the map [35,36] is unclear. WPCA denoising produces similar spatial structure in the Nd image, but noise obscures the contrast in the Ti image and the Ca image. WPCA with fewer components to produce greater denoising leads to all the elemental images simply copying the spatial structure of the Nd image, a clear artifact. NLM produces limited denoising and clear artifacts in the Nd and Ca images. BM3D is better for Nd and Ti, but shows no lattice in the Ca at all. The inability of the implementations of NLM and BM3D used here to make use of the spatial information from highcontrast images like Nd when denoising the lower-intensity images like Ca may play a role in their poorer denoising performance. The qualitative trends in denoising shown in Fig. 4 are borne out by quantitative tests of PSNR on phantom data (data synthesized on the computer so the noiseless x is known and can be compared to the denoised x⁄) [24]. NLPCA creates the highest PSNR for every band across a wide range of input PSNR levels. The disadvantage of NLPCA is substantially higher computing time and memory requirements, which make the current implementation unsuitable for real-time or near real-time use by the microscope operator. NLM and BM3D, especially with the periodic block matching scheme to find similar patches, are less computationally demanding.

2.2. Drift and distortion correction Experimental microscopy data is also often corrupted by drift and distortions. Drift can be of the sample, the microscope, or both, and distortions can be consistent in every measurement, or they can vary randomly from measurement to measurement. Drift and distortion are especially prevalent and problematic for scanning microscopies which acquire spatial samples sequentially. Scanning is often slower than imaging microscopies which acquire spatial samples in parallel, allowing more time for drift and distortions, and the scanning acquisition will translate sample drift into distortion of the image. Registration has a huge number of applications outside materials characterization in computer vision, medical imaging, and other fields; a survey of many classes of image registration problems and approaches is given in Ref. [37]. Fig. 5 illustrates the general drift and distortion problem and two types of solution. The simplest form of the problem is: given a set of i = 1 . . . N images (or spectrum images) of an identical object with drift and distortion, recover the best possible estimate of the true image. Fig. 5(a) illustrates the case with drift only, corrected by a shift vector fi applied to all the pixels in each image i. This case could be realized by a series of TEM images of the same object. Since the detector is fixed, the sampling grid of pixels is the same in every image. fi can be estimated either from the image data itself, which will be called ‘‘inline registration” here, or from images of a separate reference area in between the frames of Fig. 5(a), which is assumed to be rigidly connected to the imaged area (‘‘reference area registration”). Reference area registration is useful if the time to acquire a single frame in the series is so long that drift inside a single frame is large, as can be the case, for example, for slow EDS spectrum images. In that case, reference area registration can be used periodically throughout the acquisition of a single image. Inline registration may be more accurate, since it uses directly the images being registered. The most common means to obtain fi is to maximize the cross correlation between images, but there are many possible variants and which is best varies by application. For atomic-resolution lattice images, the phase cross correlation is more strongly peaked than the amplitude cross correlation, but for non-periodic images,

Fig. 5. (a) Rigid vs (b) non-rigid registration of a series of images. Rigid registration is depicted with drift only, without distortion in each image. Non-rigid registration is depicted with drift and distortion. Adapted from Ref. [38].

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

the amplitude cross correlation is better [39]. Sometimes it is better to register a data series piecewise, registering image i + 1 to i, then i + 2 to i + 1, etc., but sometime a global approach, registering every image i > 1 to the first image, i = 1 is best. Those two approaches can be used sequentially, with an initial rough alignment done piecewise, then refined by global alignment. Cross correlation can be replaced by mean squared difference (Eq. (2)), Poisson maximum likelihood (Eq. (4)), or other measures like total variation. More complicated rigid registration approaches are also required for problems in which the object changes as well as the image, including registration of a tilt series for tomography to remove lateral motion superimposed on rotation (e.g. [40]) and registration of a focal series to remove drift superimposed on changes in focus. Fig. 5(b) illustrates the case of drift and distortion corrected by non-rigid registration. In non-rigid registration, every pixel in every image is allowed a shift, fi(x, y). There is not nearly enough information in even a series of images to determine fi(x, y) if all possible shifts are allowed, so prior information about the image acquisition mode or the sample must be built into the algorithm. Prior knowledge about the sample, for example use of a reference image [41], or enforcing periodicity [42], restricts the range of problems addressable by the algorithm. Information about the imaging system is more general. Jones and Nellist have provided an extensive discussion of various forms of distortion in scanned images [43], including probe jitter, flagging (displacement of entire scan lines by an incorrect starting position), and the effects of sample drift. They have implemented corrections based on these distortions and applied them to series of STEM and STM images [44]. Yankovich et al. [38] have used a non-rigid registration method developed by Berkels et al. [45] on a series of STEM images to demonstrate the first sub-picometer precision electron microscopy images. The details of the Berkels method are presented elsewhere [45], so here we will focus on the prior information built into the algorithm. That prior information is quite general, but still yields excellent results. The primary assumption is that fi(x, y) is smooth, and that the degree of smoothness is connected to the magnitude of f. Large deformations that effect many pixels must have small derivatives with respect to x and y, since they arise from long length-scale, slowly time varying processes like sample drift. Small deformations effecting just a few pixels may be less smooth, since they arise from faster, smaller length scale instabilities like acoustic noise. The smallest, fastest instabilities arising from electronic probe jitter effect just one or two pixels but can change direction from one pixel to the next. This connection between smoothness and length scale is implemented by penalizing deformations with large Dirichlet energy (a measure of smoothness) inside a multigrid registration scheme that first registers a down-sampled

7

version of the image, then uses that fi(x, y) as a starting estimate for the more finally sampled version. A secondary assumption is that distortions in the image (although not the drift) have zero mean over a series of many images. This assumption in implemented by a multistage alignment, in which each image is first aligned to the previous image in the series, then those fi(x, y) are used as starting estimates for aligning to a key frame. Once the entire series is aligned, it is averaged, and the series is aligned again to the average image. Fig. 6 shows the results of non-rigid registration and averaging of Z-contrast STEM images of a test sample of a GaN single crystal 0i zone axis. Fig. 6(a) is a single Z-contrast viewed along a h1 1 2 STEM image out of the series of 512 images, which shows significant distortions in the shape of each atom and more subtle variations in the distances between the atoms. Fig. 6(b) shows the result of non-rigid registration and averaging. The field of view is reduced because some of the sample has drifted away, but in what remains the atoms are smooth, regular, and exhibit very high SNR. Fig. 6(c) and (d) shows that the interatomic distances are highly regular, with a standard deviation of <1 pm in both cases. The standard deviation of repeated interatomic distances in a perfect crystal is a good measure of the statistical uncertainty in the atom positions, including distortions and drift [46], so the image in Fig. 6(b) demonstrates sub-pm precision in locating atomic columns. Fig. 7 shows the result of non-rigid registration and averaging of a series of Z-contrast STEM images of a Pt nanocatalyst particle on a SiO2 support [38]. This image comes from a shorter series of only 56 images to limit beam-induced damage to the particle structure. As a result, it has lower precision of 2 pm for the particle interior. The surface and immediate sub-surface atomic columns are displaced from their ideal lattice positions as indicated by the arrows and reported quantitatively in the table. The corner between the two {1 1 1} facets in the image (positions G through L) is strongly contracted into the center of the particle, as observed for flat metal surfaces [47] and reported for various metal nanoparticles [48]. The flat {1 1 1} surface (A though E) bulges slightly outward, consistent with DFT calculations [49], and the smooth transition between bulge and contraction leaves position F with no measurable displacement at all. These results are an example of how improvements in materials science data quality from data science approaches can lead to new observations about materials structure. A web interface to the Berkels code is available through Nanohub.org [50], and the method has been tested for coarse pixel sampling to reduce the electron dose [51], and extended to registration of spectrum images [24]. A different non-rigid alignment procedure has been developed for electron tomography data [40], where it is intended to

0i. (a) A raw STEM image from the series. (b) The result after non-rigid registration and averaging. (c and d) Histograms Fig. 6. Sub-picometer precision imaging of GaN h1 1 2 of the x and y interatomic separations as indicated in (b) demonstrating sub-picometer precision in both directions. Taken from Ref. [38].

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

8

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

Fig. 7. Non-rigid registration applied to a Z-contrast STEM image of a Pt nanocatalyst. The table reports the displacement of the surface sites indicated in the image away from their ideal lattice positions, in the directions indicated by the yellow arrows. The length of the arrows is exaggerated for clarity. Taken from Ref. [38].

overcome beam-related changes to the sample structure as well as scanning-image acquisition distortions. Another approach to drift and distortion correction is to introduce additional prior information into how the data are acquired by rotating the scan with respect to the sample over a series of images [39,52]. Distortions in the scan are tied to the scan reference frame and drift of the sample is tied to the sample reference frame, so the two effects can be separated from one another and corrected. Unlike Berkels or Jones non-rigid registration, the rotation method can also correct systematic scan distortions present in every image. The result is precision of 1–2 pm [52], but with a similar level of accuracy, which does not require an internal calibration reference like the results in Figs. 6 and 7. 2.3. Compressed sensing When first encountered, compressed sensing can look like magic: typical examples including demonstrating that an image can be reconstructed from the values of only a tiny subset of its pixels [53], or that a camera with only one pixel can nonetheless obtain detailed images of a scene [54]. It is not magic, of course. The mathematics of compressed sensing are well established, and they have been reviewed elsewhere [53], so they will not be covered here. Instead, we will focus on the requirements and limitations of compressed sensing and its applications in materials microscopy. To succeed, compressed sensing requires prior information about both the nature of the underlying signal and about the measurement of the that signal. Although the prior information requirements are quite general, they are not universally true, and compressed sensing cannot be applied to signals for which the assumptions about prior information do not hold or the required prior information is unavailable. One requirement is that the underlying signal must be sparse. Sparsity has a rigorous quantitative definition [53], but qualitatively it means ‘‘mostly zero”. However, sparsity is a quality of the representation of the data as well as the data itself. For example, a high-resolution STEM image like Fig. 2(a) is not mostly zero, but its Fourier transform is. Thus, atomic-resolution images can be sparse in the Fourier domain. Similarly, a tomogram of nanoparticles in a matrix is not sparse, but a representation of just the edges of all the particles could be. The second requirement is that the measurements of the signal must be random, but known. In the image reconstruction example, the pixels in the subset used for reconstruction must be randomly

distributed across the image, but their position must be known. Performance guarantees for the quality of compressed sensing reconstructions are best developed for measurements matrices that have the restricted isometry property [55]. The condition that the measurement matrix be known means, for example, that simply turning the beam current down until only a few pixels measure any intensity in a TEM image is not sufficient for compressed sensing. The potential of compressed sensing in materials microscopy is demonstrated in Fig. 8 [56]. Fig. 8(a) is a conventional STEM image of a NdGaO3 substrate coated with 6 atomic layers of SrTiO3, then 10 nm of LaSrMnO3. Fig. 8(b) shows images of the same sample with a fast beam blanker used to blank the beam for some of the pixels, selected at random. In the top image, the beam is blanked 50% of the time, and in the bottom image, the beam is blanked 80% of the time. Fig. 8(c) shows the images reconstructed from the subset sampled images in (b) using a compressed sensing approach. The reconstructions are not perfect, but they are certainly sufficient to obtain significant materials information, such as the number of SrTiO3 layers, an estimate of the interface roughness, and to identify defects like interface dislocations. Although compressed sensing is not a form of denoising, the results in Fig. 8 look like a path towards acceptable imaging at low electron dose to the sample [57,58]. The partially-blanked images in Fig. 8 were acquired with the same probe current and the same pixel dwell time as the conventional STEM image, which means they were acquired with 50% or 80% lower electron dose. However, compressed sensing with Poisson noise is limited by physical sensing constraints and the nonnegativity of the intensities [59–61]. This may imply that maintaining SNR of the reconstructed image requires maintaining the total dose by increasing the dose at the subset of image pixels that are sampled, meaning that compressed sensing offers potentially no reduction in total dose to the entire sample at constant image quality. What acquisition of a limited number of pixels in an image then reconstruction does offer is much greater flexibility in the distribution of the dose across the sample. Unevenly distributed, highly localized dose might be advantageous for materials that damage through a mechanism with slow relaxation like local heating in a sample with low thermal conductivity. Acquisition of only a fraction of the pixels in an image can also enable much faster acquisition. Anderson et al. modified a scanning electron microscope to direct the probe to only a subset of positions on the sample, then used compressed sensing approaches

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

9

Fig. 8. Experimental realization of compressed sensing reconstruction from a randomly samples STEM image of NdGaO3/SrTiO3/LaSrMnO3. (a) A conventional STEM image at full pixel sampling. (b) Images acquired with a beam blanker active for 50% or 80% of the pixels. (c) Compressed sensing reconstructions of the images in (b). Taken from Ref. [56].

to reconstruct the complete image [62]. The result was images acquired up to 10 times faster than a conventional raster scan. Similar improvements in speed using similar methods have been achieved recently in atomic force microscopy as well [63]. 2.4. Opportunities for further development The examples above deal primarily with post-acquisition processing of data. The exception is the work on compressed sensing, in which application of new algorithms and computing combined with development of new hardware create new microscopy modalities. This is an example of computational imaging, the codesign of data processing and data acquisition. A few other examples include equally sloped tomography and non-square scans. Equally-sloped tomography is a combined technique for tomography data acquisition with a specifically tuned reconstruction algorithm [64]. Together, these approaches have recently pushed the spatial resolution of electron tomography down sufficiently to resolve individual atoms and crystallographic defects in 3D [65,66]. Non-square scans have a long history in scanned probe microscopy, in which spiral scans can be designed to maintain constant linear or angular velocity for the tip. Constant velocity reduces distortions from the inertia of the scan system, but requires modest computing to render human-interpretable images [67,68]. Non-square scans have been recently implemented for electron microscopy as well [69]. Here are a few simple ideas for the future: One is adaptive experiments. Now, we use drift correction during experiments, but we could build software to perform on-the-fly analysis, estimating SNR in the data or uncertainty in a final result like a composition, then terminating data acquisition once sufficient data quality is achieved. The result would be exactly as much dose to the sample as required, and not a bit more. In addition, microscope operators would be made more productive by real-time or near real-time implementations of the denoising and drift/distortion

corrections discussed above. Even approximate implementations would provide invaluable feedback on the eventual data quality achievable, resulting in a much higher level of usability and productivity. Another idea under active development in electron microscopy is a variant of compressed sensing designed to increase the speed of acquisition of relatively slow pixelated detectors like CCD cameras. Computational imaging is an active field of research in its own right, so it seems likely that there are more opportunities for cross-pollination of ideas and techniques with materials characterization with greater impact than these. 3. Improved materials information Improving the materials information obtained from microscopy data is largely the purview of techniques in machine learning. Various techniques enable experimenters to map known signals in large data sets and to discover new signals. Here we will discuss problems in spectral unmixing, clustering, and the use of optimization techniques to enable the solution of analytically intractable inverse problems like dynamical diffraction using forward models that simulate experiments from models. A large range of topics in image analysis, including feature identification, edge detection, and segmentation also fall in the category of improved information, but are not covered here. 3.1. Spectral unmixing In a spectrum image or similar data set, often data from one spatial position (e.g. a single spectrum) contains a mixture of signals from more than one prototypical spectrum. As an example, an EDS spectrum image from a two-phase mixture could contain spectra which are a mixture of the spectrum from the pure phase A and the pure phase B. Given such a spectrum image, the spectral unmixing problem is to identify the spectral prototypes and assign a fraction of each prototype to every position [70,71]. One

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

10

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

approach is to apply prior information, like the spectra of phases A and B using a technique like multiple linear least squares fitting [72], but a more general approach is to learn the prototypical spectra from the spectrum image itself. Spectral unmixing is an example of a broader class of problems in unsupervised machine learning: given a large data set of related measurements, find interesting signals and characterize them. ‘‘Interesting” can mean signals that occur frequently, or that occur bunched together in space or time, or signals that occur rarely but are not noise. Ideally, the extracted signals would also be physically meaningful: the spectrum of a phase or the physical response of a material to a stimulus like an electric field. No one algorithm provides a desirable answer in every case, so a large number of possible approaches exist. Only a few that have found application in materials microscopy will be covered here, divided into the two broad categories of component analysis and clustering. 3.1.1. Component analysis Component analysis and low-dimensional representation are discussed in Section 2.1 as a means of denoising. For denoising, the nature of the basis vectors for the low-dimensional space (the components) did not matter, since the only goal is to remove aspects of the signal outside the low-dimensional subspace. For spectral unmixing, the components (vectors a and b in Fig. 3) are now the prototypical signals, the coefficients of each component (i and j in Fig. 3) for a given measurement give the fraction of each prototype it contains, and graphs or images of the weights across many points represent the distribution of the prototypes in the data set. Consideration of Fig. 3 should make clear that there are many possible choices for the components that form equally good bases for the low-dimensional subspace. Different methods for component analysis enforce different restrictions on the components which may make them more or less physically meaningful for a particular application. In PCA, the first component is always the mean of the set of measurements, and all the other components characterize deviation from the mean. They are selected to maximize the fraction of the variance in the original data that each component captures, subject to the requirement that the components are orthogonal vectors in the high-dimensional space. As a result, in materials microscopy, PCA prototypes typically have little physical meaning, since they are both positive and negative and have a Gaussian histogram regardless of the distribution of the underlying data. Sometimes they can be viewed qualitatively as difference spectra with respect to the mean. These limitations of PCA can be overcome by a variety of techniques, one of which, independent component analysis (ICA) [74– 76] is illustrated in Fig. 9 [73]. Fig. 9 compares the first three PCA components (not counting the mean spectrum) to the first three ICA components for an EEL spectrum image data set from a SnO2–TiO2 solid solution thermally treated to create spinodal decomposition. ICA finds components which are maximally nonGaussian but with minimum mutual information [74]. An EEL spectrum has a strongly non-Gaussian histogram, so ICA could be expected to perform better than PCA. The two phases in the sample have strongly overlapping EELS edges, with the Ti L2,3 edge starting at 456, the Sn M4,5 edge starting at 485 eV inside the extended Ti fine structure, and O K-edges at 532 eV from both compounds, but with different fine structure. After spinodal decomposition the two phases are well-separated from one another, so the prototype spectra overlap very little in spatial dimensions of the spectrum image. ICA should be sensitive to this spatial separation. The first two PCA components, Fig. 9(a) and (b), do not have the shape of EEL spectra at all. The third component, Fig. 9(c), does, but not only is it mixing spectral features from all three elements, as shown by the small hump near 485 eV, the Sn M4,5 edge onset,

but components 1 and 2 also have strong features around 456 eV, the Ti L2,3 edge onset. The ICA components in Fig. 9(d)– (f) do a better job of separating the contributions from the two phases. The first component, Fig. 9(d), is the power-law decaying background common to EEL spectra. The second component, Fig. 9(b) is very close to a reference spectrum for TiO2, and the third component Fig. 9(c) contains signals from Sn and O without Ti. The corresponding spatial maps of the component weights are also more physically useful for ICA than PCA [73]. The PCA maps do not simply correspond to the sample microstructure and contain negative values which are difficult to interpret physically. The ICA maps are all positive and match the microstructure observed by other techniques very well. There are many other prototype-finding approaches and algorithms, some of which take the subspace approach of PCA and ICA, and some do not. Vertex component analysis is a computationally efficient approach that has had some success in EELS and EDS data sets [75,77,78]. Non-negative matrix factorization, which finds all-positive prototypes which are more likely to be physically interpretable, has also been used [79]. Bayesian linear unmixing (BLU) [80] starts with the assumption that the data set consists of linear combinations of the prototypes, then uses Bayesian inference to estimate the prototypes. It can enforce both positivity of the prototypes and normalized, positive weights. 3.1.2. Clustering A second class of approaches to uncovering interesting signals in complex data sets involves clustering, or grouping together similar measurements in the high dimensional space of measurements. The measurements within a group can then be averaged together to find a prototype, the differences between groups can be quantified based on the distance separating them in the space of measurements, and the variability within each group can be analyzed. As with component analysis, there are a number of approaches to accomplish this basic task, and only a few will be reviewed here. The most common clustering technique in materials microscopy is k-means clustering, which is also one of the oldest clustering techniques [81,82]. k-means clustering tries to assign all of the measurements in a data set to k different clusters that minimize the sum of the squares of the distances of every point from the center of its cluster. The center of each cluster is the mean of the points assigned to it. k-means clustering is thus assigning all of the points in the data set to hyperspheres surrounding the center points. Finding the absolute best k-means clustering (the global minimum of the sum of the squared distances) is a computationally intensive problem, but various approximate algorithms exist. k-means requires the user to specify the number of clusters as an input, which is sometimes estimates from the underlying dimensionality of the data set determined by component analysis (see Section 2.1), or from a dendrogram, which displays graphically the distance between cluster centers as the clusters are recursively divided into smaller and smaller clusters. A sudden drop in the cluster separation, reminiscent of the knee in the scree plot for PCA, is an indication that the optimal number of clusters has been exceeded. kmeans clustering is used as part of the NLPCA denoising algorithm in Section 2.1 [24], and has been used for feature identification in high-resolution electron images [83], CBED patterns [84], and complex SPM data [85,86]. Another approach to the same problem is density-based clustering. The prototypical algorithm is called Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [87]. DBSCAN tries to find clusters of connected measurements in the space of measurements. Points representing single measurements are connected by being within some threshold distance e of one another. Collections of connected points larger than a specified threshold

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

11

Fig. 9. Component analysis of an EEL spectrum image from a SnO2 – TiO2 spinodal. (a–c) are the first three PCA components. (d–f) are the first three ICA components. Adapted from Ref. [73].

number of points correspond to regions of high density and are identified as clusters. Density-based clustering can therefore find any number of clusters of any shape, and not every measurement must belong to a cluster, since, for example, some measurements may be farther than e from every other measurement. Various elaborations on the DBSCAN approach exist which estimate the internal parameters (e and the minimum number of points in a cluster) from the data, rather than having them set by the user. Density-based clustering has been used atom probe tomography to discover atomic clusters [88] and in biological fluorescence microscopy applications [89,90]. 3.1.3. An example combining both approaches No one technique suffices to solve every problem, or even every example of a class of problems, such as spectral unmixing for every EEL spectrum image. Strelcov et al. have provided an excellent example of applying multiple techniques to discover information within a complex microscopy dataset, then using that information to create materials knowledge [85]. The microscopy data come from first-order reversal curve current-voltage (FORC-IV) SPM data [15], in which a series of current-voltage loops at steadily increas-

ing maximum positive bias are collected at every pixel in an image, creating a 4D data set of current as a function of position (x, y), voltage, and loop number which is sensitive to e.g. hysteretic and memristive effects from bias-induced changes in the sample [15]. The sample consists of CoFe2O4 (CFO) nanocolumns in a BiFeO3 (BFO) matrix. Their first analysis, shown in Fig. 10, was based on k-means clustering, using four clusters (Fig. 10(a)). Fig. 10(c) shows that the clustering algorithm discovered regions with increasing resistivity from cluster 1 to cluster 4, and noticeably higher hysteresis in clusters 1 and 2 compared to 3 and 4. Fig. 10(b) shows where the four data clusters occur in the sample. The low resistivity parts of the data set (clusters 3 and 4) correspond to the CFO pillars, and the highest resistivity is the BFO matrix (cluster 1). Cluster 2 is a distinct interfacial region surrounding every CFO pillar. Their second analysis was Bayesian linear unmixing (BLU) to find prototype signals and signal distributions. BLU is attractive in this case because the results are all positive and the weights sum to one, creating spatial maps that are readily interpretable. The BLU results are shown in Fig. 11. The top row shows the prototype signals, and the bottom row shows maps of the weights.

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

12

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

Fig. 10. k-means clustering of complex FORC-IV SPM current-voltage data from CoFe2O4 nanopillars in a BiFeO3 matrix. (a) A dendogram used to determine that the data are well-clustered using four clusters. (b) The spatial distribution of the four clusters. (c) The average FORC-IV curve for each of the four clusters. Figure from Ref. [85].

The outstanding feature of this analysis is that the prototype signals have identifiable physical meaning [85]: prototype 1 is simple Ohmic behavior with linear I–V, and prototypes 2 and 3 are Fowler-Nordheim tunneling with low potential barriers. Prototype 4 has strong hysteresis, with the forward I–V curve well fit to Poole-Frenkel conduction and the reverse to Fowler-Nordheim tunneling. The weights show that prototype 1 occupies the edges of one CFO column and the center of two other columns. Prototype 2 is primarily interfacial, as is prototype 4, and prototype 3 describes the insulating BFO matrix. The authors speculate that the hysteresis in prototype 4 is a result of oxygen vacancy redistribution in the interface, activated under bias [85]. The complex I–V response of this sample would be extremely difficult to disentangle from bulk measurements, but it is also not obvious from the raw FORC-IV data. The application of a few different data science techniques discovered unexpected signals (new information) in the large data set, leading to new materials knowledge. Strelcov et al. later extended these techniques to characterize and understand the behavior of similar samples as a function of oxygen and water partial pressure, to derive local dopant concentrations and barrier heights, and to gain new insight into memristive switching [86]. 3.2. Tomography Computed tomography has a long history of innovation in mathematical methods to solve the problem of reconstructing the 3D structure of a sample from a series of 2D projections. Mid-

gley and Weyland have reviewed applications of electron tomography to materials samples [91,92]. These and other reviews discuss the development of reconstruction approaches from the simple back-projection, to weighted back-projection, through more sophisticated techniques such as simultaneous iterative reconstruction technique (SIRT) and algebraic reconstruction technique (ART) which are now standard approaches in the field. Here, we will summarize two recent innovations that have been adopted in materials electron tomography, discrete tomography and compressed sensing tomography. In depth reviews comparing these techniques and related methods to standard approaches are available [93,94]. Discrete tomography uses the prior information that the 3D intensity of the object has only a few (potentially two) possible values to reconstruct the object from a small number of projections [95]. One of the situations where this assumption applies is in tomography of crystals at atomic resolution. If the crystal is monoatmic, there either is or is not an atom, leading to a binary object. For a polyatomic crystal, the atoms still have discrete values of intensity. Reconstruction from just two or three projections requires additional prior information such as the structure of the crystal lattice and therefore cannot reconstruct defects or interfaces [96]. Reconstruction from more projections (10 or more) can reconstruct a crystal including defects without prior information about the lattice [97]. Van Aert, Bals and coworkers combined quantitative Z-contrast STEM imaging and discrete tomography [98] to demonstrate atomic-resolution tomograms of nanocrystals from a small

Fig. 11. Bayesian linear unmixing of the data same data set as Fig. 10. (top row) The component I–V curves determined by unmixing. (bottom row) The spatial distribution of the four different components. Figure from Ref. [85].

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

number of projections [96,99]. Their first effort was imaging of Ag nanocrystals embedded in an Al matrix from just two projections [96]. The nanocrystal was assumed to be an fcc lattice with no off-site atoms or internal holes, and the reconstruction was validated against additional projection images of the same particle that were not used in the reconstruction. A follow-up demonstrated 3D imaging of a core-shell PbSe-CdSe nanorod with three atom types from a larger number of projections [99]. Compressed sensing has recently found application in tomography reconstruction (see [93] for a review). As noted in Section 2.3, the object for compressed sensing must be sparse. Tomography objects may be sparse in a simple pixel representation if the object is, for example, a discrete collection of nanoparticles [100], or it can be rendered sparse by reconstructing only edges of objects, in the Fourier or wavelet domain [101], or by assuming that all the objects are atoms so only the atom positions need to be reconstructed [102]. Tomographic measurements are not known to satisfy the restricted isometry property since they are usually not random, so methods to quantify the limitations this imposes on compressed sensing reconstructions are an active area of research [93]. Compressed sensing tomography has been applied to reconstructing nanoparticles [100] and other nanostructures [58], crystals at atomic resolution [102,103], and the plasmonic states at the edges and corners of a metallic nanocube [101]. 3.3. Incorporating information from forward simulations For several microscopies and spectroscopies, we have effective forward models that cannot be inverted. In other words, given a sufficiently detailed model of a material (its atomic structure, microstructure, electronic states, etc.) we can calculate the results of an experiment, but given the experimental data we cannot obtain unique information about the sample. Electron imaging in TEM and STEM have this property: we can simulate dynamical diffraction from known structures using a variety of techniques [104], but we cannot analytically invert the dynamical data to obtain a structure. Core ionization and vibrational spectroscopies face similar problems, as does scanning tunneling microscopy. But even if the data can be inverted in principle, as with kinematic diffraction, it may be incomplete. One way to solve an inverse problem for which there is a good forward model is to compute the forward model for many possible solutions, then select the best one. Experimenters do this all the time, when they develop a model based on a combination of information derived from experimental data, intuition, and trial and error, then show that it agrees with the available experimental data. Optimization techniques in applied mathematics offer a systematic, automated way to approach the same problem. These techniques have mostly been applied in materials science to determining atomic structures, so let us consider the problem of determining an atomic structure a, which is a list of atom positions, from an experimental data set y. We define a cost function that captures the similarity between the forward simulation of y, Y(a), and y itself, such as

CðaÞ ¼

N 1X ½Y i ðaÞ yi 2 ; N i¼1 r2i

ð5Þ

where ri is the uncertainty in data point yi. Then we minimize C over the space of possible structures a. If the experimental data incompletely constrain the structure, we can include additional information, such as an estimate of the total system energy, E(a), from density functional theory or an empirical interatomic potential, additional experimental data sets z, etc., or more abstract constraints enforcing prior knowledge, K(a), such as penalties for short interatomic distances based on the hard-sphere radii of atomic

13

species or penalties on bond angles in covalent solids. These additional terms are can be added to the cost function, as in

CðaÞ ¼

N N 1X ½Y i ðaÞ yi 2 1X ½Z i ðaÞ zi 2 þ a1 þ a2 EðaÞ þ a3 KðaÞ: 2 N i¼1 N i¼1 ri r2i

ð6Þ This approach leaves the problem of determining the a’s, which are the relative weights of the different terms. The a’s can be determined in various ad hoc ways, for example by making the magnitude of all the terms roughly equal or making them all decrease at roughly the same rate, or in a more systematic manner by borrowing methods based on Bayesian statistics from biomacromolecule structure refinement [105,106]. Alternately, methods in multi-objective optimization can find the Pareto optimum a, for which improving one term in Eq. (6) necessarily reduces one or more of the other terms. C(a) is in general a high-dimensional, non-convex function with a large number of local minima and derivatives that are not easy to calculate, which restricts the available optimization algorithms to the most general approaches. Optimizers that have been used in materials characterization include Monte Carlo [107–110], and genetic algorithms [111–113]. Most of this work has been using diffraction data, but as TEM and STEM has become fully quantitative, it has become possible to apply structure optimization to microscopy data as well [111,113–115]. Fig. 12 shows the results of genetic algorithm optimization of a colloidal gold nanoparticle against Z-contrast STEM experimental data (Fig. 12(a)) and E from an embedded-atom model interatomic potential. The STEM data constrains the particle shape and thickness, the lack of surface faceting, and the presence of the twin boundary. The total energy constrains the three dimensional structure, including in the direction normal to the image plane, which is not constrained by the STEM image and which in turn determines the surface roughness. Fig. 12(c) shows the evolution of the two terms of the cost function over the course of the refinement, and Fig. 12(b) shows a simulated STEM image of the final result. The atomic columns in the simulated image (b) can be matched oneto-one to the experimental image (a), with an average displacement of 0.12 Å. The simulated image also contains a handful of atomic columns at the edge of the particle that correspond to diffuse intensity in the experiment that is not organized into discrete columns, potentially due to movement of surface atoms under the electron beam. The structural model created by GA refinement is at a local minimum of the interatomic potential, making it suitable for further simulations using, for example, molecular dynamics to derive structure-property relationships. The refinement in Fig. 12 consumed 10,000 CPU on computing cluster comprised of AMD Opteron 2427 processors at 2.2 GHz, which corresponds to about a week running on 60 cores. However, during that time, it consumed only a nominal amount of human time and attention. As computers grow more powerful, and optimization approaches for materials structure improve, we could imagine refining hundreds of nanoparticles in parallel, completely automatically, enabling the study of statistically significant populations of structures instead of single structures as is often currently the case. Another common method for deriving structures from forward models is to generate a large library of simulated data from the set of expected structures, then match experiments to the library. This approach has been used with Z-contrast STEM [116,117] and position-average convergent beam electron diffraction [118,119] experiments. A v2 measured applied directly to the measured data vs simulated data is one approach [116,119] to matching experiments to simulations, and a wide variety of image feature recognition and dictionary learning approaches might also be useful. One

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

14

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

Fig. 12. Genetic algorithm refinement of a gold nanoparticle against Z-contrast STEM data. (a) The experimental Z-contrast STEM image of the nanoparticle and (b) a simulated image of the final structure at the end of the refinement. (c) The evolution of the cost function C and its two terms, the energy and the agreement with experiment, over the course of the simulation. (d) Histogram of the displacements of the atomic column positions in the experimental and final simulated STEM images. Figure taken from Ref. [111].

other interesting approach is the use of Bayesian statistics applied to quantities derived from the experimental and simulated images, like the mean intensity of an atom or the width of the atom image. The advantage of such an approach is that derived quantities can be less susceptible to noise than image intensities directly. Ishikawa et al. have Bayesian statistics applied to Z-contrast image data and simulations of Z-contrast images to determine the depth of Ce atoms in an AlN crystal [117]. In a simple linear imaging model, a STEM image in one projection contains information only about the position of atoms within the plane of the projection, not perpendicular to it. However, dynamical diffraction and probe channeling mean that the depth of an impurity perpendicular to the projection plane has subtle effects on the image, changing the intensity of the atomic columns and their shape. Ishikawa et al. calculate the probability of a Ce atom at depth d in a column of m Al atoms, given measured intensity characteristics {I0 . . . IN} as

QN PðIn jd; mÞPðd; mÞ Pðd; mjfI0 . . . IN gÞ ¼ Pm Pdn¼0QN : j¼1 i¼1 n¼0 PðIn jd; mÞPðdi ; mj Þ

ð7Þ

P(In|d, m) is the probability of measuring a particular intensity characteristic given a particular Ce position and number of Al atoms in the column. That quantity can be evaluated from the library of simulations, given numerical uncertainty in the simulations and noiselimited uncertainty in the experiments. P(d, m), the probability of the Ce occupying position d in a column of m Al without any further information, is taken to be constant.

For the particular problem of deriving the position of a high-Z dopant atom in a light-element matrix from Z-contrast STEM data, the v2 approach and the Bayesian approach yield similar performance, locating the dopant to within one or two atomic positions in all three dimensions [116,117]. However, the Bayesian model can be easily extended to include additional information, such as simulated energies for different dopant sites, X-ray absorption fine structure data and simulations, or almost anything else. It can also be automated to process large quantities of data. These qualities point to its power as a general data analysis and interpretation tool for materials characterization [120].

3.4. Opportunities for further development One of the opportunities available materials microscopy and materials science more broadly is to adapt tools and approach developed in other area of science to materials data and materials problems. The breadth of techniques discussed in this section only scratches the surface of potentially useful tools already deployed in application areas as diverse as remote sensing, geoscience, and medical imaging and from various branches of mathematics including variational methods, matrix completion, and optimization. The challenge is to education ourselves well enough to make intelligent use of these methods. As the examples above show, different problems require different approaches, and often the best results come from judicious combination of several approaches. Another challenge is the growing prevalence of materials microscopy data sets that are much too large for human-directed

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

detailed analysis of individual measurements. Examples include the FORC-IV and other complex SPM data sets, large spectrum images from EELS and EDS, 4D microscopy data sets consisting of entire diffraction patterns from many positions on a sample, and even high-speed movies of HRTEM images comprising hundreds of thousands of frames. These large data sets require automated methods capable not just of calculating predetermined quantities set by the human analyst, but of recognizing important signals de novo and directing human attention toward them. Without such tools, discoveries present in already acquired data will go begging. Large-scale structure optimization fulfills a similar need, enabling detailed analysis of large data sets with reasonable human time and energy. A related opportunity is to skip over the ‘‘improved data” step in Section 2 and proceed directly to improved materials information. While the denoising and registration methods presented in Section 2 produce beautiful images and happy experimenters, they are not the fundamental information we seek from materials characterization. That information is the positions and elemental identities of atoms; the filled and empty electronic states; or the I–V response of a part of the sample. If data science methods can take us directly to the necessary information from noisy, corrupted data, so much the better. Cryo electron microscopy determination of the 3D structure of biomacromolecules is an excellent example [121]. Starting from thousands or millions of incredibly noisy images of molecules in vitreous ice, computing delivers volumetric density maps for the molecule. Denoised, aligned, undistorted, etc., images of the molecule may or may not exist at an intermediate stage in the computation, but they are not the result. They might be used for troubleshooting the data analysis, but the final structure is the result. If materials microscopists identify parallel problems and develop approaches (and software) to solve them, we enable ourselves able to make new discoveries, even from noisy, distorted, and corrupted data from fragile, beam-sensitive samples and instruments pushed to the limits of spatial, spectral, and temporal resolution. 4. Software and computing As all the approaches discussed above become more and more essentially to materials microscopy, as well as materials characterization and materials science more generally, we should spare some attention for the software that implements them and the computing that powers them. As in other areas of science, there is competition between open source and commercial software packages as platforms for informatics in materials microscopy. Just in EM, open source packages for various applications include hyperspy [122], EELSModel [123], the Cornell Spectrum Imager [124], ELMA [125], and Nion Swift [126]. At the risk of overgeneralizing, these packages tend to be cutting edge in methods, but without top-quality user interfaces and documentation, and they often suffer from limited access to experimental metadata. At the other extreme are closed source commercial packages, including in EM DigitalMicrograph from Gatan, TIA/Velox from FEI, Pathfinder from ThermoFisher, TEAM from EDAX, and many more. These systems are tightly coupled to data acquisition hardware, so they have excellent access to metadata, and some of them are user-extensible via scripting or plugins. However, they are typically expensive, and they can be limited in the computing architectures they will support. These limitations make it more difficult for researchers to take advantage of the centralized high performance or high-throughput computing resources necessary to run computing intensive algorithms on large data sets. Intermediate between these extremes is open source software written to run on commercial platforms like

15

MATLAB, IDL, or DigitalMicrograph [127]. These packages can be easier to develop and maintain than completely open source packages, but require costly licenses from users and still struggle to obtain full access to metadata. Access to complete metadata for experiments is an economic and cultural problem, not a scientific one. Open file standards such as CDF and HDF5 can store arbitrarily complicated metadata and arbitrarily complicated data in the same file. The difficulty is that manufacturers of commercial characterization instrument have an incentive to lock metadata within proprietary data formats in order to sell additional copies of their in-house analysis software. Competition might discourage such behavior if a clearly superior open source alternative already existed, but no such alternative is likely to arise without access to metadata already in place. One way out of this chicken-andegg problem is for government agencies to require instruments purchased with their funds to save their data only in fully disclosed, open file formats. In addition, such a policy would be consistent with the spirit of recent open science and data sharing initiatives. Recently, web-enabled data analysis environments and applications have emerged as an alternative to end user installed software. Early examples from other areas of materials science include the Materials Project for dissemination and processing of ab initio calculation data [128], and Nanohub for all sorts of applications related to nanotechnology [129]. Within materials characterization, development of these environments is being led by the needs of large central facilities for X-ray and neutron scattering, and is making its way into materials microscopy. The Bellerophon Environment for Analysis of Materials (BEAM) seeks to provide a central, web-delivered framework for all sorts of microscopy data analysis, including simple smoothing and image visualization all the way to complex, computationally intensive spectral unmixing calculations on large data sets [130]. The non-rigid registration code in Section 2.2 is available as standalone, single-purpose web app [50]. These web systems have the significant advantages that they do not require users to install new software, and they do not require developers to support multiple architectures. As a result, they have the potential to speed adoption of new techniques by reducing the barrier to entry for new users. They have the disadvantage of requiring the hosting institution to pay for computing resources and bandwidth to support researchers from other institutions, and the related disadvantage of leaving those external users at the mercy of the continued largesse of the host to continue their research. Academic users who lose access to their own data or lose the ability to reproduce previous results when an online analysis system shuts down may also find themselves in violation of increasingly strict data management, retention, and dissemination policies designed to improve openness and reproducibility in science. There are also significant potential obstacles to commercial use, including restrictions on funding used to support these systems and concerns by commercial users about the security of proprietary data. Perhaps the community will evolve towards a pay-as-you-go model akin to Amazon Web Services for scientific computing. Until then, convergence on a few extensible, community-supported, open source platform for informatics in materials microscopy, supplemented by web services for initial trials of new techniques, seems to this author like the best option. A similar approach supports much of the technology underpinning the internet, and has come to dominate some areas of purely computational materials science, such as molecular dynamics and density functional theory calculations. Only time will tell if the same approach can succeed in materials characterization as well.

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

16

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

5. Summary Data science and informatics will continue to have an increasing impact on materials science in general, and materials characterization and materials microscopy more specifically. The drivers for this trend are large, complex data sets made possible by new instrumentation and techniques and the growing availability and steadily decreasing cost of large-scale computing. This review proposes as a framework to understand the application of these techniques that we view the process of materials characterization as a chain from data to information to knowledge. Data is what comes off the instrument, information is facts about a material that we derive from data, and knowledge is the generalizable contributions to science we develop from information. Data science and informatics have been applied (so far) to improve the quality of data and to improve the information extracted from it. This review focused on two problems in improving data, denoising and drift and distortion correction, and two problems in improving information, spectral unmixing and optimization over forward models to solve intractable inverse problems in structure determination. It also touched on recent developments in tomographic reconstruction. Several data science tools were described, including non-local patch-based methods, low-dimensional representation, clustering, and compressed sensing. Most of the data science tools appear as part of the solution to more than one problem. For example, low-dimensional representation has a part to play in both denoising and spectral unmixing. Similarly, most of the materials microscopy problems benefited from the application of more than one data science technique, such as non-local principle component analysis denoising, and analysis of complex scanning problem microscopy data. This cross-cutting between problems and tools points to the need for practitioners to gain some general knowledge of the scope, application, and limitations of the available techniques, which is the purpose of this review. Acknowledgements Preparation of this manuscript was supported by the U.S. Department of Energy, Office of Basic Energy Sciences (DE-FG0208ER46547) and by a University of Wisconsin Vilas Mid-Career Investigator Award. Research by PMV discussed herein was supported by the U.S. Department of Energy, Office of Basic Energy Sciences (DE-FG02-08ER46547, high precision STEM and nonlocal PCA denoising) and the U. S. National Science Foundation (DMR-1332851, nanoparticle structure optimization using genetic algorithms). References [1] P. Denes, J. Bussat, Z. Lee, V. Radmillovic, Active pixel sensors for electron microscopy, Nucl. Instrum. Methods Phys. Res. Sect. A 579 (2007) 891–894. [2] M. Battaglia, D. Contarato, P. Denes, P. Giubilato, Cluster imaging with a direct detection CMOS pixel sensor in transmission electron microscopy, Nucl. Instrum. Methods Phys. Res. Sect. A 608 (2009) 363–365. [3] T.A. Caswell et al., A high-speed area detector for novel imaging techniques in a scanning transmission electron microscope, Ultramicroscopy 109 (2009) 304–311. [4] H.-G. Liao et al., Facet development during platinum nanocube growth, Science 345 (2014) 916–919. [5] E. Sutter et al., Electron-beam induced transformations of layered tin dichalcogenides, Nano Lett. (2016), http://dx.doi.org/10.1021/acs.nanolett. 6b01541. [6] D.A. Muller et al., Atomic-scale chemical imaging of composition and bonding by aberration-corrected microscopy, Science 319 (2008) 1073–1076. [7] H. Tan, S. Turner, E. Yücelen, J. Verbeeck, G. Van Tendeloo, 2D atomic mapping of oxidation states in transition metal oxides by scanning transmission electron microscopy and electron energy-loss spectroscopy, Phys. Rev. Lett. 107 (2011) 1–4.

[8] M.-W. Chu, S.C. Liou, C.-P. Chang, F.-S. Choa, C.H. Chen, Emergent chemical mapping at atomic-column resolution by energy-dispersive X-ray spectroscopy in an aberration-corrected electron microscope, Phys. Rev. Lett. 104 (2010) 196101. [9] M. Watanabe et al., Improvements in the X-ray analytical capabilities of a scanning transmission electron microscope by spherical-aberration correction, Microsc. Microanal. 12 (2006) 515. [10] C. Ophus, P. Ercius, M. Sarahan, C. Czarnik, J. Ciston, Recording and using 4DSTEM datasets in materials science, Microsc. Microanal. 20 (2014) 62–63. [11] V.B. Ozdol et al., Strain mapping at nanometer resolution using advanced nano-beam electron diffraction, Appl. Phys. Lett. 106 (2015) 253107. [12] C. Ophus et al., Efficient linear phase contrast in scanning transmission electron microscopy with matched illumination and detector interferometry, Nat. Commun. 7 (2016) 10719. [13] H. Yang et al., 4D STEM: high efficiency phase contrast imaging using a fast pixelated detector, J. Phys: Conf. Ser. 644 (2015) 012032. [14] L. He, P. Zhang, M.F. Besser, M.J. Kramer, P.M. Voyles, Electron correlation microscopy: a new technique for studying local atom dynamics applied to a supercooled liquid, Microsc. Microanal. 21 (2015) 1026–1033. [15] E. Strelcov et al., Probing local ionic dynamics in functional oxides at the nanoscale, Nano Lett. 13 (2013) 3455–3462. [16] P. Sun, F.O. Laforge, M.V. Mirkin, Scanning electrochemical microscopy in the 21st century, Phys. Chem. Chem. Phys. 9 (2007) 802–823. [17] A.J. Bard, F.-R.F. Fan, J. Kwak, O. Lev, Scanning electrochemical microscopy. Introduction and principles, Anal. Chem. 61 (1989) 132–138. [18] S.V. Kalinin, A.N. Morozovska, L.Q. Chen, B.J. Rodriguez, Local polarization dynamics in ferroelectric materials, Rep. Prog. Phys. 73 (2010) 056502. [19] P. G??thner, K. Dransfeld, Local poling of ferroelectric polymers by scanning force microscopy, Appl. Phys. Lett. 61 (1992) 1137–1139. [20] F.J. Anscombe, The transformation of Poisson, binomial, and negativebinomial data, Biometrika 35 (1948) 246–254. [21] M. Mäkitalo, A. Foi, Optimal inversion of the Anscombe transformation in low-count Poisson image denoising, IEEE Trans. Image Process. 20 (2011) 99– 109. [22] J. Salmon, Z. Harmany, C.-A. Deledalle, R. Willett, Poisson noise reduction with non-local PCA, J. Math. Imaging Vision (2012). [23] C.A. Deledalle, F. Tupin, L. Denis, R. Willett, Poisson NL means: unsupervised non local means for poisson noise, Proc. – Int. Conf. Image Process. (ICIP) (2010) 801–804, http://dx.doi.org/10.1109/ICIP.2010.5653394. [24] A.B. Yankovich et al., Non-rigid registration and non-local principle component analysis to improve electron microscopy spectrum images, Nanotechnology 27 (2016) 364001. [25] A. Buades, B. Coll, J.-M. Morel, A non-local algorithm for image denoising, in: IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2, 2005, pp. 60– 65. [26] N. Mevenkamp, A. Yankovich, P. Voyles, B. Berkels, Non-local means for scanning transmission electron microscopy images and poisson noise based on adaptive periodic similarity search and patch regularization, in: J. Bender, A. Kuijper, T. von Landesberger, H. Theisel, P. Urban (Eds.), Vision, Modeling, and Visualization, Eurographics Association, 2014, pp. 63–70, http://dx.doi. org/10.2312/vmv.20141277. [27] K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising with blockmatching and 3D filtering, in: E.R. Dougherty, J.T. Astola, K.O. Egiazarian, N.M. Nasrabadi, S. Rizvi (Eds.), AImage Processing: Algorithms and Systems, Neural Networks, and Machine Learning, vol. 6064, 2006, 606414 (Proc. of SPIE-IS&T Electronic Imaging, SPIE vol. 6064, 2006). [28] N. Mevenkamp et al., Poisson noise removal from high-resolution STEM images based on periodic block matching, Adv. Struct. Chem. Imaging 1 (2015) 3. [29] M.R. Keenan, P.G. Kotula, Accounting for Poisson noise in the multivariate analysis of ToF-SIMS spectrum images, Surf. Interface Anal. 36 (2004) 203– 212. [30] J.-F. Cai, E.J. Candès, Z. Shen, A singular value thresholding algorithm for matrix completion, SIAM J. Optim. 20 (2010) 1956–1982. [31] E.J. Candès, B. Recht, Exact matrix completion via convex optimization, Found. Comput. Math. 9 (2009) 717–772. [32] S. Lichtert, J. Verbeeck, Statistical consequences of applying a PCA noise filter on EELS spectrum images, Ultramicroscopy 125C (2012) 35–42. [33] P. Cueva, R. Hovden, J.A. Mundy, H.L. Xin, D.A. Muller, Data processing for atomic resolution electron energy loss spectroscopy, Microsc. Microanal. 18 (2012) 667–675. [34] G. Lucas, P. Burdet, M. Cantoni, C. Hébert, Multivariate statistical analysis as a tool for the segmentation of 3D spectral data, Micron 52–53 (2013) 49–56. [35] G. Kothleitner et al., Quantitative elemental mapping at atomic resolution using X-ray spectroscopy, Phys. Rev. Lett. 112 (2014) 085501. [36] Z. Chen et al., Energy dispersive X-ray analysis on an absolute scale in scanning transmission electron microscopy, Ultramicroscopy 157 (2015) 21– 26. [37] L.G. Brown, A survey of image registration techniques, ACM Comput. Surv. 24 (1992) 325–376. [38] A.B. Yankovich et al., Picometre-precision analysis of scanning transmission electron microscopy images of platinum nanocatalysts, Nat. Commun. 5 (2014) 4155. [39] C. Ophus, J. Ciston, C.T. Nelson, Correcting nonlinear drift distortion of scanning probe and scanning transmission electron microscopies from image pairs with orthogonal scan directions, Ultramicroscopy 162 (2016) 1–9.

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx [40] T. Printemps, N. Bernier, P. Bleuet, G. Mula, L. Hervé, Non-rigid alignment in electron tomography in materials science, J. Microsc. 0 (2016) n/a–n/a. [41] A. Recnik, G. Möbus, S. Sturm, IMAGE-WARP: a real-space restoration method for high-resolution STEM images using quantitative HRTEM analysis, Ultramicroscopy 103 (2005) 285–301. [42] N. Braidy, Y. Le Bouar, S. Lazar, C. Ricolleau, Correcting scanning instabilities from images of periodic structures, Ultramicroscopy 118 (2012) 67–76. [43] L. Jones, P.D. Nellist, Identifying and correcting scan noise and drift in the scanning transmission electron microscope, Microsc. Microanal. 19 (2013) 1050–1060. [44] L. Jones et al., Smart Align—a new tool for robust non-rigid registration of scanning microscope data, Adv. Struct. Chem. Imaging 1 (2015) 8. [45] B. Berkels et al., Optimized imaging using non-rigid registration, Ultramicroscopy 138 (2014) 46–56. [46] S. Bals, S. Van Aert, G. Van Tendeloo, D. Ávila-Brande, Statistical estimation of atomic positions from exit wave reconstruction with a precision in the picometer range, Phys. Rev. Lett. 96 (2006) 096106. [47] K.P. Bohnen, K.M. Ho, Structure and dynamics at metal surfaces, Surf. Sci. Rep. 19 (1993) 99–120. [48] W.J. Huang et al., Coordination-dependent surface atomic contraction in nanocrystals revealed by coherent diffraction, Nat. Mater. 7 (2008) 308–313. [49] L.Y. Chang, A.S. Barnard, L.C. Gontard, R.E. Dunin-Borkowski, Resolving the structure of active sites on platinum catalytic nanoparticles, Nano Lett. 10 (2010) 3073–3076. [50] Non-rigid Registration for STEM. http://dx.doi.org/10.4231/D30R9M519. [51] A.B. Yankovich, B. Berkels, W. Dahmen, P. Binev, P.M. Voyles, High-precision scanning transmission electron microscopy at coarse pixel sampling for reduced electron dose, Adv. Struct. Chem. Imaging 1 (2015) 2. [52] X. Sang, J.M. Lebeau, Revolving scanning transmission electron microscopy: correcting sample drift distortion without prior knowledge, Ultramicroscopy 138 (2014) 28–35. [53] E.J. Candes, M.B. Wakin, An introduction to compressive sampling, IEEE Signal Process. Mag. 25 (2008) 21–30. [54] M.F. Duarte et al., Single-pixel imaging via compressive sampling, IEEE Signal Process. Mag. 25 (2008) 83–91. [55] E.J. Candès, The restricted isometry property and its implications for compressed sensing, Comptes Rendus Mathematique 346 (2008) 589–592. [56] A. Béché, B. Goris, B. Freitag, J. Verbeeck, Development of a fast electromagnetic beam blanker for compressed sensing in scanning transmission electron microscopy, Appl. Phys. Lett. 108 (2016) 0–5. [57] A. Stevens, H. Yang, L. Carin, I. Arslan, N.D. Browning, The potential for Bayesian compressive sensing to significantly reduce electron dose in highresolution STEM images, Reprod. Syst. Sex. Disord. 63 (2014) 41–51. [58] Z. Saghi et al., Reduced-dose and high-speed acquisition strategies for multidimensional electron microscopy, Adv. Struct. Chem. Imaging 1 (2015). [59] X. Jiang, G. Raskutti, R. Willett, Minimax optimal rates for poisson inverse problems with physical constraints, IEEE Trans. Inf. Theory 1–30 (2014). [60] M. Raginsky, R.M. Willett, Z.T. Harmany, R.F. Marcia, Compressed sensing performance bounds under poisson noise, IEEE Trans. Signal Process. 58 (2010) 3990–4002. [61] M. Raginsky et al., Performance bounds for expander-based compressed sensing in poisson noise, IEEE Trans. Signal Process. 59 (2011) 4139–4153. [62] H.S. Anderson, J. Ilic-Helms, B. Rohrer, J. Wheeler, K. Larson, Sparse imaging for fast electron microscopy, in: C.A. Bouman, I. Pollak, P.J. Wolfe (Eds.), Computational Imaging XI, 2013, 86570C. http://dx.doi.org/10.1117/12. 2008313. [63] T. Arildsen et al., Reconstruction algorithms in undersampled AFM Imaging, IEEE J. Sel. Top. Signal Process. 10 (2016) 31–46. [64] J. Miao, F. Förster, O. Levi, Equally sloped tomography with oversampling reconstruction, Phys. Rev. B – Condens. Matter Mater. Phys. 72 (2005) 3–6. [65] M.C. Scott et al., Electron tomography at 2.4-ångström resolution, Nature 483 (2012) 444–447. [66] C.-C. Chen et al., Three-dimensional imaging of dislocations in a nanoparticle at atomic resolution, Nature 496 (2013) 74–77. [67] O.S. Ovchinnikov, S. Jesse, S.V. Kalinin, Adaptive probe trajectory scanning probe microscopy for multiresolution measurements of interface geometry, Nanotechnology 20 (2009) 255701. [68] D. Ziegler et al., Improved accuracy and speed in scanning probe microscopy by image reconstruction from non-gridded position sensor data, Nanotechnology 24 (2013) 335703. [69] X. Sang et al., Dynamic scan control in STEM: spiral scans, Adv. Struct. Chem. Imaging 2 (6) (2016). [70] N. Keshava, A survey of spectral unmixing algorithms, Lincoln Lab. J. 14 (2003) 55–78. [71] J.M. Bioucas-Dias et al., Hyperspectral unmixing overview: geometrical, statistical, and sparse regression-based approaches, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 5 (2012) 354–379. [72] J.A. Hunt, D.B. Williams, Electron energy-loss spectrum-imaging, Ultramicroscopy 38 (1991) 47–73. [73] F. de la Peña et al., Mapping titanium and tin oxide phases using EELS: an application of independent component analysis, Ultramicroscopy 111 (2011) 169–176. [74] C. Jutten, J. Herault, Blind separation of sources, part I: an adaptive algorithm based on neuromimetic architecture, Signal Process. 24 (1991) 1–10. [75] N. Dobigeon, N. Brun, Spectral mixture analysis of EELS spectrum-images, Ultramicroscopy 120 (2012) 25–34.

17

[76] N. Bonnet, D. Nuzillard, Independent component analysis: a new possibility for analysing series of electron energy loss spectra, Ultramicroscopy 102 (2005) 327–337. [77] J.M.P. Nascimento, J.M.B. Dias, Vertex component analysis: a fast algorithm to unmix hyperspectral data, IEEE Trans. Geosci. Remote Sens. 43 (2005) 898– 910. [78] M. Duchamp et al., Compositional study of defects in microcrystalline silicon solar cells using spectral decomposition in the scanning transmission electron microscope, Appl. Phys. Lett. 102 (2013) 133902. [79] I.S. Dhillon, S. Sra, Generalized nonnegative matrix approximations with Bregman divergences, Adv. Neural Inf. Process. Syst. 19 (2005) 283–290. [80] N. Dobigeon, S. Moussaoui, M. Coulon, J.Y. Tourneret, A.O. Hero, Joint Bayesian endmember extraction and linear unmixing for hyperspectral imagery, IEEE Trans. Signal Process. 57 (2009) 4355–4368. [81] J. MacQueen, Some methods for classification and analysis of multivariate observations, in: L.M.L. Cam, J. Neyman (Eds.), Proc. Fifth Berkeley Sympos. Math. Statist. and Probability: Vol I: Statistics, University of California Press, 1967, pp. 281–297. [82] J.A. Hartigan, M.A. Wong, Algorithm As 136: a K-Means Clustering Algorithm, J. R. Soc. Ser. C (Appl. Stat.) 28 (1979) 100–108. [83] A. Belianinov et al., Identification of phases, symmetries and defects through local crystallography, Nat. Commun. 6 (2015) 7801. [84] S. Jesse et al., Big data analytics for scanning transmission electron microscopy ptychography, Sci. Rep. 6 (2016) 26348. [85] E. Strelcov et al., Deep data analysis of conductive phenomena on complex oxide interfaces: physics from data mining, ACS Nano 8 (2014) 6449–6457. [86] E. Strelcov, A. Belianinov, Y.-H. Hsieh, Y.-H. Chu, S.V. Kalinin, Constraining data mining with physical models: voltage- and oxygen pressure-dependent transport in multiferroic nanostructures, Nano Lett. 15 (2015) 6650–6657. [87] M. Ester, H. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, 1996, pp. 226– 231. doi:citeulike-article-id:3509601. [88] L.T. Stephenson, M.P. Moody, P.V. Liddicoat, S.P. Ringer, New techniques for the analysis of fine-scaled clustering phenomena within atom probe tomography (APT) data, Microsc. Microanal. 13 (2007) 448–463. [89] H. Deschout, A. Shivanandan, P. Annibale, M. Scarselli, A. Radenovic, Progress in quantitative single-molecule localization microscopy, Histochem. Cell Biol. 142 (2014) 5–17. [90] A. Mazouchi, J.N. Milstein, Fast optimized cluster algorithm for localizations (FOCAL): a spatial cluster analysis for super-resolved microscopy, Bioinformatics 32 (2015) 747–754. [91] P.A. Midgley, M. Weyland, 3D electron microscopy in the physical sciences: the development of Z-contrast and EFTEM tomography, Ultramicroscopy 96 (2003) 413–431. [92] M. Weyland, P.A. Midgley, Electron tomography, Mater. Today 7 (2004) 32– 40. [93] R. Leary, Z. Saghi, P.A. Midgley, D.J. Holland, Compressed sensing electron tomography, Ultramicroscopy 131 (2013) 70–91. [94] B. Goris, T. Roelandts, K.J. Batenburg, H. Heidari Mezerji, S. Bals, Advanced reconstruction algorithms for electron tomography: from comparison to combination, Ultramicroscopy 127 (2013) 40–47. [95] K.J. Batenburg et al., 3D imaging of nanomaterials by discrete tomography, Ultramicroscopy 109 (2009) 730–740. [96] S. Van Aert, K.J. Batenburg, M.D. Rossell, R. Erni, G. Van Tendeloo, Threedimensional atomic imaging of crystalline nanoparticles, Nature 470 (2011) 374–377. [97] J.R. Jinschek et al., 3-D reconstruction of the atomic positions in a simulated gold nanocrystal based on discrete tomography: prospects of atomic resolution electron tomography, Ultramicroscopy 108 (2008) 589–604. [98] K.J. Batenburg, A network flow algorithm for reconstructing binary images from discrete X-rays, J. Math. Imaging Vis. 27 (2007) 175–191. [99] S. Bals et al., Three-dimensional atomic imaging of colloidal core-shell nanocrystals, Nano Lett. 11 (2011) 3420–3424. [100] Z. Saghi et al., Three-dimensional morphology of iron oxide nanoparticles with reactive concave surfaces. A compressed sensing-electron tomography (CS-ET) approach, Nano Lett. 11 (2011) 4666–4673. [101] O. Nicoletti et al., Three-dimensional imaging of localized surface plasmon resonances of metal nanoparticles, Nature 502 (2013) 80–84. [102] B. Goris et al., Measuring lattice strain in three dimensions through electron microscopy, Nano Lett. 15 (2015) 6996–7001. [103] B. Goris et al., Three-dimensional elemental mapping at the atomic scale in bimetallic nanocrystals, Nano Lett. 13 (2013) 4236–4241. [104] E.J. Kirkland, Computation in electron microscopy, Acta Crystallogr. Sect. A Found. Adv. 72 (2016) 1–27. [105] W. Rieping, M. Habeck, M. Nilges, Inferential structure determination, Science 309 (2005) 303–306. [106] M. Habeck, W. Rieping, M. Nilges, Weighting of experimental evidence in macromolecular structure determination, Proc. Natl. Acad. Sci. USA 103 (2006) 1756–1761. [107] D.A. Keen, R.L. McGreevy, Structural modeling of glasses using reverse Monte Carlo simulation, Nature 344 (1990) 423–425. [108] R.L. McGreevy, Reverse Monte Carlo modelling, J. Phys.: Condens. Matter 13 (2001) R877–R913. [109] T.C. Petersen, I. Yarovsky, I. Snook, D.G. Mcculloch, G. Opletal, Structural analysis of carbonaceous solids using an adapted reverse Monte Carlo algorithm, Carbon 41 (2003) 2403–2411.

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001

18

P.M. Voyles / Current Opinion in Solid State and Materials Science xxx (2016) xxx–xxx

[110] J. Hwang et al., Nanoscale structure and structural relaxation in Zr50Cu45Al5 bulk metallic glass, Phys. Rev. Lett. 108 (2012) 195505. [111] M. Yu, A.B. Yankovich, A. Kaczmarowski, D. Morgan, P.M. Voyles, Integrated computational and experimental structure refinement for nanoparticles, ACS Nano 10 (2016) 4031–4038. [112] B. Meredig, C. Wolverton, A hybrid computational–experimental approach for automated crystal structure solution, Nat. Mater. 12 (2012) 123–127. [113] A.J. Logsdail, Z.Y. Li, R.L. Johnston, Development and optimization of a novel genetic algorithm for identifying nanoclusters from scanning transmission electron microscopy images, J. Comput. Chem. 33 (2012) 391–400. [114] L. Jones, K.E. MacArthur, V.T. Fauske, A.T.J. van Helvoort, P.D. Nellist, Rapid estimation of catalyst nanoparticle morphology and atomic-coordination by high-resolution z-contrast electron microscopy, Nano Lett. 14 (2014) 6336– 6341. [115] C.L. Jia et al., Determination of the 3D shape of a nanoscale crystal with atomic resolution from a single image, Nat. Mater. (2014), http://dx.doi.org/ 10.1038/nmat4087. [116] J. Hwang, J.Y. Zhang, A.J. D’Alfonso, L.J. Allen, S. Stemmer, Three-dimensional imaging of individual dopant atoms in SrTiO3, Phys. Rev. Lett. 111 (2013) 266101. [117] R. Ishikawa, A.R. Lupini, S.D. Findlay, T. Taniguchi, S.J. Pennycook, Threedimensional location of a single dopant with atomic precision by aberrationcorrected scanning transmission electron microscopy, Nano Lett. 14 (2014) 1903–1908.

[118] J.M. Lebeau, A.J. D’Alfonso, N.J. Wright, L.J. Allen, S. Stemmer, Determining ferroelectric polarity at the nanoscale, Appl. Phys. Lett. 98 (2011) 052904. [119] J. Hwang, J.Y. Zhang, J. Son, S. Stemmer, Nanoscale quantification of octahedral tilts in perovskite films, Appl. Phys. Lett. 100 (2012) 191909. [120] G. D’Agostini, Bayesian inference in processing experimental data: principles and basic applications, Rep. Prog. Phys. 66 (2003) 1383–1419. [121] E. Callaway, The revolution will not be crystallized, Nature 525 (2015) 172– 174. [122] Hyperspy. Hyperspy. http://dx.doi.org/10.5281/zenodo.57882. [123] EELSModel. Available at:

Please cite this article in press as: P.M. Voyles, Informatics and data science in materials microscopy, Curr. Opin. Solid State Mater. Sci. (2016), http://dx.doi. org/10.1016/j.cossms.2016.10.001