PCA Image Processing

 

This article describes the analysis for a specific type of experiment, in which a sequence of images is acquired at regular steps in energy. The resulting data are effectively an image where each pixel within the image contains a spectrum from which spatially resolved quantitative information can be extracted. The advantages for such an experiment are:

  1. Given pulse counted images, the resulting images are suitable for proper quantification.
  2. Chemical state images are possible.
  3. Background subtraction is performed in analogous fashion to normal quantification of spectra. Image artefacts due to background variation are removed.
  4. Statistical techniques can be applied to the data set as a whole to offer alternative views of the data and enhance the signal-to-noise in resulting images.
  5. The background information can be used to offer layer information in image format.

 

Experiment (Kratos Vision 2.x)

 

The Acquisition Manager in Vision 2.x offers a means of defining a sequence of images, each image acquired at regular steps in energy. The mechanism by which the set of acquisition regions is defined is facilitated by use of the Interpolate option available for acquisition regions when in Imaging Mode. The procedure is as follows:

  1. Switch the menu to Imaging Mode. When in imaging mode a panel below the Regions Definition table will be visible offering a text-field for entering an increment and a button labelled Interpolate.
  2. In the Regions Table enter two acquisition regions. The first region represents the start energy for the sequence of images and the second region defines the end energy for the last image in the sequence. Ensure these two regions are entered with identical acquisition times.
  3. Ensure the high-light marker is over the second of the two regions and enter the desired energy interval into the text-field labelled Increment; then press the Interpolate button.

 

The result of these steps is a set of acquisition regions in the Region Table. There is a limit on the number of images defined in this fashion (about 200); if more images are required, then a second acquisition object will need to be defined and entered into the Vision Manager flow chart.

 

Processing the Image Set

 

The signal enhancing steps help to reveal the true information within the image set. These enhancements are achieved using PCA and PCA related algorithms to partition the useful information from the noise and therefore create a set of images with reduced noise content. Figure 1 shows a set of images before and after processing using PCA. The two images in the first column are displayed after PCA has been applied, while the second column contains the same two images before processing.

 

There are two ways to approach enhancing a set of energy stepped images. The PCA procedure can be applied in either the spatial domain (that is, treating each image as vector) or, with the help of SVD Sort, in the energy domain, where the raw images are converted to spectra and the spectra are treated as the vectors. The data shown in Figure 1 has been processed in the spatial domain, however in general, applying the noise partitioning in the energy domain offers a greater opportunity to suppress the noise in the set of data.

 

 

 

Figure 1

 

 

SVD Sort and PCA Noise Reduction

 

The following is a description of how to reduce the noise in a data set using the energy domain. The steps described below are intended to provide an understanding for the method as well as the mechanics of performing the analysis. Once a dataset has been assessed, the all-in-one buttons under the TFA Predict section greatly reduce the sequence of steps required for an analysis. See the section detailing the Image Processing dialog window.

 

The starting point for an analysis is a set of images as shown in the second column in Figure 1. These images represent a sweep of energies (kinetic energy appears as the experimental variable in the right-hand-side of the Experiment Frame), where in this case the step size is 0.5 eV. From these images, the data are converted to an equivalent set of spatially resolved spectra, from which a single image will be created, where the intensity at each pixel is determined by integrating the signal between a background and the signal defined on each and every pixel in the image. In this example, the advantage of performing this type of experiment is that the final image is without artefacts due to sample charging (albeit small shifts in energy) and the computed intensity will be unaffected by variations in the background itself.

 

The processing steps are as follows:

  1. Convert the images to an equivalent set of spectra. Overlay the set of raw images in the Active Display Tile. Click on the Active Tile to enable the toolbar buttons and select the Image Processing dialog window (Figure 2) from the Options menu.

 

 

  1. Press the button labelled Convert Images to Spectra on the Image Processing property page. The images in the Active Tile must all correspond to regular energy increments. Given that the images do correspond to a sequence of regular energy intervals, a new Experiment Frame is created containing a column of VAMAS blocks, one for each row in the original image, where each VAMAS block contains a spectrum for each column in the original image. Figure 3 shows the spectra from the 61st column (index starts at zero) overlaid in the Active Tile. To move through the columns of spectra, use the Control Key and the Page Up/Page Down/Home/End keyboard buttons.

Figure 2 Image Processing Dialog Window

Figure 3

  1. To extract the useful information from the entire set of 128 by 128 pixels requires the use of SVD Sort. SVD Sort is a method for combing large data sets (in this case 16384 spectra) for the useful information in that data and results in a small number of vectors from which all the original data can be approximated in a least-squares sense. Of course, for SVD sort to be successful the original set of vectors must be over specified by the original set of data and this is clearly the case for the spectra shown in Figure 3. To perform an SVD Sort on a set of spectra, these data must be overlaid in the Active Tile (as shown in Figure 3), the number of SVD scans entered in the text-field on the Image Processing property page and the SVD Sort button pressed. The action of the SVD Sort button is to perform the operation on each spectrum within a VAMAS block (in this example there are 128 spectra per VAMAS block) and then follow the process through each of the VAMAS blocks. The result of each invocation is the set of most significant vectors are moved to the top of the list and appear in the first corresponding variable in each VAMAS block (that is, the spectra visible when Ctrl Home is pressed). The SVD Sort button can be pressed multiple times or the number of SVD scans set to the desired number of operations, where the chosen number of SVD Scans depends on the number of vectors exhibiting information rather than noise. In this example, ten scans results in the information content shuffling into the top twelve VAMAS blocks (Figure 4).

 

Figure 4

  1.  Next step is to extract the useful vectors into the same Experiment Frame and then finish the data organization by performing a PCA. In the right-hand-side of the Experiment Frame, select the top set of VAMAS block containing the useful information. These are essentially those VAMAS block for which the transformed data does not look like noise, plus a couple more for luck. Press the Copy Factors button on the Image Processing dialog window, check the list of VAMAS blocks to be copied and press OK. The Processing Data Only tick-box should be ticked. A new column of VAMAS blocks will be created in the Experiment Frame.
  2. Select the newly created VAMAS blocks and overlay these data in the Active Tile. Then press the PCA button. The data previously ordered with respect to the SVD Sort will now be adjusted to ensure all the vectors are mutually orthogonal and in a state which can be used to recreate the raw data using only the most significant abstract factors.
  3. Return the original column of VAMAS blocks to their unsorted state. This is accomplished by overlaying the first column in the Active Tile, invoking the Spectrum Processing dialog window and pressing the Reset All button on the Processing History property page.
  4. Next, determine the number of factors required to represent the data. Typically this involves inspecting the abstract factor and rejecting those after the last abstract factor for which a recognizable shape is present. The abstract factors rejected are considered to be due to noise.
  5. Display the VAMAS blocks containing the abstract factors in the Active Tile and select in the right-hand-side the raw data VAMAS blocks. Enter the chosen number of abstract factors considered to be representative of the data (in this example the chosen number is two) in the TFA Predict section and press the Predict button. The raw spectra are replaced by predicted spectra, where the predicted spectra are a linear combination of the chosen number of most significant abstract factors displayed in the Active Tile. Figure 5 shows a comparison of the raw and the predicted spectrum for one pixel.

Figure 5

 

  1. To create the final processed image(s) (Figure 6), define quantification region(s) on each VAMAS block containing a spectrum from a pixel, overlay these spectra in the Active Tile and press the Convert Regions to Images button on the Image Processing property page.

 

Figure 6

 

 

 

 

The Image Processing Dialog Window (Figure 2): A Quick Summary of Features

 

 The Convert Images to Spectra button creates a set of VAMAS blocks containing spectra corresponding to each pixel from a set of energy stepped images. Overlay a set of images in the Active Tile and press the Convert Images to Spectra button. The images must be assigned sequential energy values defined by the experimental variable and these energies must be part of a sequence defined using a common energy increment.

 

 

 Given an image data set in which pixel information is represented by spectra, where a column of VAMAS blocks, each holding a row of pixel spectra, define the number of rows within the image, then new images are created from Quantification Items defined on these spectra using Quantification Regions by pressing the Convert Regions to Images button. At least one Region must be defined on each VAMAS block and those VAMAS blocks used to create the image must be overlaid in the Active tile. The Tag field in the region determines the type of quantification information used to create the image. Keywords in the Tag field indicate the required image type (all in lower case characters):

 

area

RSF adjusted peak area CPSeV

height

RSF adjusted peak height above background CPS

position

Position of the peak maximum relative to the background

centroid

Position of the peak centroid relative to the peak background

 

If the Tag field in the Quantification Region is none on the above keywords, the image is created from the RSF and transmission corrections area CPSeV (where ever possible).

 

 

 Given an image data set in which pixel information is represented by spectra, where a column of VAMAS blocks, each holding a row of pixel spectra, define the number of rows within the image, then new images are created from component intensities in CPSeV. A peak model must be defined on each VAMAS block and the VAMAS blocks displayed in the Active Tile. On pressing the Convert Components to Images button, the peak model is fitted to each spectrum in the image and an image is generated for each component in the peak model, plus a figure-of-merit image.

 

Keywords in the Tag field indicate the required image type (all in lower case characters):

 

position

Position of the peak maximum relative to the background

fwhm

The Full Width Half Maximum from each component

 

If the Tag field in the Component is none on the above keywords, the image is created from the RSF and transmission corrections area CPSeV (where ever possible).

 

 Given a set of images overlaid in the Active Tile, each image is divided by the sum of all the images and normalised to 100%. If the images are generated from quantification items such as Regions or Components, then the result of the Quantify Images button is a set of Atomic Concentration images displayed in a new Experiment Frame.

 

 The Reduce Size button sums sets of four pixels in squares to produce images with half the original pixel dimension. Overlay a set of images in the Active Tile and press the Reduce Size button. A new Experiment Frame is created in which the reduced size images appear.

 

 To create a line scan from one or more images, overlay the images in the Active Tile, mark out a line on the image display using the left-hand mouse button and a drag action, then press the Create Line Scan button. The dimensions of the line scan are in pixel coordinates. The X-axis will display distance when the display mode in not binding energy and, if desired, these distance dimensions must be calibrated elsewhere.

 

 The Copy Factors button is a means of copying the chosen SVD sorted vectors to a new column in preparation for use to TFA predict the original data.

 

 An alternative SVD Sort strategy in which the vectors are ordered using a ten dimensional subspace filter.

 An alternative SVD Sort strategy in which the vectors are ordered using a subspace with dimension n equal to twice the value of the text-field .

 

 A complete ordering of a data set using the SVD Sort algorithm, in which the set of n vectors is scanned n times. Generally, it is quicker and just as accurate to use the SVD Sort with fewer SVD scans.

 

       The text field is used to input the number of SVD Sort scans required when the SVD Sort button is applied. The number of SVD Sort Scans should be more than the number of expected abstract factors for a given data set. If the expected number of factors is four (say) then a safe number of SVD Scans might be ten. The important thing to monitor is the convergence of the vectors, where all vectors excluded from the final PCA have converged to exhibit only noise like structure.

 

To perform an SVD Sort, enter the desired number of scans in the text-field, overlay the data in the Active Tile and press the SVD Sort button. When spectra are displayed in the Active Tile, the SVD Sort will work on all spectra stores in the corresponding variables.

 

 An extension of the SVD Sort option, where each scan through the set of vectors produces a vector and this vector is then projected out of all the vectors in the list before applying the next SVD Sort. The result is a set of orthogonal vectors spanning the original vector space. The number of vectors identified and projected out of the data set as a whole is specified in the No Scans text-field.

 

 Nonlinear Iterative Partial Least Squares (NIPALS) method for computing the significant abstract factors is applied to the data displayed overlaid in the Active Tile.  The number of vectors sort in specified in the No Scans text-field. A one stop enhancement button using the NIPALS method is also available on the TFA Predict panel.

 

 

 Perform a PCA, via a Singular Valued Decomposition on those data displayed in the Active Tile. The PCA is performed directly on the data channels without shifting or other adjustment available on the Spectrum Processing PCA property page.

 

 Data can be pre-processed prior to SVD Sorting and/or PCA using these options. Dividing data by the square root of the counts per bin suppresses the influence of noise on the structure with an image or spectrum. These buttons represent a simple, yet effective means of reducing noise artefacts from the abstract factors by offering the forward and inverse transformation steps.

 

To operate on multiple VAMAS block, the data from the VAMAS blocks should be overlaid in the Active Tile before pressing either of these buttons.

 

  Enter a valid PCA abstract factor in the Active Tile and use the right-hand-side of the Experiment Frame to select those images/spectra for which the TFA prediction is to be applied. Enter the number of Abstract Factor used in the TFA Prediction step in the text-field and press the Predict button.

 

 One Stop Data Enhancement

 

These buttons offer a means of enhancing the signal-to-noise in a data set at a single press of a button. Once the number of significant abstract factors are determined and the number of entered into the given text-field (No AFs), the set of data currently overlaid in the Active Tile is processed into significant vectors and reconstructed from the number of significant abstract factors specified in the tex-field. The buttons labeled with SQRT will perform the operation using the recommended noise suppressing SQRT/square transformation. The resulting spectra calculated directly or indirectly from images are ready for quantification.

 

 These buttons represent direct computation on a pixel-by-pixel basis. The operands are defined by overlaying two images in the Active Tile. The first of the two images selected in the right-hand-side will correspond to a; the result of the computation appears in the VAMAS block corresponding to a. The Spectrum Processing dialog window can be used to reset the processed data to the original raw data.