CytoSpec - an APPLICATION FOR HYPERSPECTRAL IMAGING


 

File Pulldown Menu

Load
Save
Save Matlab
Import ASCII
Import Binary
Export
Delete
Clear
Plot
Customize
Batch Multiple Files
Exit

Imaging

Chemical Imaging
Frequency Maps
3D-Deconvolution
HCA Imaging
PCA Imaging
Synthon Imaging (ANN)
ANN Imaging

Preprocessing

Calculation of Derivative Spectra
Normalization (Vector, Offset)
Cut
Interpolate
Smooth
ABS <--> TR Conversion
Subtraction
Dispersion Correction
Quality test
Baseline Correction
Water Vapor Compensation
Noise Correction
Batch Preprocessing

Multivariate Statistics

Hierarchical Cluster Analysis
Principal Component Analysis
k-Means Clustering
Fuzzy C-Means Clustering

Tools

Display Options
Display Spectra
Set Display limits
Grid On/Off
Set Colors
Capture
Export Maps
Map Statistics
Display Large Maps
Define ROI
Display Colorbar
Swap Data Blocks
Rotate
Flip

File Information

Show History
Show Instrument Parameters
Show measurement Parameters
Show Additional Parameters
Edit Parameters

 

PULL DOWN MENU "IMAGE MANIPULATION"

 

Image Menu
 
    Help is available for the following functions of the 'image manipulation' pull down menu:
     
      chemical maps (chemical imaging or functional group mapping)
      frequency maps (imaging based on band frequencies)
      3D deconvolution (image re-assembling based on 3D Fourier self-deconvolution)
      HCA maps (image re-assembling based on hierarchical clustering)
      PCA maps (image re-assembling based on principal component analysis)
      Synthon imaging (image re-assembling based on Synthon's NeuroDeveloper(TM) ANN simulator)
      ANN maps (image re-assembling based on artificial neural network analysis)

CHEMICAL IMAGING


     
    Chemical Imaging (or functional group mapping): This function permits to produce chemical images, which display defined spectral parameters such as absorbances at a given frequency as a function of the spatial position. Spectral parameters are color encoded according to the selected Colormap. In the CytoSpec program you can use the following methods for chemical imaging:
     
      Method A: calculates the integrated area in the region P1, P2 (no baseline correction)
      Method B: obtains the absorbance value at a given frequency P1
      Method C: calculates the integrated area in the region P1, P2 (trapezoidal baseline correction)
      Method D: obtains a baseline corrected absorbance value at a given frequency P5. The baseline is obtained from the points P1 and P2
      Methods A-D/E-H: calculate ratios using any combination of methods for two regions or absorbances at a given frequency
chemical mapping
    1. Select first the imaging method. You can choose between methods A-D by activating the appropriate radio button. If you wish to use ratios for chemical imaging you have to additionally select the 'ratio' button and the method in the denominator (methods E-G).
     
    2. Next, type the wavenumber values P1-P6 used as integration borders, or frequency points for construction baselines. Note that the number of values depends on the integration method.
     
    3. Select the data block on which chemical imaging should be carried out. The following data blocks can be used for chemical imaging: original data (i), preprocessed data (ii), derivatives (iii), and 3D-FSD data (iv).
     
    4. Pressing the 'plot' button calculates the respective parameters from the selected spectral data block and plots these parameters as a function of (x,y) position within the map.
     
    5. Button 'cancel': The window for chemical imaging is closed.
     

FREQUENCY IMAGING / FREQUENCY MAPS


     
    Frequency imaging: This function permits visualization of peak positions, and their variations, within hyperspectral maps. Band positions - either maxima or minima - are obtained and plotted as a function of the spatial coordinates.
     
    Band positions can be obtained from all types of data blocks, including also derivative spectra (for details refer to chapter Internal Data Organization).
     
    Two different methods of the 'frequency map' routine are available: Peak maxima/minima can be obtained from spectra contained in one of the four data blocks, or for overlapping bands from second derivative data blocks (this may also be second derivatives from spectra of the derivative data block!).
    The use of second derivatives for peak picking may be useful for the detection of peaks in the presence of strongly overlapping signals, i.e. when the band is only a small shoulder on a strong signal. To some extent 2nd derivatives compensate also for baseline effects. Derivatives should be used with care as noise is considerably amplified.
     
    The algorithm of the peak picking routine works as follows:
     
    A. If the option 'obtain peak positions from derivatives' was NOT selected:
     
      1. Spectra of the data block of your choice (see option 'use data block') are interpolated. For interpolation, the spline method is used. Interpolation is carried out in the frequency range indicated in the edit fields 'select spectral region for peak search'. Furthermore, a factor of interpolation can be selected. This factor indicates how many times the number of data points will be increased by interpolation (only in the spectral region selected for peak picking).
       
      2. If the option 'search maxima' was checked, the algorithm is then searching for the x-positions (frequencies) of the maxima within the spectral region indicated by the user. If minima are chosen, the program is searching for minima. IMPORTANT: If band positions are obtained from the data block of derivate spectra, maxima appear in second derivative spectra as minima and vice versa. There is no check for i) the order of the derivative and consequently ii), no compensation for the inversion of maxima and minima!
       
      3. The frequency values of the maxima/minima are color scaled and plotted as a function of the spatial coordinates. If you wish to further analyze the band positions by other programs you can access the data matrix of frequency values by using the Export Maps function.
       
    B. The option 'obtain peak positions from derivatives' was checked:
     
      1. Second derivative spectra from the data block of your choice (see option 'use data block') are calculated by applying the Savitzki-Golay algorithm with 5 smoothing points (see also chapter Calculation of Derivative Spectra).
       
      2. Derivative spectra are interpolated. For interpolation, the spline method is used. Interpolation is carried out in the frequency range indicated in the edit fields 'select spectral region for peak search'. Furthermore, a factor of interpolation can be selected. This factor indicates how many times the number of data points will be increased upon interpolation in the spectral region selected for peak picking.
       
      3. If the option 'search maxima' was checked, the algorithm is then searching for the x-positions (frequencies) of the maxima within the spectral region indicated by the user. If minima are chosen, the program is searching for minima. IMPORTANT: If band positions are obtained from the data block of derivate spectra, maxima appear in second derivative spectra as minima and vice versa. There is no check for i) the order of the derivative and consequently ii) no compensation for the inversion of maxima and minima!
       
      4. The frequency values of the maxima/minima are color scaled and plotted as a function of the spatial coordinates. If you wish to further analyze the band positions by other programs you can access the data matrix of frequency values by using the Export Maps function.
       
frequency imaging
    search maxima or minima: indicate whether you want to search for peak maxima or minima.
     
    obtain peak positions from derivatives: if this checkbox is checked, second derivatives of the chosen data block will be calculated. In this case, maxima or minima will be obtained from second derivate spectra. Otherwise (if box is NOT checked), peak positions are obtained directly from the data block chosen.
     
    interpolation factor: this factor indicates how many times the number of data points will be increased by the interpolation.
     
    select spectral region for peak search: the spectral region which will be used for peak picking.
     
    use data block: please choose the appropriate data block on which peak picking should be carried out.
     
    plot: the peak picking routine is started
     
    cancel: the 'frequency plot' window will be closed.
     

HCA Imaging


     
    Hierarchical Cluster Analysis (HCA): This imaging function performs hierarchical cluster analysis of hyperspectral data and can be used to display the results as HCA images and dendrograms. Spectra with negative Quality Test results, or unselected Regions of Interest are excluded from the analysis and appear in HCA images as black pixels. For each spectral class, or cluster, average spectra and standard deviation spectra are calculated which can be stored in an ASCII data format. Furthermore, it is also possible to store, or load, distance data and the results of hierarchical clustering.
     

 
window for hierarchical clustering


 

1. D-Values

 

 
2. Euclidean distances

 
3. normalized Euclidean distances

 
4. Euclidean squared distances

 
5. city block.

 

 

 
1. average linkage

 
2. single linkage

 
3. complete linkage

 
4. group average

 
5. centroid method.

 
6. median algorithm

 
7. Wards algorithm

 

cluster images and cluster average spectra

PCA Imaging


     
    Principles of PCA: PCA is a linear transformation in which the (spectral) data are transferred into a new coordinate system. In this new coordinate system, the largest data variance points to the direction of the first coordinate, which is also called the first principal component (pc), the second largest variance on the second pc, and so forth. PCA is therefore a transformation that re-arranges the data according to the data's intrinsic variance: most of the variance is contained in the lower-order principal components while higher-order pc's are supposed to contain mainly noise. Reduction of dimensionality by PCA can be effectively achieved by omitting higher-order principal components.
     
    Related web links (Wikipedia):
     

 
PCA imaging

 
    How to start PCA imaging: One can start PCA imaging either from the 'Image Manipulations --> PCA maps' pull down menu or from the menu 'Multivariate Statistics --> PC analysis (PCA)'. In the first case one have to load a PCA file (*.pca) obtained in earlier program sessions. Please refer to the chapter Multivariate Statistics --> PCA maps if you want to produce PCA images directly from a hyperspectral data set.
     
    Define the number of dimensions: to specify which PCs should be used for imaging, check the appropriate checkboxes of the column 'use value'. In the example above the principal components one and two are activated. The CytoSpec program permits the use of the first 10 principal components.
     
    Definition of individual score coefficients: first, select the principal components to be displayed from the popup menus in the upper part of the PCA imaging window (PC x- or y-axis). Then, click into the score plot window to the right. The respective coordinates of this action are transferred to the edit fields indicated as 1st to 10th score. Alternatively, it is possible to manually type the respective coordinates into these boxes.
     
    Normalization of the score coefficients: checking the checkbox 'normalize scores' causes normalization of distances between score coefficients and 'mass centers' such that the maximum distance for all principal components equals one.
     
    save PCs: if this option is chosen, the first 10 principal components are stored (see description of the button 'save' below for details).
     
    What does 'fix value' mean? This option permits to fix the coordinates of a given mass center in the n-th dimension, irrespective of mouse manipulations in the plot to the right. This option may be useful when searching for an optimal contrast of a PCA image.
     
    Imaging: using the actual coordinates of the mass centers, PCA images are plotted into the lower right panel of the main window.
     
    load: opens a standard window for opening files of the format *.pca.
     
    save: allows to save score coefficients and principal components. File extension will be *.pca. If the checkbox 'save PC's' was checked, the first 10 principal components are stored as separate double column ASCII files. These files are stored in the same directory as the *.pca-file.
     
    cancel: closes the PCA imaging window. Data not stored are lost.
     
    Note that spectra with negative Quality Test results, or unselected Regions of Interest are excluded from the analysis and appear in PCA images as black pixels.
     
    Reference to the literature:
     

3D FOURIER SELF-DECONVOLUTION (FSD)


 
3D deconvolution

    Example: This example shows the results of 3D-FSD (data acquired by the use of a 64 x 64 mid-infrared MCT focal plane array detector). For chemical imaging, the absorbance values at 1731 wavenumbers of the original spectral (upper plot) and of the FSD data block (lower panel) were color encoded and plotted as a function of (x,y) position. The panel to the left displays the original (blue) spectra and FSD spectra (red).
     
    Important :
    3D Fourier self-deconvolution can be performed only on the original data (data block 1). This function is only suited for envelopes much broader than the spatial/spectral resolution. Avoid oscillatory patterns due to over-deconvolution!
     
    Reference to the literature:
     

ANN IMAGING


     
    This function of the CytoSpec program is designed to re-assemble hyperspectral images on the basis of classification results of artificial neural networks (ANN). In this function, result files (*.res) of the Stuttgart Neural Network Simulator (SNNS) are analyzed and directly converted into false-colored ANN maps. Furthermore, checks for activation thresholds and multiple activations can be carried out.
     
    The Network Simulator was developed at the "Institut für Parallele und Verteilte Höchstleistungsrechner" (IPRV) of the Universität Stuttgart (Germany). The SNNS can downloaded for free at ftp://ftp.informatik.uni-stuttgart.de.
ANN Imaging (SNNS)

xdim in ANN map: the number of measurement points in x-direction.
 
ydim in ANN map: the number of measurement points in y-direction.
 
threshold for activations: threshold which defines the minimal allowed output activation. Activations below this threshold are set to zero and the black color is assigned to the corresponding pixel. If you want to omit this test then you can set this threshold to 0.
 
threshold for multiple activations: threshold which defines the maximum allowed activation of the second-highest activated neuron. If a spectrum meets this criterion, it is tested as negative and again the black color is assigned to corresponding pixel. One can omit this test by setting the threshold to 1.
 
image: opens the standard windows file browser. Please select a path and a valid SNNS result file.
 
cancel: aborts the ANN imaging function
 

    IMPORTANT: Please check carefully the sequence of the input pattern (spectra) when compiling the SNNS pattern file. Note that only the pattern sequence will define the spatial (x,y) positions of individual spectra. Use the option 'include output patterns' when saving '*.res' files. SNNS result-files should have the following format (see example below).
     
    Format of SNNS result files:
     
      SNNS result file V1.4-3D
      generated at Tue Mar 25 11:24:47 1997
       
      No. of patterns     : 980
      No. of input units  : 76
      No. of output units : 4
      startpattern        : 1
      endpattern          : 980
      teaching output included
      #1.1
      1 0 0 0
      0.06227 0.82028 0.01888 0.00005
      #2.1
      1 0 0 0
      0.97587 0.35621 0.00001 0.00089
      #3.1
      1 0 0 0
      0.99435 0.28643 0.00001 0.00046
       
        .....
       
      #979.1
      1 0 0 0
      0 0.05502 0.98514 0.21558
      #980.1
      1 0 0 0
      0 0.14128 0.81746 0.61632
    In the example given above, 980 spectra obtained from a rectangular area (20 x 19 spectra) were analyzed. The number of pre-defined classes in the teaching phase of ANN model development was four.
     
    The first line of the first pattern (#1.1) shows the a priori class assignment (target pattern), while the second line displays the ANN test results for this particular spectrum. The maximum activation was found for the second output neuron (0.82028), indicating a posteriori class assignment of this individual spectrum to class # two. CytoSpec automatically analyzes the posteriori assignments for all spectral sub-pattern contained in the *.res file and assigns specific colors to each class. Images are produced by combining colors with spatial (pixel) positions of the spectra assuming rectangular regions and equal distances between pixels in x- and y-direction, respectively.
     
    The following types of ANNs were tested to be compatible:
     
    • multilayer perceptron (MLP) networks consisting of three layers of neurons (input layer, hidden layer, output layer)
    • ANNs with feed-forward propagation of activations, shortcut connections are allowed
    • teaching functions: backpropagation, resilient backpropagation (rprop), quickpropagation (quickprop)
       
    CytoSpec also permits basic tests for multiple activations (i.e. if the second-highest activation is larger than a defined threshold) and for a required minima of activation. If one of these tests is negative the black color is assigned to the corresponding pixel. In the command line window (and the log-file) you can find additional information on the test results.
     
    Reference to the literature:
     

Imaging by using Synthon's NeuroDeveloper(TM) network simulator


 
    The function called 'Synthon maps' is basically an interface between CytoSpec and Synthon's NeuroDeveloper (TM), a software for teaching and validating artificial neural network models with spectra from various origins (e.g. IR, Raman, MS spectra). Based on neural network models, the interface can be used to re-assemble ANN images from CytoSpec's original data set. Spectral pre-processing, features selection and ANN classification of a priori unknown spatially resolved IR maps can be easily performed in one step. This is achieved by utilizing the runtime environment of the NeuroDeveloper software which does not require a software license from Synthon. Spectral data can be therefore classified without the NeuroDeveloper software on the basis of predefined network libraries. The NeuroDeveloper is, however, required if you wish to create and validate own neural network models.
Synthon GmbH, contact address:
Analytics and Pattern Recognition
Im Neuenheimer Feld
69120 Heidelberg
GERMANY
phone: +49 6221 50 257 900
fax: +49 6221 50 257 909
email: info@synthon-analytics.com
internet: http://www.synthon-analytics.de  

 
    To start the function select 'Synthon maps' from the 'image manipulation' pull down menu:
Imaging by using Synthon's NeuroDeveloper(TM) networks
 
change WTA and '406040

load allows to browse the directory structure and load the NeuroDeveloper network library (*.snt) file. After loading the 'image' and 'stats' buttons are activated.
 
image displays the ANN map in CytoSpec's main window
 
stats gives an overview on ANN classification statistics (see screenshot below)
 
use NeuroDeveloper winner assignments if this checkbox is checked, CytoSpec uses the NeuroDeveloper settings for ANN classification. The NeuroDeveloper settings can be modified by unselecting this checkbox.
 
use ND WTA criterion the NeuroDeveloper settings for the Winner Takes All (WTA) criterion are used. You can modify these values by deselecting this checkbox (see button 'define' and chapter below)
 
use ND 406040 criterion the NeuroDeveloper settings for the 40 / 60 / 40 (406040) criterion are used. You can modify these values by deselecting this checkbox (see button 'define' and chapter below)
 
use ND extrapolation criterion the NeuroDeveloper settings for the extrapolation are used. This option can deactivated.
 
define opens a window for entering own WTA and 406040 criteria (see screenshot to the left).
 
cancel aborts the 'synthon maps' function
 


 
    Spectra that have failed the tests appear within the ANN maps as black pixels.
     
    The evaluation of the activations calculated by the network is performed with the analysis functions WTA and 40-20-40. For this evaluation the scores and the distribution of the activations from the output neurons are taken into account.
     
    WTA criteria:
     
    WTA stands for winner takes all, which means the classification depends on the highest output activation. A spectrum will only be classified if its output is greater than the defined minimum activation of winner neuron (default 0.7) and the minimum distance to next activation (default 0.3). Otherwise the classification will not be considered correctly and the spectrum remains unclassified.
     
    '406040' criteria
     
    The 40-20-40 function works differently. The activation of one neuron has to exceed 0.6 (default, above 60 percent of the activation range). All other activations of further classes have to be below 0.4 (below 40 percent of the activation range). Otherwise, the pattern remains unclassified.
     
    'extrapolation' criterion
     
    A general problem with different classification methodologies is the potential misclassification due to undesired or unexpected extrapolation. This occurs, when the training and validation datasets do not comprise all classes or the entire range of a feature needed for a given classification problem. In this case, any classification method, including ANNs, would not be representative for the given problem. Data of this type should rather be termed not classified. The NeuroDeveloper uses a distance value derived from the training and validation dataset, to determine an extrapolation problem. The maximum distance of a pattern to its corresponding class is calculated and set to 100. During the classification of a new pattern by the ANN, the distance of the new pattern is calculated and set into relation. In case the calculated extrapolation value of the class, identified by the neural network, exceeds 100, an extrapolation occurs. The default value to determine patterns as unclassified is proposed to be set to 200. The value should be set greater than 100, the smaller the value, the stricter the threshold.
     
    Please note
     
    If the checkbox 'use NeuroDeveloper winner assignments' was checked, CytoSpec does not perform an analysis of the WTA, 406040 and extrapolation results. The NeuroDeveloper software excludes spectra from further analysis in the following way:
     
    1. either both, the WTA, or the 406040 criteria are failed.
    2. failed classification based on the extrapolation criterion
     
    The definition of 'failed classifications' in CytoSpec is different. CytoSpec defines spectra as unclassified if an individual spectrum failed one of the three criteria. For this reasons, the classification statistics may depend on the program used for analysis.
     
    Screenshot of the NeuroDeveloper(TM) classification statistics window:
     
NeuroDeveloper(TM) clssification statistics

 
    A. Compilation of data sets for teaching and internal validation
1. Load an IR data set and produce an IR spectral map (e.g. chemical map, HCA map)
2. Obtain the context menu of IR maps by clicking with the right mouse button over the infrared spectral map.
3. Choose 'class 1' --> and 'start' if you want to assign spectra to class 1. Now, you are in the 'select spectra' mode. In this mode, the mouse cursor changes its appearance (arrow plus cross).
4. You can select now an unlimited number of spectra by left mouse clicks (in this mode; spectra will be not displayed). The spatial coordinates will be given in the command line window.
5. To stop the selection mode, choose 'selection mode off' from the context menu. Alternatively, you can immediately start to assign spectra to class 2 by selecting 'class 2' --> and 'start' from the context menu. In this way, spectra can be assigned to up to 10 distinct classes.
6. If all spectra are selected, stop the selection mode by 'selection mode off'. In the normal 'show spectra' mode the mouse pointer will regain its normal appearance (arrow).
7. In order to export spectra select the 'export' --> 'x,y ASCII' from the file pull down menu. A window with the title ' convert into a (x,y) ASCII data format' appears. Check the checkbox 'export selection'. Please use the default settings for all other options. Make sure, that the data block of original absorbance spectra is exported.
8. Press button 'export' and store the spectra in a folder of your choice. Spectra of class 1 can be identified by the extension '*_1.dat', spectra by class 2 are named '*_2.dat' and so forth.
9. Steps 1-8 should be repeated for a number of maps. It is recommended to use consistent class assignments for identical (histological) structures.
10. Split the spectral data into a subset for teaching (ca. 65 % of the spectra) and internal validation (35%)
    Now you can load the spectral data into the NeuroDeveloper software. Perform class assignment and pre-processing. Teach and validate the ANNs and store the network (see Synthon's NeuroDeveloper software manual for details).
    The network file (*.snt) contains all relevant information for pre-processing and classification. This file, and Synthons's run-time environment (NOT the NeuroDeveloper!) are required for classification.

     
    B. Produce NeuroDeveloper maps (e.g. for external validation)
1. Load an IR data set of absorbance spectra. It is recommended to produce first an IR map (e.g. chemical map) from original spectra.
2. Select 'Synthon maps' from the 'Image manipulation' pull down menu. A window entitled 'create NeuroDeveloper maps' will appear. Press the 'load' button and select one of NeuroDeveloper's Network files (*.snt).
3. Spectra from the data block of original data are now written to a temporary file (in CytoSpec's root folder, please make sure that sufficient free disk space is present). The data are then pre-processed and classified by Synthon's run-time environment. After this, the classification results are automatically transferred back to CytoSpec. When finished, you can immediately press the 'image' button that causes CytoSpec to display the NeuroDeveloper map. If you wish to modify the NeuroDeveloper exclusion criteria such as WTA, 406040, or extrapolation, uncheck the respective checkboxes and press 'define'. Change the settings, close the window and press 'image'. Classification statistics are available by pressing the 'stats' button of the 'create NeuroDeveloper maps' window.

 
Reference to the literature:
 

[ GENERAL | FILE | PREPROCESSING | MULTIVARIATE STATISTICS | IMAGE MANIPULATIONS | TOOLS | FILE INFO | HELP | GLOSSARY ]

Copyright (c) 2000-2008 CytoSpec. All rights reserved.