CytoSpec - an APPLICATION FOR HYPERSPECTRAL IMAGING |
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
|
|
|
|||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
PULL DOWN MENU "MULTIVARIATE IMAGING" |
||||||||||||||||||||||||||||||
![]() |
||||||||||||||||||||||||||||||
CREATE HCA MAPS FROM SPECTRA |
||||||||||||||||||||||||||||||
HCA imaging ('Multivariate Imaging' pull down menu).
HCA imaging ).Reformatting data sets for HCA imaging: The following data manipulations are performed to prepare data sets for HCA:
Important: Spectra with a negative Quality Test result, or unselected Regions of Interest are automatically excluded from HCA imaging.Once data preparation has been finished, the HCA imaging dialog box will appear: HCA imaging
|
||||||||||||||||||||||||||||||
HCA Imaging |
||||||||||||||||||||||||||||||
Hierarchical Cluster Analysis (HCA): This imaging function performs hierarchical cluster analysis of hyperspectral data and can be used to display the results as HCA images and dendrograms. Spectra with negative Quality Test results, or unselected Regions of Interest are excluded from the analysis and appear in HCA images as black pixels. For each spectral class, or cluster, average spectra and standard deviation spectra are calculated which can be stored in an ASCII data format. Furthermore, it is also possible to store, or load, distance data and the results of hierarchical clustering. ![]()
Multivariate Imaging --> HCA imaging --> create HCA maps from spectra .
obtaining inter-spectral distances for clusteringPart II - hierarchical clustering.Part III - HCA imaging: reassembling HCA images, obtaining mean cluster spectra.Part IV - an example of HCA imaging.
calc: starts the calculation of the distance matrix. load: load distance matrix files (file extensions is '*.dis'). save: save the distance matrix. use shortcut: if this option is chosen, the calculation of the distance matrix AND hierarchical clustering are queued that is no further user input will be required. Please select a distance AND a cluster method, even if the distance matrix has not been obtained, before pressing the 'calc' button. Note also that the distance matrix cannot be stored when this option was selected. reduced HCA: This HCA imaging option is particularly useful when large datasets are analyzed. It is recommended to choose this option for data files containing more than 128 x 128 spectra. In reduced HCA, the calculation of the distance matrix and hierarchical clustering are carried out on the basis of randomly selected spectra. When finished, mean cluster spectra of the last 50 clusters are obtained. Then, distance method: pop up menu which allows to select one of the following methods for distance matrix calculation.
load: load cluster analysis files. The file extension is '*.cls'. save: save cluster analysis results. cluster method: here you can select a method for clustering:
image: displays the cluster image. The number of classes (clusters) are color encoded. The color sequence is determined by the active color map, usually the color map 'ann' (see function Display Spectra).dendro: A dendrogram is shown on the display. The dendrogram can be stored as bitmap ('*.bmp') or as an encapsulated postscript ('*.eps') data file. Both functions are available by a activating the context menu of the dendrogram window. Note that dendrograms will show only the last 500 fusion steps. spectra: this function produces and displays average spectra of each class. After clicking on the 'spectra' button, a dialog box for choosing the source data block comes up (see screenshot below). This data block is then used for averaging and the creation of the respective standard deviation spectra.
disp averages: average spectra of the selected data block are calculated and plotted in the same color like in the cluster map. save averages: average and standard deviation spectra of the selected data block are calculated. To store these spectra (double column ASCII) use the standard windows dialog box to set the path and chose a file name. Average spectra and standard deviation spectra of the i-th cluster are stored per default in separate files (corename_i - for average spectra and corename_std_i for the standard deviation spectra). Part IV - Example of HCA imaging:
Min-Max Normalized average spectra, which were encoded by the same color utilized for displaying the cluster in the HCA image (HCA was carried out on the basis of the file 'colon.cyt'. which can be found in the /testdata/bin/CytoSpec/ directory).Reference to the literature: Lasch P, Haensch W, Naumann D, Diem M. Imaging of colorectal adenocarcinoma using FT-IR microspectroscopy and cluster analysis. Biochim Biophys Acta. 2004 1688(2):176-86. Lasch P, Diem M, Hänsch W & Naumann D. Artificial neural networks as supervised techniques for FT-IR microspectroscopic imaging. Journal of Chemometrics 2007 Vol. 20(5):209-220. |
||||||||||||||||||||||||||||||
KMC Imaging (k-means cluster imaging) |
||||||||||||||||||||||||||||||
J.B. MacQueen. In L.M. LeCam and J. Neymann (eds) Proceedings of Fifth Berkeley Symposium on Mathematical Statistics and Probability. 1967 281-297. Related web links (Wikipedia): In the CytoSpec implementation, MacQueens k-means cluster algorithm is used. k-means clustering is a non hierarchical clustering method, which obtains a "hard" (crisp) class membership for each spectrum, that is the class membership of an individual spectrum can be taken only the values of zero or one. It uses an iterative algorithm to update randomly selected initial cluster centers, and to obtain the class membership for each spectrum, assuming well-defined boundaries between the clusters. MacQueens iterative algorithm of KMC can be described as follows: Spectra are illustrated as points in a p-dimensional space (p is the number of features of the spectra. In this space a number of k points is initially chosen, where each point represents a cluster to be made. Then, distance values between the points and all objects (spectra) are calculated. Objects are assigned to a cluster on the basis of a minimal distance value. Next, centroids of the clusters are calculated and distance values between the centroids and each of the objects are re-calculated. Then, if the closest centroid is not associated with the cluster to which the object currently belongs, the object will switch its cluster membership to the cluster with the closest centroid. The centroid's positions are re-calculated every time a component has changed the cluster membership. This continues until none of the objects has been re-assigned. When the function 'KMC imaging' was selected from the 'multivariate imaging' pulldown menu the following window comes up: Reformatting data sets for KMC imaging: The following data manipulations are performed to prepare data sets for KMC:
Important: Spectra with a negative Quality Test result, or unselected Regions of Interest are automatically excluded from KMC imaging.Once data preparation has been finished, the following KMC imaging dialog box will appear: To start the KMC imaging function hit the 'image' button, to exit press 'cancel'. When the calculation is finished the cluster map will be immediately plotted in the axis of the preprocessed maps using the colormap 'ann'. Button 'spectra': This option becomes available when the k-means clustering calculations are finished. When this button is pressed, a new window comes up that offers additional options such as the calculation of cluster mean spectra or sorting individual map spectra by its cluster membership. save all spectra: all spectra used for KMC imaging are stored as double column ASCII data. Use the standard windows dialog box to set the path and create the ASCII data files. File names: corenamex_i: x-th spectrum of the i-th cluster. disp averages: average spectra of the selected data block are calculated and plotted in the same color like in the cluster map. save averages: average and standard deviation spectra of the selected data block are calculated. To store these spectra (double column ASCII) use the standard windows dialog box to set the path and chose a file name. Average spectra and standard deviation spectra of the i-th cluster are stored per default in separate files (corename_i - for average spectra and corename_std_i for the standard deviation spectra). Reference to the literature: |
||||||||||||||||||||||||||||||
FCM cluster imaging (fuzzy C-means cluster imaging |
||||||||||||||||||||||||||||||
fuzzy C-means clustering (Wikipedia) J.C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms, 1981 New York. Plenum Press. FCM clustering is a non hierarchical clustering method. This clustering technique partitions objects into groups (cluster) whose members show a certain degree of similarity. Unlike k-means clustering, the output of FCM clustering is a membership function, which defines the degree of membership of a given spectrum to the clusters. The values of the membership function can vary between one (highest degree of cluster membership) and zero (no class membership), where the sum of the C cluster membership values for one object equals one. Thus, this method departs from the classical two-valued (0 or 1) logic, and uses "soft" linguistic system variables and a continuous range of true values in the interval [0,1]. FCM imaging uses a fuzzy iterative algorithm to calculate the class membership grade for each spectrum. The iterations in FCM clustering are based on minimizing an objective function, which represents the distance from any given data point (spectrum) to the actual cluster center weighted by that data points membership grade. The advantage of the fuzzy C-means clustering over k-means clustering is that both outliers and data, which display properties of more than one class can be characterized by assigning nonzero class membership values to several clusters. In the similarity maps assembled by FCM clustering the membership values are encoded by the colormap, that is by the color intensities in the case of single color colormaps. When the function 'FCM imaging' was selected from the 'multivariate imaging' pulldown menu the following window comes up: Reformatting data sets for FCM imaging: The following data manipulations are performed to prepare data sets for FCM:
Important: Spectra with a negative Quality Test result, or unselected Regions of Interest are automatically excluded from FCM imaging.Once data preparation has been finished, the following FCM imaging dialog box will appear: To start the FCM imaging function hit the 'FCM' button, to exit press 'cancel'. When the calculation is finished select a cluster that should be used to reassemble the image ('display which cluster'). The single cluster image is plotted in the axis of the preprocessed maps using the colormap 'black'. A composite image can be created in a separate window by pressing the button 'composite image'. Button 'spectra': This option becomes available when the fuzzy C-means clustering calculations are finished. When this button is pressed, a new window comes up that offers additional options such as the calculation of cluster mean spectra or sorting individual map spectra by its cluster membership. By pressing any of the buttons (except button 'done') a new FCM imaging approach is initiated by using the settings of the FCM imaging dialog box. disp averages: average spectra of the selected data block are calculated and plotted. save averages: average and standard deviation spectra of the selected data block are calculated. To store these spectra (double column ASCII) use the standard windows dialog box to set the path and chose a file name. Average spectra and standard deviation spectra of the i-th cluster are stored per default in separate files (corename_i - for average spectra and corename_std_i for the standard deviation spectra). Reference to the literature: |
||||||||||||||||||||||||||||||
CREATE PCA MAPS FROM SPECTRA |
||||||||||||||||||||||||||||||
PCA imaging ('Multivariate Imaging' pull down menu).
PCA imaging ).Reformatting data sets for PCA imaging: The following data manipulations are performed to prepare data sets for PCA:
Important: Spectra with a negative Quality Test result, or unselected Regions of Interest are automatically excluded from PCA imaging.Once data preparation has been finished, the PCA imaging dialog box will appear: PCA imaging
|
||||||||||||||||||||||||||||||
PCA Imaging |
||||||||||||||||||||||||||||||
Principles of PCA: PCA is a linear transformation in which the (spectral) data are transferred into a new coordinate system. In this new coordinate system, the largest data variance points to the direction of the first coordinate, which is also called the first principal component (pc), the second largest variance on the second pc, and so forth. PCA is therefore a transformation that re-arranges the data according to the data's intrinsic variance: most of the variance is contained in the lower-order principal components while higher-order pc's are supposed to contain mainly noise. Reduction of dimensionality by PCA can be effectively achieved by omitting higher-order principal components. Related web links (Wikipedia):
|
||||||||||||||||||||||||||||||
VCA Imaging (vertex component analysis imaging) |
||||||||||||||||||||||||||||||
Principles of VCA: VCA (vertex component analysis) is an unsupervised method to rapidly unmix hyperspectral data. The algorithm was initially developed by J. Nascimento and J. Dias. The idea of VCA can be summarized as follows: Given a set of mixed spectral (multispectral or hyperspectral) vectors, linear spectral mixture analysis, or linear unmixing, aims at estimating the number of reference substances, also called endmembers, their spectral signatures, and their abundance fractions. Unsupervised endmember extraction by VCA exploits the following facts: the endmembers are the vertices of a simplex and the affine transformation of a simplex is also a simplex. Furthermore, VCA assumes the presence of pure pixels in the data. The algorithm iteratively projects data onto a direction orthogonal to the subspace spanned by the endmembers already determined. The new endmember signature corresponds to the extreme of the projection. The algorithm iterates until all endmembers are found. Related links J. Nascimento and J. Dias, "Vertex Component Analysis: A fast algorithm to unmix hyperspectral data", IEEE Transactions on Geoscience and Remote Sensing 2005 vol. 43, no. 4, pp. 898-910 Unsupervised unmixing of hyperspectral imagery using the constrained positive matrix factorization by Yahya M. Masalmah. July 2007. Dissertation, University of Puerto Rico. Chair: Miguel Vel´ez-Reyes, Major Department: Computing and Information Science and Engineering The window for data preparation is the same as for other multivariate imaging approaches, such as HCA, KMC or FCM imaging. Reformatting data sets for VCA imaging: The following data manipulations are performed to prepare data sets for VCA:
Important: Spectra with a negative Quality Test result, or unselected Regions of Interest are automatically excluded from PCA imaging.Once data preparation has been finished, the following VCA imaging dialog box will appear:
|
||||||||||||||||||||||||||||||
ANN IMAGING |
||||||||||||||||||||||||||||||
This function of the CytoSpec program is designed to re-assemble hyperspectral images on the basis of classification results of artificial neural networks (ANN). In this function, result files (*.res) of the Stuttgart Neural Network Simulator (SNNS) are analyzed and directly converted into false-colored ANN maps. Furthermore, checks for activation thresholds and multiple activations can be carried out. The Network Simulator was developed at the "Institut für Parallele und Verteilte Höchstleistungsrechner" (IPRV) of the Universität Stuttgart (Germany). The SNNS can downloaded for free at ftp://ftp.informatik.uni-stuttgart.de.A tutorial of how to use the CytoSpec-SNNS interface is given here
|
||||||||||||||||||||||||||||||
Imaging by using Synthon's NeuroDeveloper(TM) network simulator |
||||||||||||||||||||||||||||||
The evaluation of the activations calculated by the network is performed with the analysis functions WTA and 40-20-40. For this evaluation the scores and the distribution of the activations from the output neurons are taken into account. WTA criteria: WTA stands for winner takes all, which means the classification depends on the highest output activation. A spectrum will only be classified if its output is greater than the defined minimum activation of winner neuron (default 0.7) and the minimum distance to next activation (default 0.3). Otherwise the classification will not be considered correctly and the spectrum remains unclassified. 406040 criteria The 40-20-40 function works differently. The activation of one neuron has to exceed 0.6 (default, above 60 percent of the activation range). All other activations of further classes have to be below 0.4 (below 40 percent of the activation range). Otherwise, the pattern remains unclassified. extrapolation criterion A general problem with different classification methodologies is the potential misclassification due to undesired or unexpected extrapolation. This occurs, when the training and validation datasets do not comprise all classes or the entire range of a feature needed for a given classification problem. In this case, any classification method, including ANNs, would not be representative for the given problem. Data of this type should rather be termed not classified. The NeuroDeveloper uses a distance value derived from the training and validation dataset, to determine an extrapolation problem. The maximum distance of a pattern to its corresponding class is calculated and set to 100. During the classification of a new pattern by the ANN, the distance of the new pattern is calculated and set into relation. In case the calculated extrapolation value of the class, identified by the neural network, exceeds 100, an extrapolation occurs. The default value to determine patterns as unclassified is proposed to be set to 200. The value should be set greater than 100, the smaller the value, the stricter the threshold. Please note: If the checkbox 'use NeuroDeveloper winner assignments' was checked, CytoSpec does not perform an analysis of the WTA, 406040 and extrapolation results. The NeuroDeveloper software excludes spectra from further analysis in the following way:
2. or classification based on the extrapolation criterion failed. Screenshot of the NeuroDeveloper(TM) classification statistics window: The network file (*.snt) contains all relevant information for pre-processing and classification. This file, and Synthons's run-time environment (NOT the NeuroDeveloper!) are required for classification. B. Produce NeuroDeveloper maps (e.g. for external validation) |
||||||||||||||||||||||||||||||
Create HCA maps from spectra
HCA imaging (image reassembling based on hierarchical clustering)
KMC imaging (image reassembling based on k-means clustering)
FCM cluster imaging (image reassembling based on fuzzy C-means clustering)
Create PCA maps from spectra
PCA imaging (image reassembling based on principal component analysis)
VCA imaging (image reassembling based on vertex component analysis)
ANN imaging (image reassembling based on artificial neural network analysis)
Synthon imaging (ANN imaging, requires Synthon's NeuroDeveloper(TM) ANN simulator)
Imaging with distance values (image reassembling based on interspectral distances)
Multivariate Statistics --> PCA imaging --> create PCA maps from spectra if you want to produce PCA images directly from a hyperspectral data set.
Quality Test results, or unselected
Regions of Interest are excluded from the analysis and appear in PCA images as black pixels. 
log-file) you can find additional information on the test results.
Lasch, P. & Naumann, D. FT-IR Microspectroscopic Imaging of Human Carcinoma Thin Sections Based on Pattern Recognition Techniques. Cellular and Molecular Biology 1998 44(1). pp. 189-202
Lasch P, Haensch W, Kidder L, Lewis EN. Naumann D. Colorectal Adenocarcinoma Characterization by Spatially Resolved FT-IR Microspectroscopy. Appl. Spectrosc. 2002 56 (1). 1-9 |
Synthon GmbH, contact address: Analytics and Pattern Recognition Im Neuenheimer Feld 69120 Heidelberg GERMANY phone: +49 6221 50 257 900 fax: +49 6221 50 257 909 email: info@synthon-analytics.com internet: http://www.synthon-analytics.de |
|
![]()
|
load allows to browse the directory structure and load the NeuroDeveloper network library (*.snt) file. After loading the 'image' and 'stats' buttons are activated.
|
| 1. | Load an IR data set and produce an IR spectral map (e.g. chemical map, HCA map) |
| 2. | Obtain the context menu of IR maps by clicking with the right mouse button over the infrared spectral map. |
| 3. | Choose 'class 1' --> and 'start' if you want to assign spectra to class 1. Now, you are in the 'select spectra' mode. In this mode, the mouse cursor changes its appearance (arrow plus cross). |
| 4. | You can select now an unlimited number of spectra by left mouse clicks (in this mode; spectra will be not displayed). The spatial coordinates will be given in the command line window. |
| 5. | To stop the selection mode, choose 'selection mode off' from the context menu. Alternatively, you can immediately start to assign spectra to class 2 by selecting 'class 2' --> and 'start' from the context menu. In this way, spectra can be assigned to up to 10 distinct classes. |
| 6. | If all spectra are selected, stop the selection mode by 'selection mode off'. In the normal 'show spectra' mode the mouse pointer will regain its normal appearance (arrow). |
| 7. | In order to export spectra select the 'export' --> 'x,y ASCII' from the file pull down menu. A window with the title ' convert into a (x,y) ASCII data format' appears. Check the checkbox 'export selection'. Please use the default settings for all other options. Make sure, that the data block of original absorbance spectra is exported. |
| 8. | Press button 'export' and store the spectra in a folder of your choice. Spectra of class 1 can be identified by the extension '*_1.dat', spectra by class 2 are named '*_2.dat' and so forth. |
| 9. | Steps 1-8 should be repeated for a number of maps. It is recommended to use consistent class assignments for identical (histological) structures. |
| 10. | Split the spectral data into a subset for teaching (ca. 65 % of the spectra) and internal validation (35%) |
| 1. | Load an IR data set of absorbance spectra. It is recommended to produce first an IR map (e.g. chemical map) from original spectra. |
| 2. | Select 'Synthon maps' from the 'Image manipulation' pull down menu. A window entitled 'create NeuroDeveloper maps' will appear. Press the 'load' button and select one of NeuroDeveloper's Network files (*.snt). |
| 3. | Spectra from the data block of original data are now written to a temporary file (in CytoSpec's root folder, please make sure that sufficient free disk space is present). The data are then pre-processed and classified by Synthon's run-time environment. After this, the classification results are automatically transferred back to CytoSpec. When finished, you can immediately press the 'image' button that causes CytoSpec to display the NeuroDeveloper map. If you wish to modify the NeuroDeveloper exclusion criteria such as WTA, 406040, or extrapolation, uncheck the respective checkboxes and press 'define'. Change the settings, close the window and press 'image'. Classification statistics are available by pressing the 'stats' button of the 'create NeuroDeveloper maps' window. |
Distance (Wikipedia) 
Quality Test result, or unselected
Regions of Interest are automatically excluded from the function 'imaging with distance values'.
Copyright (c) 2000-2010 CytoSpec. All rights reserved.