Type of presentation: Oral

ID-9-O-2047 Mapped Textons for Tissue Microarray Classification in Digital Pathology

Fernández-Carrobles M. M.1, Bueno G.1, García-Rojo M.2, Déniz O.1, Salido J.1
1VISILAB, Universidad de Castilla-La Mancha, Spain, 2Dpto. Anatomía Patológica, Hospital General Universitario de Ciudad Real, Spain.
MMilagro.fernandez@uclm.es

Breast microscopy images contain large amounts of textures which can be used as a discriminative way to distinguish benignant or malignant breast tissue. For that reason, the selection of a valuable descriptor is essential for later tissue classifications. Frequential textons have been selected in this study to represent the texture of breast Tissue Microarray (TMA) images. Textons are textural descriptors of the spectral model. A definition of textons would be expressed as the repetitive features of the textures which humans can distinguish. Therefore, the texton concept assumes the existence of different textural components in the same texture. In our study, we used textons as discriminators to detect cancer or other sorts of tissue anomalies. Textures were represented by texton maps.

A data set of 628 microscopic images acquired by a whole-slide imaging system at 10x was selected and divided into four classes (see Figure 1).

The images are filtered by a MR8 filter bank composed by a set of filters: a Gaussian, a LoG, 18 edge and 18 bar filters with 3 basic scales and 6 orientations respectively. The MR8 filter bank is applied over the tissue images and 38 response filters are extracted. Each pixel is represented by a 38 dimensional vector. Then, a k-means clustering algorithm is applied over all the pixel vectors. Each new group extracted from the k-means algorithm is characterized by a representative vector called texton. Finally, 60 textons were selected for each class so a total of 240 textons composed the texton vocabulary. Once the texton vocabulary was extracted, each tissue image is represented by their texton histogram and its map is created (see Figure 2).

Features are extracted calculating the Haralick coefficients on the texton maps. We also considered the influence of color on the TMA images. Therefore, each feature set was extracted from eight different color models: RGB, CMYK, HSV, Lab, Luv, SCT, Hb and Lb. A total of 241 features were obtained for each color model, that is, a maximum of 1928 features were handled. Therefore we proposed a dimensionality reduction which has been performed with a correlation threshold of 97%.

The classification tests were carried out with 5 different classifiers: Fisher, SVM, random forest, bagging trees and Adaboost. The best results were obtained with AdaBoost and combination of all color models. The classification was tested by 10fcv.

We obtained 95% accuracy with 92% precision. Therefore, it is shown that texton maps are suitable to classify breast TMA images. The best results were obtained by using all color models and applying a feature correlation analysis (see Figure 3). The computational time was also reduced from 493 to 362 seconds without and with dimensional reduction respectively.


The authors acknowledge partial financial support from the Spanish Research Ministry Project TIN2011-24367 and from the EU Marie Curie Actions, AIDPATH project (num. 612471).

Fig. 1: H&E TMA images divided into four classes. Class 1) benign stromal with cellularity. Class 2) adipose tissue, Class 3) benign and benign anomalous structures: sclerosing and adenosis lesions, fibroadenomas, tubular adenomas, phyllodes tumors, columnar cell lesions and duct ectasia. Class 4) ductal and lobular carcinomas.

Fig. 2: Texton Maps. The texton histogram is created when all image pixels are classified by their nearest texton. The texton map is a representation of the image through their texton histogram. Each pixel in the new image is represented by a texton and each texton is represented by a color.

Fig. 3: AdaBoost classification error with and without dimensional reduction. The color models combination improve the classification results but also increases the size of the feature set. Feature reduction by correlation allowed a reduction of 79.2% of the initial features (from 1928 to 409 features) and improves accuracy up to 1.2%.