Application of SOM neural network in clustering.
Abstract:
The Self-Organizing Map (SOM) is an unsupervised neural network algorithm that projects high-dimensional data onto a two-dimensional map. The projection preserves the topology of the data so that similar data items will be mapped to nearby locations on the map. One of the SOM neural network's applications is clustering of animals due their features. In this paper we produce an experiment to analyze the SOM in clustering different species of animals.
Keywords: SOM Neural Network; Feature; Clustering; Animal
Article Type:
Clinical report
Subject:
Neural circuitry (Research)
Computational neuroscience (Research)
Authors:
Behbahani, Soroor
Nasrabadi, Ali Moti
Pub Date:
12/01/2009
Publication:
Name: Journal of Biomedical Science and Engineering (JBiSE) Publisher: Scientific Research Publishing, Inc. Audience: Academic Format: Magazine/Journal Subject: Science and technology Copyright: COPYRIGHT 2009 Scientific Research Publishing, Inc. ISSN: 1937-6871
Issue:
Date: Dec, 2009 Source Volume: 2 Source Issue: 8
Topic:
Event Code: 310 Science & research
Geographic:
Geographic Scope: Iran Geographic Code: 7IRAN Iran
Accession Number:
228716899
Full Text:
1. INTRODUCTION
The Self-Organizing Map (SOM) is a fairly well-known neural network and indeed one of the most popular unsupervised learning algorithms. Since its invention by Finnish Professor Teuvo Kohonen in the early 1980s, more than 4000 research articles have been published on the algorithm, its visualization and applications. The maps comprehensively visualize natural groupings and relationships in the data and have been successfully applied in a broad spectrum of research areas ranging from speech recognition to financial analysis. The Self-organizing Map performs a non-linear projection of multidimensional data onto a two-dimensional display. The mapping is topology-preserving, meaning that the more alike two data samples are in the input space, the closer they will appear together on the final map. The SOM belongs to the class of Neural Network algorithms. This is a group of algorithms based on analogies to the neural structures of the brain. The SOM in particular was inspired by an interesting phenomenon: as physicians have discovered, some areas of brain tissue can be ordered according to an input signal. Basically, the SOM is a computer program simulating this biological ordering process. Applied to electronic datasets, the algorithm is capable of producing a map that shows similar input data items appearing close to each other. There are numerous applications involving the SOM algorithm but the most widespread use is the identification and visualization of natural groupings in the data. The process of finding similar items is generally referred to as clustering. Compared to the k-means clustering algorithm, the SOM exemplifies a robust and structured self-organizing neural networks are based on the principle of transforming a set of p-variate observations into a spatial representation of smaller dimensionality, which may allow a more effective visualization of correlations in the original data [4].
2. SELF-ORGANIZING MAP
The Self-Organizing Map belongs to the class of unsupervised and competitive learning algorithms. It is a sheet-like neural network, with nodes arranged as a regular, usually two-dimensional grid. As explained in the previous section on Neural Networks, we usually think of the node connections as being associated with a vector of weights. In the case of Self-Organizing Maps, it is easier to think of each node as being directly associated with a weight vector.
The items in the input data set are assumed to be in a vector format. If n is the dimension of the input space, then every node on the map grid holds an n-dimensional vector of weights:
mi = [mi1, mi2, mi3, . , min] (1)
The basic principle of the Self-Organizing Map is to adjust these weight vectors until the map represents a picture of the input data set. Since the number of map nodes is significantly smaller than the number of items in the dataset, it is needless to say that it is impossible to represent every input item from the data space on the map. Rather, the objective is to achieve a configuration in which the distribution of the data is reflected and the most important metric relationships are preserved. In particular, we are interested in obtaining a correlation between the similarity of items in the dataset and the distance of their most alike representatives on the map. In other words, items that are similar in the input space should map to nearby nodes on the grid [4].
2.1. Image's Characteristics
To represent the 3D image of 12 lead ECG, three axes for time, temporal and spatial are needed witch temporal axis represented the time domain of the cardiac signal and the spatial axis represented the locations of the limb and thoracic leads. The data axis is represented two extracted features of cardiac signal contains amplitude and wavelet coefficients. 6 leads are used to represent the image obtained by thoracic leads and 6 leads of 12 are used to represent the image obtained by limb leads.
In order to determine the information between consecutive leads in the spatial axis, an interpolation technique was used to witch could cause to homogeneousness of the image.
3. AN EXAMPLE OF SOM NEURAL NETWORK APPLICATION
More researches are performing in the field of SOM neural network applications in last two decades. One of the most important and famous examples of this application is clustering of animals due their features.
General features are using in this example based on the Kohonen animal data base (Table 1).
But the fact is that, these features are not sufficient for different species of animals.
In previous experiments, it had been assumed that there were only one species for each animal, whereas there may be exist more than 10 species for each special animal. So, for analyzing the ability of SOM neural network we perform a new experiment and assumed more than one species for them and increase the number of features to invent better separability. These features consist of geographical dispersion, nourishing and habitat, etc (Table 2).
The SOM size in this research is 7*7 and the initial weights are selected randomly.
Although there are 3 animals that are not settle in right location in SOM map, and selected wrong neurons, the results shows that extracted features could well separate the different species of animals. This result shows that the selected features for these 3 animals have not sufficient ability to separate them. This problem could be solved by adding extra features or choosing the features with more precise. One of the most important points in neural networks is the method of features extraction, but increasing the number of features could not always be the best solution for approving the results, because sometimes increasing the features lead to derangement in network. Another reason of bad result in neural networks relates to number of inputs. Increasing the number of inputs (animal species) leads to spreading the SOM size and could decrease the ability of it, because there would be more correlation between inputs, so the statistic of error will be increased.
4. RESULTS
Choosing suitable features for separating animal's species lead to good results of SOM neural network .There were some similarity between some of the animal's feature in Kohonen data base. For example the features of Goose and Owl, as well as, horse and zebra are exactly the same. And this similarity leads to wrong results in clustering of these animals. Although there are some errors, in this new experiment, these errors occurred between different species of one animal not between different animals. So the more similarity between animal's species, the more errors will occur.
[FIGURE 1 OMITTED]
5. CONCLUSIONS
SOM is a highly useful multivariate visualization method that allows the multidimensional data to be displayed as a 2-dimensional map. This is the main advantage of SOM. The map units clustering makes it easy to observe similarities in the data. Through our experiment, we demonstrated that the possibility of quick observation of relationship between component (feature) and the class as well as the relationship among different component (feature) of the dataset from the visualization of a dataset. SOM is also capable of handling several types of classification problems while providing a useful, interactive, and intelligible summary of the data.
However, SOM also has some disadvantages. For example, adjacent map units point to adjacent input data vector, so sometimes distortions are possible because high dimensional topography can not always be represented in 2D. To avoid such phenomenon, training rate and the neighborhood radius should not be reduced too quickly.
Hence, SOM usually need many iterations of training. And SOM also does not provide an estimation of such map distortion. Alternatives to the SOM have been developed in order to overcome the theoretical problems and to enable probabilistic analysis.
Current research showed a simple application of SOM neural network in clustering. This method can be used in many applications that need classification and one of them could be disease clustering. As seen in current research, we used fuzzy method to determine the features of each animal.
Similar to this research, we can determine the features of diseases. This method could help the physician in their diagnosis. We can use the sign of diseases as the input of SOM neural network. As some classes of disease have similar symptoms the SOM neural network can show a limitation of neighbor diseases that have such symptoms, so the physician can focus on them to diagnose the patient's disease with more accuracy. Fuzzy features can increase the ability of SOM neural network if they choose carefully with more accuracy and of course it need some trail and error methods to find a rule to relate a membership function to each disease and its symptoms.
doi: 10.4236/jbise.2009.28093
Received 11 June 2009; revised 29 June 2009; accepted 27 July 2009.
REFERENCES
[1] A. Forti, (2006) Growing hierarchical tree SOM: An unsupervised neural network with dynamic topology, , Gian Luca Foresti, Neural Networks, 19, 1568-1580.
[2] S. Haykin, (1999) Neural networks a comprehensive foundation (2nd ed.), Prentice Hall.
[3] R. G. Adams, K. Butchart and N. Davey, (1999) Hierarchical classification with a competitive evolutionary neural tree, Neural Networks, 12, 541-551.
[4] J. Li, Information visualization of self organizing maps. doi: 10.4236/jbise.2009.28094
Soroor Behbahani (1), Ali Moti Nasrabadi (2)
(1) Biomedical Engineering Department, Science and Research Branch, Islamic Azad University, Tehran, Iran;
(2) Biomedical Engineering Department, Faculty of Engineering, Shahed University, Tehran, Iran.
Email: soroor behbahani@yahoo.com; a m nasrabadi@yahoo.com
Table 1. The animal data set.
Animal Dove Hen Duck Goose
is
Small 1 1 1 1
Medium 0 0 0 0
Big 0 0 0 0
has
Two kgs 1 1 1 1
Four legs 0 0 0 0
Hair 0 0 0 0
Hooves 0 0 0 0
Mane 0 0 0 0
Feathers 1 1 1 1
likes to
Hunt 0 0 0 0
Run 0 0 0 0
Fly 1 0 0 1
Swim 0 0 1 1
Animal Owl Hawk Eagle Fox
is
Small 1 1 0 0
Medium 0 0 1 1
Big 0 0 0 0
has
Two kgs 1 1 1 0
Four legs 0 0 0 1
Hair 0 0 0 1
Hooves 0 0 0 0
Mane 0 0 0 0
Feathers 1 1 1 0
likes to
Hunt 1 1 1 1
Run 0 0 0 0
Fly 1 1 1 0
Swim 0 0 0 0
Animal Dog Wolf Cat Tiger
is
Small 0 0 1 0
Medium 1 1 0 0
Big 0 0 0 1
has
Two kgs 0 0 0 0
Four legs 1 1 1 1
Hair 1 1 1 1
Hooves 0 0 0 0
Mane 0 0 0 0
Feathers 0 0 0 0
likes to
Hunt 0 1 1 1
Run 1 1 0 1
Fly 0 0 0 0
Swim 0 0 0 0
Animal Lion Horse Zebra Cow
is
Small 0 0 0 0
Medium 0 0 0 0
Big 1 1 1 1
has
Two kgs 0 0 0 0
Four legs 1 1 1 1
Hair 1 1 1 1
Hooves 0 1 1 1
Mane 1 1 1 1
Feathers 0 0 0 0
likes to
Hunt 1 0 0 0
Run 1 1 1 0
Fly 0 0 0 0
Swim 0 0 0 0
Table 2. Increasing number of features and animals species.
Eagle Brownfish
Redfox Afghanfox Owl Owl
Small 0 0 0 0
Medium 0.4 0.5 0.1 0.43
Big 0 0 0 0
2 leg 0 0 1 1
4 leg 1 1 0 0
Hair 1 1 0 0
Hoove 0 0 0 0
Mane 0 0 0 0
Feathers 0 0 1 1
Hunt 1 1 0.95 1
Run 1 1 0 0
Fly 0 0 1 1
Swim 0 0 0 0
Asia 1 1 1 1
Africa 1 0 0.4 0
Us 1 0 0.4 1
Europe 0 1 0 0
Mountainous 1 0 1 0
Plain 1 1 0 1
River 0 0 1 1
Jungle 1 1 0 0
Domestic 0 0 0 0
Carnivorous 1 1 1 1
Herbivorous 0 0 0 0
Frugivorous 1 0 0 0
Egg 0 0 1 1
Milk 1 1 0 0
Colour variation 0.5 0.45 0.2 0.25
Longeared Shorteared Barn Saker
Owl Owl Owl Falcon
Small 0 0.875 0.475 0
Medium 0.21 0 0 0.083
Big 0 0 0 0
2 leg 1 1 1 1
4 leg 0 0 0 0
Hair 0 0 0 0
Hoove 0 0 0 0
Mane 0 0 0 0
Feathers 1 1 1 1
Hunt 1 1 1 1
Run 0 0 0 0
Fly 1 1 1 0.8
Swim 0 0 0 0
Asia 1 1 1 1
Africa 0 0 0.7 0.86
Us 1 0 1 0
Europe 0 1 0 1
Mountainous 0 0 0.47 0.5
Plain 0 0 0 1
River 0 1 0 0
Jungle 1 0 0 0
Domestic 0 0 1 0
Carnivorous 1 1 1 1
Herbivorous 0 0 0 0
Frugivorous 0 0 0 0
Egg 1 1 1 1
Milk 0 0 0 0
Colour variation 0.2 0.13 0.3 0.24
Lanner Peregrine Osprey Booted
Falcon Falcon Eagle Eagle
Small 0 0.95 0 0
Medium 0.33 0.13 0.08 0.08
Big 0 0 0 0
2 leg 1 1 1 1
4 leg 0 0 0 0
Hair 0 0 0 0
Hoove 0 0 0 0
Mane 0 0 0 0.066
Feathers 1 1 1 1
Hunt 0.9 0.91 1 1
Run 0 0 0 0
Fly 1 1 1 1
Swim 0 0 0 0
Asia 1 0.98 1 1
Africa 0 1 0 0.243
Us 0.5 1 0.5 0.7
Europe 0.95 0 0 1
Mountainous 0.27 1 1 0
Plain 1 0 0 0
River 0 0.96 0 0
Jungle 0.032 0 0 1
Domestic 0 0 0 0
Carnivorous 1 1 1 1
Herbivorous 0 0 0 0
Frugivorous 0 0 0 0
Egg 1 1 1 1
Milk 0 0 0 0
Colour variation 0.54 0.35 0.24 0.4
Lesser
Bonelli Spotted spotted Imperial
Eagle Eagle Eagle Eagle
Small 0 0 0 0
Medium 0.41 0.52 0.58 0.61
Big 0 0 0 0
2 leg 1 1 1 1
4 leg 0 0 0 0
Hair 0 0 0 0
Hoove 0 0 0 0
Mane 0 0 0 0
Feathers 1 1 1 1
Hunt 1 1 1 0.99
Run 0 0 0 0
Fly 1 1 0.95 1
Swim 0 0 0 0
Asia 1 0.91 1 0
Africa 0 1 0 1
Us 0.61 0 0.43 1
Europe 1 1 0.4 0
Mountainous 1 0.5 0 0.5
Plain 0.5 0 0.35 1
River 0 1 0 1
Jungle 0 1 0.9 0
Domestic 0 0 0 0
Carnivorous 0.95 0.89 1 1
Herbivorous 0 0 0 0
Frugivorous 0 0 0 0
Egg 0.92 1 1 1
Milk 0 0 0 0
Colour variation 0.33 0.21 0.4 0.16
Red Withe
Golden breasted Greylag fronted
Eagle Goose Goose Goose
Small 0 0 0 0
Medium 0.71 0.23 0.68 0.31
Big 0 0 0 0
2 leg 1 1 1 1
4 leg 0 0 0 0
Hair 0 0 0 0
Hoove 0 0 0 0
Mane 0 0 0 0
Feathers 1 1 1 0.898
Hunt 1 0 0 0
Run 0 0 0.0363 0
Fly 1 1 1 1
Swim 0 1 0.99 0.85
Asia 0.87 1 0 0
Africa 1 0 0.9 0
Us 0 0 1 0
Europe 1 0.441 0 0
Mountainous 1 0 0.85 1
Plain 0 1 0 0.2
River 0 1 1 1
Jungle 0.03 0 0 0
Domestic 0 0 1 0.5
Carnivorous 1 0 0 0
Herbivorous 0 0 0 0
Frugivorous 0 0.98 1 1
Egg 1 1 1 1
Milk 0 0 0 0.021
Colour variation 0.13 0.5 0.1 0.15
Santhebert Pointer Wood Stock
Dog Dog Pigeon Dove
Small 0 0 1 0.8
Medium 0.76 0.31 0.211 0.02
Big 0 0 0 0
2 leg 0 0 1 1
4 leg 1 1 0 0
Hair 1 1 0 0
Hoove 0 0 0 0
Mane 0 0 0 0
Feathers 0 0 1 1
Hunt 0 0 0 0
Run 1 0.89 0 0
Fly 0 0 1 1
Swim 0 0.88 1 0
Asia 0 0 1 0.91
Africa 1 0.66 1 0.5
Us 0 1 0 1
Europe 0.515 0.79 1 1
Mountainous 0 0 0.2 0
Plain 0 0 1 1
River 0.9 0.78 0 1
Jungle 0 0.023 1 0
Domestic 0 0 1 1
Carnivorous 0.858 1 0 0
Herbivorous 0.02 0 0 0
Frugivorous 0.04 0.1 1 1
Egg 0 0 0 1
Milk 0.88 0 1 0
Colour variation 0.12 0.54 0.67 0.2
Rock Collared
Dove Dave Wolf Kaiot
Small 0.8 0.675 0 0
Medium 0 0 0.48 0.825
Big 0 0 0 0
2 leg 1 1 0 0
4 leg 0 0 1 1
Hair 0 0 1 1
Hoove 0 0 0 0
Mane 0 0 0 0
Feathers 1 1 0 0
Hunt 0 0 1 1
Run 0 0 0.88 0.88
Fly 0.965 1 0 0
Swim 0 0 0 0
Asia 0.87 0.92 1 1
Africa 0.11 0.5 0 0
Us 1 0.19 0.17 1
Europe 0 1 1 0
Mountainous 0.95 0.21 0 0.4
Plain 0 0 1 1
River 1 0.2 0.11 0
Jungle 0 0 1 0
Domestic 1 0.95 0.032 0
Carnivorous 0 0 1 1
Herbivorous 0 0.1 0 0
Frugivorous 1 1 0 0
Egg 1 1 1 0
Milk 0 0 0 1
Colour variation 0.3 0.47 0.31 0.21
Iranian
Tiger Lion Horse Zebra
Small 0 0 0 0
Medium 0 0 0.85 0.62
Big 0.88 1 1 0
2 leg 0 0 0 0
4 leg 1 1 1 1
Hair 1 1 1 1
Hoove 0 0 1 1
Mane 0 1 1 1
Feathers 0 0 0 0
Hunt 1 0.89 0 0
Run 1 1 0.84 0.87
Fly 0 0 0 0
Swim 0 0 0 0
Asia 1 1 1 1
Africa 1 0 0.416 0
Us 0.9 0.3 0 0.91
Europe 1 0.78 0.445 0
Mountainous 1 0.95 0.985 0
Plain 0 1 0.95 0.35
River 0 0 0 0.019
Jungle 1 1 0.5 1
Domestic 0 0 1 0
Carnivorous 1 1 0 0
Herbivorous 0 0 1 1
Frugivorous 0.21 0.0033 0 0.1
Egg 0 0 0 0
Milk 1 1 1 1
Colour variation 0.3 0 0.92 0.54
Asian
Marbled Chinchila goloden
Zebra Cat Cat Cat
Small 0 0 0 0
Medium 0.967 0.45 0.16 0.55
Big 0 0 0 0.01
2 leg 0 0 0 0
4 leg 1 1 1 1
Hair 1 0 0 1
Hoove 1 0 0 0
Mane 1 0 0 0
Feathers 0 0 0 0
Hunt 0 1 0 1
Run 1 0.99 0.88 1
Fly 0 0 0.96 0
Swim 0 1 1 0
Asia 0.78 0 0 1
Africa 0 1 1 0
Us 1 0 0 0.33
Europe 0.93 0.35 0.032 0
Mountainous 0 1 1 0
Plain 0.64 0.033 0 1
River 0 0 1 1
Jungle 1 0 0 1
Domestic 0 0.91 1 1
Carnivorous 0 1 1 1
Herbivorous 0.8 0 0 0
Frugivorous 0 0.2 0 0
Egg 0 0 0 0
Milk 1 1 1 1
Colour variation 0.75 0.21 0.3 0.4
Caucasian
Blackear black Mallard Gadwell
Cat Grouse Duck Duck
Small 0.818 1 0 0
Medium 0.8 0.03 0.28 0.16
Big 0 0 0 0
2 leg 0 1 1 1
4 leg 1 0 0 0
Hair 1 0 0 0
Hoove 0 0 0 0
Mane 0 0 0 0
Feathers 0 1 1 1
Hunt 1 0 0 0
Run 1 0 0 0
Fly 0 1 1 1
Swim 0 0 0 0
Asia 1 1 1 0.91
Africa 1 0 0.8 0.4
Us 0 0.5 1 0
Europe 1 0 0 1
Mountainous 1 0.99 1 0.45
Plain 0 0.5 0 0
River 0 0 1 0.3
Jungle 1 0.5 0 0
Domestic 0 0.95 0 0.5
Carnivorous 1 0 0 0
Herbivorous 0 0 0 0
Frugivorous 0 1 1 1
Egg 0 1 1 1
Milk 1 0 0 0
Colour variation 0.7 0.1 0.21 0.3
Wigeon Panital Garganey
Duck Duck Duck
Small 0 0 0
Medium 0.083 0.25 0.191
Big 0 0 0
2 leg 1 1 0
4 leg 0 0 1
Hair 0 0 0
Hoove 0 0 0
Mane 0 0 0
Feathers 1 1 0
Hunt 0 0 1
Run 0 0 0
Fly 0.95 1 0
Swim 0 0 0
Asia 1 1 1
Africa 0 1 1
Us 1 0 0
Europe 0 0 1
Mountainous 0 0 0
Plain 1 0 0
River 1 1 1
Jungle 0 0 1
Domestic 0 0.75 0
Carnivorous 0 0 0
Herbivorous 0 0 0
Frugivorous 1 1 1
Egg 1 1 1
Milk 0 0 0
Colour variation 0.24 0.5 0.4
Marbledteal Black Grizli
Duck Bear Bear
Small 0.95 0 0
Medium 0 0 0
Big 0 0.01 0.28
2 leg 1 0 0
4 leg 0 1 1
Hair 0 1 1
Hoove 0 0 0
Mane 0 0 0
Feathers 1 0 0
Hunt 0 1 1
Run 0 1 1
Fly 0.95 0 0
Swim 0 0 0
Asia 1 0 1
Africa 0.033 0 0
Us 0 1 1
Europe 1 0 1
Mountainous 0 0 0
Plain 0.5 0 1
River 1 1 1
Jungle 1 1 1
Domestic 0 0 0
Carnivorous 0 1 1
Herbivorous 0 0 1
Frugivorous 1 0 1
Egg 1 1 0
Milk 0 0 1
Colour variation 0.3 0.4 0.13
Panda Orangotan Shampain
Bear Monky Monky
Small 0 0 0
Medium 0 0 0.65
Big 0.17 0.45 0.23
2 leg 0 1 1
4 leg 1 0 0
Hair 1 1 1
Hoove 0 0 0
Mane 0 0 0
Feathers 0 0 0
Hunt 1 1 1
Run 1 0 0
Fly 0 0 0
Swim 0 0 0
Asia 0.85 0.95 0
Africa 0 1 0.89
Us 0 0 0.045
Europe 1 1 0
Mountainous 1 0.21 0
Plain 0 0 0.9
River 0 0.89 0
Jungle 1 1 0.8
Domestic 0 0 0
Carnivorous 1 0 0.14
Herbivorous 1 1 0
Frugivorous 0 1 1
Egg 0 0 0
Milk 1 0.87 1
Colour variation 0.36 0.04 0.351
Gale Copyright:
Copyright 2009 Gale, Cengage Learning. All rights reserved.