Essay on Binary Classification

Topic > Essay on Binary Classification - 952

Classification is a supervised learning process in which data is grouped with respect to a known class tag. This is a task of discovering information that can be used to predict the class of a record whose class identification is unknown. In mammography image classification it is used to classify images into different class tags depending on the image characteristics. The classification is discrete and does not involve any order and continuous and floating point would designate a numerical rather than categorical objective. Classification is divided into two types ai) Binary classification ii) Multi-label classification. In binary classification, data is provided in two classes or categories. For example, in image classification the data is classified into normal or abnormal while in multi-label classification the data is grouped into more than two categories such as normal, abnormal or fat etc. The attribute set used in the classification process is divided into two disjoint sets such as test set and training set. The test set contains the set of attributes with the default class label of the class. Normally, the class tag is derived from previous experiential data. The test data can be represented as: (a1, a2, …, an; c), where ai is the attribute and c represents the class. Even though the class tags of this test data are unknown, it is possible to predict the classes to which this data belongs. As shown in Figure 5.1, a classification model can be thought of as a black box that automatically assigns a class tag when given a set of attributes of unknown classes. The classification phase in data mining consists of two phases as given below: 1) Training Phase 2) Testing Phase The training phase is the learning phase in which a training model is built by the class... ... in the center of the document ...... fixed group of properties or attributes.• Default classes: Target class tags have distinct output values (Boolean or multiclass)• Adequate data: Training cases should be present sufficient to train the model. • Internal node: specifies a test on a single attribute • Leaf node: designates the value of the target attribute • Branch (Edge): subdivision of an attribute • Path: a disjunction of the test to make the final decision • Decision trees perform classification by starting at the root of the tree and moving through it to a leaf node.• Selecting an attribute to test at each node: choosing the most useful attribute for classifying examples.• Gain information : Measures how well a given attribute separates training examples based on their target classification. This measurement is used to select the best attributes at each stage of tree assembly.