Classification Methods

Ml Stock

What are classification tasks?

According to wikipedia , Classification is defined as "Identifying which of a set of categories (sub-populations) an observation (or observations) belongs to". Examples are assigning a given email to the "spam" or "non-spam" class, and assigning a diagnosis to a given patient based on observed characteristics of the patient (sex, blood pressure, presence or absence of certain symptoms, etc.)" Classification algorithms used in machine learning utilize input training data for the purpose of predicting the likelihood or probability that the data that follows will fall into one of the predetermined categories. One of the most common applications of classification is for filtering emails into “spam” or “non-spam”, as used by today’s top email service providers.

Usecases in classification

Customer behavior prediction: Customers can be classified into different categories based on their buying patterns, web store browsing patterns etc. For example, classification models can be used to determine whether a customer is likely to purchase more items or not. If the classification model predicts a greater likelihood that they are about to make more purchases, then you might want to send them promotional offers and discounts accordingly. Or if it has been determined that they will probably fall off of their purchasing habits soon, maybe save them for later by making their information readily available.
Spam filtering: An algorithm is trained to recognize spam email by learning the characteristics of what constitutes spam vs non-spam email. The classification model could be a function that maps from an email text to a spam classification (or non-spam classification). Algorithms such as Naive Bayes and Support Vector Machines can be used for classification. Once the classification model is trained, it can then be used to filter new incoming emails as spam or non-spam. The picture below represents the Spam classification model depicted as Spam classifier.

Algorithms/Estimators used in Classification

There are a total of 28 algorithms used for supervised classification tasks under sklearn. The whole of the data science and ML space would agree on very few algorithms that are production worthy . The real meaning of production worthy here is that when you want to deploy a model you need to know the underlying mechanisms of the algorithm that you use . Most of the algorithms in classification are quite easy to understand but there are few which need a little less technicality . The below list are the estimators which are widely used in the industry and have had good impact.

Estimators supported in Xceed ML

Given the number of estimators that are famously used, Xceed supports a little more than that and can be used effortlessly without any in depth knowledge of the algorithms because the concept remains the same in most of the algorithms

The below picture is a view listing all classification estimators Xceed supports

Ml Stock

Evaluating a Classification Model

Accuracy

Accuracy simply measures how often the classifier correctly predicts. We can define accuracy as the ratio of the number of correct predictions and the total number of predictions.

Log loss

Log Loss can be used when the output of the classifier is a numeric probability instead of a class label. Log loss measures the unpredictability of the extra noise that comes from using a predictor as opposed to the true labels.

F1 macro

Macro F1-score (short for macro-averaged F1 score) is used to assess the quality of problems with multiple binary labels or multiple classes. If you are looking to select a model based on a balance between precision and recall, don't miss out on assessing your F1-scores

F1 micro

Micro F1-score (short for micro-averaged F1 score) is used to assess the quality of multi-label binary problems. It measures the F1-score of the aggregated contributions of all classes

ROC_AUC

A receiver operating characteristic (ROC), or simply ROC curve, is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold is varied.

Xceed supports all of these evaluation criterias under ML explainer Tab

Ml Stock

Explaining a Classification Model

After the model creation and checking all the evaluation criterias you need to come to the explainability/interpretability part of the process where there are two important charts are used all over the data space .

Confusion Matrix

A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier. It can be used to evaluate the performance of a classification model through the calculation of performance metrics like accuracy, precision, recall, and F1-score

ROC Curve

The ROC is created by plotting the FPR (false positive rate) vs the TPR (true positive rate) at various thresholds settings.

Precision Vs Recall

A precision-recall curve shows the relationship between precision (= positive predictive value) and recall (= sensitivity) for every possible cut-off.

Cumulative Gain Curve

The cumulative gains curve is an evaluation curve that assesses the performance of the model and compares the results with the random pick. It shows the percentage of targets reached when considering a certain percentage of the population with the highest probability to be target according to the model

Xceed supports all of these explainable methods under ML explainer Tab

Automobile Banner

Classification Methods

What are classification tasks?

Usecases in classification

Algorithms/Estimators used in Classification

Estimators supported in Xceed ML

Evaluating a Classification Model

Accuracy

Log loss

F1 macro

F1 micro

ROC_AUC

Explaining a Classification Model

Confusion Matrix

ROC Curve

Precision Vs Recall

Cumulative Gain Curve

Get the power of futuristic Data & AI Platform for your enterprise.

Get insights delivered

Right in your inbox

Products

Stories

Company

Classification Methods

What are classification tasks?​

Usecases in classification​

Algorithms/Estimators used in Classification​

Estimators supported in Xceed ML​

Evaluating a Classification Model​

Accuracy​

Log loss​

F1 macro​

F1 micro​

ROC_AUC​

Explaining a Classification Model​

Confusion Matrix​

ROC Curve​

Precision Vs Recall​

Cumulative Gain Curve​

Get the power of futuristic Data & AI Platform for your enterprise.

Get insights delivered

Right in your inbox

Products

Stories

Company

What are classification tasks?

Usecases in classification

Algorithms/Estimators used in Classification

Estimators supported in Xceed ML

Evaluating a Classification Model

Accuracy

Log loss

F1 macro

F1 micro

ROC_AUC

Explaining a Classification Model

Confusion Matrix

ROC Curve

Precision Vs Recall

Cumulative Gain Curve