Role of Confusion Matrix in Cyber Crime
The confusion matrix is a matrix used to determine the performance of the classification models for a given set of test data. It can only be determined if the true values for test data are known. The matrix itself can be easily understood and implemented to test a ML model.
What is Cyber Crime?
Cybercrime is criminal activity that either targets or uses a computer, a computer network or a networked device. Cybercrime is committed by cybercriminals or hackers who want to make money. Cybercrime is carried out by individuals or organizations. Some cybercriminals are organized, use advanced techniques and are highly technically skilled. Others are novice hackers.
What is Confusion Matrix?
A Confusion matrix is an N x N matrix used for evaluating the performance of a classification model, where N is the number of target classes. The matrix compares the actual target values with those predicted by the machine learning model. This gives us a holistic view of how well our classification model is performing and what kinds of errors it is making.
For a binary classification problem, we would have a 2 x 2 matrix as shown below with 4 values:
The target variable has two values: Positive or Negative.
The columns represent the actual values of the target variable.
The rows represent the predicted values of the target variable.
Understanding TP, TN, FP and FN in confusion matrix.
True Positive (TP) :
- The predicted value matches the actual value.
- The actual value was positive and the model predicted a positive value.
True Negative (TN) :
- The predicted value matches the actual value.
- The actual value was negative and the model predicted a negative value.
False Positive (FP) :
- The predicted value was falsely predicted.
- The actual value was negative but the model predicted a positive value.
- Also known as the Type 1 error.
False Negative (FN) :
- The predicted value was falsely predicted.
- The actual value was positive but the model predicted a negative value.
- Also known as the Type 2 error.
What to learn from Confusion Matrix?
- There are two possible predicted classes: “yes” and “no”. If we were predicting the presence of a disease, for example, “yes” would mean they have the disease, and “no” would mean they don’t have the disease.
- The classifier made a total of 165 predictions (e.g., 165 patients were being tested for the presence of that disease).
- Out of those 165 cases, the classifier predicted “yes” 110 times, and “no” 55 times.
- In reality, 105 patients in the sample have the disease, and 60 patients do not.
Why do we need Confusion Matrix?
- Confusion matrix not only gives you insight into the errors being made by your classifier but also types of errors that are being made.
- Every column of the confusion matrix represents the instances of that predicted class.
- Each row of the confusion matrix represents the instances of the actual class.
- It shows how any classification model is confused when it makes predictions.
Dealing with False Positive and False Negative
False positives are mislabeled security alerts, indicating there is a threat when in actuality, there isn’t. By default, most security teams are conditioned to ignore false positives. Unfortunately, this practice of ignoring security alerts can create alert fatigue and cause your team to miss actual. These false alarms account for roughly 40% of the alerts cybersecurity teams receive on a daily basis and at large organizations can be overwhelming and a huge waste of time. False negatives are uncaught cyber threats overlooked by security tooling because they’re dormant, highly sophisticated or the security infrastructure in place lacks the technological ability to detect these attacks.
Conclusion
A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier. It is used to measure the performance of a classification model.
Not every security is 100% secure but, having a solid backup is the Best security !
Thank You !!!