Abstract:
Rated a high-risk cyber-attack type, Advanced Persistent Threat (APT) has become a cause
for concern to cyber security experts. Detecting the presence of APT in order to mitigate this
attack has been a major challenge as successful attacks to large organizations still abound.
Our approach combines static rule anomaly detection through pattern recognition and
machine learning-based classification technique in mitigating the APT. (1) The rules-based
on patterns are derived using statistical analysis majorly Kruskal Wallis test for association.
A Packet Capture (PCAP) dataset with 1,047,908 packet header data is analyzed in an
attempt, to identify malicious versus normal data traffic patterns. 90% of the attack traffic
utilizes unassigned and dynamic/private ports and, also data sizes of between 0 and 58 bytes.
(2) The machine learning approach narrows down the algorithm utilized by evaluating the
accuracy levels of four algorithms: K-Nearest Neighbor (KNN), Support Vector Machine
(SVM), Decision Tree and Random Forest with the accuracies 99.74, 87.11, 99.84 and 99.90
percent respectively. A load balance approach and modified entropy formula was applied to
Random Forest. The model combines the two techniques giving it a minimum accuracy of
99.95% with added capabilities of detecting false positives. The results for both methods are
matched in order to make a final decision. This approach can be easily adopted, as the data
required is packet header data, visible in every network and provides results with
commendable levels of accuracy, and the challenges of false positives greatly reduced