Clasificador de logs de acceso para detección de incidentes de ciberseguridad

Miguel Pérez del Castillo; Gastón Rial; Rafael Sotelo; Máximo Gurméndez

doi:10.36561/ING.18.7

Authors

Miguel Pérez del Castillo Universidad de Montevideo, Uruguay https://orcid.org/0000-0001-5500-8892
Gastón Rial Universidad de Montevideo, Uruguay https://orcid.org/0000-0001-9174-5937
Rafael Sotelo Universidad de Montevideo, Uruguay https://orcid.org/0000-0002-4034-3177
Máximo Gurméndez Universidad de Montevideo, Uruguay https://orcid.org/0000-0001-6435-0200

DOI:

https://doi.org/10.36561/ING.18.7

Keywords:

Filtering, Cybersecurity response, CLF, Cachine learning

Abstract

The number of attacks on government websites has escalated in the last years. In order to assist in the detection process conducted by cybersecurity analysts, this document suggests implementing machine learning techniques over web server access logs. The overall objective is to optimize the detection time using a customized classifier which selects traces corresponding to anomalous activity. Specifically, web server combined log format (CLF) access logs coded as real vectors are an input to a weighted K-NN nearest neighbors’ model. The methodology was tested on datasets and premises provided by the CERTuy (National Cybersecurity Event Response Team) and the SOC (Security Operations Center). According to evaluations 82% of cybersecurity offenses have been detected, 80% of normal behavior has been filtered and the reduction time has been reduced from 13 hours to 15 minutes.