ExtraHop's machine learning dataset helps detect and mitigate malware and botnet operations faster. The data set with 16 million lines will soon be available as open source.
ExtraHop, a leader in cloud-native Network Detection and Response (NDR), today announced that it is open sourcing its massive 16 million-row data set - one of the most robust on the market - to support algorithmic mitigation generated domains (DGAs). This aims to level the playing field for defenders and enable companies of all sizes to better secure their organizations by strengthening defenses against malware and botnets.
New challenges arise every day
With the cybersecurity skills gap widening (up 26% in the last year) and dwindling resources, the cyber landscape is rapidly evolving. With new threats rapidly emerging, open source research and data sets are a solution to address the challenges security teams face every day.
“The security challenges we face are vast and dynamic, and with this initiative we are democratizing the tools needed for threat detection for security teams of all sizes, backgrounds and industries,” said Raja Mukerji, Chief Scientist and co-founder of ExtraHop. “Collaboration in the cybersecurity community is invaluable – only by joining together to share our best work can we stay on the offensive and put attackers at a disadvantage. Our research will transform the community and we encourage other teams to publish their own findings that will benefit the entire industry.”
Goal: expand cooperation in the area of cybersecurity
In an effort to encourage collaboration with industry, ExtraHop is publishing its DGA detector dataset, consisting of more than 16 million rows of data, on GitHub to help security teams detect malicious activity in their environments before it occurs become a business problem.
DGAs are used by threat actors to maintain control of an organization's environment once they have penetrated a network, making attacks difficult to detect and stop. Originally developed for ExtraHop's award-winning NDR platform, Reveal(x), this research can now be used by any security researcher to build their own machine learning (ML) classification model to identify DGAs and attack attacks faster can be repelled more precisely. Since its implementation in Reveal(x), the ExtraHop DGA model has achieved greater than 98% accuracy.
Access for everyone
“With threat actors having the ability to operate undetected and these types of attacks increasing, DGAs are now increasingly seen as a major threat to organizations,” says Todd Kemmerling, Director of Data Science at ExtraHop. “As we began developing a model to detect DGAs, it became clear that there was a lack of public datasets accessible to security teams with a wide range of resources. With this data set, we close that gap and give every security team access to the critical data they need to quickly detect DGAs.”
More at ExtraHop.com
About ExtraHop ExtraHop is dedicated to helping businesses with security that cannot be undermined, outwitted or compromised. The dynamic cyber defense platform Reveal (x) 360 helps companies to identify complex threats and react to them - before they put the company at risk. We apply cloud-scale AI to petabytes of traffic per day and conduct line rate decryption and behavioral analysis for all infrastructures, workloads and data on the fly. With the complete transparency of ExtraHop, companies can quickly identify malicious behavior, hunt down advanced threats and reliably forensic investigate every incident.