With AI and ChatGPT algorithm - hunt down cybercriminals

With AI and ChatGPT algorithm - hunt down cybercriminals

Share post

The further development of neural networks, AI and machine learning is becoming a real "game changer". The chatbot ChatGPT is currently making this more than clear. Sophos AI uses the advanced GPT technology to realize even better security applications.

As the scale increases, neural networks get better and better. The basis for the program is the GPT-3.5 language processing model, which also comes from OpenAI. At supercomputing scale and in conjunction with machine learning, Sophos AI uses this advanced technology to realize even better security applications. Because especially in the area of ​​cyber security, the enormous performance in contrast to previous, smaller models stands out. This is made clear by various successful test series that the Sophos experts did with GPT-3 last year.

GPT-3 offers enormous potential for IT security

GPT-3 is a pre-trained, large-scale language model that is groundbreaking in its flexibility and accuracy. And this is exactly where the human creativity of Sophos AI is required, namely where and how this technology can be used in the fight against cybercrime. Because if input and output data can be converted to text, the uses of GPT-3 are endless in this area as well. For example, you could ask GPT-3 to write working Python code from a function description, or create a classification application with just a few examples.

Sophos AI - want to use GPT-3 potential

Sophos AI experts see enormous potential in GPT-3. For example, finding an untagged record is relatively easy; however, creating a tagged data set for training a traditional machine learning model is usually very time consuming and difficult. Traditional machine learning models that are trained with few examples often have problems with overfitting. In other words, they don't generalize well to previously nonexistent examples. With the GPT-3 "Few-Shot Learning", however, Sophos AI requires only a few annotated training examples and thus outperforms conventional models. Because GPT-3 has been self-monitored and trained extensively, it has been shown to perform well on several classification problems with just a few examples.

Two examples for the concrete application

spam detection

🔎 Prompt for a spam classification task (Image: Sophos).

It is challenging to train a powerful spam classification model with only four non-critical and four spam samples. Traditional classification models often require a large training dataset to learn enough signals. However, since GPT-3 is a language model trained on a large text data set, it can recognize the intent of a classification task and solve the task with a few examples.

When learning with a few examples, an important step is prompt engineering, which involves designing the format of the input data for text completion tasks. The picture shows such a spam classification task.

The prompt includes a statement and some examples with their labels as a support set, and a query example is attached in the last section. GPT-3 is then asked to generate an answer as a label prediction from the input.

Comparing the classification results between traditional ML models and "few-shot learning" with GPT-3, it is easy to see that it clearly outperforms traditional ML models such as logistic regression and "random forest". This is because few-shot learning takes the context information of the given examples and chooses the label of the most similar example as the output. As a result, GPT-3 does not require retraining but allows a powerful classification model to be built with simple prompt engineering.

Readable explanations for hard-to-decipher code

Reverse engineering command lines is a difficult and time-consuming task, even for security experts. It's even harder to understand "living-off-the-land" commands because they're long and contain strings that are difficult to parse. Attackers use standard apps and standard processes on their victims' computers to camouflage phishing activities, for example. GPT-3 can translate a command line into an understandable description - for example, write working Python or Java code from a given description of the code. It is also possible to ask GPT-3 to generate multiple descriptions from a command line, and the output descriptions will be tokenized with word-level probabilities to select the best candidate. Sophos AI's approach to selecting the best description from multiple variants is to use a reverse translation method that selects the description that can produce the most similar command line to the input command line.

GPT-3: Cybersecurity Milestone

"GPT-3 is a cybersecurity milestone because it can detect spam and analyze complex command lines with few examples," say the experts of the Sophos AI team. “The flexibility of GPT-3 is very well suited to combating ever-evolving cyber threats. We assume that the even more difficult cybersecurity problems can soon be addressed with correspondingly larger neural network models.”

More at Sophos.com

 


About Sophos

More than 100 million users in 150 countries trust Sophos. We offer the best protection against complex IT threats and data loss. Our comprehensive security solutions are easy to deploy, use and manage. They offer the lowest total cost of ownership in the industry. Sophos offers award-winning encryption solutions, security solutions for endpoints, networks, mobile devices, email and the web. In addition, there is support from SophosLabs, our worldwide network of our own analysis centers. The Sophos headquarters are in Boston, USA and Oxford, UK.


 

Matching articles on the topic

Report: 40 percent more phishing worldwide

The current spam and phishing report from Kaspersky for 2023 speaks for itself: users in Germany are after ➡ Read more

IT security: NIS-2 makes it a top priority

Only in a quarter of German companies do management take responsibility for IT security. Especially in smaller companies ➡ Read more

Stealth malware targets European companies

Hackers are attacking many companies across Europe with stealth malware. ESET researchers have reported a dramatic increase in so-called AceCryptor attacks via ➡ Read more

Cyber ​​attacks increase by 104 percent in 2023

A cybersecurity company has taken a look at last year's threat landscape. The results provide crucial insights into ➡ Read more

The AI ​​Act and its consequences for data protection

With the AI ​​Act, the first law for AI has been approved and gives manufacturers of AI applications between six months and ➡ Read more

MDR and XDR via Google Workspace

Whether in a cafe, airport terminal or home office – employees work in many places. However, this development also brings challenges ➡ Read more

Mobile spyware poses a threat to businesses

More and more people are using mobile devices both in everyday life and in companies. This also reduces the risk of “mobile ➡ Read more

Test: Security software for endpoints and individual PCs

The latest test results from the AV-TEST laboratory show very good performance of 16 established protection solutions for Windows ➡ Read more