Beyond ChatGPT: How AI Agents are Shaping the Future of Cyber Defense and Offense


Figure1

Sequoia - What's next for AI agentic workflows ft. Andrew Ng of AI Fund (youtube.com)

In the rapidly evolving landscape of artificial intelligence, the spotlight is shifting from traditional models like ChatGPT to the next frontier: AI agents.

These autonomous systems are not just smart—they can perform complex tasks with minimal human intervention, learning, adapting, and collaborating in ways that were once the realm of science fiction.

Today, we stand on the brink of a new era where Agentic AI holds the potential to transform cybersecurity. Inspired by the visionary work of Andrew Ng, a pioneer in AI and co-founder of Google Brain, this blog explores the profound impact and future potential of AI agents.

Let us dive into the intricacies of LLM-based agents versus agentic workflows and uncover why the future of AI is agentic.

First off, let us clarify the difference between LLM-based agents and agentic workflows.

Andrew Ng explained this at the AI Ascent conference of the capital fund Sequoia Capital in March this year. I will also leave link for that session in link below.

Source: What's next for AI agentic workflows ft. Andrew Ng of AI Fund - YouTube

  • Large Language Model (LLM)-based agents typically generate responses to prompts in a single shot. For example, you might ask an LLM (Large Language Model) to write an essay, and it will produce a complete draft in one go. However, this approach often lacks depth and refinement.

    This form is also called Zero Shot Prompting - because it tries to "hit the basket" on the first shot and get a good result without iterations.

    It turns out that, like humans, language models also struggle to "succeed" on the first shot to bring the perfect result. Therefore, the OpenAI company created the GPT3/4 language models in a chat format, which allows us to get more out of the model with the help of additional iterations on the same topic - to give comments, divide the problem into sub-problems, require the model to perform self-reflection, etc.

  • In contrast, Agentic workflows involve AI agents that perform tasks iteratively, much like how humans would work on a project in stages. These agents can draft, review, revise, and enhance their outputs in multiple steps, leading to more refined and higher-quality results. This iterative process is more collaborative and adaptive, which is crucial in complex fields like cybersecurity.

Figure2

[Introducing Andrew Ng]

Before we dive deeper, let us talk about Andrew Ng. He is a renowned figure in artificial intelligence and machine learning. He founded Landing AI and deeplearning.ai, co-founded Coursera, and is an adjunct professor at Stanford. Andrew co-founded the Google Brain project, which developed large-scale deep learning algorithms and created the famous 'Google cat' experiment. He also served as Chief Scientist at Baidu, building a large AI team. His work democratizes education through platforms like Coursera, providing high-quality online courses from top universities to learners worldwide.

(3) Andrew Ng | LinkedIn

Figure3

During the same conference, Andrew Ng presented fascinating research on the performance of language models in coding tasks. His team used a dataset called HumanEval, originally published by OpenAI, which contains 164 small code challenges. For instance, one challenge might be to write a function that sums all the odd numbers that are in even positions in each array.

Ng's team demonstrated that using an agent, or several agents, to tackle these problems iteratively yielded significantly better results than the zero-shot method applied directly to even the most sophisticated models. The first place was won by an agent called AgentCoder, which is a coordination of several agents working together iteratively. This model scored 44% higher than GPT-4 in zero-shot mode.

This finding illustrates the potential of agentic workflows, where AI agents collaborate and iterate on tasks, leading to more refined outcomes compared to traditional, single-shot responses from more advanced models.

This example underscores the power of agentic AI and its potential to revolutionize complex fields like cybersecurity. By enabling AI agents to work together iteratively, we can achieve more robust and effective solutions, whether it is in coding, cybersecurity, or any other complex field.

Figure4

Min 2:54 - 3:26 - What's next for AI agentic workflows ft. Andrew Ng of AI Fund - YouTube

[Understanding Agentic AI in Cybersecurity]

So, we now know that Agentic AI refers to AI systems that operate autonomously, making decisions and taking actions without human intervention. Think of it as having a team of highly skilled security experts who never sleep and are always on the lookout for threats.

To put this in perspective, consider how a smart home system works. Imagine a smart thermostat that learns your daily routine, adjusts the temperature based on your preferences, and even detects unusual activities like a sudden temperature drop that might indicate a window left open. Similarly, in cybersecurity, Agentic AI continuously monitors network traffic, detects anomalies, and responds to threats in real-time.

Now, let us break down how AI agents work within Agentic AI using an example of a malicious code creation scenario. Imagine attackers wanting to generate malicious code. They could theoretically use a set of AI agents with distinct roles to automate the process:

Architects of Code: These agents define the structure and requirements of the malicious code. They outline what the code needs to achieve, such as stealing data or disrupting services.

Coders: These agents write the actual code based on the architects' specifications. They use advanced algorithms to generate code that fulfils the defined requirements.

Quality Assurance (QA) Agents: These agents test the generated code to ensure it works as intended without errors. They simulate different environments to identify and fix bugs.

Cybersecurity Experts: These agents evaluate the code from a defensive perspective. They ensure the code can bypass security measures and suggest improvements to make it more effective.

In an iterative process, these agents work together until the task is completed. The architects set the task, the coders generate the code, the QA agents test and refine it, and the cybersecurity experts enhance its effectiveness. This automated, iterative approach allows for continuous improvement and adaptation.

In the context of cybersecurity defense, Agentic AI can use similar processes to develop and refine protective measures. For instance, AI agents could continuously monitor network traffic, detect anomalies, and update security protocols to defend against new threats in real-time.

Let's see few examples:

[LLM Agents can autonomously exploit one day vulnerability]

Researchers Richard Fang, Rohan Bindu, Akui Gupta, and Daniel Kang have demonstrated that large language models (LLMs) like GPT-4 can autonomously exploit one-day vulnerabilities. These vulnerabilities are security flaws that become publicly known and exploitable before a fix is available.

In their study, the team tested GPT-4 against a dataset of 15 real-world vulnerabilities. GPT-4 successfully exploited 87% of these vulnerabilities, significantly outperforming other models and tools. This capability was achieved using the ReAct framework, allowing the AI to autonomously navigate and exploit security flaws.

This research highlights the urgent need for proactive cybersecurity measures, as AI's potential in exploiting vulnerabilities poses a significant and evolving threat. Regular updates and timely application of security patches are crucial in mitigating these risks.

[2404.08144] LLM Agents can Autonomously Exploit One-day Vulnerabilities (arxiv.org)

Figure5

Figure6

Figure7

Figure8

[Project Voyager]

Imagine a world where AI agents can continuously explore, learn, and adapt without human intervention. That is exactly what Project Voyager is doing in Minecraft.

Researchers from Caltech, Stanford, the University of Texas, and NVIDIA have developed Voyager, the first large language model (LLM)-powered agent designed for lifelong learning in an open-ended environment.

Voyager operates in Minecraft, a game known for its endless possibilities and lack of a predefined end goal. This makes it the perfect playground for testing AI capabilities. Voyager consists of three key components: an automatic curriculum that maximizes exploration, a skill library for storing complex behaviors, and an iterative prompting mechanism for improving performance based on feedback.

One fascinating scenario unfolded when Voyager's NPCs, tasked with building a village, ran out of nearby trees to chop for wood. Instead of getting stuck, these AI-powered NPCs got together to discuss a solution. They decided to move to a neighboring village, eliminate the inhabitants, and use the wood from their houses to continue their construction.

This decision showcases Voyager's advanced problem-solving and decision-making abilities, even when faced with unexpected challenges.

But how does Voyager achieve this? The AI uses GPT-4 to interact with the game environment. It processes feedback from its actions, learns from execution errors, and self-verifies its tasks. This iterative learning process allows Voyager to continuously improve and adapt its strategies.

The implications of this technology are profound. In cybersecurity, for instance, AI agents like Voyager can be used to simulate attacks, identify vulnerabilities, and develop defense strategies autonomously. This not only enhances our ability to respond to cyber threats but also allows us to stay ahead of potential attackers."

Project Voyager demonstrates the potential of AI in creating autonomous agents capable of lifelong learning and adaptation. Whether it is in gaming, robotics, or cybersecurity, the ability of these agents to learn and evolve continuously opens new possibilities for innovation and problem-solving.

Voyager | An Open-Ended Embodied Agent with Large Language Models (minedojo.org)

Figure9

[Insights from Andrew Ng on Agentic AI]

Andrew Ng's vision for Agentic AI is transformative. In his talks and articles, he discusses how AI agents can autonomously perform and refine tasks, similar to how a human would iterate and improve their work. This concept is known as agentic workflows. Instead of traditional, static AI interactions, agentic workflows involve iterative processes where AI models continuously improve their outputs.

Ng explains four key design patterns for agentic workflows:

  1. Reflection: AI models reflect on their own outputs and identify areas for improvement. For instance, an AI that generates code can review it for correctness and efficiency, then refine it based on self-critique.
  1. Planning and Multi-agent Collaboration: Multiple AI models work together, each specializing in different tasks. For example, one AI could generate code while another reviews and optimizes it.
  1. Tool Use: AI models use external tools to enhance their capabilities, such as code generation tools or web search tools for additional information.
  1. Many-to-Many Use: Combining different AI models, each with a specific specialty, to work collaboratively on complex tasks.

Ng believes that these agentic workflows will significantly enhance the capabilities of AI, moving beyond instant, one-shot results to more refined, iterative processes. This approach promises to unlock the full potential of large language models and AI systems in various applications, including cybersecurity.

[Conclusion]

As we have seen, Agentic AI can push the boundaries of what is possible in cybersecurity. The ability of AI agents to continuously learn, adapt, and make decisions autonomously is revolutionizing how we approach cyber defense. However, this also brings us closer to a future where AI might surpass human intelligence in certain areas. It is crucial to develop these technologies responsibly and ensure they are used to benefit society.

Yaniv Hoffman

Yaniv Hoffman

Yaniv Hoffman brings more than 20 years of experience in leading high-performance engineering and service teams, specialized in networking, cyber-security and cloud operations. Mr. Hoffman is the Vice President of Technologies. In this role he is responsible for APAC engineering teams (Pre-Sale, Post Sale, Architecture, Professional Services), and drives innovation in technical solutions and delivery while leading sales activities across the region. Prior to this role, he managed the global technical services in Radware, overseeing all customer engagements and customer success.

Contact Radware Sales

Our experts will answer your questions, assess your needs, and help you understand which products are best for your business.

Already a Customer?

We’re ready to help, whether you need support, additional services, or answers to your questions about our products and solutions.

Locations
Get Answers Now from KnowledgeBase
Get Free Online Product Training
Engage with Radware Technical Support
Join the Radware Customer Program

Get Social

Connect with experts and join the conversation about Radware technologies.

Blog
Security Research Center
CyberPedia