Skip to content

Commit

Permalink
Refactored plan tests
Browse files Browse the repository at this point in the history
  • Loading branch information
antoninoLorenzo committed Jun 29, 2024
1 parent 0801e73 commit c51e7e7
Show file tree
Hide file tree
Showing 4 changed files with 95 additions and 69 deletions.
Binary file added test/tests/__pycache__/__init__.cpython-311.pyc
Binary file not shown.
Binary file not shown.
62 changes: 62 additions & 0 deletions test/tests/plan_tests.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
[
{
"content": [
"I completely agree with you! Let's refine our plan and use the tools more efficiently.\n\n",
"**Step 1: DNS Information**\n",
"Using `dig`, we'll gather DNS information about the target domain:\n",
"terminal dig example.com\n",
"This will provide us with valuable information, such as MX records, NS records, and potential subdomains.\n\n",
"**Step 2: Port Scan (nmap -sV)**\n",
"We can skip scanning specific ports since `nmap -sV` already scans the common ports (80, 21, 22, and 443). This will give us a comprehensive view of open ports and services:\n\n",
"nmap -sV example.com\n\n",
"**Step 3: NSE Scripting**\n",
"To gather more information, we can use nmap's NSE (Nmap Scripting Engine) to run scripts that scan for vulnerabilities, check for common web servers, and identify potential issues. For example:\n\n",
"nmap --script=http-title,http-open-proxy,http-methods,http-robots.txt example.com\n",
"This will help us identify potential vulnerabilities, open proxies, and other information.\n\n",
"**Step 4: Operating System Identification**\n",
"Using `nmap -O`, we can determine the operating system running on the target machine:\n",
"terminal nmap -O example.com\n",
"This will help us narrow down potential attack vectors or identify specific vulnerabilities.\n",
"By refining our plan, we can gather more information about the target host and services without attempting to exploit any vulnerabilities. What do you think?\n"
],
"commands": [
"dig example.com",
"nmap -sV example.com",
"nmap --script=http-title,http-open-proxy,http-methods,http-robots.txt example.com",
"nmap -O example.com"
]
},
{
"content": [
"**Plan for Reconnaissance Phase:**\n\n",
"**Target System:** 154.142.1.1 (IP Camera).\n",
"**Objective:** Gather information about the operating system, services, data transmission, and potential vulnerabilities.\n",
"**Tasks:**\n\n",
"1. **Nmap Scan**: Use Nmap to gather information about the IP Camera's operating system, services, and open ports.\n",
" - Execute the following command: `nmap -O -sV 154.142.1.1`\n",
"2. **Operating System Detection**: Identify the operating system running on the IP Camera using Nmap's `-O` option.\n",
"3. **Service Detection**: Identify any services or protocols being used by the IP Camera, such as HTTP, FTP, or SSH, using Nmap's `-sV` option.\n",
"4. **Data Transmission Analysis**: Use Nmap's `-A` option to analyze data transmission patterns and identify any potential issues.\n",
"5. **SSH and FTP Vulnerability Scan**: Run a vulnerability scan for SSH and FTP protocols using Nmap's `--script` option with the following commands:\n",
" - `nmap --script sshvuln 154.142.1.1`\n",
" - `nmap --script ftp-vuln 154.142.1.1`\n\n",
"Please let me know if you'd like to proceed with this plan or if you have any specific requirements or concerns.\n"
],
"commands": [
"nmap -O -sV 154.142.1.1",
"nmap --script sshvuln 154.142.1.1",
"nmap --script ftp-vuln 154.142.1.1"
]
},
{
"content": [
"To do a port scan and version scan on 192.168.1.252, I'll use the following command:\n\n",
"`nmap -p 1-1024 -sV 192.168.1.252`\n\n",
"This command will perform a quick port scan (nmap -p 1-1024) to identify all open ports from 1 to 1024 on the target IP address, and then do a version scan (nmap -sV) to detect the operating system and any services running on those open ports.\n\n",
"Would you like me to run this command for you?\n"
],
"commands": [
"nmap -p 1-1024 -sV 192.168.1.252"
]
}
]
102 changes: 33 additions & 69 deletions test/tests/test_plan.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import textwrap
import json
import unittest

from src.agent import Agent
Expand All @@ -8,82 +8,46 @@

class TestPlan(unittest.TestCase):

NL_PLANS = [
textwrap.dedent("""
I completely agree with you! Let's refine our plan and use the tools more efficiently.
# def test_execute(self):
# tasks = [
# Task(thought="Get directory content", tool=Terminal, command="ls"),
# Task(thought="Get machine host name", tool=Terminal, command="hostname")
# ]

**Step 1: DNS Information**
# plan = Plan(tasks)
# for output in plan.execute():
# print('---------------------------------')
# for i, task_overview in enumerate(output):
# print(f'{i+1}. {task_overview}')
# if task_overview.status == TaskStatus.DONE:
# print(f'Output:\n{task_overview.output}')

Using `dig`, we'll gather DNS information about the target domain:
<TOOL>terminal dig example.com</TOOL>
This will provide us with valuable information, such as MX records, NS records, and potential subdomains.
**Step 2: Port Scan (nmap -sV)**
We can skip scanning specific ports since `nmap -sV` already scans the common ports (80, 21, 22, and 443). This will give us a comprehensive view of open ports and services:
<TOOL>nmap -sV example.com</TOOL>
**Step 3: NSE Scripting**
To gather more information, we can use nmap's NSE (Nmap Scripting Engine) to run scripts that scan for vulnerabilities, check for common web servers, and identify potential issues. For example:
<TOOL>nmap --script=http-title,http-open-proxy,http-methods,http-robots.txt example.com</TOOL>
This will help us identify potential vulnerabilities, open proxies, and other information.
**Step 4: Operating System Identification**
Using `nmap -O`, we can determine the operating system running on the target machine:
<TOOL>terminal nmap -O example.com</TOOL>
This will help us narrow down potential attack vectors or identify specific vulnerabilities.
By refining our plan, we can gather more information about the target host and services without attempting to exploit any vulnerabilities. What do you think?
"""),
textwrap.dedent("""
**Plan for Reconnaissance Phase:**
def test_from_response(self):
agent = Agent(model='llama3')
with open('plan_tests.json', 'r', encoding='utf-8') as fp:
test_cases = json.load(fp)

**Target System:** 154.142.1.1 (IP Camera).
**Objective:** Gather information about the operating system, services, data transmission, and potential vulnerabilities.
**Tasks:**
for test_case in test_cases:
plan_nl = test_case['content']
expected_commands = test_case['commands']

1. **Nmap Scan**: Use Nmap to gather information about the IP Camera's operating system, services, and open ports.
- Execute the following command: `nmap -O -sV 154.142.1.1`
2. **Operating System Detection**: Identify the operating system running on the IP Camera using Nmap's `-O` option.
3. **Service Detection**: Identify any services or protocols being used by the IP Camera, such as HTTP, FTP, or SSH, using Nmap's `-sV` option.
4. **Data Transmission Analysis**: Use Nmap's `-A` option to analyze data transmission patterns and identify any potential issues.
5. **SSH and FTP Vulnerability Scan**: Run a vulnerability scan for SSH and FTP protocols using Nmap's `--script` option with the following commands:
- `nmap --script sshvuln 154.142.1.1`
- `nmap --script ftp-vuln 154.142.1.1`
plan = agent.extract_plan(plan_nl)
self.assertIsNotNone(plan, "Plan is None:")

Please let me know if you'd like to proceed with this plan or if you have any specific requirements or concerns.
"""),
]
commands = [task.command for task in plan.tasks]

def test_execute(self):
tasks = [
Task(thought="Get directory content", tool=Terminal, command="ls"),
Task(thought="Get machine host name", tool=Terminal, command="hostname")
]
self.assertEquals(
len(commands),
len(expected_commands),
f"commands {len(commands)} != expected {len(expected_commands)}"
)
self.assertEquals(
commands,
expected_commands,
f"Commands:\n{commands}\nExpected:\n{expected_commands}"
)

plan = Plan(tasks)
for output in plan.execute():
print('---------------------------------')
for i, task_overview in enumerate(output):
print(f'{i+1}. {task_overview}')
if task_overview.status == TaskStatus.DONE:
print(f'Output:\n{task_overview.output}')

def test_from_response(self):
agent = Agent(model='llama3')
for plan_nl in TestPlan.NL_PLANS:
plan = agent.extract_plan(plan_nl)
print(plan)
self.assertIsNotNone(plan)

# def test_should_timeout(self):
# pass
Expand Down

0 comments on commit c51e7e7

Please sign in to comment.