Prompt user on existing directory (#57)

* added tests and check_scan_directory * added documentation; closes #30
2025-12-22 08:44:22 +01:00 · 2020-04-30 08:19:56 -07:00
parent f1c1868a6e
commit 2ecdf4319a
5 changed files with 282 additions and 78 deletions
--- a/README.md
+++ b/README.md
@@ -16,6 +16,7 @@ Table of Contents
 - [Installation](#installation)
 - [Defining Scope](#defining-a-scans-scope)
 - [Example Scan](#example-scan)
    - [Existing Results Directories](#existing-results-directories)
 - [Viewing Results](#viewing-results)
 - [Chaining Results w/ Commands](#chaining-results-w-commands)
 - [Choosing a Scheduler](#choosing-a-scheduler)
@@ -154,6 +155,30 @@ The same steps can be seen in realtime in the linked video below.
 [![asciicast](https://asciinema.org/a/318397.svg)](https://asciinema.org/a/318397)
 ### Existing Results Directories
 When running additional scans against the same target, you have a few options.  You can either
 - use a new directory
 - reuse the same directory
 If you use a new directory, the scan will start from the beginning.
 If you choose to reuse the same directory, `recon-pipeline` will resume the scan from its last successful point.  For instance, say your last scan failed while running nmap.  This means that the pipeline executed all upstream tasks (amass and masscan) successfully.  When you use the same results directory for another scan, the amass and masscan scans will be skipped, because they've already run successfully.
 **Note**: There is a gotcha that can occur when you scan a target but get no results.  For some scans, the pipeline may still mark the Task as complete (masscan does this).  In masscan's case, it's because it outputs a file to `results-dir/masscan-results/` whether it gets results or not.  Luigi interprets the file's presence to mean the scan is complete.
 In order to reduce confusion, as of version 0.9.3, the pipeline will prompt you when reusing results directory.
 ```
 [db-2] recon-pipeline> scan FullScan --results-dir testing-results --top-ports 1000 --rate 500 --target tesla.com
 [*] Your results-dir (testing-results) already exists. Subfolders/files may tell the pipeline that the associated Task is complete. This means that your scan may start from a point you don't expect. Your options are as follows:
   1. Resume existing scan (use any existing scan data & only attempt to scan what isn't already done)
   2. Remove existing directory (scan starts from the beginning & all existing results are removed)
   3. Save existing directory (your existing folder is renamed and your scan proceeds)
 Your choice?
 ```
 ## Viewing Results
 As of version 0.9.0, scan results are stored in a database located (by default) at `~/.local/recon-pipeline/databases`.  Databases themselves are managed through the [database command](https://recon-pipeline.readthedocs.io/en/latest/api/commands.html#database) while viewing their contents is done via [view command](https://recon-pipeline.readthedocs.io/en/latest/api/commands.html#view-command).
--- a/docs/overview/running_scans.rst
+++ b/docs/overview/running_scans.rst
@@ -88,3 +88,27 @@ Scan the target
    <script id="asciicast-318397" src="https://asciinema.org/a/318397.js" async></script>
 Existing Results Directories and You
 ####################################
 When running additional scans against the same target, you have a few options.  You can either
 - use a new directory
 - reuse the same directory
 If you use a new directory, the scan will start from the beginning.
 If you choose to reuse the same directory, ``recon-pipeline`` will resume the scan from its last successful point.  For instance, say your last scan failed while running nmap.  This means that the pipeline executed all upstream tasks (amass and masscan) successfully.  When you use the same results directory for another scan, the amass and masscan scans will be skipped, because they've already run successfully.
 **Note**: There is a gotcha that can occur when you scan a target but get no results.  For some scans, the pipeline may still mark the Task as complete (masscan does this).  In masscan's case, it's because it outputs a file to ``results-dir/masscan-results/`` whether it gets results or not.  Luigi interprets the file's presence to mean the scan is complete.
 In order to reduce confusion, as of version 0.9.3, the pipeline will prompt you when reusing results directory.
 .. code-block:: console
    [db-2] recon-pipeline> scan FullScan --results-dir testing-results --top-ports 1000 --rate 500 --target tesla.com
    [*] Your results-dir (testing-results) already exists. Subfolders/files may tell the pipeline that the associated Task is complete. This means that your scan may start from a point you don't expect. Your options are as follows:
       1. Resume existing scan (use any existing scan data & only attempt to scan what isn't already done)
       2. Remove existing directory (scan starts from the beginning & all existing results are removed)
       3. Save existing directory (your existing folder is renamed and your scan proceeds)
    Your choice?
--- a/pipeline/recon-pipeline.py
+++ b/pipeline/recon-pipeline.py
@@ -2,11 +2,13 @@
 # stdlib imports
 import os
 import sys
 import time
 import shlex
 import shutil
 import pickle
 import selectors
 import tempfile
 import textwrap
 import selectors
 import threading
 import subprocess
 import webbrowser
@@ -116,10 +118,11 @@ class ReconShell(cmd2.Cmd):
        self.selectorloop = None
        self.continue_install = True
        self.prompt = DEFAULT_PROMPT
        self.tools_dir = Path(defaults.get("tools-dir"))
        self._initialize_parsers()
-        Path(defaults.get("tools-dir")).mkdir(parents=True, exist_ok=True)
+        self.tools_dir.mkdir(parents=True, exist_ok=True)
        Path(defaults.get("database-dir")).mkdir(parents=True, exist_ok=True)
        # register hooks to handle selector loop start and cleanup
@@ -207,6 +210,47 @@ class ReconShell(cmd2.Cmd):
            self.async_alert(style(f"[+] {words[5].split('_')[0]} complete!", fg="bright_green"))
    def check_scan_directory(self, directory):
        """ Determine whether or not the results-dir about to be used already exists and prompt the user accordingly.
        Args:
            directory: the directory passed to ``scan ... --results-dir``
        """
        directory = Path(directory)
        if directory.exists():
            term_width = shutil.get_terminal_size((80, 20)).columns
            warning_msg = (
                f"[*] Your results-dir ({str(directory)}) already exists. Subfolders/files may tell "
                f"the pipeline that the associated Task is complete. This means that your scan may start "
                f"from a point you don't expect. Your options are as follows:"
            )
            for line in textwrap.wrap(warning_msg, width=term_width, subsequent_indent="    "):
                self.poutput(style(line, fg="bright_yellow"))
            option_one = (
                "Resume existing scan (use any existing scan data & only attempt to scan what isn't already done)"
            )
            option_two = "Remove existing directory (scan starts from the beginning & all existing results are removed)"
            option_three = "Save existing directory (your existing folder is renamed and your scan proceeds)"
            answer = self.select([("Resume", option_one), ("Remove", option_two), ("Save", option_three)])
            if answer == "Resume":
                self.poutput(style(f"[+] Resuming scan from last known good state.", fg="bright_green"))
            elif answer == "Remove":
                shutil.rmtree(Path(directory))
                self.poutput(style(f"[+] Old directory removed, starting fresh scan.", fg="bright_green"))
            elif answer == "Save":
                current = time.strftime("%Y%m%d-%H%M%S")
                directory.rename(f"{directory}-{current}")
                self.poutput(
                    style(f"[+] Starting fresh scan.  Old data saved as {directory}-{current}", fg="bright_green")
                )
    @cmd2.with_argparser(scan_parser)
    def do_scan(self, args):
        """ Scan something.
@@ -221,6 +265,8 @@ class ReconShell(cmd2.Cmd):
                style(f"[!] You are not connected to a database; run database attach before scanning", fg="bright_red")
            )
        self.check_scan_directory(args.results_dir)
        self.poutput(
            style(
                "If anything goes wrong, rerun your command with --verbose to enable debug statements.",
@@ -234,7 +280,7 @@ class ReconShell(cmd2.Cmd):
        scans = get_scans()
        # command is a list that will end up looking something like what's below
-        # luigi --module pipeline.recon.web.webanalyze WebanalyzeScan --target-file tesla --top-ports 1000 --interface eth0
+        # luigi --module pipeline.recon.web.webanalyze WebanalyzeScan --target abc.com --top-ports 100 --interface eth0
        command = ["luigi", "--module", scans.get(args.scantype)[0]]
        tgt_file_path = None
@@ -281,7 +327,7 @@ class ReconShell(cmd2.Cmd):
        # imported tools variable is in global scope, and we reassign over it later
        global tools
-        persistent_tool_dict = Path(defaults.get("tools-dir")) / ".tool-dict.pkl"
+        persistent_tool_dict = self.tools_dir / ".tool-dict.pkl"
        if args.tool == "all":
            # show all tools have been queued for installation
@@ -299,6 +345,7 @@ class ReconShell(cmd2.Cmd):
        if persistent_tool_dict.exists():
            tools = pickle.loads(persistent_tool_dict.read_bytes())
        print(args.tool)
        if tools.get(args.tool).get("dependencies"):
            # get all of the requested tools dependencies
--- a/tests/test_shell/test_recon_pipeline_shell.py
+++ b/tests/test_shell/test_recon_pipeline_shell.py
@@ -1,5 +1,7 @@
 import re
 import sys
 import time
 import pickle
 import shutil
 import importlib
 from pathlib import Path
@@ -293,21 +295,163 @@ class TestReconShell:
    # ("all", "commands failed and may have not installed properly", 1)
    # after tools moved to DB, update this test
    @pytest.mark.parametrize("test_input, expected, return_code", [("all", "is already installed", 0)])
-    def test_do_install(self, test_input, expected, return_code, capsys, tmp_path, monkeypatch):
+    def test_do_install(self, test_input, expected, return_code, capsys, tmp_path):
        process_mock = MagicMock()
        attrs = {"communicate.return_value": (b"output", b"error"), "returncode": return_code}
        process_mock.configure_mock(**attrs)
-        def mockreturn():
+        tool_dict = {
-            return tmp_path
+            "tko-subs": {
                "installed": False,
                "dependencies": ["go"],
                "go": "/usr/local/go/bin/go",
                "commands": [
                    "/usr/local/go/bin/go get github.com/anshumanbh/tko-subs",
                    "(cd ~/go/src/github.com/anshumanbh/tko-subs &&  /usr/local/go/bin/go build &&  /usr/local/go/bin/go install)",
                ],
                "shell": True,
            },
            "recursive-gobuster": {
                "installed": False,
                "dependencies": ["go"],
                "recursive-parent": "/home/epi/.local/recon-pipeline/tools/recursive-gobuster",
                "commands": [
                    "bash -c 'if [ -d /home/epi/.local/recon-pipeline/tools/recursive-gobuster ]; then cd /home/epi/.local/recon-pipeline/tools/recursive-gobuster && git fetch --all && git pull; else git clone https://github.com/epi052/recursive-gobuster.git /home/epi/.local/recon-pipeline/tools/recursive-gobuster ; fi'"
                ],
                "shell": False,
            },
            "subjack": {
                "installed": False,
                "dependencies": ["go"],
                "go": "/usr/local/go/bin/go",
                "commands": [
                    "/usr/local/go/bin/go get github.com/haccer/subjack",
                    "(cd ~/go/src/github.com/haccer/subjack && /usr/local/go/bin/go install)",
                ],
                "shell": True,
            },
            "searchsploit": {
                "installed": False,
                "dependencies": None,
                "home": "/home/epi",
                "tools-dir": "/home/epi/.local/recon-pipeline/tools",
                "exploitdb-file": "/home/epi/.local/recon-pipeline/tools/exploitdb",
                "searchsploit-file": "/home/epi/.local/recon-pipeline/tools/exploitdb/searchsploit",
                "searchsploit-rc": "/home/epi/.local/recon-pipeline/tools/exploitdb/.searchsploit_rc",
                "homesploit": "/home/epi/.searchsploit_rc",
                "sed-command": "'s#/opt#/home/epi/.local/recon-pipeline/tools#g'",
                "commands": [
                    "bash -c 'if [ -d /usr/share/exploitdb ]; then ln -fs /usr/share/exploitdb /home/epi/.local/recon-pipeline/tools/exploitdb && sudo ln -fs $(which searchsploit) /home/epi/.local/recon-pipeline/tools/exploitdb/searchsploit ; elif [ -d /home/epi/.local/recon-pipeline/tools/exploitdb ]; then cd /home/epi/.local/recon-pipeline/tools/exploitdb && git fetch --all && git pull; else git clone https://github.com/offensive-security/exploitdb.git /home/epi/.local/recon-pipeline/tools/exploitdb ; fi'",
                    "bash -c 'if [ -f /home/epi/.local/recon-pipeline/tools/exploitdb/.searchsploit_rc ]; then cp -n /home/epi/.local/recon-pipeline/tools/exploitdb/.searchsploit_rc /home/epi ; fi'",
                    "bash -c 'if [ -f /home/epi/.searchsploit_rc ]; then sed -i 's#/opt#/home/epi/.local/recon-pipeline/tools#g' /home/epi/.searchsploit_rc ; fi'",
                ],
                "shell": False,
            },
            "luigi-service": {
                "installed": False,
                "dependencies": None,
                "service-file": "/home/epi/PycharmProjects/recon-pipeline/luigid.service",
                "commands": [
                    "sudo cp /home/epi/PycharmProjects/recon-pipeline/luigid.service /lib/systemd/system/luigid.service",
                    "sudo cp /home/epi/PycharmProjects/recon-pipeline/luigid.service $(which luigid) /usr/local/bin",
                    "sudo systemctl daemon-reload",
                    "sudo systemctl start luigid.service",
                    "sudo systemctl enable luigid.service",
                ],
                "shell": True,
            },
            "aquatone": {
                "installed": False,
                "dependencies": None,
                "aquatone": "/home/epi/.local/recon-pipeline/tools/aquatone",
                "commands": [
                    "mkdir /tmp/aquatone",
                    "wget -q https://github.com/michenriksen/aquatone/releases/download/v1.7.0/aquatone_linux_amd64_1.7.0.zip -O /tmp/aquatone/aquatone.zip",
                    "bash -c 'if [[ ! $(which unzip) ]]; then sudo apt install -y zip; fi'",
                    "unzip /tmp/aquatone/aquatone.zip -d /tmp/aquatone",
                    "mv /tmp/aquatone/aquatone /home/epi/.local/recon-pipeline/tools/aquatone",
                    "rm -rf /tmp/aquatone",
                    "bash -c 'found=false; for loc in {/usr/bin/google-chrome,/usr/bin/google-chrome-beta,/usr/bin/google-chrome-unstable,/usr/bin/chromium-browser,/usr/bin/chromium}; do if [[ $(which $loc) ]]; then found=true; break; fi ; done; if [[ $found = false ]]; then sudo apt install -y chromium-browser ; fi'",
                ],
                "shell": False,
            },
            "gobuster": {
                "installed": False,
                "dependencies": ["go", "seclists"],
                "go": "/usr/local/go/bin/go",
                "commands": [
                    "/usr/local/go/bin/go get github.com/OJ/gobuster",
                    "(cd ~/go/src/github.com/OJ/gobuster && /usr/local/go/bin/go build && /usr/local/go/bin/go install)",
                ],
                "shell": True,
            },
            "amass": {
                "installed": False,
                "dependencies": ["go"],
                "go": "/usr/local/go/bin/go",
                "amass": "/home/epi/.local/recon-pipeline/tools/amass",
                "commands": [
                    "/usr/local/go/bin/go get -u github.com/OWASP/Amass/v3/...",
                    "cp ~/go/bin/amass /home/epi/.local/recon-pipeline/tools/amass",
                ],
                "shell": True,
                "environ": {"GO111MODULE": "on"},
            },
            "masscan": {
                "installed": True,
                "dependencies": None,
                "masscan": "/home/epi/.local/recon-pipeline/tools/masscan",
                "commands": [
                    "git clone https://github.com/robertdavidgraham/masscan /tmp/masscan",
                    "make -s -j -C /tmp/masscan",
                    "mv /tmp/masscan/bin/masscan /home/epi/.local/recon-pipeline/tools/masscan",
                    "rm -rf /tmp/masscan",
                    "sudo setcap CAP_NET_RAW+ep /home/epi/.local/recon-pipeline/tools/masscan",
                ],
                "shell": True,
            },
            "go": {
                "installed": False,
                "dependencies": None,
                "go": "/usr/local/go/bin/go",
                "commands": [
                    "wget -q https://dl.google.com/go/go1.13.7.linux-amd64.tar.gz -O /tmp/go.tar.gz",
                    "sudo tar -C /usr/local -xvf /tmp/go.tar.gz",
                    "bash -c 'if [ ! $(echo ${PATH} | grep $(dirname /usr/local/go/bin/go )) ]; then echo PATH=${PATH}:/usr/local/go/bin >> ~/.bashrc; fi'",
                ],
                "shell": True,
            },
            "webanalyze": {
                "installed": False,
                "dependencies": ["go"],
                "go": "/usr/local/go/bin/go",
                "commands": [
                    "/usr/local/go/bin/go get github.com/rverton/webanalyze/...",
                    "(cd ~/go/src/github.com/rverton/webanalyze && /usr/local/go/bin/go build && /usr/local/go/bin/go install)",
                ],
                "shell": True,
            },
            "seclists": {
                "installed": True,
                "depencencies": None,
                "seclists-file": "/home/epi/.local/recon-pipeline/tools/seclists",
                "commands": [
                    "bash -c 'if [[ -d /usr/share/seclists ]]; then ln -s /usr/share/seclists /home/epi/.local/recon-pipeline/tools/seclists ; elif [[ -d /home/epi/.local/recon-pipeline/tools/seclists ]] ; then cd /home/epi/.local/recon-pipeline/tools/seclists && git fetch --all && git pull; else git clone https://github.com/danielmiessler/SecLists.git /home/epi/.local/recon-pipeline/tools/seclists ; fi'"
                ],
                "shell": True,
            },
        }
-        monkeypatch.setattr(Path, "home", mockreturn)
+        tooldir = tmp_path / ".local" / "recon-pipeline" / "tools"
        tooldir.mkdir(parents=True, exist_ok=True)
        pickle.dump(tool_dict, (tooldir / ".tool-dict.pkl").open("wb"))
        with patch("subprocess.Popen", autospec=True) as mocked_popen:
            mocked_popen.return_value = process_mock
            self.shell.tools_dir = tooldir
            self.shell.do_install(test_input)
-            out = capsys.readouterr().out
+            assert mocked_popen.called
            assert mocked_popen.called or expected in out
    @pytest.mark.parametrize(
        "test_input, expected, db_mgr",
@@ -333,8 +477,14 @@ class TestReconShell:
        with patch("subprocess.run", autospec=True) as mocked_popen, patch(
            "webbrowser.open", autospec=True
-        ) as mocked_web, patch("selectors.DefaultSelector.register", autospec=True) as mocked_selector:
+        ) as mocked_web, patch("selectors.DefaultSelector.register", autospec=True) as mocked_selector, patch(
            "cmd2.Cmd.select"
        ) as mocked_select:
            mocked_select.return_value = "Resume"
            mocked_popen.return_value = process_mock
            test_input += f" --results-dir {tmp_path / 'mostuff'}"
            if db_mgr is None:
                self.shell.do_scan(test_input)
                assert expected in capsys.readouterr().out
@@ -389,3 +539,26 @@ class TestReconShell:
                assert not file.exists()
            else:
                assert file.exists()
    @pytest.mark.parametrize(
        "test_input", [("1", "Resume", True, 1), ("2", "Remove", False, 0), ("3", "Save", False, 1)]
    )
    def test_check_scan_directory(self, test_input, tmp_path):
        user_input, answer, exists, numdirs = test_input
        new_tmp = tmp_path / f"check_scan_directory_test-{user_input}-{answer}"
        new_tmp.mkdir()
        with patch("cmd2.Cmd.select") as mocked_select:
            mocked_select.return_value = answer
            self.shell.check_scan_directory(str(new_tmp))
            assert new_tmp.exists() == exists
            assert len(list(tmp_path.iterdir())) == numdirs
            if answer == "Save":
                assert (
                    re.search(r"check_scan_directory_test-3-Save-[0-9]{6,8}-[0-9]+", str(list(tmp_path.iterdir())[0]))
                    is not None
                )
--- a/tests/utils.py
+++ b/tests/utils.py
@@ -1,65 +0,0 @@
 import sys
 import subprocess
 from pathlib import Path
 from contextlib import redirect_stdout, redirect_stderr
 from cmd2.utils import StdSim
 def is_kali():
    return any(
        [
            "kali" in x
            for x in subprocess.run("cat /etc/lsb-release".split(), stdout=subprocess.PIPE).stdout.decode().split()
        ]
    )
 def normalize(block):
    """ Normalize a block of text to perform comparison.
    Strip newlines from the very beginning and very end  Then split into separate lines and strip trailing whitespace
    from each line.
    """
    assert isinstance(block, str)
    block = block.strip("\n")
    return [line.rstrip() for line in block.splitlines()]
 def run_cmd(app, cmd):
    """ Clear out and err StdSim buffers, run the command, and return out and err """
    saved_sysout = sys.stdout
    sys.stdout = app.stdout
    # This will be used to capture app.stdout and sys.stdout
    copy_cmd_stdout = StdSim(app.stdout)
    # This will be used to capture sys.stderr
    copy_stderr = StdSim(sys.stderr)
    try:
        app.stdout = copy_cmd_stdout
        with redirect_stdout(copy_cmd_stdout):
            with redirect_stderr(copy_stderr):
                app.onecmd_plus_hooks(cmd)
    finally:
        app.stdout = copy_cmd_stdout.inner_stream
        sys.stdout = saved_sysout
    out = copy_cmd_stdout.getvalue()
    err = copy_stderr.getvalue()
    return normalize(out), normalize(err)
 def setup_install_test(tool=None):
    tools = Path.home() / ".local" / "recon-pipeline" / "tools" / ".tool-dict.pkl"
    try:
        tools.unlink()
    except FileNotFoundError:
        pass
    if tool is not None:
        try:
            tool.unlink()
        except (FileNotFoundError, PermissionError):
            pass