The LLM-Based Threat Modeling Agent with CAPEC Retrieval is an interactive tool designed to generate a threat model based on a textual system description. It uses a combination of advanced technologies to break down a system into its core components, identify relevant threats for each component, and map each threat to the CAPEC knowledge base for actionable insights. The final threat model is presented as an interactive tree visualization.
- System Decomposition: Utilizes an LLM (GPT-4o) to decompose a structured/unstructured system description into core components using Data-Flow Diagram (DFD) elements: (1) external entities, (2) processes, (3) data stores, and (4) data flows.
- Threat Identification: Utilizes an LLM (GPT-4o) to identify relevant threats for each system component.
- CAPEC Retrieval: Utilizes a vector database (Chroma) containing the CAPEC dataset to retrieve relevant attack patterns for each identified threat using semantic search.
- Backend: Flask (Python)
- LLM API: OpenAI GPT-4o
- Vector Database: Chroma
- Frontend: HTML, CSS, JavaScript, and jsTree (for interactive visualization)
- Obtain an OpenAI API key:
- Obtain an API key by signing up at OpenAI and creating a new secret key.
- This key is required for the application to interact with OpenAI's GPT-4o model. You will configure the key below.
Note: You will need to have funds in your OpenAI account. Without adequate funds, the tool will not be able to make calls to OpenAI's GPT-4o model. Check your account settings for more details.
- Install Python 3.12 or later (if not already installed):
Note: This project is designed to work best with Python 3.12. While earlier versions of Python may work, they are not officially supported and could result in unexpected behavior or performance issues.
- macOS: Python 3 is often pre-installed. Verify the version:
If Python 3.12 or later is not installed:
python3 --version
- Download Python from the official Python website.
- Follow the installation instructions for your OS.
- Linux: Most distributions have Python 3 pre-installed. Verify with:
To install the latest version:
python3 --version
sudo apt update sudo apt install python3 python3-venv python3-pip
- Windows:
- Download the installer from the official Python website.
- During installation, make sure to check the box "Add Python to PATH".
- macOS: Python 3 is often pre-installed. Verify the version:
- Install Git (if not already installed):
- Follow the instructions for your operating system at the official Git website.
- Clone the repository:
git clone https://github.com/karimsammouri/capec_threat_modeling.git
- Navigate to the project directory:
cd capec_threat_modeling
- Create and activate a virtual environment (optional but recommended):
- Create the virtual environment (named
venv
):python3 -m venv venv
Note: Use
python
instead ofpython3
if it points to Python 3 on your system. - Activate the virtual environment:
- On macOS/Linux:
source venv/bin/activate
- On Windows:
.\venv\Scripts\activate
- On macOS/Linux:
- Create the virtual environment (named
- Install the required dependencies:
pip install -r requirements.txt
- Configure the OpenAI API key:
- Create a
.env
file in the root directory of the project:touch .env
- Add the following line (your API key) to the
.env
file:OPENAI_API_KEY=your_openai_api_key
Note: Replace
your_openai_api_key
with your actual OpenAI API key and don't forget to save the file! - The application will automatically read the key from the
.env
file when you run it.
- Create a
- Load CAPEC into the Chroma vector database:
- Run the
chroma.py
script to load the CAPEC data:python3 chroma.py
Note: You only need to run the
chroma.py
script once. It creates a local Chroma vector database with the CAPEC dataset, which can be reused across sessions. - Run the
- Launch the Application:
- Run the Flask app:
python3 app.py
- Run the Flask app:
- Interact with the Application:
- Open your browser and navigate to http://127.0.0.1:5000/.
- Provide a textual description of your system.
- Generate and explore the threat model.
For contributions, please fork the repository, make changes, and submit a pull request.
This project is licensed under the MIT License. See the LICENSE
file for details.
- MITRE: For developing and maintaining the publicly available CAPEC knowledge base.
- OpenAI: For providing the GPT-4o model.
- Chroma: An open-source vector database that enables semantic search.
- jsTree: An open-source JavaScript library (jQuery plugin) for creating interactive tree structures.
Developed by Karim Sammouri. Feel free to reach out for any questions or suggestions at [email protected].