LLM-Based Threat Modeling Agent with CAPEC Retrieval

Description

The LLM-Based Threat Modeling Agent with CAPEC Retrieval is an interactive tool designed to generate a threat model based on a textual system description. It uses a combination of advanced technologies to break down a system into its core components, identify relevant threats for each component, and map each threat to the CAPEC knowledge base for actionable insights. The final threat model is presented as an interactive tree visualization.

Key Features

System Decomposition: Utilizes an LLM (GPT-4o) to decompose a structured/unstructured system description into core components using Data-Flow Diagram (DFD) elements: (1) external entities, (2) processes, (3) data stores, and (4) data flows.
Threat Identification: Utilizes an LLM (GPT-4o) to identify relevant threats for each system component.
CAPEC Retrieval: Utilizes a vector database (Chroma) containing the CAPEC dataset to retrieve relevant attack patterns for each identified threat using semantic search.

Technologies Used

Backend: Flask (Python)
LLM API: OpenAI GPT-4o
Vector Database: Chroma
Frontend: HTML, CSS, JavaScript, and jsTree (for interactive visualization)

Installation

Prerequisites

Obtain an OpenAI API key:
- Obtain an API key by signing up at OpenAI and creating a new secret key.
- This key is required for the application to interact with OpenAI's GPT-4o model. You will configure the key below.
Note: You will need to have funds in your OpenAI account. Without adequate funds, the tool will not be able to make calls to OpenAI's GPT-4o model. Check your account settings for more details.
Install Python 3.12 or later (if not already installed):

Note: This project is designed to work best with Python 3.12. While earlier versions of Python may work, they are not officially supported and could result in unexpected behavior or performance issues.
- macOS: Python 3 is often pre-installed. Verify the version:
```
python3 --version
```
  If Python 3.12 or later is not installed:
  - Download Python from the official Python website.
  - Follow the installation instructions for your OS.
- Linux: Most distributions have Python 3 pre-installed. Verify with:
```
python3 --version
```
  To install the latest version:
```
sudo apt update
sudo apt install python3 python3-venv python3-pip
```
- Windows:
  - Download the installer from the official Python website.
  - During installation, make sure to check the box "Add Python to PATH".
Install Git (if not already installed):
- Follow the instructions for your operating system at the official Git website.

Project Setup

Clone the repository:

git clone https://github.com/karimsammouri/capec_threat_modeling.git

Navigate to the project directory:
```
cd capec_threat_modeling
```
Create and activate a virtual environment (optional but recommended):
- Create the virtual environment (named venv):
```
python3 -m venv venv
```
  Note: Use python instead of python3 if it points to Python 3 on your system.
- Activate the virtual environment:
  - On macOS/Linux:
```
source venv/bin/activate
```
  - On Windows:
```
.\venv\Scripts\activate
```
Install the required dependencies:
```
pip install -r requirements.txt
```
Configure the OpenAI API key:
- Create a .env file in the root directory of the project:
```
touch .env
```
- Add the following line (your API key) to the .env file:
```
OPENAI_API_KEY=your_openai_api_key
```
  Note: Replace your_openai_api_key with your actual OpenAI API key and don't forget to save the file!
- The application will automatically read the key from the .env file when you run it.

Usage

Load CAPEC into the Chroma vector database:
- Run the chroma.py script to load the CAPEC data:
```
python3 chroma.py
```
Note: You only need to run the chroma.py script once. It creates a local Chroma vector database with the CAPEC dataset, which can be reused across sessions.
Launch the Application:
- Run the Flask app:
```
python3 app.py
```
Interact with the Application:
- Open your browser and navigate to http://127.0.0.1:5000/.
- Provide a textual description of your system.
- Generate and explore the threat model.

Contributions

For contributions, please fork the repository, make changes, and submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

MITRE: For developing and maintaining the publicly available CAPEC knowledge base.
OpenAI: For providing the GPT-4o model.
Chroma: An open-source vector database that enables semantic search.
jsTree: An open-source JavaScript library (jQuery plugin) for creating interactive tree structures.

Contact

Developed by Karim Sammouri. Feel free to reach out for any questions or suggestions at [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
capec		capec
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
chroma.py		chroma.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM-Based Threat Modeling Agent with CAPEC Retrieval

Description

Key Features

Technologies Used

Table of Contents

Installation

Prerequisites

Project Setup

Usage

Contributions

License

Acknowledgements

Contact

About

Releases

Packages

Languages

License

karimsammouri/capec_threat_modeling

Folders and files

Latest commit

History

Repository files navigation

LLM-Based Threat Modeling Agent with CAPEC Retrieval

Description

Key Features

Technologies Used

Table of Contents

Installation

Prerequisites

Project Setup

Usage

Contributions

License

Acknowledgements

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages