Welcome to NPPAD

NPPAD: An Open Time-series Dataset Covering Various Accidents for Nuclear Power Plants

This repository contains:

The background of this project
Introduction to the dataset
Related data processing scripts

Hopefully, you can use this project to get the needed accident data for nuclear power plants and then develop new accident diagnosis algorithms and benchmarks.

Background
Introduction to the dataset
- Workflow overview
- Dataset structure
Related scripts
Installation
Maintainers
Contributing
License
Citing

Background

Nuclear energy plays an important role in global energy supply, especially as a key low-carbon source of power. Safe operation is critical in the generation of nuclear energy, i.e. in nuclear power plants. Given the significant impact of human-caused errors on three serious nuclear accidents in history, artificial intelligence technologies are increasingly being used to assist plant operators in making decisions. Specifically, artificial intelligence algorithms are used to identify the presence of accidents and their root causes. A continuing challenge is the lack of an open dataset in the nuclear power plant domain to measure the performance of various algorithms. we presents a first-of-its-kind public dataset created with the help of PCTRAN, a pre-developed and widely used simulation software for nuclear power plants. The dataset, NPPAD, basically covers most of the common types of accidents that can occur in pressurized water reactor nuclear power plants. It contains time-series data on the status or actions of various subsystems as well as the accident types and severity information. The dataset also incorporates other simulation data like the amount of radionuclide released, which can help users to conduct research beyond accident diagnosis.

Introduction to the dataset

Workflow overview

Fig. 1 Overall Workflow Of The Simulation Data Generation

The overall workflow implemented in the script to generate the nuclear power plant accident dataset is shown in Fig. 1. First, we started the software by an automation script. Once the software is launched, the nuclear plant operating at 100% power is initialized.Then we select different operating conditions. If the normal operating condition is treated, the simulator will run for a certain time that we configured to get the data output. Besides, for abnormal operating conditions, accident type, accident parameters and simulation time are configured, and then simulation data is output. The accidents covered in this work is shown in Table 1 . Specifically, the parameter selection screen is shown in the Fig. 2.

Fig. 2 Accident type selection and parameter setting

After that, PCTRAN will simulate automatedly. The detailed process of accident simulation in PCTRAN is shown in Box 1. First, a set of input parameters are configured according to the operations. which decide the way of the corresponding simulations. And we can get the output data in a certain time. Finally, we get the dataset NPPAD with different conditions. PS: The dataset in this work does not include cases where mitigation system failures are superimposed on nuclear plant accidents, as such superimposed cases are too numerous to cover.

Table1 Accident sets covered by NPPAD

Folder name	Accident	Type	Severity
NORM	Normal operating	-	-
LOCA	Loss of Coolant Accident (Hot Leg)	Severity	% of 100 cm2
LOCAC	Loss of Coolant Accident (Cold Leg)	Severity	% of 100 cm2
SLBIC	Steam Line Break Inside Containment	Severity	% of 100 cm2
SLBOC	Steam Line Break Outside Containment	Severity	% of 100 cm2
SP	Spark Presence for Hydrogen Burn	Other	-
LACP	Loss of AC Power	Other	-
LOF	Loss of Flow (Locked Rotor)	Other	-
ATWS	Anticipated Transient Without Scram	Other	-
TT	Turbine Trip	Other	-
SGATR	Steam Generator A Tube Rupture	Severity	% of 1 full tube rupture
SGBTR	Steam Generator B Tube Rupture	Severity	% of 1 full tube rupture
RW	Rod Withdrawal	Severity	% (+/-) withdrawn
RI	Rod Insertion	Severity	% (+/-) insertion
FLB	Feedwater Line Break	Severity	% of 100 cm2
MD	Moderator Dilution	Severity	% of unborated injection
LR	Load Rejection	Severity	% of full load rejected
LLB	Letdown Line Break in auxiliary buildings	Severity	% of nominal letdown flow

Dataset structure

The NPPAD dataset covers 18 types of operating conditions, with Box 2 shows partially. Each operating condition sample contains three files, two in mdb format and the other in txt format. The mdb file can be opened directly through Microsoft Access. For example, the content of 1.mdb (PlotData) is shown in box 3, it represents the time series of the status parameters with a 1% of 100 cm2 break of LOCA, while PlotData represents the sub-table in the 1.mdb file. Another useful sub-table is ListPlotVariables, as shown in Box 6, which describes the parameters corresponding to the abbreviations in PlotData. And in box 4, 1Dose.mdb represents the time series of the radionuclide in the nuclear power plant. In addition to the mdb format, we also provide CSV format in the folders Operation_csv_data and Dose_csv_data. Besides, 1Transient Report.txt in box 5 describes the actions in the subsystems of the nuclear plant over the simulation time for each accident, which can help the user to understand the changes in the plant status. The numbers in front of the files in other operating conditions (e.g. 1.mdb, 2.mdb) correspond to the severity of the accident, and the exact meaning can be determined by column ‘severity’ of Table 1.

Related scripts

The following three scripts are provided in Data Processing.py

Method mdbtocsv

Use this method to convert files from mdb format to csv format, the files Dose_csv_data and Operation_csv_data in this project are the result of converting the original dataset DATA into csv format.

    def mdbtocsv(self):
        driver = '{Microsoft Access Driver (*.mdb, *.accdb)}'
        if (os.path.exists(self.operation_data_csv_path) == False):
            os.makedirs(self.operation_data_csv_path) #Create folder of operation parameters
        if (os.path.exists(self.dose_data_csv_path) == False):
            os.makedirs(self.dose_data_csv_path)  #Create folder of dose parameters
        for accident in os.listdir(self.data_path):
            accident_path = self.data_path + '\\'+ accident
            os.chdir(self.project_path) # Make sure it is the project path
            for name in os.listdir(accident_path):
                if not (re.search(r'Transient Report.txt',name))  :
                    os.chdir(self.project_path) # Make sure the database conect normally
                    mdb_file = accident_path + '\\' + name
                    print(mdb_file)
                    cnxn = pyodbc.connect(f'Driver={driver};DBQ={mdb_file}')
                    if re.search(r'\d+' + '.mdb', name) : #Operation data
                        data_table = pd.read_sql('SELECT * FROM PlotData', cnxn)
                        data_table.sort_values(by=['TIME'], ascending=True,
                                               inplace=True)  # Some mdbs have problems with not being in time order
                        csv_accident_path = self.operation_data_csv_path + '\\' + accident
                        if (os.path.exists(csv_accident_path) == False):
                            os.makedirs(csv_accident_path)
                        os.chdir(csv_accident_path)
                        csv_name = name.replace('mdb','csv')
                        data_table.to_csv(csv_name, header=True, index=False)
                    elif re.search(r'\d+' + 'dose' + '.mdb', name) : #Dose data
                        data_table = pd.read_sql('SELECT * FROM ListDS', cnxn)
                        data_table.sort_values(by=['TIME'], ascending=True,
                                               inplace=True)  # Some mdbs have problems with not being in time order
                        csv_accident_path = self.dose_data_csv_path + '\\' + accident
                        if(os.path.exists(csv_accident_path) == False):
                            os.makedirs(csv_accident_path)
                        os.chdir(csv_accident_path)
                        csv_name = name.replace('mdb', 'csv')
                        data_table.to_csv(csv_name, header=True, index=False)

Method generate_dataset

Use this method to generate a standard dataset for supervised learning tasks.

    def generate_dataset(self, dataset_source_path):
        class Mydataset(Dataset):
            def __int__(self, dataset_path):
                self.dataset_path = dataset_path
                self.feature = []
                self.label = []
                """
                1. Read all csv files in order
                2. Add labels(accident types) to self.label, 
                add features(operation data or dose data) to self.feature
                """
                for accident in os.listdir(self.dataset_path):
                    accident_path = self.data_path + '\\' + accident
                    for size_name in os.listdir(accident_path):
                        csv_data_path = accident_path + '\\' + size_name
                        self.label.append(accident)
                        sample_df = pd.read_csv(csv_data_path)
                        sample_value = (sample_df.iloc[:150, 1:]).values  # Take the data of 1500s
                        sample_list = list(chain.from_iterable(sample_value))  # Convert 2-D list to 1-D list
                        self.feature.append(sample_list)
                self.label = (pd.Categorical(self.label)).codes
                assert len(self.label) == len(self.feature)
                self.length = len(self.feature)
            def __getitem__(self, index):
                x = self.feature[index]
                x = torch.Tensor(x)
                y = self.label[index]
                return {"x": x, "y": y}

            def __len__(self):
                return self.length
        return Mydataset(dataset_source_path)

Method show_parameters

Use this method to plot the variation of physical parameters.

    def show_parametes(self, variables, plot_data_path, figture_save_path):
        if (os.path.exists(figture_save_path) == False):
            os.makedirs(figture_save_path)
        plot_data = pd.read_csv(plot_data_path)
        plot_df = plot_data[variables]
        plot_df.set_index(plot_df.columns[0],inplace=True)
        fig_plot = plot_df.plot()
        plt.xlabel(variables[0])
        plt.show()
        fig_name = ''
        for var in range(1,len(variables)):
            fig_name = fig_name  + variables[var] + '-'
        print(fig_name)
        fig_save = fig_plot.get_figure()
        fig_path = figture_save_path + "\\" + fig_name
        fig_save.savefig(fig_path)

Installation

First, Python 3.6 or higher is already installed by default.

To install NPPAD from the soure code:

$ git clone https://github.com/thu-inet/NuclearPowerPlantAccidentData 
$ cd NuclearPowerPlantAccidentData/
$ pip install -r requirements.txt

Maintainers

Contributing

We appreciate all contributions. Please let us know if you encounter a bug by filing an issue.

License

NPPAD has a MIT license, as found in the LICENSE file.

Citing

Citing NPPAD in your research:

Qi, B., Xiao, X., Liang, J. et al. An open time-series simulated dataset covering various accidents 
for nuclear power plants.Sci Data 9, 766 (2022). https://doi.org/10.1038/s41597-022-01879-1

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
Dose_csv_data		Dose_csv_data
Figures		Figures
NPPAD		NPPAD
Operation_csv_data		Operation_csv_data
Simulator Run Script		Simulator Run Script
Simulator		Simulator
Data Processing.py		Data Processing.py
LICENSE		LICENSE
LOGO.png		LOGO.png
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to NPPAD

NPPAD: An Open Time-series Dataset Covering Various Accidents for Nuclear Power Plants

Background

Introduction to the dataset

Workflow overview

Dataset structure

Related scripts

Installation

Maintainers

Contributing

License

Citing

About

Releases

Packages

Languages

License

Lins-01/NuclearPowerPlantAccidentData

Folders and files

Latest commit

History

Repository files navigation

Welcome to NPPAD

NPPAD: An Open Time-series Dataset Covering Various Accidents for Nuclear Power Plants

Background

Introduction to the dataset

Workflow overview

Dataset structure

Related scripts

Installation

Maintainers

Contributing

License

Citing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages