This repository contains:
-
The background of this project
-
Introduction to the dataset
-
Related data processing scripts
Hopefully, you can use this project to get the needed accident data for nuclear power plants and then develop new accident diagnosis algorithms and benchmarks.
- Background
- Introduction to the dataset
- Related scripts
- Installation
- Maintainers
- Contributing
- License
- Citing
Nuclear energy plays an important role in global energy supply, especially as a key low-carbon source of power. Safe operation is critical in the generation of nuclear energy, i.e. in nuclear power plants. Given the significant impact of human-caused errors on three serious nuclear accidents in history, artificial intelligence technologies are increasingly being used to assist plant operators in making decisions. Specifically, artificial intelligence algorithms are used to identify the presence of accidents and their root causes. A continuing challenge is the lack of an open dataset in the nuclear power plant domain to measure the performance of various algorithms. we presents a first-of-its-kind public dataset created with the help of PCTRAN, a pre-developed and widely used simulation software for nuclear power plants. The dataset, NPPAD, basically covers most of the common types of accidents that can occur in pressurized water reactor nuclear power plants. It contains time-series data on the status or actions of various subsystems as well as the accident types and severity information. The dataset also incorporates other simulation data like the amount of radionuclide released, which can help users to conduct research beyond accident diagnosis.
Fig. 1 Overall Workflow Of The Simulation Data GenerationThe overall workflow implemented in the script to generate the nuclear power plant accident dataset is shown in Fig. 1. First, we started the software by an automation script. Once the software is launched, the nuclear plant operating at 100% power is initialized.Then we select different operating conditions. If the normal operating condition is treated, the simulator will run for a certain time that we configured to get the data output. Besides, for abnormal operating conditions, accident type, accident parameters and simulation time are configured, and then simulation data is output. The accidents covered in this work is shown in Table 1 . Specifically, the parameter selection screen is shown in the Fig. 2.
Fig. 2 Accident type selection and parameter setting
After that, PCTRAN will simulate automatedly. The detailed process of accident simulation in PCTRAN is shown in Box 1. First, a set of input parameters are configured according to the operations. which decide the way of the corresponding simulations. And we can get the output data in a certain time. Finally, we get the dataset NPPAD with different conditions. PS: The dataset in this work does not include cases where mitigation system failures are superimposed on nuclear plant accidents, as such superimposed cases are too numerous to cover.
Table1 Accident sets covered by NPPAD
Folder name | Accident | Type | Severity |
---|---|---|---|
NORM | Normal operating | - | - |
LOCA | Loss of Coolant Accident (Hot Leg) | Severity | % of 100 cm2 |
LOCAC | Loss of Coolant Accident (Cold Leg) | Severity | % of 100 cm2 |
SLBIC | Steam Line Break Inside Containment | Severity | % of 100 cm2 |
SLBOC | Steam Line Break Outside Containment | Severity | % of 100 cm2 |
SP | Spark Presence for Hydrogen Burn | Other | - |
LACP | Loss of AC Power | Other | - |
LOF | Loss of Flow (Locked Rotor) | Other | - |
ATWS | Anticipated Transient Without Scram | Other | - |
TT | Turbine Trip | Other | - |
SGATR | Steam Generator A Tube Rupture | Severity | % of 1 full tube rupture |
SGBTR | Steam Generator B Tube Rupture | Severity | % of 1 full tube rupture |
RW | Rod Withdrawal | Severity | % (+/-) withdrawn |
RI | Rod Insertion | Severity | % (+/-) insertion |
FLB | Feedwater Line Break | Severity | % of 100 cm2 |
MD | Moderator Dilution | Severity | % of unborated injection |
LR | Load Rejection | Severity | % of full load rejected |
LLB | Letdown Line Break in auxiliary buildings | Severity | % of nominal letdown flow |
The following three scripts are provided in Data Processing.py
- Method mdbtocsv
Use this method to convert files from mdb format to csv format, the files Dose_csv_data and Operation_csv_data in this project are the result of converting the original dataset DATA into csv format.
def mdbtocsv(self):
driver = '{Microsoft Access Driver (*.mdb, *.accdb)}'
if (os.path.exists(self.operation_data_csv_path) == False):
os.makedirs(self.operation_data_csv_path) #Create folder of operation parameters
if (os.path.exists(self.dose_data_csv_path) == False):
os.makedirs(self.dose_data_csv_path) #Create folder of dose parameters
for accident in os.listdir(self.data_path):
accident_path = self.data_path + '\\'+ accident
os.chdir(self.project_path) # Make sure it is the project path
for name in os.listdir(accident_path):
if not (re.search(r'Transient Report.txt',name)) :
os.chdir(self.project_path) # Make sure the database conect normally
mdb_file = accident_path + '\\' + name
print(mdb_file)
cnxn = pyodbc.connect(f'Driver={driver};DBQ={mdb_file}')
if re.search(r'\d+' + '.mdb', name) : #Operation data
data_table = pd.read_sql('SELECT * FROM PlotData', cnxn)
data_table.sort_values(by=['TIME'], ascending=True,
inplace=True) # Some mdbs have problems with not being in time order
csv_accident_path = self.operation_data_csv_path + '\\' + accident
if (os.path.exists(csv_accident_path) == False):
os.makedirs(csv_accident_path)
os.chdir(csv_accident_path)
csv_name = name.replace('mdb','csv')
data_table.to_csv(csv_name, header=True, index=False)
elif re.search(r'\d+' + 'dose' + '.mdb', name) : #Dose data
data_table = pd.read_sql('SELECT * FROM ListDS', cnxn)
data_table.sort_values(by=['TIME'], ascending=True,
inplace=True) # Some mdbs have problems with not being in time order
csv_accident_path = self.dose_data_csv_path + '\\' + accident
if(os.path.exists(csv_accident_path) == False):
os.makedirs(csv_accident_path)
os.chdir(csv_accident_path)
csv_name = name.replace('mdb', 'csv')
data_table.to_csv(csv_name, header=True, index=False)
- Method generate_dataset
Use this method to generate a standard dataset for supervised learning tasks.
def generate_dataset(self, dataset_source_path):
class Mydataset(Dataset):
def __int__(self, dataset_path):
self.dataset_path = dataset_path
self.feature = []
self.label = []
"""
1. Read all csv files in order
2. Add labels(accident types) to self.label,
add features(operation data or dose data) to self.feature
"""
for accident in os.listdir(self.dataset_path):
accident_path = self.data_path + '\\' + accident
for size_name in os.listdir(accident_path):
csv_data_path = accident_path + '\\' + size_name
self.label.append(accident)
sample_df = pd.read_csv(csv_data_path)
sample_value = (sample_df.iloc[:150, 1:]).values # Take the data of 1500s
sample_list = list(chain.from_iterable(sample_value)) # Convert 2-D list to 1-D list
self.feature.append(sample_list)
self.label = (pd.Categorical(self.label)).codes
assert len(self.label) == len(self.feature)
self.length = len(self.feature)
def __getitem__(self, index):
x = self.feature[index]
x = torch.Tensor(x)
y = self.label[index]
return {"x": x, "y": y}
def __len__(self):
return self.length
return Mydataset(dataset_source_path)
- Method show_parameters
Use this method to plot the variation of physical parameters.
def show_parametes(self, variables, plot_data_path, figture_save_path):
if (os.path.exists(figture_save_path) == False):
os.makedirs(figture_save_path)
plot_data = pd.read_csv(plot_data_path)
plot_df = plot_data[variables]
plot_df.set_index(plot_df.columns[0],inplace=True)
fig_plot = plot_df.plot()
plt.xlabel(variables[0])
plt.show()
fig_name = ''
for var in range(1,len(variables)):
fig_name = fig_name + variables[var] + '-'
print(fig_name)
fig_save = fig_plot.get_figure()
fig_path = figture_save_path + "\\" + fig_name
fig_save.savefig(fig_path)
First, Python 3.6 or higher is already installed by default.
To install NPPAD from the soure code:
$ git clone https://github.com/thu-inet/NuclearPowerPlantAccidentData
$ cd NuclearPowerPlantAccidentData/
$ pip install -r requirements.txt
We appreciate all contributions. Please let us know if you encounter a bug by filing an issue.
NPPAD has a MIT license, as found in the LICENSE file.
Citing NPPAD in your research:
Qi, B., Xiao, X., Liang, J. et al. An open time-series simulated dataset covering various accidents
for nuclear power plants.Sci Data 9, 766 (2022). https://doi.org/10.1038/s41597-022-01879-1