HTGR Micro-Core Quadrant Power

Input

theta_{1}: Angle of control drum in quadrant 1 (degrees)
theta_{2}: Angle of control drum in quadrant 1 (degrees)
theta_{3}: Angle of control drum in quadrant 2 (degrees)
theta_{4}: Angle of control drum in quadrant 2 (degrees)
theta_{5}: Angle of control drum in quadrant 3 (degrees)
theta_{6}: Angle of control drum in quadrant 3 (degrees)
theta_{7}: Angle of control drum in quadrant 4 (degrees)
theta_{8}: Angle of control drum in quadrant 4 (degrees)

Output

FluxQ1 : Neutron flux in quadrant 1 (\(\frac{neutrons}{cm^{2} s}\))
FluxQ2 : Neutron flux in quadrant 2 (\(\frac{neutrons}{cm^{2} s}\))
FluxQ3 : Neutron flux in quadrant 3 (\(\frac{neutrons}{cm^{2} s}\))
FluxQ4 : Neutron flux in quadrant 4 (\(\frac{neutrons}{cm^{2} s}\))

Overview

The data set consists of about 3000 data points with 8 inputs and 5 outputs. This data set increases in size from about 800 samples to about 3000 samples after producing more data points using reactor symmetries. The data set featured in this work was based around the HOLOS-Quad reactor design. This reactor implements modular construction where seperate units can be transported independently and assembled at the specified location. The HOLOS-Quad core is specifically a 22 MWt high-temperature gas-cooled microreactor (HTGR) which is controlled by 8 cylindrical control drums. It utilizes TRISO fuel particles contained in hexagonal graphite blocks used as a moderator. These graphite blocks have channels where helium gas can pass through for cooling. The image below shows a radial slice of the reactor. [1]

The main importance of this data set is the influence on the control drums on the neutron flux. The drums control reactivity by rotating to vary the proximity of \(B_{4} C\) on a portion of their outer edges to the fueled region of the core. Perturbations of the control drums in tern causes the core power shape to shift leading to complexe power distributions. Therefore, predictions of control drum reactivity worths for arbitrary configurations makes this problem nontrivial. [1]

Each sample in this dataset has 8 control drum angles where each angle was selected from a uniform random distribution spanning \([-180^\circ, 180^\circ]\) and the corresponding fluxes in each of the four quadrants is calculated using the Serpent code. The considerably larger dataset is needed because of the larger number of degrees of freedom associated with general statistical models such as deep neural networks (DNN). [1]

|microreactor\_imag.png|

[6]:

import pyMAISE as mai
import time
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
from scipy.stats import uniform, randint
from sklearn.model_selection import ShuffleSplit

# Plot settings
matplotlib_settings = {
    "font.size": 14,
    "legend.fontsize": 12,
}
plt.rcParams.update(**matplotlib_settings)

pyMAISE Initialization

Starting any pyMAISE job requires initialization, this includes the definition of global settings used throughout pyMAISE. For this problem the pyMAISE defaults are used:

verbosity: 0 \(\rightarrow\) pyMAISE prints no outputs,
random_state: None \(\rightarrow\) No random seed is set,
test_size: 0.3 \(\rightarrow\) 30% of the data is used for testing,
num_configs_saved: 5 \(\rightarrow\) The top 5 hyper-parameter configurations are saved for each model.

The settings are initialized and the heat conduction preprocessor specific to this data set is retrieved.

[7]:

global_settings = mai.settings.init()
preprocessor = mai.load_qpower()

The data consists of 8 inputs:

[8]:

preprocessor.inputs.head()

[8]:

	theta1	theta2	theta3	theta4	theta5	theta6	theta7	theta8
0	5.919526	2.369503	2.923656	4.488987	3.683212	4.008905	4.970368	2.987966
1	2.162380	0.273624	0.927741	4.595586	2.598824	0.170167	2.124048	4.980209
2	0.450100	0.006301	2.512217	3.313864	1.913458	3.582252	0.280764	4.888595
3	0.461105	4.825628	3.771356	2.599278	2.056019	0.007332	1.106786	5.504671
4	5.248202	3.549416	3.333632	3.907310	2.095312	5.585145	3.774253	2.480120

and 4 outputs with 818 samples total. This data set is produced before preprocessing and added more samples using symmetry conditions.

[9]:

preprocessor.outputs.head()

[9]:

	fluxQ1	fluxQ2	fluxQ3	fluxQ4
0	2.580000e+19	2.590000e+19	2.670000e+19	2.560000e+19
1	2.550000e+19	2.530000e+19	2.510000e+19	2.510000e+19
2	2.570000e+19	2.580000e+19	2.520000e+19	2.520000e+19
3	2.570000e+19	2.580000e+19	2.520000e+19	2.560000e+19
4	2.540000e+19	2.620000e+19	2.580000e+19	2.520000e+19

The last step for preprocessing the data will be min-max scaling,

[10]:

data = preprocessor.min_max_scale()

Model Initialization

As this data set had a multi-dimensional output, we will initialize 6 models:

Linear regression: linear,
Lasso regression: lasso,
Decision tree regression: dtree,
Random forest regression: rforest,
K-nearest neighbors regression: knn,
Sequential Dense Neural Networks: nn.

All the classical models are initialized with the Scikit-learn defaults; however, the neural networks require some paramter definitions for the layers, optimizer, and training.

[227]:

model_settings = {
    "models": ["linear", "lasso", "dtree", "knn", "rforest", "nn"],
    "nn": {
        # Sequential
        "num_layers": 5,
        "dropout": True,
        "rate": 0.4,
        "validation_split": 0.20,
        "loss": "mean_absolute_error",
        "metrics": ["mean_absolute_error"],
        "batch_size": 8,
        "epochs": 160,
        "warm_start": True,
        "jit_compile": False,
        # Starting Layer
        "start_num_nodes": 400,
        "start_kernel_initializer": "normal",
        "start_activation": "relu",
        "input_dim": preprocessor.inputs.shape[1], # Number of inputs
        # Middle Layers
        "mid_num_node_strategy": "linear", # Middle layer nodes vary linearly from 'start_num_nodes' to 'end_num_nodes'
        "mid_kernel_initializer": "normal",
        "mid_activation": "relu",
        # Ending Layer
        "end_num_nodes": preprocessor.outputs.shape[1], # Number of outputs
        "end_activation": "linear",
        "end_kernel_initializer": "normal",
        # Optimizer
        "optimizer": "adam",
        "learning_rate": 9e-4,
    },
}
tuning = mai.Tuning(data=data, model_settings=model_settings)

Hyper-parameter Tuning

The classical models are hyper-paramter tunned with random search as they are inexpensive to train and the DNN is trained with Baysian search to limit training time.

[228]:

random_search_spaces = {
    "lasso": {
        "alpha": uniform(loc=0.0001, scale=0.0099), # 0.0001 - 0.01
    },
    "dtree": {
        "max_depth": randint(low=5, high=50), # 5 - 50
        "max_features": [None, "sqrt", "log2", 2, 4, 6],
        "min_samples_split": randint(low=2, high=20), # 2 - 20
        "min_samples_leaf": randint(low=1, high=20), # 1 - 20
    },
    "rforest": {
        "n_estimators": randint(low=20, high=200), # 20 - 200
        "criterion": ["squared_error", "absolute_error", "poisson"],
        "min_samples_split": randint(low=2, high=20), # 2 - 20
        "min_samples_leaf": randint(low=1, high=20), # 1 - 20
        "max_features": [None, "sqrt", "log2", 2, 4, 6],
    },
    "knn": {
        "n_neighbors": randint(low=1, high=20), # 1 - 20
        "weights": ["uniform", "distance"],
        "leaf_size": randint(low=1, high=40), # 1 - 40
        "p": randint(low=1, high=10), # 1 - 10
    },
}
bayesian_search_spaces = {
    "nn": {
        "mid_num_node_strategy": ["constant", "linear"],
        "batch_size": [8, 64],
        "learning_rate": [1e-5, 0.001],
        "num_layers": [2, 8],
        "start_num_nodes": [25, 500],
    },
}

start = time.time()
random_search_configs = tuning.random_search(
    param_spaces=random_search_spaces,
    models=["linear"] + list(random_search_spaces.keys()),
    n_iter=300,
    cv=ShuffleSplit(n_splits=5, test_size=0.20, random_state=global_settings.random_state),
)
bayesian_search_configs = tuning.bayesian_search(
    param_spaces=bayesian_search_spaces,
    models=bayesian_search_spaces.keys(),
    n_iter=50,
    cv=5,
)
stop = time.time()
print("Hyper-parameter tuning took " + str((stop - start) / 60) + " minutes to process.")

Hyper-parameter tuning search space was not provided for linear, doing manual fit
Hyper-parameter tuning took 54.83229525089264 minutes to process.

We can understand the hyper-parameter tuning of Bayesian search on the artificial neural network from the convergence plot below.

[229]:

fig, ax = plt.subplots(figsize=(8,8))
ax = tuning.convergence_plot(model_types="nn")

../_images/examples_HTGR_microreactor_15_0.png

Model Post-processing

With the top num_configs_saved, we can pass these parameter configurations to the PostProcessor for model comparison and analysis. For nn we define the epochs parameter to be 200 for better performance.

[230]:

new_model_settings = {
    "nn": {"epochs": 200}
}
postprocessor = mai.PostProcessor(
    data=data,
    models_list=[random_search_configs, bayesian_search_configs],
    new_model_settings=new_model_settings,
    yscaler=preprocessor.yscaler,
)

To compare the performance of these models we will compute performance metrics for both the training and testing data:

[231]:

postprocessor.metrics()

[231]:

	Model Types	Parameter Configurations	Train R2	Train MAE	Train MSE	Train RMSE	Test R2	Test MAE	Test MSE	Test RMSE
22	nn	{'batch_size': 8, 'learning_rate': 0.000768934...	0.723310	2.016330e+17	6.760209e+34	2.600040e+17	0.587191	2.451123e+17	9.540215e+34	3.088724e+17
25	nn	{'batch_size': 8, 'learning_rate': 0.000692753...	0.731648	1.987000e+17	6.557863e+34	2.560833e+17	0.583411	2.469003e+17	9.618025e+34	3.101294e+17
21	nn	{'batch_size': 8, 'learning_rate': 0.000571544...	0.729042	1.994119e+17	6.625076e+34	2.573922e+17	0.581055	2.466079e+17	9.671373e+34	3.109883e+17
23	nn	{'batch_size': 8, 'learning_rate': 0.000773010...	0.727562	1.987243e+17	6.657604e+34	2.580233e+17	0.578204	2.480338e+17	9.745415e+34	3.121765e+17
24	nn	{'batch_size': 8, 'learning_rate': 0.001, 'mid...	0.731647	1.973733e+17	6.555374e+34	2.560346e+17	0.574763	2.475927e+17	9.832405e+34	3.135667e+17
12	rforest	{'criterion': 'absolute_error', 'max_features'...	0.925803	1.071549e+17	1.810155e+34	1.345420e+17	0.459602	2.820568e+17	1.251527e+35	3.537692e+17
13	rforest	{'criterion': 'poisson', 'max_features': 'sqrt...	0.921264	1.087454e+17	1.920709e+34	1.385896e+17	0.452958	2.844069e+17	1.265573e+35	3.557490e+17
14	rforest	{'criterion': 'squared_error', 'max_features':...	0.852740	1.495281e+17	3.592513e+34	1.895393e+17	0.451911	2.842909e+17	1.269313e+35	3.562742e+17
11	rforest	{'criterion': 'squared_error', 'max_features':...	0.821266	1.652528e+17	4.360925e+34	2.088283e+17	0.450488	2.857992e+17	1.271526e+35	3.565846e+17
15	rforest	{'criterion': 'squared_error', 'max_features':...	0.836304	1.580934e+17	3.992762e+34	1.998190e+17	0.441457	2.872441e+17	1.294109e+35	3.597373e+17
20	knn	{'leaf_size': 28, 'n_neighbors': 10, 'p': 1, '...	1.000000	0.000000e+00	0.000000e+00	0.000000e+00	0.405802	2.949392e+17	1.372090e+35	3.704173e+17
16	knn	{'leaf_size': 24, 'n_neighbors': 11, 'p': 1, '...	1.000000	0.000000e+00	0.000000e+00	0.000000e+00	0.404322	2.951881e+17	1.375222e+35	3.708399e+17
18	knn	{'leaf_size': 21, 'n_neighbors': 12, 'p': 1, '...	1.000000	0.000000e+00	0.000000e+00	0.000000e+00	0.399835	2.967002e+17	1.385730e+35	3.722540e+17
19	knn	{'leaf_size': 5, 'n_neighbors': 12, 'p': 1, 'w...	1.000000	0.000000e+00	0.000000e+00	0.000000e+00	0.399835	2.967002e+17	1.385730e+35	3.722540e+17
17	knn	{'leaf_size': 35, 'n_neighbors': 12, 'p': 1, '...	1.000000	0.000000e+00	0.000000e+00	0.000000e+00	0.399835	2.967002e+17	1.385730e+35	3.722540e+17
10	dtree	{'max_depth': 41, 'max_features': 2, 'min_samp...	0.380137	3.085397e+17	1.512964e+35	3.889684e+17	0.209559	3.407998e+17	1.824329e+35	4.271217e+17
9	dtree	{'max_depth': 20, 'max_features': 6, 'min_samp...	0.425268	2.980873e+17	1.401759e+35	3.744008e+17	0.207876	3.396058e+17	1.832484e+35	4.280752e+17
8	dtree	{'max_depth': 12, 'max_features': None, 'min_s...	0.512901	2.738067e+17	1.188864e+35	3.447990e+17	0.192774	3.424470e+17	1.869418e+35	4.323676e+17
7	dtree	{'max_depth': 33, 'max_features': 6, 'min_samp...	0.502199	2.762003e+17	1.214969e+35	3.485641e+17	0.184884	3.431905e+17	1.892810e+35	4.350644e+17
6	dtree	{'max_depth': 19, 'max_features': 6, 'min_samp...	0.468714	2.876132e+17	1.295864e+35	3.599812e+17	0.154319	3.522359e+17	1.957617e+35	4.424497e+17
4	lasso	{'alpha': 0.00043275214209560715}	0.211484	3.518028e+17	1.926117e+35	4.388755e+17	0.152251	3.583597e+17	1.957925e+35	4.424844e+17
5	lasso	{'alpha': 0.00040540629093527964}	0.211550	3.517853e+17	1.925954e+35	4.388569e+17	0.152249	3.583499e+17	1.957915e+35	4.424833e+17
3	lasso	{'alpha': 0.0004634582859650019}	0.211403	3.518224e+17	1.926313e+35	4.388978e+17	0.152247	3.583736e+17	1.957948e+35	4.424871e+17
1	lasso	{'alpha': 0.0005387947145830145}	0.211186	3.518750e+17	1.926843e+35	4.389582e+17	0.152211	3.584119e+17	1.958070e+35	4.425009e+17
2	lasso	{'alpha': 0.0005814373880881531}	0.211049	3.519106e+17	1.927176e+35	4.389961e+17	0.152174	3.584365e+17	1.958178e+35	4.425130e+17
0	linear	{'copy_X': True, 'fit_intercept': True, 'n_job...	0.212029	3.517249e+17	1.924785e+35	4.387238e+17	0.151696	3.582060e+17	1.958993e+35	4.426052e+17

From each model we can see poor performance. The significant performer is the DNN which is still not acceptable enough to consider this model for application.

[232]:

for model in ["linear", "lasso", "dtree", "knn", "rforest", "nn"]:
    print(postprocessor.get_params(model_type=model), "\n")

  Model Types  copy_X  fit_intercept n_jobs   normalize  positive
0      linear    True           True   None  deprecated     False

  Model Types     alpha
0       lasso  0.000433

  Model Types  max_depth  max_features  min_samples_leaf  min_samples_split
0       dtree         41             2                14                  4

  Model Types  leaf_size  n_neighbors  p   weights
0         knn         28           10  1  distance

  Model Types       criterion  max_features  min_samples_leaf  \
0     rforest  absolute_error             4                 1

   min_samples_split  n_estimators
0                  2           174

  Model Types  batch_size  learning_rate mid_num_node_strategy  num_layers  \
0          nn           8       0.000769              constant           2

   start_num_nodes
0              445

[233]:

models = np.array([["linear", "lasso"], ["dtree", "dtree"], ["knn", "rforest"], ["nn", None]])

fig = plt.figure(constrained_layout=fig, figsize=(10,15))
gs = GridSpec(models.shape[0], models.shape[1], figure=fig)
for i in range(models.shape[0] - 1):
    for j in range(models.shape[1]):
        if models[i, j] != None:
            ax = fig.add_subplot(gs[i, j])
            ax = postprocessor.diagonal_validation_plot(model_type=models[i, j])
            ax.set_title(models[i, j])
ax = fig.add_subplot(gs[-1, :])
ax = postprocessor.diagonal_validation_plot(model_type=models[-1, 0])
_ = ax.set_title(models[-1, 0])

../_images/examples_HTGR_microreactor_22_0.png

Lastly we can see the nn traning curve. Due to loss being higher for validation one can assume this model cannot predict well.

[234]:

fig, ax = plt.subplots(figsize=(8,8))
ax = postprocessor.nn_learning_plot()

../_images/examples_HTGR_microreactor_24_0.png

Updated data using reactor symmetries to multiply samples

To obtain better results, higher levels of preprocessing is done using the qpower_preprocessing function from https://github.com/deanrp2/MicroControl/blob/main/pmdata/. This function applies the symmetry rules to multiply the samples, removes unnecessary columns, adjusts the drum angles to be in the correct cordinate system, and normalizes quadrant powers. This expands the data set from 818 samples to 3004 samples.

Additionally, the careful_split function from https://github.com/deanrp2/MicroControl/blob/main/pmdata/ is used to ensure samples deriving from the same sample fall in the same data set during the training and testing split.

pyMAISE Initialization

[11]:

global_settings = mai.settings.init()
preprocessor = mai.load_pqpower()

The data still consists of 8 inputs, but more samples.

[12]:

preprocessor.inputs.head()

[12]:

	theta1	theta2	theta3	theta4	theta5	theta6	theta7	theta8
0	2.237507	-1.698166	2.433453	1.253772	3.140399	-3.082829	2.754851	0.386800
1	1.253772	2.433453	-1.698166	2.237507	0.386800	2.754851	-3.082829	3.140399
2	-3.140399	3.082829	-2.754851	-0.386800	-2.237507	1.698166	-2.433453	-1.253772
3	-0.386800	-2.754851	3.082829	-3.140399	-1.253772	-2.433453	1.698166	-2.237507
4	-1.519638	2.489141	-1.853817	1.147173	-2.058398	0.755909	-0.091469	2.379043

The data still consists of the same outputs:

[13]:

preprocessor.outputs.head()

[13]:

	fluxQ1	fluxQ2	fluxQ3	fluxQ4
0	0.248175	0.249199	0.256734	0.245892
1	0.249199	0.248175	0.245892	0.256734
2	0.256734	0.245892	0.248175	0.249199
3	0.245892	0.256734	0.249199	0.248175
4	0.252723	0.250798	0.248167	0.248312

The careful_split function is used for train/test split [1]. The function definition can be seen below

[238]:

def careful_split(df, tstfrac = False, tstnum = False):
    """
    Returns a train and test split which does not allow samples coming from the same calculation
    to exist across the train/test line. In other words, samples generated from symmetry must stay
    with the sample they were made from.

    df: must be dataframe with the indices given from the mult_sym function. make sure to sort_index
    tstfrac: generate the split based on an approximate fraction to include in the TESTING set
    tstnum: generate the split based on an approximate total number to include in the TESTING set. The argument here
        will be approximately the total number of rows in the testing dataframe that is returned
    """
    tot_samples = df.shape[0]

    #determine size of final testing set divided by 4
    if tstnum:
        tstsize = int(tstnum//4)
    elif tstfrac:
        tstsize = int((tot_samples*tstfrac)//4)
    else:
        raise Exception("Please specify tstfrac or tstsize")

    #get sample names associate with original calcs
    samplenums = np.unique([a[:12] for a in df.index])

    #pick a set from the original sample names
    tidxr = list(np.random.choice(samplenums, tstsize, replace = False))

    #expand set to include multiplied sample names as well
    tidxf = sorted(tidxr + [a + "_h" for a in tidxr] + [a + "_r" for a in tidxr] + [a + "_v" for a in tidxr])

    #index original dataframe to get the sets
    train = df.drop(tidxf)
    test = df.loc[tidxf]

    return train, test

This splitting also was produced from https://github.com/deanrp2/MicroControl/blob/main/pmdata/.

[263]:

#load data
data = pd.read_csv("/home/connorcr//NERS491/pyMAISE/pyMAISE/data/qpower_processed.csv", index_col = 0)

#train and test split
train, test = careful_split(data, tstfrac = 0.2)

#define predictors and responses
pred = ["theta" + str(i) for i in range(1, 9)]
resp = ["fluxQ"+str(i) for i in range(1,5)]

#split into X/Y data arrays
xtrain=train[pred].values
xtest=test[pred].values

ytrain=train[resp].values
ytest=test[resp].values

# Putting data back into a pandas data frame for pyMAISE to use
data = (
    pd.DataFrame(xtrain, columns=['theta1', 'theta2', 'theta3', 'theta4', 'theta5', 'theta6', 'theta7', 'theta8']),
    pd.DataFrame(xtest, columns=['theta1', 'theta2', 'theta3', 'theta4', 'theta5', 'theta6', 'theta7', 'theta8']),
    pd.DataFrame(ytrain, columns=['fluxQ1', 'fluxQ2', 'fluxQ3', 'fluxQ4']),
    pd.DataFrame(ytest, columns=['fluxQ1', 'fluxQ2', 'fluxQ3', 'fluxQ4'])
)

Model Initialization

Models were initialized the same as before.

[240]:

model_settings = {
    "models": ["linear", "lasso", "dtree", "knn", "rforest", "nn"],
    "nn": {
        # Sequential
        "num_layers": 5,
        "dropout": False,
        "rate": 0.5,
        "validation_split": 0.20,
        "loss": "mean_absolute_error",
        "metrics": ["mean_absolute_error"],
        "batch_size": 8,
        "epochs": 200,
        "warm_start": True,
        "jit_compile": False,
        # Starting Layer
        "start_num_nodes": 437,
        "start_kernel_initializer": "normal",
        "start_activation": "relu",
        "input_dim": preprocessor.inputs.shape[1], # Number of inputs
        # Middle Layers
        "mid_num_node_strategy": "linear", # Middle layer nodes vary linearly from 'start_num_nodes' to 'end_num_nodes'
        "mid_kernel_initializer": "normal",
        "mid_activation": "relu",
        # Ending Layer
        "end_num_nodes": preprocessor.outputs.shape[1], # Number of outputs
        "end_activation": "linear",
        "end_kernel_initializer": "normal",
        # Optimizer
        "optimizer": "adam",
        "learning_rate": 9e-4,
    },
}
tuning = mai.Tuning(data=data, model_settings=model_settings)

Hyper-parameter Tuning

The hyper-parameter tuning spaces are defined below. Major differences are the slightly expanded search space along with Baysian search have double the amount of iterations and random search producing more models.

[241]:

random_search_spaces = {
    "lasso": {
        "alpha": uniform(loc=0.0001, scale=0.0099), # 0.0001 - 0.01
    },
    "dtree": {
        "max_depth": randint(low=5, high=50), # 5 - 50
        "max_features": [None, "sqrt", "log2", 2, 4, 6],
        "min_samples_split": randint(low=2, high=20), # 2 - 20
        "min_samples_leaf": randint(low=1, high=20), # 1 - 20
    },
    "rforest": {
        "n_estimators": randint(low=50, high=200), # 50 - 200
        "criterion": ["squared_error", "absolute_error", "poisson"],
        "min_samples_split": randint(low=2, high=20), # 2 - 20
        "min_samples_leaf": randint(low=1, high=20), # 1 - 20
        "max_features": [None, "sqrt", "log2", 2, 4, 6],
    },
    "knn": {
        "n_neighbors": randint(low=1, high=20), # 1 - 20
        "weights": ["uniform", "distance"],
        "leaf_size": randint(low=1, high=30), # 1 - 30
        "p": randint(low=1, high=10), # 1 - 10
    },
}
bayesian_search_spaces = {
    "nn": {
        "mid_num_node_strategy": ["constant", "linear"],
        "batch_size": [8, 64],
        "learning_rate": [1e-5, 0.001],
        "num_layers": [2, 8],
        "start_num_nodes": [25, 500],
    },
}

start = time.time()
random_search_configs = tuning.random_search(
    param_spaces=random_search_spaces,
    models=["linear"] + list(random_search_spaces.keys()),
    n_iter=400,
    cv=ShuffleSplit(n_splits=5, test_size=0.20, random_state=global_settings.random_state),
)
bayesian_search_configs = tuning.bayesian_search(
    param_spaces=bayesian_search_spaces,
    models=bayesian_search_spaces.keys(),
    n_iter=100,
    cv=5,
)
stop = time.time()
print("Hyper-parameter tuning took " + str((stop - start) / 60) + " minutes to process.")

Hyper-parameter tuning search space was not provided for linear, doing manual fit
Hyper-parameter tuning took 481.1029209295909 minutes to process.

Once again a plot of the Bayesian search nn convergence is shown.

[242]:

fig, ax = plt.subplots(figsize=(8,8))
ax = tuning.convergence_plot(model_types="nn")

new_model_settings = {
    "nn": {"epochs": 200}
}
postprocessor = mai.PostProcessor(
    data=data,
    models_list=[random_search_configs, bayesian_search_configs],
    new_model_settings=new_model_settings,
    yscaler=preprocessor.yscaler,
)

../_images/examples_HTGR_microreactor_41_0.png

The averaged performance matrics can be seen below.

[243]:

postprocessor.metrics()

[243]:

	Model Types	Parameter Configurations	Train R2	Train MAE	Train MSE	Train RMSE	Test R2	Test MAE	Test MSE	Test RMSE
21	nn	{'batch_size': 10, 'learning_rate': 0.00071587...	0.988009	0.000354	2.083540e-07	0.000456	0.980992	0.000465	3.594479e-07	0.000600
23	nn	{'batch_size': 8, 'learning_rate': 0.000469699...	0.960876	0.000653	6.797935e-07	0.000824	0.957822	0.000697	7.975876e-07	0.000893
22	nn	{'batch_size': 8, 'learning_rate': 0.000536184...	0.957795	0.000669	7.333210e-07	0.000856	0.955604	0.000718	8.395288e-07	0.000916
25	nn	{'batch_size': 8, 'learning_rate': 0.000557979...	0.951672	0.000726	8.397190e-07	0.000916	0.947478	0.000791	9.931777e-07	0.000997
24	nn	{'batch_size': 8, 'learning_rate': 0.000394115...	0.938714	0.000812	1.064867e-06	0.001032	0.936472	0.000861	1.201305e-06	0.001096
11	rforest	{'criterion': 'poisson', 'max_features': None,...	0.957186	0.000683	7.439166e-07	0.000863	0.744293	0.001761	4.835391e-06	0.002199
13	rforest	{'criterion': 'poisson', 'max_features': 6, 'm...	0.954779	0.000702	7.857387e-07	0.000886	0.736210	0.001796	4.988249e-06	0.002233
15	rforest	{'criterion': 'absolute_error', 'max_features'...	0.912795	0.000974	1.515221e-06	0.001231	0.735613	0.001795	4.999528e-06	0.002236
12	rforest	{'criterion': 'squared_error', 'max_features':...	0.965586	0.000614	5.979644e-07	0.000773	0.732073	0.001804	5.066466e-06	0.002251
14	rforest	{'criterion': 'squared_error', 'max_features':...	0.918623	0.000943	1.413963e-06	0.001189	0.724966	0.001827	5.200860e-06	0.002281
16	knn	{'leaf_size': 27, 'n_neighbors': 7, 'p': 1, 'w...	1.000000	0.000000	0.000000e+00	0.000000	0.715866	0.001842	5.372955e-06	0.002318
17	knn	{'leaf_size': 14, 'n_neighbors': 7, 'p': 1, 'w...	1.000000	0.000000	0.000000e+00	0.000000	0.715866	0.001842	5.372955e-06	0.002318
18	knn	{'leaf_size': 15, 'n_neighbors': 7, 'p': 1, 'w...	1.000000	0.000000	0.000000e+00	0.000000	0.715866	0.001842	5.372955e-06	0.002318
20	knn	{'leaf_size': 20, 'n_neighbors': 9, 'p': 1, 'w...	1.000000	0.000000	0.000000e+00	0.000000	0.707126	0.001868	5.538227e-06	0.002353
19	knn	{'leaf_size': 2, 'n_neighbors': 9, 'p': 1, 'we...	1.000000	0.000000	0.000000e+00	0.000000	0.707126	0.001868	5.538227e-06	0.002353
10	dtree	{'max_depth': 31, 'max_features': None, 'min_s...	0.637017	0.001983	6.306981e-06	0.002511	0.408778	0.002623	1.117995e-05	0.003344
8	dtree	{'max_depth': 40, 'max_features': 6, 'min_samp...	0.644997	0.001964	6.168320e-06	0.002484	0.408205	0.002665	1.119079e-05	0.003345
7	dtree	{'max_depth': 45, 'max_features': None, 'min_s...	0.704183	0.001790	5.139935e-06	0.002267	0.399901	0.002658	1.134782e-05	0.003369
9	dtree	{'max_depth': 29, 'max_features': None, 'min_s...	0.666184	0.001900	5.800178e-06	0.002408	0.392290	0.002663	1.149174e-05	0.003390
6	dtree	{'max_depth': 19, 'max_features': None, 'min_s...	0.700227	0.001806	5.208673e-06	0.002282	0.388429	0.002670	1.156475e-05	0.003401
0	linear	{'copy_X': True, 'fit_intercept': True, 'n_job...	0.010366	0.003334	1.719528e-05	0.004147	0.012866	0.003483	1.866662e-05	0.004320
5	lasso	{'alpha': 0.00012669007172725014}	0.008703	0.003333	1.722417e-05	0.004150	0.008889	0.003487	1.874183e-05	0.004329
4	lasso	{'alpha': 0.00015189471533303285}	0.008215	0.003334	1.723265e-05	0.004151	0.008310	0.003488	1.875277e-05	0.004330
3	lasso	{'alpha': 0.0001570349962680683}	0.008106	0.003334	1.723455e-05	0.004151	0.008185	0.003488	1.875514e-05	0.004331
2	lasso	{'alpha': 0.0001681918260943599}	0.007857	0.003334	1.723888e-05	0.004152	0.007903	0.003488	1.876047e-05	0.004331
1	lasso	{'alpha': 0.0001892404322211085}	0.007341	0.003334	1.724785e-05	0.004153	0.007331	0.003488	1.877127e-05	0.004333

From this we can see that all the models performed significantly better with neural networks doing the best. To visualize this further diagonal validation plots and validations plots were ploduced below.

[260]:

models = np.array([["linear", "lasso"], ["dtree", "dtree"], ["knn", "rforest"], ["nn", None]])

fig = plt.figure(constrained_layout=fig, figsize=(10,15))
gs = GridSpec(models.shape[0], models.shape[1], figure=fig)
for i in range(models.shape[0] - 1):
    for j in range(models.shape[1]):
        if models[i, j] != None:
            ax = fig.add_subplot(gs[i, j])
            ax = postprocessor.diagonal_validation_plot(model_type=models[i, j])
            ax.set_title(models[i, j])
ax = fig.add_subplot(gs[-1, :])
ax = postprocessor.diagonal_validation_plot(model_type=models[-1, 0])
_ = ax.set_title(models[-1, 0])

../_images/examples_HTGR_microreactor_45_0.png

Lastly the neural networks learning curve was produced to show there is no overfitting.

[261]:

fig, ax = plt.subplots(figsize=(8,8))
ax = postprocessor.nn_learning_plot()

../_images/examples_HTGR_microreactor_47_0.png

References

[1] Price, Dean & Radaideh, Majdi & Kochunas, Brendan. (2022). Multiobjective optimization of nuclear microreactor reactivity control system operation with swarm and evolutionary algorithms. Nuclear Engineering and Design. 393. 111776. 10.1016/j.nucengdes.2022.111776.