NEACRP C1 Rod Ejection Accident

Inputs

  • rod_worth: Reactivity worth of the rod being ejected

  • beta: Delayed neutron fraction

  • h_gap: Gap conductancce (\(\frac{W}{m^2 \cdot K}\))

  • gamma_frac: Direct heating fraction

Outputs

  • max_power: Peak power reached during transient (\(\% FP\))

  • burst_width: Width of power burst (\(s\))

  • max_TF: Max fuel centerline temperature (\(K\))

  • avg_Tcool: Average coolant temperature at outlet (\(K\))

The NEACRP C1 rod ejection accident (REA) data represents one benchmark for reactor transient analysis. The data set is used to find the relationship between the REA/reactor parameters and the power/thermal behavior of the system during/after the event. Therefore, the data set is constructed by perturbing the inputs listed above. The corresponding output results in values of interest to the safety analysis of the transient. The data were generated using deterministic simulations by the PARCS code, where the data set size includes 2000 simulations/samples [1]. The goal is to use pyMAISE to build, tune, and compare various ML models’ performance in predicting the transient outcomes based on the REA properties.

[6]:
import pyMAISE as mai
import time
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
from scipy.stats import uniform, randint, norm
from sklearn.model_selection import ShuffleSplit
from statistics import stdev, mean

# Plot settings
matplotlib_settings = {
    "font.size": 14,
    "legend.fontsize": 12,
}
plt.rcParams.update(**matplotlib_settings)

pyMAISE Initialization

First we initialize pyMAISE with the following 4 parameters:

  • verbosity: 0 \(\rightarrow\) pyMAISE prints no outputs,

  • random_state: None \(\rightarrow\) No random seed is set,

  • test_size: 0.3 \(\rightarrow\) 30% of the data is used for testing,

  • num_configs_saved: 5 \(\rightarrow\) The top 5 hyper-parameter configurations are saved for each model.

With pyMAISE initialized we can load the preprocessor for this data set using load_rea().

[7]:
global_settings = mai.settings.init()
preprocessor = mai.load_rea()

As stated the data set consists of 4 inputs:

[8]:
preprocessor.inputs.head()
[8]:
rod_worth beta h_gap gamma_frac
0 0.008638 0.007576 13727.981902 0.023957
1 0.009255 0.007529 9370.218080 0.019707
2 0.008046 0.007647 9962.543845 0.020045
3 0.008463 0.007139 8569.910206 0.020072
4 0.008641 0.007575 12813.925869 0.011449

and 4 outputs with 2000 total data points:

[9]:
preprocessor.outputs.head()
[9]:
max_power burst_width max_Tf avg_Tcool
0 181.210 0.315 918.3 561.119081
1 474.590 0.250 965.2 562.030035
2 44.083 0.425 875.7 560.194700
3 270.500 0.290 938.2 561.241696
4 195.560 0.315 924.8 561.106714

Prior to constructing any models we can get a surface understanding of the data set with a correlation matrix.

[10]:
fig, ax = plt.subplots(figsize=(15,10))
fig, ax = preprocessor.correlation_matrix(fig=fig, ax=ax, annotations=True, colorbar=False)
../_images/examples_rod_ejection_9_0.png

There is a positive correlation between rod worth and maximum power, maximum fuel centerline temperature, and average coolant outlet temperature. Additionally, the delayed neutron fraction correlates with burst width.

The final step of the pyMAISE initialization process is data scaling. For this data set we will use min-max scaling.

[11]:
data = preprocessor.min_max_scale()

Model Initialization

We will examine the performance of 6 models in this data set:

  • Linear regression: linear,

  • Lasso regression: lasso,

  • Decision tree regression: dtree,

  • Random forest regression: rforest,

  • K-nearest neighbors regression: knn,

  • Sequential dense neural networks: nn.

For hyper-parameter tuning, each model must be initialized. We will use the Scikit-learn defaults for the classical ML models (linear, lasso, dtree, rforest, and knn); therefore, they are only specified in the models parameter of the model_settings dictionary. However, we must specify nn model parameters that define the layers, optimizer, and training.

[12]:
model_settings = {
    "models": ["linear", "lasso", "dtree", "knn", "rforest", "nn"],
    "nn": {
        # Sequential
        "num_layers": 4,
        "dropout": True,
        "rate": 0.5,
        "validation_split": 0.15,
        "loss": "mean_absolute_error",
        "metrics": ["mean_absolute_error"],
        "batch_size": 8,
        "epochs": 75,
        "warm_start": True,
        "jit_compile": False,
        # Starting Layer
        "start_num_nodes": 100,
        "start_kernel_initializer": "normal",
        "start_activation": "relu",
        "input_dim": preprocessor.inputs.shape[1], # Number of inputs
        # Middle Layers
        "mid_num_node_strategy": "linear", # Middle layer nodes vary linearly from 'start_num_nodes' to 'end_num_nodes'
        "mid_kernel_initializer": "normal",
        "mid_activation": "relu",
        # Ending Layer
        "end_num_nodes": preprocessor.outputs.shape[1], # Number of outputs
        "end_activation": "linear",
        "end_kernel_initializer": "normal",
        # Optimizer
        "optimizer": "adam",
        "learning_rate": 5e-4,
    },
}
tuning = mai.Tuning(data=data, model_settings=model_settings)

Hyper-parameter Tuning

We will use random search for the hyper-parameter tuning of the classical models (lasso, dtree, rforest, and knn) through the random_search function. linear will be manually fit with the Scikit-learn defaults. For each classical model 300 models will be produced with randomly sampled parameter configurations. For nn, bayesian search is used to optimize the hyper-parameters in 50 iterations through the bayesian_search function. Bayesian search is appealing for nn as their training can be computationally expensive. To further reduce the computational cost of nn we specify only 10 epochs which will produce less than performant models but show the optimal parameters. For both search methods we use cross-validation to reduce bias in the models from the data set. The hyper-parameter search spaces are defined in the random_search_spaces and bayesian_search_spaces dictionaries.

[13]:
random_search_spaces = {
    "lasso": {
        "alpha": uniform(loc=0.0001, scale=0.0099), # 0.0001 - 0.01
    },
    "dtree": {
        "max_depth": randint(low=5, high=50), # 5 - 50
        "max_features": [None, "sqrt", "log2", 2, 4, 6],
        "min_samples_split": randint(low=2, high=20), # 2 - 20
        "min_samples_leaf": randint(low=1, high=20), # 1 - 20
    },
    "rforest": {
        "n_estimators": randint(low=50, high=200), # 50 - 200
        "criterion": ["squared_error", "absolute_error", "poisson"],
        "min_samples_split": randint(low=2, high=20), # 2 - 20
        "min_samples_leaf": randint(low=1, high=20), # 1 - 20
        "max_features": [None, "sqrt", "log2", 2, 4, 6],
    },
    "knn": {
        "n_neighbors": randint(low=1, high=20), # 1 - 20
        "weights": ["uniform", "distance"],
        "leaf_size": randint(low=1, high=30), # 1 - 30
        "p": randint(low=1, high=10), # 1 - 10
    },
}
bayesian_search_spaces = {
    "nn": {
        "mid_num_node_strategy": ["constant", "linear"],
        "batch_size": [8, 64],
        "dropout": [True, False],
        "learning_rate": [1e-5, 0.01],
        "num_layers": [2, 6],
        "start_num_nodes": [25, 500],
    },
}

start = time.time()
random_search_configs = tuning.random_search(
    param_spaces=random_search_spaces,
    models=["linear"] + list(random_search_spaces.keys()),
    n_iter=300,
    cv=ShuffleSplit(n_splits=5, test_size=0.25, random_state=global_settings.random_state),
)
bayesian_search_configs = tuning.bayesian_search(
    param_spaces=bayesian_search_spaces,
    models=bayesian_search_spaces.keys(),
    n_iter=50,
    cv=ShuffleSplit(n_splits=5, test_size=0.25, random_state=global_settings.random_state),
)
stop = time.time()
print("Hyper-parameter tuning took " + str((stop - start) / 60) + " minutes to process.")
Hyper-parameter tuning search space was not provided for linear, doing manual fit
Hyper-parameter tuning took 108.24320970773697 minutes to process.

We can understand the hyper-parameter tuning of Bayesian search from the convergence plot.

[14]:
fig, ax = plt.subplots(figsize=(8,8))
ax = tuning.convergence_plot(model_types="nn")
../_images/examples_rod_ejection_17_0.png

Fewer than 30 iterations were required to converge to the optimal parameter configurations.

Model Post-processing

Now that the top num_configs_saved saved, we can pass these models to the PostProcessor for model comparison and analysis. To improve the nn performance we can pass an updated epochs parameter. Using 500 epochs should improve fitting at higher computational cost.

[15]:
new_model_settings = {
    "nn": {"epochs": 500}
}
postprocessor = mai.PostProcessor(
    data=data,
    models_list=[random_search_configs, bayesian_search_configs],
    new_model_settings=new_model_settings,
    yscaler=preprocessor.yscaler,
)

To compare the performance of these models we will compute 4 metrics for both the training and testing data:

  • mean squared error MSE \(=\frac{1}{n}\sum^n_{i = 1}(y_i - \hat{y_i})^2\),

  • root mean squared error RMSE \(=\sqrt{\frac{1}{n}\sum^n_{i = 1}(y_i - \hat{y_i})^2}\),

  • mean absolute error MAE = \(=\frac{1}{n}\sum^n_{i = 1}|y_i - \hat{y_i}|\),

  • and r-squared R2 \(=1 - \frac{\sum^n_{i = 1}(y_i - \hat{y_i})^2}{\sum^n_{i = 1}(y_i - \bar{y_i})^2}\),

where \(y\) is the actual outcome, \(\bar{y}\) is the average outcome, \(\hat{y}\) is the model predicted outcome, and \(n\) is the number of observations. The averaged performance metrics are shown below.

[16]:
postprocessor.metrics()[["Model Types", "Train R2", "Test R2"]]
[16]:
Model Types Train R2 Test R2
22 nn 0.996991 0.996563
25 nn 0.997542 0.996274
23 nn 0.993635 0.994332
21 nn 0.994222 0.993968
24 nn 0.990929 0.991919
12 rforest 0.995286 0.986470
11 rforest 0.996907 0.986128
13 rforest 0.993114 0.984042
15 rforest 0.990654 0.983169
14 rforest 0.990818 0.981383
7 dtree 0.998771 0.960748
6 dtree 1.000000 0.960440
10 dtree 0.995147 0.959554
9 dtree 0.997734 0.956914
8 dtree 0.999312 0.955705
19 knn 1.000000 0.949050
17 knn 1.000000 0.948966
16 knn 1.000000 0.947115
18 knn 1.000000 0.946237
20 knn 0.950356 0.939592
0 linear 0.854579 0.850833
4 lasso 0.854311 0.850303
1 lasso 0.854212 0.850150
2 lasso 0.854065 0.849899
3 lasso 0.854057 0.849886
5 lasso 0.853920 0.849656

Given the top performing models are linear and lasso this data set’s outputs are linear with their inputs. nn also performs well with all models greater than 0.95. Performance quickly drops off with rforest, knn, and dtree. We can look specifically at the performance for each output:

[17]:
postprocessor.metrics(y="max_power")
[17]:
Model Types Parameter Configurations Train R2 Train MAE Train MSE Train RMSE Test R2 Test MAE Test MSE Test RMSE
21 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.999765 2.133878 10.063103 3.172239 0.999828 2.030514 7.801756 2.793162
22 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.999833 2.112485 7.136338 2.671393 0.999809 2.219746 8.629569 2.937613
23 nn {'batch_size': 13, 'dropout': 0, 'learning_rat... 0.999527 3.283176 20.276449 4.502938 0.999619 3.345278 17.225367 4.150345
25 nn {'batch_size': 21, 'dropout': 0, 'learning_rat... 0.999574 3.959537 18.271895 4.274564 0.999582 3.942873 18.888371 4.346075
24 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.999397 3.839394 25.836482 5.082960 0.999391 3.985900 27.552767 5.249073
12 rforest {'criterion': 'poisson', 'max_features': None,... 0.998844 3.694867 49.532222 7.037913 0.991611 10.021148 379.428960 19.478936
11 rforest {'criterion': 'squared_error', 'max_features':... 0.998732 4.066483 54.318439 7.370104 0.990279 10.491619 439.651007 20.967857
14 rforest {'criterion': 'absolute_error', 'max_features'... 0.996685 6.143280 142.022579 11.917323 0.989230 11.197654 487.118868 22.070770
15 rforest {'criterion': 'squared_error', 'max_features':... 0.997262 5.726082 117.322278 10.831541 0.988743 11.284735 509.128735 22.563881
13 rforest {'criterion': 'poisson', 'max_features': 6, 'm... 0.997948 4.804165 87.931383 9.377174 0.987344 11.140833 572.401047 23.924904
10 dtree {'max_depth': 23, 'max_features': None, 'min_s... 0.997487 6.092556 107.659430 10.375906 0.979565 19.109739 924.216361 30.400927
7 dtree {'max_depth': 30, 'max_features': 6, 'min_samp... 0.999503 2.354660 21.306076 4.615851 0.978051 20.263113 992.725335 31.507544
6 dtree {'max_depth': 31, 'max_features': None, 'min_s... 1.000000 0.000000 0.000000 0.000000 0.977857 20.057878 1001.494090 31.646391
9 dtree {'max_depth': 37, 'max_features': 6, 'min_samp... 0.998593 4.222973 60.297608 7.765153 0.975976 20.147429 1086.564282 32.963075
8 dtree {'max_depth': 11, 'max_features': 6, 'min_samp... 0.999548 2.004389 19.346764 4.398496 0.973210 21.389837 1211.670836 34.809063
17 knn {'leaf_size': 28, 'n_neighbors': 5, 'p': 2, 'w... 1.000000 0.000000 0.000000 0.000000 0.971296 21.422576 1298.217237 36.030782
19 knn {'leaf_size': 9, 'n_neighbors': 3, 'p': 2, 'we... 1.000000 0.000000 0.000000 0.000000 0.970830 22.624646 1319.297694 36.322138
16 knn {'leaf_size': 7, 'n_neighbors': 4, 'p': 2, 'we... 1.000000 0.000000 0.000000 0.000000 0.970633 22.012378 1328.196327 36.444428
18 knn {'leaf_size': 18, 'n_neighbors': 6, 'p': 2, 'w... 1.000000 0.000000 0.000000 0.000000 0.970355 21.527457 1340.782022 36.616690
20 knn {'leaf_size': 1, 'n_neighbors': 5, 'p': 4, 'we... 0.973073 18.921449 1153.640964 33.965291 0.964079 23.701504 1624.645752 40.306895
0 linear {'copy_X': True, 'fit_intercept': True, 'n_job... 0.883119 50.664327 5007.516923 70.763811 0.877886 52.341080 5522.919504 74.316347
4 lasso {'alpha': 0.00015499889632899263} 0.882972 50.615358 5013.822766 70.808352 0.877659 52.333832 5533.217815 74.385602
1 lasso {'alpha': 0.00018375666615060612} 0.882912 50.612206 5016.379778 70.826406 0.877585 52.341321 5536.563923 74.408090
2 lasso {'alpha': 0.00022582145708762807} 0.882806 50.612246 5020.901953 70.858323 0.877459 52.357448 5542.267588 74.446407
3 lasso {'alpha': 0.00022794412681753656} 0.882800 50.612303 5021.154770 70.860107 0.877452 52.358437 5542.580883 74.448512
5 lasso {'alpha': 0.0002627188109898158} 0.882696 50.615824 5025.633335 70.891701 0.877331 52.374636 5548.061975 74.485314

For max power all but linear and lasso did well.

[18]:
postprocessor.metrics(y="burst_width")
[18]:
Model Types Parameter Configurations Train R2 Train MAE Train MSE Train RMSE Test R2 Test MAE Test MSE Test RMSE
22 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.988799 0.005160 0.000172 0.013098 0.987108 0.005698 0.000239 0.015456
25 nn {'batch_size': 21, 'dropout': 0, 'learning_rat... 0.991330 0.005005 0.000133 0.011524 0.986333 0.005875 0.000253 0.015914
23 nn {'batch_size': 13, 'dropout': 0, 'learning_rat... 0.977841 0.005919 0.000339 0.018422 0.980524 0.006991 0.000361 0.018997
21 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.978727 0.006273 0.000326 0.018050 0.977655 0.007223 0.000414 0.020349
12 rforest {'criterion': 'poisson', 'max_features': None,... 0.985973 0.002829 0.000215 0.014657 0.974381 0.006673 0.000475 0.021788
11 rforest {'criterion': 'squared_error', 'max_features':... 0.992650 0.003202 0.000113 0.010610 0.973837 0.006776 0.000485 0.022018
13 rforest {'criterion': 'poisson', 'max_features': 6, 'm... 0.982523 0.004071 0.000268 0.016361 0.973693 0.006995 0.000487 0.022079
24 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.967319 0.007959 0.000501 0.022373 0.971284 0.008661 0.000532 0.023067
15 rforest {'criterion': 'squared_error', 'max_features':... 0.975306 0.004684 0.000378 0.019448 0.968105 0.007372 0.000591 0.024311
14 rforest {'criterion': 'absolute_error', 'max_features'... 0.978055 0.004783 0.000336 0.018333 0.961587 0.007456 0.000712 0.026679
6 dtree {'max_depth': 31, 'max_features': None, 'min_s... 1.000000 0.000000 0.000000 0.000000 0.915653 0.013175 0.001563 0.039534
7 dtree {'max_depth': 30, 'max_features': 6, 'min_samp... 0.997753 0.001886 0.000034 0.005866 0.915387 0.013296 0.001568 0.039597
10 dtree {'max_depth': 23, 'max_features': None, 'min_s... 0.990948 0.004050 0.000139 0.011774 0.908150 0.013236 0.001702 0.041255
8 dtree {'max_depth': 11, 'max_features': 6, 'min_samp... 0.999439 0.001429 0.000009 0.002932 0.906660 0.013048 0.001730 0.041589
9 dtree {'max_depth': 37, 'max_features': 6, 'min_samp... 0.997002 0.002823 0.000046 0.006776 0.902873 0.013235 0.001800 0.042424
19 knn {'leaf_size': 9, 'n_neighbors': 3, 'p': 2, 'we... 1.000000 0.000000 0.000000 0.000000 0.873178 0.015211 0.002350 0.048477
17 knn {'leaf_size': 28, 'n_neighbors': 5, 'p': 2, 'w... 1.000000 0.000000 0.000000 0.000000 0.868704 0.014066 0.002433 0.049325
16 knn {'leaf_size': 7, 'n_neighbors': 4, 'p': 2, 'we... 1.000000 0.000000 0.000000 0.000000 0.863291 0.014743 0.002533 0.050331
18 knn {'leaf_size': 18, 'n_neighbors': 6, 'p': 2, 'w... 1.000000 0.000000 0.000000 0.000000 0.860476 0.014414 0.002585 0.050847
20 knn {'leaf_size': 1, 'n_neighbors': 5, 'p': 4, 'we... 0.867813 0.012254 0.002025 0.044995 0.847149 0.015478 0.002832 0.053220
0 linear {'copy_X': True, 'fit_intercept': True, 'n_job... 0.568566 0.035066 0.006608 0.081288 0.561224 0.038123 0.008131 0.090170
4 lasso {'alpha': 0.00015499889632899263} 0.567971 0.033968 0.006617 0.081344 0.559685 0.037069 0.008159 0.090328
1 lasso {'alpha': 0.00018375666615060612} 0.567766 0.033770 0.006620 0.081363 0.559282 0.036893 0.008167 0.090369
2 lasso {'alpha': 0.00022582145708762807} 0.567522 0.033493 0.006624 0.081386 0.558636 0.036641 0.008179 0.090435
3 lasso {'alpha': 0.00022794412681753656} 0.567510 0.033479 0.006624 0.081387 0.558603 0.036628 0.008179 0.090439
5 lasso {'alpha': 0.0002627188109898158} 0.567299 0.033254 0.006627 0.081407 0.558036 0.036421 0.008190 0.090497

For burst width knn, dtree, lasso, and linear struggled to predict the testing data. knn overfit to the training data set.

[19]:
postprocessor.metrics(y="max_Tf")
[19]:
Model Types Parameter Configurations Train R2 Train MAE Train MSE Train RMSE Test R2 Test MAE Test MSE Test RMSE
22 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.999606 0.476851 0.494479 0.703192 0.999593 0.514516 0.534188 0.730882
25 nn {'batch_size': 21, 'dropout': 0, 'learning_rat... 0.999654 0.531952 0.435127 0.659641 0.999570 0.561521 0.563811 0.750873
21 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.998736 0.796847 1.587985 1.260153 0.998745 0.810594 1.645989 1.282961
24 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.997993 1.064926 2.520588 1.587636 0.998061 1.124524 2.543935 1.594972
23 nn {'batch_size': 13, 'dropout': 0, 'learning_rat... 0.997786 1.214482 2.780910 1.667606 0.997780 1.280105 2.911773 1.706392
11 rforest {'criterion': 'squared_error', 'max_features':... 0.997825 1.095227 2.731355 1.652681 0.989698 2.461387 13.514313 3.676182
12 rforest {'criterion': 'poisson', 'max_features': None,... 0.997872 0.985909 2.672958 1.634918 0.989633 2.511683 13.599902 3.687805
15 rforest {'criterion': 'squared_error', 'max_features':... 0.993997 1.777071 7.538544 2.745641 0.987258 2.768031 16.714941 4.088391
13 rforest {'criterion': 'poisson', 'max_features': 6, 'm... 0.995041 1.633966 6.227993 2.495595 0.986598 2.791944 17.581126 4.192985
14 rforest {'criterion': 'absolute_error', 'max_features'... 0.993344 1.943850 8.359785 2.891329 0.986437 2.910056 17.792976 4.218172
0 linear {'copy_X': True, 'fit_intercept': True, 'n_job... 0.984094 2.737370 19.975873 4.469438 0.983467 2.860855 21.688595 4.657102
4 lasso {'alpha': 0.00015499889632899263} 0.983848 2.651807 20.285131 4.503902 0.983224 2.793339 22.007038 4.691166
1 lasso {'alpha': 0.00018375666615060612} 0.983748 2.643162 20.410535 4.517802 0.983129 2.785325 22.131543 4.704417
2 lasso {'alpha': 0.00022582145708762807} 0.983572 2.633487 20.632317 4.542281 0.982962 2.779027 22.350542 4.727636
3 lasso {'alpha': 0.00022794412681753656} 0.983562 2.633147 20.644716 4.543646 0.982953 2.778929 22.362755 4.728927
5 lasso {'alpha': 0.0002627188109898158} 0.983387 2.628571 20.864359 4.567752 0.982789 2.779341 22.578708 4.751706
17 knn {'leaf_size': 28, 'n_neighbors': 5, 'p': 2, 'w... 1.000000 0.000000 0.000000 0.000000 0.975539 3.576428 32.088355 5.664658
16 knn {'leaf_size': 7, 'n_neighbors': 4, 'p': 2, 'we... 1.000000 0.000000 0.000000 0.000000 0.974794 3.743922 33.066542 5.750352
18 knn {'leaf_size': 18, 'n_neighbors': 6, 'p': 2, 'w... 1.000000 0.000000 0.000000 0.000000 0.974527 3.621026 33.416091 5.780665
10 dtree {'max_depth': 23, 'max_features': None, 'min_s... 0.995139 1.741500 6.104337 2.470696 0.973796 4.253417 34.375232 5.863039
19 knn {'leaf_size': 9, 'n_neighbors': 3, 'p': 2, 'we... 1.000000 0.000000 0.000000 0.000000 0.973567 3.868463 34.675625 5.888601
7 dtree {'max_depth': 30, 'max_features': 6, 'min_samp... 0.998618 0.728214 1.735818 1.317504 0.972962 4.455750 35.469846 5.955657
6 dtree {'max_depth': 31, 'max_features': None, 'min_s... 1.000000 0.000000 0.000000 0.000000 0.971959 4.527000 36.785733 6.065124
9 dtree {'max_depth': 37, 'max_features': 6, 'min_samp... 0.997024 1.266952 3.737929 1.933372 0.971806 4.427972 36.986470 6.081650
20 knn {'leaf_size': 1, 'n_neighbors': 5, 'p': 4, 'we... 0.976595 3.308671 29.394203 5.421642 0.971088 3.946267 37.927531 6.158533
8 dtree {'max_depth': 11, 'max_features': 6, 'min_samp... 0.998946 0.623212 1.323978 1.150642 0.967522 4.624231 42.605800 6.527312

For max fuel temperature, all models were able to model this output. This indicates the max fuel temperature in linear with the inputs.

[20]:
postprocessor.metrics(y="avg_Tcool")
[20]:
Model Types Parameter Configurations Train R2 Train MAE Train MSE Train RMSE Test R2 Test MAE Test MSE Test RMSE
22 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.999727 0.010189 0.000152 0.012345 0.999743 0.010191 0.000151 0.012292
21 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.999660 0.010413 0.000190 0.013768 0.999643 0.010754 0.000210 0.014492
25 nn {'batch_size': 21, 'dropout': 0, 'learning_rat... 0.999613 0.013282 0.000216 0.014697 0.999611 0.013677 0.000229 0.015138
23 nn {'batch_size': 13, 'dropout': 0, 'learning_rat... 0.999386 0.014937 0.000343 0.018511 0.999404 0.015323 0.000351 0.018739
24 nn {'batch_size': 8, 'dropout': 0, 'learning_rate... 0.999008 0.017018 0.000554 0.023529 0.998942 0.017794 0.000623 0.024961
11 rforest {'criterion': 'squared_error', 'max_features':... 0.998422 0.019401 0.000881 0.029684 0.990699 0.046547 0.005477 0.074007
12 rforest {'criterion': 'poisson', 'max_features': None,... 0.998453 0.018792 0.000863 0.029384 0.990254 0.049577 0.005739 0.075758
15 rforest {'criterion': 'squared_error', 'max_features':... 0.996052 0.029694 0.002204 0.046951 0.988569 0.051719 0.006732 0.082046
13 rforest {'criterion': 'poisson', 'max_features': 6, 'm... 0.996944 0.026526 0.001706 0.041308 0.988534 0.051929 0.006752 0.082171
14 rforest {'criterion': 'absolute_error', 'max_features'... 0.995187 0.034239 0.002687 0.051836 0.988280 0.055895 0.006901 0.083074
0 linear {'copy_X': True, 'fit_intercept': True, 'n_job... 0.982538 0.069947 0.009749 0.098737 0.980755 0.074340 0.011333 0.106454
4 lasso {'alpha': 0.00015499889632899263} 0.982454 0.070464 0.009796 0.098974 0.980644 0.075362 0.011398 0.106763
1 lasso {'alpha': 0.00018375666615060612} 0.982420 0.070589 0.009815 0.099070 0.980605 0.075581 0.011421 0.106869
2 lasso {'alpha': 0.00022582145708762807} 0.982360 0.070788 0.009848 0.099239 0.980538 0.075904 0.011460 0.107053
3 lasso {'alpha': 0.00022794412681753656} 0.982357 0.070799 0.009850 0.099248 0.980535 0.075920 0.011462 0.107063
5 lasso {'alpha': 0.0002627188109898158} 0.982298 0.070972 0.009883 0.099415 0.980470 0.076199 0.011501 0.107241
17 knn {'leaf_size': 28, 'n_neighbors': 5, 'p': 2, 'w... 1.000000 0.000000 0.000000 0.000000 0.980326 0.074259 0.011585 0.107634
16 knn {'leaf_size': 7, 'n_neighbors': 4, 'p': 2, 'we... 1.000000 0.000000 0.000000 0.000000 0.979742 0.076561 0.011929 0.109221
18 knn {'leaf_size': 18, 'n_neighbors': 6, 'p': 2, 'w... 1.000000 0.000000 0.000000 0.000000 0.979589 0.075897 0.012019 0.109632
19 knn {'leaf_size': 9, 'n_neighbors': 3, 'p': 2, 'we... 1.000000 0.000000 0.000000 0.000000 0.978625 0.081296 0.012587 0.112191
9 dtree {'max_depth': 37, 'max_features': 6, 'min_samp... 0.998319 0.019348 0.000938 0.030634 0.977002 0.085462 0.013543 0.116373
10 dtree {'max_depth': 23, 'max_features': None, 'min_s... 0.997014 0.028430 0.001667 0.040833 0.976706 0.086039 0.013717 0.117119
7 dtree {'max_depth': 30, 'max_features': 6, 'min_samp... 0.999210 0.011467 0.000441 0.021007 0.976592 0.085899 0.013784 0.117405
6 dtree {'max_depth': 31, 'max_features': None, 'min_s... 1.000000 0.000000 0.000000 0.000000 0.976290 0.085863 0.013962 0.118160
20 knn {'leaf_size': 1, 'n_neighbors': 5, 'p': 4, 'we... 0.983943 0.066562 0.008965 0.094683 0.976052 0.082846 0.014102 0.118753
8 dtree {'max_depth': 11, 'max_features': 6, 'min_samp... 0.999315 0.010149 0.000382 0.019553 0.975428 0.087764 0.014469 0.120289

Average coolant temperature was also well predicted by all models.

We can see the parameters of each model with the best Test R2 with get_params.

[21]:
for model in model_settings["models"]:
    print(postprocessor.get_params(model_type=model), "\n")
  Model Types  copy_X  fit_intercept n_jobs   normalize  positive
0      linear    True           True   None  deprecated     False

  Model Types     alpha
0       lasso  0.000155

  Model Types  max_depth  max_features  min_samples_leaf  min_samples_split
0       dtree         37             6                 1                  4

  Model Types  leaf_size  n_neighbors  p   weights
0         knn         28            5  2  distance

  Model Types      criterion  max_features  min_samples_leaf  \
0     rforest  squared_error             4                 1

   min_samples_split  n_estimators
0                  3           186

  Model Types  batch_size  dropout  learning_rate mid_num_node_strategy  \
0          nn           8        0       0.002693              constant

   num_layers  start_num_nodes
0           2              310

We can visualize the performance of each model with diagonal validation plots. These plots show the predicted output to the actual output. For the plots below we will do max burst width.

[27]:
models = np.array([["linear", "lasso"], ["dtree", "knn"], ["rforest", "nn"]])

output = ["burst_width"]

fig = plt.figure(constrained_layout=fig, figsize=(10,15))
gs = GridSpec(models.shape[0], models.shape[1], figure=fig)
for i in range(models.shape[0]):
    for j in range(models.shape[1]):
        if models[i, j] != None:
            ax = fig.add_subplot(gs[i, j])
            ax = postprocessor.diagonal_validation_plot(
                model_type=models[i, j],
                y=output,
            )
            ax.set_title(models[i, j])
../_images/examples_rod_ejection_33_0.png

We see that all models except linear and lasso do relatively well predicting burst width. nn has the best performance according to these diagonal validation plots.

Similarly, the validation_plot function produces validation plots that show the absolute relative error for each burst width prediction.

[26]:
fig = plt.figure(constrained_layout=fig, figsize=(15,20))
gs = GridSpec(models.shape[0], models.shape[1], figure=fig)
for i in range(models.shape[0]):
    for j in range(models.shape[1]):
        if models[i, j] != None:
            ax = fig.add_subplot(gs[i, j])
            ax = postprocessor.validation_plot(
                model_type=models[i, j],
                y=output,
            )
            ax.set_title(models[i, j])
../_images/examples_rod_ejection_35_0.png

The performance gap of the linear model to the others is evident in the magnitude of the relative error.

Finally, the learning curve of the most performant nn is shown by nn_learning_plot.

[24]:
fig, ax = plt.subplots(figsize=(8,8))
ax = postprocessor.nn_learning_plot()
../_images/examples_rod_ejection_37_0.png

The validation curve is below the training curve; therefore, the nn is not overfit.

References

    1. Finnemann and A. Galati, “NEACRP 3-D LWR Core Transient Benchmark,” NEACRP-L-335, Revision 1, 1992.

pyMAISElogo.png