Heat Conduction

Intputs

  • qprime: Linear heat generation rate (\(\frac{W}{m}\))

  • mdot: Mass flow rate (\(\frac{g}{s}\))

  • Tin: Temperature of the fuel boundary (\(K\))

  • R: Fuel radius (\(m\))

  • L: Fuel length (\(m\))

  • Cp: Heat capacity (\(\frac{J}{g \cdot K}\))

  • k: Thermal conductivity (\(\frac{W}{m \cdot K}\))

Output

  • T: Fuel centerline temperature (\(K\))

This data set consists of 1000 data points with 7 inputs and 1 output. The data set was constructed through Latin hypercube sampling of the 7 input parameters for heat conduction through a fuel rod. These samples were then used to solve for the fuel centerline temperature analytically. The geometry of the problem is illustrated in the figure below, and it is assumed volumetric heat generation is uniform radially. The problem is defined by

\begin{equation} \frac{1}{r}\frac{d}{dr}(kr\frac{dT}{dr}) + q''' = 0, \end{equation}

with two boundary conditions: \(\frac{dT}{dr}|_{r = 0} = 0\) and \(T(R) = T_{in}\). Therefore, the temperature profile in the fuel is

\begin{equation} T(r) = \frac{q'}{4\pi k}(1 - (r / R)^2) + T_{in}. \end{equation}

rod.png

[6]:
import pyMAISE as mai
import time
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
from scipy.stats import uniform, randint
from sklearn.model_selection import ShuffleSplit

# Plot settings
matplotlib_settings = {
    "font.size": 14,
    "legend.fontsize": 12,
}
plt.rcParams.update(**matplotlib_settings)

pyMAISE Initialization

Starting any pyMAISE job requires initialization, this includes the definition of global settings used throughout pyMAISE. For this problem the pyMAISE defaults are used:

  • verbosity: 0 \(\rightarrow\) pyMAISE prints no outputs,

  • random_state: None \(\rightarrow\) No random seed is set,

  • test_size: 0.3 \(\rightarrow\) 30% of the data is used for testing,

  • num_configs_saved: 5 \(\rightarrow\) The top 5 hyper-parameter configurations are saved for each model.

The settings are initialized and the heat conduction preprocessor specific to this data set is retrieved.

[7]:
global_settings = mai.settings.init()
preprocessor = mai.load_heat()

The heat conduction data set has 7 inputs

[8]:
preprocessor.inputs.head()
[8]:
qprime mdot Tin R L Cp k
0 35987.992759 206.185816 573.151869 0.004901 3.448155 4.096140 0.960945
1 38481.055798 192.378974 573.150960 0.004966 3.436833 4.249182 1.011272
2 39143.292108 205.076928 573.153975 0.005210 3.681457 4.237540 0.994646
3 38687.579644 199.594924 573.150777 0.004864 3.624594 4.158921 1.028158
4 40469.561181 194.142084 573.150977 0.004914 3.731222 4.244278 1.018738

and 1 output with 1000 data points.

[9]:
preprocessor.outputs.head()
[9]:
T
0 1034.133784
1 1170.316042
2 1164.893565
3 1205.250040
4 1444.718666

To better understand the data here is a correlation matrix of the data.

[10]:
fig, ax = plt.subplots(figsize=(15,10))
fig, ax = preprocessor.correlation_matrix(fig=fig, ax=ax, colorbar=False, annotations=True)
../_images/examples_heat_conduction_9_0.png

As expected there is a strong correlation between linear heat generation rate, qprime, and the centerline fuel temperature, T.

Prior to model training the data is min-max scaled to make each feature’s effect size comperable. Additionaly, this can improve the performance of some models.

[12]:
data = preprocessor.min_max_scale()

Model Initialization

Given this data set has a 1-dimensional output we will compare the performance of 7 machine learning (ML) models:

  • Linear regression: linear,

  • Lasso regression: lasso,

  • Support vector regression: svr,

  • Decision tree regression: dtree,

  • Random forest regression: rforest,

  • K-nearest neighbors regression: knn,

  • Sequential dense neural networks: nn.

For hyper-parameter tuning each model must be initialized. We will use the Scikit-learn defaults for the classical ML models (linear, lasso, svr, dtree, rforest, and knn); therefore, they are only specified in the models parameter of the model_settings dictionary. However, we must specify many parameters for the nn model that define the layers, optimizer, and training.

[13]:
model_settings = {
    "models": ["linear", "lasso", "dtree", "svr", "knn", "rforest", "nn"],
    "nn": {
        # Sequential
        "num_layers": 4,
        "dropout": True,
        "rate": 0.5,
        "validation_split": 0.15,
        "loss": "mean_absolute_error",
        "metrics": ["mean_absolute_error"],
        "batch_size": 8,
        "epochs": 50,
        "warm_start": True,
        "jit_compile": False,
        # Starting Layer
        "start_num_nodes": 100,
        "start_kernel_initializer": "normal",
        "start_activation": "relu",
        "input_dim": preprocessor.inputs.shape[1], # Number of inputs
        # Middle Layers
        "mid_num_node_strategy": "linear", # Middle layer nodes vary linearly from 'start_num_nodes' to 'end_num_nodes'
        "mid_kernel_initializer": "normal",
        "mid_activation": "relu",
        # Ending Layer
        "end_num_nodes": preprocessor.outputs.shape[1], # Number of outputs
        "end_activation": "linear",
        "end_kernel_initializer": "normal",
        # Optimizer
        "optimizer": "adam",
        "learning_rate": 5e-4,
    },
}
tuning = mai.Tuning(data=data, model_settings=model_settings)

Hyper-parameter Tuning

While three hyper-parameter tuning functions are supported (grid_search, random_search, and bayesian_search), random_search and bayesian_search are used for the classical ML and nn models, respectively. random_search randomly samples a defined parameter space and the number of models generated is easily defined. A large number of classical models are generated through random_search as they are relatively quick to train. However, the prohibative time required to train neural networks makes bayesian_search more appealing as the search converges to the optimal hyper-parameters in relatively few iterations. The hyper-parameter search spaces are outlined below, many with Scipy uniform distributions. Cross validation is used in both search methods to eliminate bias from the data set.

[14]:
random_search_spaces = {
    "lasso": {
        "alpha": uniform(loc=0.0001, scale=0.0099), # 0.0001 - 0.01
    },
    "dtree": {
        "max_depth": randint(low=5, high=50), # 5 - 50
        "max_features": [None, "sqrt", "log2", 2, 4, 6],
        "min_samples_split": randint(low=2, high=20), # 2 - 20
        "min_samples_leaf": randint(low=1, high=20), # 1 - 20
    },
    "svr": {
        "kernel": ["linear", "poly", "rbf", "sigmoid"],
        "degree": randint(low=1, high=5),
        "gamma": ["scale", "auto"],
    },
    "rforest": {
        "n_estimators": randint(low=50, high=200), # 50 - 200
        "criterion": ["squared_error", "absolute_error", "poisson"],
        "min_samples_split": randint(low=2, high=20), # 2 - 20
        "min_samples_leaf": randint(low=1, high=20), # 1 - 20
        "max_features": [None, "sqrt", "log2", 2, 4, 6],
    },
    "knn": {
        "n_neighbors": randint(low=1, high=20), # 1 - 20
        "weights": ["uniform", "distance"],
        "leaf_size": randint(low=1, high=30), # 1 - 30
        "p": randint(low=1, high=10), # 1 - 10
    },
}
bayesian_search_spaces = {
    "nn": {
        "mid_num_node_strategy": ["constant", "linear"],
        "batch_size": [8, 64],
        "learning_rate": [1e-5, 0.001],
        "num_layers": [2, 6],
        "start_num_nodes": [25, 300],
    },
}

start = time.time()
random_search_configs = tuning.random_search(
    param_spaces=random_search_spaces,
    models=["linear"] + list(random_search_spaces.keys()),
    n_iter=300,
    cv=ShuffleSplit(n_splits=5, test_size=0.25, random_state=global_settings.random_state),
)
bayesian_search_configs = tuning.bayesian_search(
    param_spaces=bayesian_search_spaces,
    models=bayesian_search_spaces.keys(),
    n_iter=50,
    cv=5,
)
stop = time.time()
print("Hyper-parameter tuning took " + str((stop - start) / 60) + " minutes to process.")
Hyper-parameter tuning search space was not provided for linear, doing manual fit
Hyper-parameter tuning took 64.99195301135381 minutes to process.

With the conclusion of hyper-parameter tuning we can see the training results of each iteration of the bayesian search.

[15]:
fig, ax = plt.subplots(figsize=(8,8))
ax = tuning.convergence_plot(model_types="nn")
../_images/examples_heat_conduction_17_0.png

After about 35 iterations the bayesian search converges on the optimal solution for this parameter space.

Model Post-processing

With the models tuned and the top num_configs_saved saved, we can now pass these models to the PostProcessor for model comparison and analysis. We will increase the epochs of the nn models for better performance.

[16]:
new_model_settings = {
    "nn": {"epochs": 200}
}
postprocessor = mai.PostProcessor(
    data=data,
    models_list=[random_search_configs, bayesian_search_configs],
    new_model_settings=new_model_settings,
    yscaler=preprocessor.yscaler,
)

To compare the performance of these models we will compute 4 metrics for both the training and testing data:

  • mean squared error MSE \(=\frac{1}{n}\sum^n_{i = 1}(y_i - \hat{y_i})^2\),

  • root mean squared error RMSE \(=\sqrt{\frac{1}{n}\sum^n_{i = 1}(y_i - \hat{y_i})^2}\),

  • mean absolute error MAE = \(=\frac{1}{n}\sum^n_{i = 1}|y_i - \hat{y_i}|\),

  • and r-squared R2 \(=1 - \frac{\sum^n_{i = 1}(y_i - \hat{y_i})^2}{\sum^n_{i = 1}(y_i - \bar{y_i})^2}\),

where \(y\) is the actual outcome, \(\bar{y}\) is the average outcome, \(\bar{y}\) is the model predicted outcome, and \(n\) is the number of observations.

[17]:
postprocessor.metrics()
[17]:
Model Types Parameter Configurations Train R2 Train MAE Train MSE Train RMSE Test R2 Test MAE Test MSE Test RMSE
16 rforest {'criterion': 'poisson', 'max_features': 6, 'm... 0.998453 3.591531 37.184198 6.097885 0.990350 9.288734 207.756209 14.413751
20 rforest {'criterion': 'poisson', 'max_features': 6, 'm... 0.996445 6.080878 85.427674 9.242709 0.989620 10.205636 223.480141 14.949252
17 rforest {'criterion': 'poisson', 'max_features': None,... 0.997094 5.037790 69.824015 8.356077 0.988891 9.804428 239.182928 15.465540
19 rforest {'criterion': 'squared_error', 'max_features':... 0.995783 5.840237 101.341391 10.066846 0.988845 9.861571 240.177509 15.497661
18 rforest {'criterion': 'absolute_error', 'max_features'... 0.995097 5.859402 117.824900 10.854718 0.987887 10.476799 260.805013 16.149459
8 dtree {'max_depth': 9, 'max_features': None, 'min_sa... 0.999305 2.733475 16.705728 4.087264 0.978193 14.604683 469.513283 21.668255
7 dtree {'max_depth': 33, 'max_features': 6, 'min_samp... 0.997990 4.899830 48.294779 6.949445 0.977932 15.095519 475.123305 21.797323
6 dtree {'max_depth': 24, 'max_features': None, 'min_s... 0.996641 5.159630 80.723680 8.984636 0.977861 14.439304 476.653943 21.832406
9 dtree {'max_depth': 23, 'max_features': None, 'min_s... 0.997745 4.412194 54.189393 7.361345 0.976071 15.054146 515.199978 22.698017
10 dtree {'max_depth': 25, 'max_features': 6, 'min_samp... 0.995978 5.808797 96.664687 9.831820 0.974757 15.482882 543.477880 23.312612
26 nn {'batch_size': 8, 'learning_rate': 0.000721275... 0.900750 35.952520 2385.133483 48.837828 0.873662 39.635764 2720.085656 52.154440
28 nn {'batch_size': 14, 'learning_rate': 0.00089049... 0.866768 43.658858 3201.772436 56.584207 0.838317 45.660018 3481.078535 59.000666
30 nn {'batch_size': 14, 'learning_rate': 0.00089754... 0.859826 46.918987 3368.615820 58.039778 0.831708 48.253050 3623.369092 60.194427
5 lasso {'alpha': 0.0001824989270416765} 0.849727 52.206878 3611.305127 60.094136 0.829573 52.166466 3669.332827 60.575018
4 lasso {'alpha': 0.00016869826770270902} 0.849751 52.194496 3610.727957 60.089333 0.829515 52.168366 3670.576526 60.585283
3 lasso {'alpha': 0.00013742897968849972} 0.849798 52.166681 3609.588039 60.079847 0.829376 52.173664 3673.583780 60.610096
2 lasso {'alpha': 0.00012692803708103} 0.849812 52.157526 3609.257463 60.077096 0.829326 52.175443 3674.652602 60.618913
1 lasso {'alpha': 0.00011717208282240732} 0.849824 52.149020 3608.973873 60.074736 0.829279 52.178128 3675.672142 60.627322
15 svr {'degree': 1, 'gamma': 'scale', 'kernel': 'poly'} 0.848972 52.588788 3629.439049 60.244826 0.828916 52.476532 3683.489056 60.691754
14 svr {'degree': 1, 'gamma': 'scale', 'kernel': 'poly'} 0.848972 52.588788 3629.439049 60.244826 0.828916 52.476532 3683.489056 60.691754
13 svr {'degree': 1, 'gamma': 'scale', 'kernel': 'poly'} 0.848972 52.588788 3629.439049 60.244826 0.828916 52.476532 3683.489056 60.691754
12 svr {'degree': 1, 'gamma': 'scale', 'kernel': 'poly'} 0.848972 52.588788 3629.439049 60.244826 0.828916 52.476532 3683.489056 60.691754
11 svr {'degree': 1, 'gamma': 'scale', 'kernel': 'poly'} 0.848972 52.588788 3629.439049 60.244826 0.828916 52.476532 3683.489056 60.691754
0 linear {'copy_X': True, 'fit_intercept': True, 'n_job... 0.849893 52.047689 3607.316904 60.060943 0.828604 52.215410 3690.200299 60.747019
29 nn {'batch_size': 8, 'learning_rate': 0.000709612... 0.862323 43.002099 3308.608881 57.520508 0.827847 45.971629 3706.503882 60.881063
27 nn {'batch_size': 8, 'learning_rate': 0.000710736... 0.809784 52.969568 4571.188994 67.610569 0.763386 55.993587 5094.357125 71.374765
21 knn {'leaf_size': 21, 'n_neighbors': 10, 'p': 1, '... 1.000000 0.000000 0.000000 0.000000 0.737858 64.180097 5643.967965 75.126347
24 knn {'leaf_size': 7, 'n_neighbors': 8, 'p': 1, 'we... 1.000000 0.000000 0.000000 0.000000 0.728624 62.456548 5842.789462 76.438141
25 knn {'leaf_size': 20, 'n_neighbors': 8, 'p': 1, 'w... 1.000000 0.000000 0.000000 0.000000 0.728624 62.456548 5842.789462 76.438141
23 knn {'leaf_size': 26, 'n_neighbors': 12, 'p': 1, '... 1.000000 0.000000 0.000000 0.000000 0.725792 67.179263 5903.767667 76.835979
22 knn {'leaf_size': 17, 'n_neighbors': 7, 'p': 1, 'w... 1.000000 0.000000 0.000000 0.000000 0.720752 62.599642 6012.271535 77.538839

As this data set represents radial heat conduction, we do not expect linear models to perform well. This is shown by the relatively poor performance of linear and lasso. The best perfroming models were rforest and dtree with test and train r-squared above 0.99.

Using the get_params function we can see the optimal hyper-parameter configurations for each model.

[18]:
for model in ["linear", "lasso", "svr", "dtree", "knn", "rforest", "nn"]:
    print(postprocessor.get_params(model_type=model), "\n")
  Model Types  copy_X  fit_intercept n_jobs  positive
0      linear    True           True   None     False

  Model Types     alpha
0       lasso  0.000182

  Model Types  degree  gamma kernel
0         svr       1  scale   poly

  Model Types  max_depth max_features  min_samples_leaf  min_samples_split
0       dtree          9         None                 1                  3

  Model Types  leaf_size  n_neighbors  p   weights
0         knn         21           10  1  distance

  Model Types criterion  max_features  min_samples_leaf  min_samples_split  \
0     rforest   poisson             6                 1                  4

   n_estimators
0           136

  Model Types  batch_size  learning_rate mid_num_node_strategy  num_layers  \
0          nn           8       0.000721              constant           2

   start_num_nodes
0              300

To visualize the performance of these models we can use the diagonal_validation_plot functions to produce diagonal validation plots.

[19]:
models = np.array([["linear", "lasso"], ["dtree", "svr"], ["knn", "rforest"], ["nn", None]])

fig = plt.figure(constrained_layout=fig, figsize=(10,20))
gs = GridSpec(models.shape[0], models.shape[1], figure=fig)
for i in range(models.shape[0] - 1):
    for j in range(models.shape[1]):
        if models[i, j] != None:
            ax = fig.add_subplot(gs[i, j])
            ax = postprocessor.diagonal_validation_plot(model_type=models[i, j])
            ax.set_title(models[i, j])
ax = fig.add_subplot(gs[-1, :])
ax = postprocessor.diagonal_validation_plot(model_type=models[-1, 0])
_ = ax.set_title(models[-1, 0])
../_images/examples_heat_conduction_26_0.png

The performance differences between rforest/dtree with the other models is apparent along with the overfitting of knn. The predictions of rforest and dtree are closely spread along \(y=x\) while the knn test predictions are over-predicted at lower temperatures and under-predicted at higher temperatures.

Similarly, the validation_plot function produces validation plots that show the absolute relative error for each prediction.

[20]:
fig = plt.figure(constrained_layout=fig, figsize=(10,20))
gs = GridSpec(models.shape[0], models.shape[1], figure=fig)
for i in range(models.shape[0] - 1):
    for j in range(models.shape[1]):
        if models[i, j] != None:
            ax = fig.add_subplot(gs[i, j])
            ax = postprocessor.validation_plot(model_type=models[i, j])
            ax.set_title(models[i, j])
ax = fig.add_subplot(gs[-1, :])
ax = postprocessor.validation_plot(model_type=models[-1, 0])
_ = ax.set_title(models[-1, 0])
../_images/examples_heat_conduction_28_0.png

The performance of the models is best represented by the magnitudes observed on the y-axis; however, even rforest gets as high as \(>9.0\%\) error.

Finally, the learning curve of the most performant nn is shown by nn_learning_plot.

[21]:
fig, ax = plt.subplots(figsize=(8,8))
ax = postprocessor.nn_learning_plot()
../_images/examples_heat_conduction_30_0.png

The validation curve is below the training curve; therefore, the nn is not overfit.

pyMAISElogo.png