Skip to content

Test rig condition monitoring and predictive maintenance. Training

Notifications You must be signed in to change notification settings

ivanokhotnikov/test_rig_forecast_training

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

88 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Training pipeline

The repo contains the source code and pipeline configurations to automate retraining of the test rig forecaster. The code uses Cloud Functions to trigger retraining on Object finalized trigger (see triggers), Cloud Build to automate testing, building and publishing to the Google Container Registry the new Docker image, Vertex AI Pipelines with Kubeflow SDK to orchestrate execution of the pipeline steps, Vertex AI Experiments to track parameters and metrics during training, Tensorboard callback within the training step to record the timeseries of losses and metrics progressions during the training process and finally Vertex AI Model Registry to store the champion models.

Pipeline diagram

Training pipeline

Pipeline steps

Step Description Inputs Outputs Parameters
read-raw-data Reads raw data files from the GCS raw data bucket storage. Uploads the combined data frame to the interim data directory in the GCS bucket interim_data
all_features
raw_data_path
features_path
interim_data_path
importer Imports interim features interim_features artifact_uri
build-features Reads the interim data, builds features (float down casting, removes NaNs and the step zero data, adds the power and time features to the processed data), saves the processed data interim_features
interim_data
processed_data
processed_features
features_path
processed_data_path
split-data Splits processed data into train and test data processed_data train_data
test_data
train_data_size
import-forecast-features Imports forecast features forecast_features features_path
train Instantiates, trains the RNN model on the train dataset. Saves the trained scaler and the keras model to the metadata store, logs the training metrics and tensorboard event file train_data scaler_model
keras_model
train_metrics
parameters
project_id
region
feature
lookback
lstm_units
learning_rate
epochs
batch_size
patience
timestamp
train_data_size
pipelines_path
evaluate Evaluates the trained keras model, saves the evaluation metrics to the metadata store test_data
scaler_model
keras_model
eval_metrics
project_id
region
feature
lookback
batch_size
timestamp
import-champion-metrics Imports champion metrics champion_metrics features_path
compare-models Compares evaluation metrics of the trained (challenger) model and the champion (the one in the model registry) eval_metrics
champion_metrics
evaluation_metric
absolute_difference
upload-model-to-registry Uploads the scaler and keras models to the models registry. Uploads the parameters and metrics of the model parameters
scaler_model
keras_model
eval_metrics
feature
project_id
region
deploy_image
models_path

Directed acyclic graph

Training pipeline's DAG

About

Test rig condition monitoring and predictive maintenance. Training

Topics

Resources

Stars

Watchers

Forks