Why This Project?
Predicting a failure before it happens can save thousands in unplanned downtime and repairs. In this case study, I show how even simulated data can be used to build a predictive maintenance model for a typical industrial motor.
The Dataset
I asked ChatGPT to generate 3 months of sensor data (sampled every 10 seconds) from a three-phase electric motor. The data have random nature, so they are not real, but they are still a good base to train and test and AI model.
Inputs
- 3-axis vibration sensors (X, Y, Z)
- Motor temperature
- Electrical current draw
Outputs
- Binary label:
0 = normal operation,1 = failure
Before failure events, both vibration and temperature showed a noticeable increase — a signal the model can learn to detect.
The Technical Pipeline
- Preprocessing: rolling averages, temperature deltas, normalization
- Feature Engineering: mean vibration, temperature trend, fluctuations
- Model: Random Forest (scikit-learn)
- Evaluation: confusion matrix, feature importance, ROC curve, overall success score
Results
The model correctly detected ~66% of the failure cases on a highly imbalanced dataset — a solid result for a first iteration. This could be improved with:
- More real-world data
- Additional time-series features
- Advanced algorithms (LSTM, Isolation Forest)
Report out
The code generated a complete report out the is comprehensive also for non-specialists in the field. The first plot generated is the confusion matrix Figure 1, that shows the actual vs the predicted. The non-failure cases are predicted accurately, with both the prediction and the real value set to zero. Some of the failure cases are accurate, but there are also some that are not well predicted. The biggest error is on the non predicted failure, as 4731 items are predicted being 0, but they are actually 1. In this case there needs to be some work both on the inputs and on the model to better predict the results. A clear example of success is reported here, where the data are consistent, they are filtered for noise, and the predicting model is very sophisticated https://aisciencetalk.blog/2024/06/10/the-importance-of-data-quality-while-using-ai/. There the results reached 100% accuracy.

Figure 1 Confusion matrix
The feature importance chart instead Figure 2, shows that the inputs have almost the same importance in the Random Forest model. This implies that there are not unnecessary values included in the model.

Figure 2 Feature importance chart
The ROC (Receiver Operating Characteristic) curve Figure 3, same as the confusion matrix, shows that there are false positives and the model can be improved with a more sophisticated model, but most importantly with more realistic data.

Figure 3 ROC curve
Obviously, there is no direct way to detect the failure looking at the data (Figure 4, Figure 5, Figure 6), as there is no clear trace of something going wrong. However, AI, can anticipate the failure that will occur to the motor.

Figure 4 Current signal

Figure 5 Temperature signal

Figure 6 Vibration signals
Need Something Similar?
This is a hands-on example of how artificial intelligence can be applied in real-world industrial engineering scenarios.
If you have sensor data (even raw), and want to extract value from it — let’s talk.
Contact us → aisciencetalkblog@gmail.com
Copyright
Author: Simone Togni
Platform: aisciencetalk.blog