The importance of data quality while using AI

Listen to this article
@aisciencetalk.blog

🚨 AI is only as good as the data it learns from! Bad data = bad AI. 😱 Want to know why data quality is so important for AI systems? Watch this! 👀 #AI #ArtificialIntelligence #DataQuality #TechTalk #MachineLearning #FutureOfTech #HeyGen AIExplained Link to the full blog article https://aisciencetalk.blog/2024/06/10/the-importance-of-data-quality-while-using-ai/

♬ suono originale – aisciencetalk.blog – aisciencetalk.blog

Summary

This article will show the importance of the data quality while using an AI complex methodology. Real results will be provided, so that the importance of good data quality will be understood.

Why is data quality important

Why is data quality important? Wherever your using data to train AI, using data to make data analysis, or simply using data for daily purposes, data quality is your key to success. Imagine for example that you’re trying to predict the fuel consumption of your car and you’re relying on the computer monitor of your vehicle.

Well, if the monitor is not well tuned, you will have a wrong calculation. In this case, you could either rely on a better system or you could cross check your monitor against the fuel volume calculated at the power station.

Have a try and let me know how much your car computer is off, compared to the fuel volume calculation. Little insight, in this second case there are uncertainties in the volume measured at the power station and in the kilometers measured by your car. What could be a solution here?

Data quality results

In this article, we are analyzing how data quality is affecting the results of failure diagnostics of a multi-shaft gas turbine, and to do this, we are referring to the results of a paper publication [1].

The objective of the researchers of the cited paper [1], is to detect the failure of a multi-shaft gas turbine, trying to detect which component is affected and by how much, and the type of failure. In particular the article is talking about severity detection, which means how bad is the failure on a scale from 0 to 150; and classification, which mean what type of failure the component of the gas turbine is subjected to.

In the paper, the authors have developed a methodology capable to detect failure severity and failure types, and to test it, they are artificially injecting noise on top of a clean signal (without noise).

The results go in deep in the various types of failures and they differentiate between severity detection and failure type detection. What is changing between the various tests, is only the noise of the measurements, that is added on top of the clean signal. When the measurements are without noise (0% noise on top of the clean signal), all the results are correct. When the noise increases to 0.8%, the wrong detections increase to 5%. Finally, when the measurement noise grows to 2%, the wrong detections go up to 41%. With 2% noise, almost half of the detections are incorrect.

And this is only due to measurement noise. Imagine if on top of measurement noise, you also have wrong signals or non-calibrated chains of measurements. Well, in this case AI won’t bring you far, and will rather provide you with wrong conclusions. The specific issue in predictive maintenance, could be that you either have too less problems reported, and in this case your AI will be worthless, or too many problems reported. In this case, AI will also be useless because non reliable. All in all, this translates in garbage in/garbage out.

Good data quality can be considered the foundation for every analysis, either with or without AI. The recommendation then, before you start any AI or data analysis, make sure the data you rely on, are of good quality.

References

[1].  Togni, Simone & Nikolaidis, Theoklis & Sampath, Suresh. (2020). A combined technique of Kalman filter, artificial neural network and fuzzy logic for gas turbines and signal fault isolation. Chinese Journal of Aeronautics. 34. 10.1016/j.cja.2020.04.015.

Copyright

Author Simone Togni

Date: 10/06/2024

Platform: aisciencetalk.blog

4 thoughts on “The importance of data quality while using AI”

  1. Pingback: Enhancing AI Predictions with Data Filtering: Techniques and Applications – AI Science Blog

  2. Pingback: AI GENERATED VIDEO. The importance of data quality while using AI - AI Science Blog

  3. Pingback: Top Sources for Data to Validate AI Methodologies - AI Science Blog

Leave a Reply

Scroll to Top

Discover more from AI Science Talk Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading