Conversations often start with AI but end with data quality
Learn why it’s so important to ensure your data is good enough, especially for use of AI in high-risk contexts such as in energy and maritime.
This episode answers key questions such as:
Why data quality is so important
How to manage data quality efficiently to get real value from AI
Transcript:
Transcript:
MARTINE HANNEVIKWelcome to the Trust in Industrial AI video series, where we explore how to implement AI with speed and confidence in safety critical industries.Conversations about AI often end up with a discussion about data quality and data management.And in today's episode, we'll explore why it's so important to ensure you have good enough data, especially for use in high-risk contexts such as in energy and maritime.I'm your host, Martine, and joining me today are two data quality experts from DNV, Mette and Karl John.Welcome to both of you.
Transcript:
METTE RØNNING RAABELThank you so much.Very nice to be here, Martine.
Transcript:
KARL JOHN PEDERSEN Thank you for having us.
Transcript:
MARTINE HANNEVIKYeah.So I think we'll start with the first question to you, Karl John.And of course, the question is, why is it so important for companies to think about data quality if they want to realise the potential benefits of their AI investments?
Transcript:
KARL JOHN PEDERSENI think as you mentioned, the conversation often goes straight to AI. But if we want to have good AI, we have to ensure that we have good data. So data is the backbone of a good AI system. So we have to focus more on the data. If we have good data coming into the system, then we can trust the results of the AI. And for many years, organizations have been collecting a lot of data. It provides a great resource for AI. But it's not just enough to have lots of data. We also have to have good data. And a recent DNV survey of energy professionals show that data quality and data management issues are very important enablers for digitalization and AI models. They rely on accurate and reliable data and poor quality data really does not work with AI. So we have to focus on the data that's going into the AI, the data that is training the models and then we are then using to get good decisions.
Transcript:
MARTINE HANNEVIK And are organizations aware of the importance and do they prioritize it?
Transcript:
KARL JOHN PEDERSEN I guess it's, it's easy for people to kind of focus on the fun stuff and the models, right?But what about the back phone and spending time on that?Yeah, I agree.Often we just want to get our hands on the AI.It's the fun bit that's important.But we see that many report data quality would be a barrier to implementing good AI in organizations.And in the same survey that we mentioned earlier,70% of the companies say that they are going to prioritize improving the data quality in their organizations for that exact reason.They see that they need good quality data.So it's promising to see the organizations are focusing on the data and we do see that they need to prioritize that.
Transcript:
MARTINE HANNEVIK And you definitely focus on the data, Mette, and have a lot of experience with that from the maritime industry.And what are some of the common issues that that you see with data quality?
Transcript:
METTE RØNNING RAABEL Yeah, so I think what we see is that we are quite good at collecting data like Karl John says, but we are not so good in checking the data quality of this data.And it's important to see that each use case requires kind of a different level of data quality than another one.And what we also see is that the same data can be used for many different use cases.And then it's important to kind of pair those data with the right data quality for the specific use case.And we see some typical common issues with data quality.It's about data incompleteness.It's about data inconsistency and it's also about the different formats of the data.And of course, this data is collected from a variety of sources.It can be manual input data where you of course can have human errors into the picture.It can be from sensor data where you can have sensors not working, sensors that are drifting.It can be old systems.It can be a lot of different things that you kind of need to take into account to actually understand the data quality.And that is crucial for making good models, basically.
Transcript:
MARTINE HANNEVIK Yes,there's a lot of things that you have to have to consider, yeah.And do you have anexamples of where data quality can have a direct impact on business outcomes?
Transcript:
METTE RØNNING RAABEL Yeah, so working now in maritime industry, we, we know that the shipping is highly competitive and we also know that it's safety critical.So it can be safety for people, it can be safety for the vessels and also installations and things surrounding and of course also for the environment.So moving into more and more up to real time needs for being able to monitor maritime business, we have to be sure that the data quality actually follows.So that where we have higher requirements of data, we also need to have good quality on those data.
Transcript:
MARTINE HANNEVIK So in the end, data quality can even be a competitive advantage for us.
Transcript:
METTE RØNNING RAABEL Absolutely, yeah, yeah.
Transcript:
MARTINE HANNEVIK Great.And can you give an examples of how you work with data quality?
Transcript:
METTE RØNNING RAABEL Yeah.So currently I work with a data-driven service.It concerns emissions from vessels and there we collect the data from the vessel.Every time we get data from a vessel into our system, we run an automatic pipeline where we actually do data qualityengine, checking, inconsistency, incompleteness and all this nitty gritty details that we need to check and that goes automatically.We find issues and then we also do a feedback loop back to the vessels so that we have a continuous improvement of the data quality.And that is of course important for this service because it is used for monitoring, it is used for regulatory compliance.We use it into machine learning algorithms to predict the future behaviour for these vessels and the customers also use it for actual settlement of contracts.So then it has commercial impact as well.
Transcript:
MARTINE HANNEVIK So then when you have high data quality and ensure that you can actually use the same data for multiple different use cases.
Transcript:
METTE RØNNING RAABEL Exactly.
Transcript:
MARTINE HANNEVIK Yeah, great.And Karl John, you have experience from many other safety critical industries.Can you give some examples of how you worked with data quality in other industries?
Transcript:
KARL JOHN PEDERSEN Yeah.One example, a recent example we can talk about is we were working with pipeline operators and they have a large network of onshore pipelines and they inspect those pipelines regularly.So they have inspection data.It's lots of data.It covers a huge geographic area.And what they really want to do is to look at potential issues with those pipelines.Are there dents in the pipelines?Is there any corrosion in the pipelines?So much data you can't do it manually and you have to be able to trust the results.So they came to us and asked us, can we use AI to automatically check this data, go through the data, find those potential issues, find those dents and find those areas of corrosion.And, and we started doing that.And the first thing we need to do is ensure that the data that we are training our models in with good quality data.So we set up standardized data schemas to ensure that we had good data because it's coming from a lot of different places, got that data in a common standard.And then we could use that check that we had good quality data, removing the bad quality data and improving the data.And then we could use that data to train the AI model.And we knew we could then trust the results coming from the AI as a result of that.It saved them a lot of time.They could do a lot more work in a shorter space of time and they could trust the results more.So just going back to basics and looking at the data and sharing out good quality data is very important.
Transcript:
MARTINE HANNEVIK And I think the keyword there is, you know, if, if you have high data quality, you can trust the results and make different kinds of decisions than if you're not really sure what you're looking at.
Transcript:
KARL JOHN PEDERSEN Yeah, no, exactly.To trust the AI, we have to trust the data as well.And it's a big going back to mentioned earlier on, we just want to straight go straight into technology and develop something.But really we have to take a step back.We have to consider the data that has been fed into the AI is of good quality.It's the correct data, it'sappropriate data.And we can then trust the results that come out of the AI models in a much better way.So data is the backbone of good AI.
Transcript:
MARTINE HANNEVIK Yeah, that's some great advice to our audience.Any final advice from you, Mette?
Transcript:
METTE RØNNING RAABEL I think it'spretty much whatKarl John already has mentioned.But I think for organizations now it's the first point is maybe to look in what parts of your organization do you think AI actually willbenefit you.It can be in preventive maintenance, can be with the simulations, can be with the process optimizations, many different things.But in a way, try to find those areas where you think AI will benefit you.And then next step is then to start collecting data and have a proper management system around those data so that you have the ability also then to check data quality properly.And then when you have all of that in control, then it is about choosing the right AI platform and tools for them.Yeah, starting building the algorithms.
Transcript:
MARTINE HANNEVIK But the key take away is start with the data.Always important.Yeah, great.So thank you very much for sharing your insights with us both of you, Mette and Karl John and thank you very much for our audience for tuning in.If you have any questions or want to learn more about how DNV can support you with safe application of industrial AI, then please visit our website.Thank you.
If we have good data coming into the system, we can trust the results of the AI.
Karl John Pedersen
Head of Digital Trust
DNV
About the speakers
Mette Rønning Raabel,Product Manager,DNV
Mette has extensive experience working with data-driven innovations in industrial contexts, particularly in the maritime industry. She has been part of the Norwegian government’s expert group for sharing of industrial data focused on developing guidelines for responsibility, ownership and user rights related to the sharing of these data.
Karl John has extensive experience in data management and quality across private and public sectors. By providing guidance on requirements, standards, solutions and training courses, his work contributes to supporting clients to build trust in and increase the value of their digital solutions. He is also one of the authors of DNV’s recommended practice for data quality assurance.
Martine Hannevik, Head of Innovation and Portfolio Management, DNV
The video series is hosted by Martine Hannevik.
Martine leads the innovation portfolio at Digital Solutions in DNV, focusing on developing future-oriented products and services in sustainability, AI and digital assurance. Her work lies at the intersection of strategy, innovation and digital transformation.
AI can enhance safety, operational efficiency, innovation, and sustainability in industries such as maritime, energy, and healthcare. However, organizations must balance risk and reward. By implementing AI responsibly, you can fully exploit its potential, even in high-risk contexts.
Combining our industry domain knowledge with deep digital expertise. DNV is dedicated to supporting industries with the safe and responsible use of industrial AI.