Presenting a model based on machine learning to improve maintenance management of oil pipeline
Mohammad reza Zamani
1
(
Ph.D. Candidate, Department of Industrial Management, Science and Research Branch, Islamic Azad University, Tehran, Iran.
)
َAhmad Ebrahimi
2
(
Assistant Professor, Department of Industrial Management and Technology, Science and Research Branch, Islamic Azad University, Tehran, Iran
)
Alireza Rashidi Komijan
3
(
Associate Professor, Department of Industrial Engineering, Firouzkouh Branch, Islamic Azad University, Firouzkouh, Iran
)
Keywords: machine learning , maintenance management, oil pipeline, KNN algorithm,
Abstract :
Maintenance costs usually account for more than a third of a manufacturing company’s operating costs, so an optimal maintenance strategy and model should improve equipment performance. Therefore, predicting the causes of failures and their roots is a major concern for managers in various industries. In fact, maintenance managers are always looking to discover what factors can lead to failures in production and service continuity in order to prevent them from occurring. This can be even more important in process industries such as oil and gas due to the high volume of losses and losses caused by equipment failures and stoppages in production and product transportation. Predicting pipeline failures and data imbalances is one of the main challenges in the oil industry. Traditional models are unable to accurately identify failures. The nearest neighbor algorithm, as one of the machine learning methods, has shown good performance in the presence of unbalanced class distributions. In this paper, an attempt has been made to improve the prediction accuracy by using KNN weighting techniques. In this proposed method, the KNN algorithm is combined with two new weighting methods, and the results of the studies show that these methods increase the prediction accuracy compared to simple KNN and other machine learning methods. This paper presents a solution for improving preventive maintenance systems in pipelines in the oil industry.
Keywords: machine learning ،maintenance management، oil pipeline، weighted KNN algorithm
- Introduction
In the midst of the Fourth Industrial Revolution, industries are constantly looking for ways to optimize production lines while also seeking to reduce their costs (Popov et al., 2021). Maintenance costs typically account for more than one-third of a manufacturing company's operating costs (Fu et al., 2020). Traditional maintenance techniques are based on two different strategies: corrective maintenance and preventive maintenance. Corrective maintenance is performed to repair faulty systems and equipment only when a failure occurs, thus reducing direct process costs (Kerf et al., 2020). But preventive maintenance is performed after regular intervals to prevent equipment and systems from failing. Therefore, repairs are performed on machines or components when they have an uncertain remaining useful life, leading to both machine downtime and increased operating costs (Zheng et al., 2021). In the oil industry, pipes play an important role in transporting oil and petroleum products, so their efficient and safe operation is crucial to minimizing environmental risk and company assets. Therefore, predicting pipeline failure is very important and necessary. Therefore, studies to predict pipeline failures have increased significantly in recent years. The present study, considering the existing shortcomings and lack of research in the field of predicting failures in oil pipelines, seeks to provide a model to identify and detect failures in oil pipelines. Machine learning techniques have been used in many studies for fault detection or failure detection and have shown good performance. Research has also been conducted on failure detection in the oil industry or failure prediction in similar equipment, but in the Iranian oil industry, research that can predict the causes of failure in oil pipes is still rarely seen. In addition, the use of a combination of metaheuristic algorithms and machine learning can achieve more accurate results in prediction by reducing the prediction error, and the use of this methodological approach is also limited in research in the field of failure detection in oil pipes (Zenisk et al., 2019). Given the existing shortcomings and lack of research in the field of oil pipe failure prediction, this article seeks to present a model for identifying and detecting oil pipe failures, which is a serious need in the oil industry. The present study attempts to develop an appropriate model for predicting oil pipe failures with the help of machine learning algorithms and predict this failure appropriately. At the end of the article, we seek to answer this key question: How is the detection of defects in oil pipes by controlling the identified parameters with the help of machine learning algorithms?
- Literature Review
Machine learning techniques have been used in many studies for fault detection or failure detection and have shown good performance. Research has also been conducted on failure detection in the oil industry or failure prediction in similar equipment, but in the Iranian oil industry, research that can predict the causes of failure in oil pipes is still rarely seen. In addition, the use of a combination of metaheuristic algorithms and machine learning can achieve more accurate results in prediction by reducing the prediction error, and the use of this methodological approach is also limited in research in the field of failure detection in oil pipes (Zenisk et al., 2019). Given the existing shortcomings and lack of research in the field of oil pipeline failure prediction, this article seeks to present a model for identifying and detecting oil pipeline failures, which is a serious need in the oil industry. The present study attempts to develop an appropriate model for predicting oil pipeline failures with the help of machine learning algorithms and predict this failure appropriately. At the end of the article, we seek to answer this key question: how to detect defects in oil pipelines by controlling the parameters identified with the help of machine learning algorithms?
Oil and gas pipelines are one of the most critical industrial infrastructures around the world. Failure or leakage in these lines can lead to environmental problems, high repair costs, and even safety threats. Therefore, developing accurate predictive models that can identify failures before they occur is of great importance.
One of the challenges in predicting pipeline failures is data imbalance. In this study, various techniques were used to solve this problem. The KNN model was implemented with weighting to increase the influence of closer samples. Also, the k value was optimized so that the model could better identify minority class samples. This method improved the accuracy of the model in detecting failure samples compared to simple KNN (Atcher et al., 2020).
Traditional methods are designed based on statistical analysis and probabilistic models. Among these methods, logistic regression can be mentioned. This method has been used to predict the probability of failure, but its accuracy is low in nonlinear data. Linear discriminant analysis and quadratic discriminant analysis have also been used to classify the condition of pipes, but they have poor performance against complex data. The main drawback of traditional methods is the inability to process large volumes of industrial sensor data and the inability to model complex and nonlinear patterns, as well as sensitivity to noise and lack of training data (Megana-Mora et al., 2019).
Recent advances in machine learning have made it possible to analyze large data sets and discover complex patterns. Various studies have shown that machine learning-based methods have higher accuracy in failure prediction, some of the most important of which are:
Random Forest A model based on combining multiple decision trees that has high accuracy, but in some cases overfitting.
- Support Vector Machine (SVM) requires a lot of processing time, especially for data with certain boundaries, but in large data volumes.
XGBoost is one of the boosting methods that are capable of modeling nonlinear and complex relationships and has shown better performance than other models in many forecasting problems.
- Artificial neural networks have the ability to learn complex features from data, but they require a large amount of data for training and have low interpretability.
Therefore, the most important limitations of machine learning methods are the need for balanced data to prevent model bias
and the high computational complexity of some models. (Yang et al., 2019).
In recent years, various methods have been proposed for pipeline failure prediction. Past studies have mainly focused on the use of basic models such as Logistic Regression, LDA, and QDA. However, new research shows that ensemble learning methods such as Random Forest and Boosting can improve the prediction accuracy. This paper aims to combine the advantages of traditional algorithms and optimized KNN methods.
KNN is one of the simplest but most effective machine learning algorithms for classification and prediction problems. This method uses data spacing for decision making and has high generalizability. The most important strengths of the KNN method are its simplicity of implementation, no need for complex training, and good performance on small and unbalanced datasets. In order to improve the accuracy and performance and overcome the limitations of this method, a combination of weights has been used.
The challenges in previous methods are the lack of attention to data imbalance, which in many previous studies have run machine learning models on balanced data, while failures rarely occur in industrial data, and the lack of use of KNN optimization techniques: Previous papers have usually used standard KNN and have not investigated improvements such as weighting. (Lin et al., 2019).
In this paper, the KNN model is combined with two new weighting methods:
(a) Exponential Weighting increases the influence of closer samples and improves the accuracy of the model.
(b) Hyperbolic Tangent Weighting allows more distant samples to have an impact on the prediction, but their role is reduced.
Therefore, to further investigate the topic, the performance and accuracy of weighted KNN models are compared with machine learning methods to determine whether the proposed method can be applied in real conditions.
In order to state the problem of the paper, given that traditional methods had many limitations and could not process complex data, and since machine learning methods have higher accuracy, but require optimization for unbalanced data. Therefore, the present paper improves KNN with weighting techniques and compares it with machine learning models.
- Methodology
The data collection tool in the present study includes a database to extract data values and a Delphi questionnaire to confirm the input variables. The Delphi questionnaire includes 8 variables mentioned in the conceptual model and the variables section. The Delphi questionnaire of the present study is set on a scale of 1 to 10 and is graded from 1 to 10 from least important to most important. The validity of the questionnaire is collected using the opinions of ten professors and its reliability is collected using the Cronbach's alpha test. The statistical population of the present study includes all experts in oil projects in the Iranian Oil Company who are preferably familiar with oil pipe failures and have technical expertise. Considering the judgmental nature of the sample, selecting 10 to 20 people leads to sample adequacy, and in the present study, at least 10 people are selected as a sample. In the above graph, the final average of each variable was obtained using the Delphi method. The most important variables are viscosity and sludge weight, followed by flow acceleration. Of course, it should be noted that the average of the variables is not very important in the present study, and only the lack of disagreement and agreement of the experts on the research variables is important, which of course was achieved. Next, we enter the machine learning stage to predict the effect of the final eight variables using the Delphi method. Next, we identified eight variables that are effective in pipe failure. Then, based on the identified variables, it was determined to what extent these variables can be good predictors for detecting defects in oil pipes. To overcome the problem of data imbalance, oversampling techniques such as SMOTE and ADASYN have been used. These methods increase the balance in the data distribution by generating artificial samples from the minority class and improve the accuracy of the model.
In this paper, the KNN model has been optimized in two ways: exponential weighting and hyperbolic tangent. These methods have improved the prediction accuracy. To examine the impact of these techniques, the proposed models have been compared with simple KNN and other machine learning algorithms including Random Forest, SVM and XGBoost.
- Exponential Weighting in KNN
- Hyperbolic tangent weighting in KNN
In this method, the weight of each neighbor is adjusted by the hyperbolic tangent function. exponential and hyperbolic tangent graphs, which show that as the sample distance increases, the assigned weight decreases more sharply, which will have a positive effect on the performance of the method.
- Result
In this article, in order to select the best method for predicting oil pipeline failures based on the identified variables and applying classification methods, a Python program has been used. The data is analyzed based on 8 variables and 319 samples for class analysis using different algorithms, and the results are presented.. After applying filters to pre-process the data, various machine learning methods were implemented and compared in terms of evaluation indicators in Table 2.
- Discussion
In this paper, an optimized model for KNN was proposed, which included exponential weighting and hyperbolic tangent techniques. These techniques have increased the accuracy of the model, which indicates the practical application of this model in the oil industry. The results of this study showed that the use of weighting techniques in KNN improves the accuracy of oil pipeline failure prediction. The KNN model with exponential weighting performed better than the simple KNN, while the hyperbolic tangent weighting was superior in some indicators such as (Precision). In general, ensemble learning methods such as Random Forest and XGBoost have the best performance compared to other methods such as Logistic Regression, LDA and QDA, SVM, and KNN with hyperbolic tangent has better accuracy in Precision because its gentler weighting has increased the stability of the model. KNN with exponential weighting performs better than simple KNN because closer samples help more in decision making. Random Forest and XGBoost methods have the highest accuracy because they use a combination of multiple models to reduce variance. SVM performs better than simple KNN but is weaker than ensemble learning models because it only considers a linear decision boundary.
Considering that the research findings show that 80% of oil pipe failures can be measured using the 8 variables under study, the remaining 20% can also be investigated and identified, in other words, it can be found out what other variables can predict oil pipe failures and add them to the existing list. Of course, this work should be done according to the opinions of experts and specialists in the field under study and the final variables should be extracted.
In laboratory conditions, each of the variables can be changed and its effect on pipe failure can be further investigated. In addition, the subject of the present study was only oil pipes, and other pipes such as gas transmission pipes can also be examined and scrutinized according to the existing variables.
The rate of pipe failure is among the issues that were not included in the present study. In fact, the extent to which these variables can lead to pipe failure is the result obtained in the present study, but the level of failure, type of failure, crack or corrosion can also be examined and considered as a variable, which is included in the limitations of the present study.
Therefore, suggestions for future research are as follows.
1- Investigate the combination of weighting methods with collective learning and meta-heuristic optimization algorithms such as genetic algorithms.
2- Consider the rate or severity of oil pipe failure.
3- Change each of the variables and examine their effect in laboratory conditions on pipe failure.
4- Examine the effect of existing variables on other types of pipes such as gas transmission pipes.
Popov, E., Cheremisin, A., Rafieepour, S., 2021. Prediction of dead oil viscosity: machine learning vs. classical correlations. Energies 14, 930
Fu, H., Yang, L., Liang, H., Wang, S., Ling, K., 2020. Diagnosis of the single leakage in the fluid pipeline through experimental study and CFD simulation. J. Pet. Sci. Eng. 193
Kerf, T.D., Gladines, J., Sels, S., Vanlanduit, S., 2020. Oil spill detection using machine learning and infrared images. Remote Sens. 12, 4090.
Zheng, J., Du, J., Liang, Y., Liao, Q., Li, Z., Zhang, H., Wu, Y., 2021. Deeppipe: a semi- supervised learning for operating condition recognition of multi-product pipelines. Process Saf. Environ. Prot. 150, 510–521.
Yang, L., Fu, H., Liang, H., Wang, Y., Han, G., Ling, K., 2019. Detection of pipeline blockage using lab experiment and computational fluid dynamic simulation. J. Pet. Sci. Eng. 183.
Otchere, D.A., Ganat, T.O.A., Gholami, R., Syahrir, R., 2020. Application of supervised machine learning paradigms in the prediction of petroleum reservoir properties: comparative analysis of ANN and SVM models. J. Pet. Sci. Eng.
Al-Dushaishi MF, Abbas AK, Alsaba M, Abbas H, Dawood J (2020) Data-driven stuck pipe prediction and remedies. Upstream Oil and Gas Technol 6:1–9. https://d oi. org/ 10. 1016/j. upstre.2 020. 100024
Alshaikh A, Magana-Mora A, Gharbi SA, Al-Yami A (2019). Machine learning for detecting stuck pipe incidents: data analytics and models evaluation. Society of Petroleum Engineers, International Petroleum Technology Conference, Beijing, China. https:// doi. org/ 10. 2523/ IPTC- 19394- MS
Magana-Mora A, Gharbi S, Alshaikh A, Al-Yami A (2019). AccuPipePred: a framework for the accurate and early detection of stuck pipe for real-time drilling operations. In: S.P.E. Middle East Oil and Gas Show and Conference, Manama, Bahrain. https:// doi. org/ 10. 2118/ 194980- MS
K. Abbas A, Flori A, Almubarak H, Dawood J, Abbas H, Alsaedi A. (2019). Intelligent prediction of stuck pipe remediation using machine learning algorithms. In: S.P.E. Annual technical conference and exhibition, Calgary, Alberta, Canada. https:// doi. org/ 10. 2118/ 196229- MS
Zenisek, J.; Holzinger, F.; A_enzeller, M. Machine learning based concept drift detection for predictive maintenance. Comput. Ind. Eng. 2019, 137
Makhotin, I.; Orlov, D.; Koroteev, D. Machine Learning to Rate and Predict the Efficiency of Waterflooding for Oil Production. Energies 2022, 15, 1199. https:// doi.org/10.3390/en15031199
Jean-François Toubeau a, *, Lorie Pardoen a, Louis Hubert a, Nicolas Marenne Jonathan Sprooten b, Zacharie De Gr_eve a, François Vall_ee(2021), Machine learning-assisted outage planning for maintenance activities in power systems with renewables, Energy 238 (2022) 121993