Using Data To Predict Failures In Rod Pump Artificial Lift Systems
University of Southern California professor Raghu Raghavenda explained to attendees of last year`s Woodford Shale Production Optimization Congress in Oklahoma City how computer modelling can help prevent rod pump failures. Leah Mooney reports. Addressing attendees at last year`s Woodford Shale Production Optimization Congress in Oklahoma City University of Southern California professor Raghu Raghavenda explained that computer science could be used to predict rod pump failures and provide the opportunity to intervene before it happens. Using historic data from a large number of wells, he created a database of rod pump performance information and used computer algorithms that could ultimately detect "patterns that will tell us [about] impending failures and then predict failures. Some of the data is such that a person looking at it may not be able to see [anything irregular] because these changes are very subtle and the machine learning techniques are able to identify these things."
Machine Learning Techniques
To overcome this, Ragharenda used a "machine learning algorithm" to analyze the data to learn what causes rod pump failures and the trends that occur in that data. This was used to build a model to classify pumps with similar parameters as either working normally or likely to fail. To create an individual model for each well would be time consuming, costly and impractical, so a global model was created, using data from a wide range of wells, which could be applied to multiple oil fields. The benefit of this system is "one could run these kinds of predictions, let`s say on a 10,000 well field, to get the results every morning and do whatever needs to be done to intervene with those".
The process of creating the models began with collecting real-world data, which has to be prepared before it can be used because it often contains missing information, as well as incorrectly entered data. "We need to [normalize the data] before we apply the machine learning techniques. Then we use the data to do some statistical analysis to get some features [of the pump`s performance] and then feed that to the machine learning algorithms."
The most reliable features Raghavenda found were card area, peak surface load, minimum surface load and daily runtime. Values for each of these features had to be labelled as either normal or prefailure to enable the computer to learn how to classify each rod pump. "For training purposes, we need what we call labelled examples where we know exactly which day it failed and what the values were before it happened. These could be given by the subject matter experts." This is an example of "supervised learning" which is time consuming but necessary at the beginning of the process. After this stage, he said "unsupervised learning" is the preferred method. Rod pump data is particularly suitable for this type of machine learning as the data clusters tend to be small when showing pre-failure condition and large when pumps are running normally.
Testing The System
Once the computer was trained in which values to look for and how they should be categorized, the system had to be tested. In order to do this, Raghavenda said "ground truths" needed to be established, involving real-life testing to ensure the predictions came true in the real world. However, "for rod pumps it is difficult to know the ground truths because it may be a 5,000ft or 10,000ft well. If I predict that it is going to fail or it has a rod cut or a tubing leak, how would we know the real truth unless we open it up? And that can be expensive".
Instead, he used historical data. This consisted of three months of data on 3,259 wells. During that time, they recorded 301 failures and 2994 running normally. The system correctly predicted 291 failures, missing only 10. It correctly predicted that 2852 would run normally but gave 142 false alarms. The parameters for this were that if the rod did not fail with 100 days of making the prediction, it was recorded as a false alarm. Raghavenda explained that some failures couldn`t be predicted because "rod pumps fail for different reasons. Some of them have long-term trends and some have short-term trends. Some failures are very sudden. One sudden failure is rod string failure. It could be that yesterday everything was perfect and all the wells were showing as normal but then the next day it stops producing completely. That we cannot predict because there is no information in the data to indicate that it would happen."
Real World Applications
One of the wells being monitored had a rod pump failure due to a tubing leak. The machine began detecting an abnormality at the time when the rod began to cut the tubing. It then recorded a dramatic fall in production when the tube started leaking. "In many cases, from a few days to a month, nobody noticed that the well maybe down and not producing at all. So there is a significant amount of lost oil," Raghavenda said, before adding that this system could prevent failures like this happening altogether.
"The issue is whether the production engineer/business units are willing to take some action. That depends on how confident you are on your prediction because there is a cost associated with opening it up and looking at it. If there was nothing, that would be a false alert and there would be a cost," Raghavenda said. To overcome this, he suggested that production engineers need to use their own prior knowledge of the wells alongside the data generated by the computer system to make decisions on whether an alert was worth investigating or not.