New system offers solution to No Fault Found
4 mins read
The problem rejoices under the catch-all of NFF – No Fault Found – and covers intermittent faults that cannot be traced when the serious business of looking for them starts. Other names for the problem include 'Cannot Duplicate' or 'Cannot Reproduce Fault' or even 'No Trouble Found'. Whatever the name, it is a problem that affects operation and maintenance in every type of electronics equipment, and it also has an impact on design engineers.
One of the most critical environments is in aerospace. Clearly, aircraft operators will err on the side of caution when it comes to dealing with faults, and as a consequence any problem can be punitive. A survey of the aircraft industry, conducted by test service company Copernicus Technology, showed the major consequences of NFF were: repeat occurrences; cost in man-hours; repair costs; downtime; flight cancellations; and overall financial impact.
However, while some respondents were able to assess the combined financial impact of these consequences – ranging from less than $1million to 4% who claimed it cost their company more than $10m – two thirds of respondents did not know how much NFF cost their organisations.
A further aspect is where the fault originates from when it can't be traced. Is it poor operation, low quality components or bad design? Without being able to identify the fault, it is impossible to tell.
According to James Martland, customer focus director of Copernicus Technology, part of the fault identification issue comes from the testing methodology. A typical test set up would be to use an environmental test rig, to simulate flight conditions, and a test solution in which standard equipment 'scans' a test line, then moves on to the next. "What that means," said Martland, "is that if there is a glitch as a consequence of your environmental stimulus, the chances are you are going to miss it. As the system grows in complexity, with a higher number of interconnectivities and weak points subject to intermittency, using scanning means that you have virtually no chance of picking up these intermittent glitches."
Copernicus was set up by a small group of air force engineers to examine the problem from first principles. In the same way that, in the 16th Century, Nicolaus Copernicus took a step back, viewed the data available and concluded that the Earth must rotate around the Sun, the company that took his name believes that problems are solved more easily if the right data is looked at in the right way.
Martland commented: "You can't deny the data – the facts. People can close their eyes to it and say 'we have always done it this way', but our philosophy is to go from first principles. People are so busy these days that often they don't have time to reflect on what the data is really telling them – they are on a treadmill. Data collection and data analysis are part of the same drive.
"A lot of the time, you have silos. Sometimes they collect good information, sometimes bad. What we decided we needed to do to get the proper picture was think about what data needed to be collected. Wherever we go, we find silos of legacy data that we can stitch together. We might have people who design equipment, people who repair and maintain it, people who operate it – they might talk to each other, but they do not necessarily share the right data."
All the founders of Copernicus experienced NFF waste in the military aerospace sector and knew it was difficult to find faults when the aircraft was on the ground without environmental stimuli like vibration and altitude. However, they remained committed to the approach that, while it was the symptoms that couldn't be found, the fault still existed. An initial approach to solving the problem came under the banner of 'knowledge management solutions'. This was a way of bringing together maintenance, operating and design data to give the customer a good understanding of the problem.
However, the key was to collect the right data and, to do this, Copernicus turned to US company Universal Synaptics, which had a piece of equipment designed to overcome the NFF problem. The original piece of equipment was the Ncompass, and Martland explained how it differed from using standard instruments. "Using an oscilloscope for one line does give you great granularity for that one line, but it is only looking at one line. With our set up, we are looking at all the lines all the time, so we can catch the glitch wherever and whenever it happens. Once we have identified where the problems are, we can switch to a different technology to better characterise it. So you go seamlessly from testing the whole system all of the time looking for system integrity – glitches – to a different mode in that test to characterise that line and understand it better."
Because the founders have all come from the military aircraft sector, that is where the company's focus has been until now. Martland gave the example of the F16 fighter fleet in the US Air Force. Each F16 has a radar box worth around $330,000 and some developed an NFF that resulted in 130 boxes being parked on depot shelves – apparently unfixable and definitely unusable. However, a project using Ncompass saw the problem diagnosed and units repaired, effectively rescuing more than $40m worth of product and bringing a dramatic increase in mean time between failure from 280hr of flying time to 850hr.
While this data collection and analysis process obviously has its place in the maintenance depot of high value military equipment, does it have relevance to an electronics design department in less critical environments? Very much so, according to Martland. "There are two ways it's adding value. There is conventional exploring – which is how you use it during the design phase to check integrity during the design and prototype stages – evaluating different components that might have the same spec, but which may perform differently. So there is that value in the design lab.
"But there also the value of the knowledge of how things age and fail in service. Because components might have integrity now, they may suffer from intermittency a year, or two, or three after service commences – that knowledge is incredibly valuable to designers. By testing components repeatedly you are characterising those items over time and that is really valuable; you are now seeing individual failures and managing them through life," he concluded. "You can manage the repair to fit in with the platform's operating cycle."