Introduction / Motivation
I want to formalize an idea I've had in analyzing data. This is a work in process. I'm interested to see if machine learning can be applied to a specific engineering problem. The problem I'm working on is in the field of reliability. Specifically I want to determine why a small percentage of parts are failing at customer sites. The objective is to prevent them from escaping our process.
So as to not divulge details of a specific product or process I will try to be generic in my description. Consider the part in question is a semiconductor device that displays an aging characteristic and has a finite lifetime. The problem I'm addressing is when say 0.5% of parts are showing premature aging. As a classification problem this is a binary problem (pass/fail) that is severely imbalanced (99.5%/0.5%).
Outliers
My first approach at analysis looks at a population of about 50 failed parts. The parts come from a population of about 60,000 for which I have 8 parameters (features) measured by the vendor. I applied Principle Component Analysis (PCA) to normalize the data, (60,000 x 8) and remove the mean. The question is whether any of the eight features on the failed parts were outliers in the vendor population. The PCA normalized data should be within 4-sigma of zero if the parameter distribution were Gaussian. All failed parts, as-shipped, were within 3-sigma of the zero mean. My conclusion was there were no outliers hence a simple screening of any of the eight features before build would not be useful.
Clusters
My next step is to see if the failed parts, as-shipped, are clustered in the 8-dimensional feature space. I performed a K-nearest-neighbor analysis (using k=3) of the 50 failed parts in the population of 60,000 and found that for each failed part the 3 nearest neighbors were all "passers".
My tentative conclusion is there is nothing in the as-shipped data that would indicate a future premature failure. It could be that the devices are damaged during the build process and as a result they are "walking wounded". Once in the field under normal operation the parts could be experiencing an accelerated aging. This might be something such as ESD damage that is not detected before shipping the product.
I am currently working on an experiment to see if ESD can create "walking wounded" devices.