| Can You Help?
This section contains questions asked in 2003. You can still post responses to these
questions as noted below, they have been cut apart for efficiencies purposes only.
Notice:
This technical discussion forum has been established by the SRE to assist all reliability
engineers, not just SRE members. To post your technical problem,
solution, question, or answer here, send an e-mail
to SRE
webmaster. Your
e-mail should follow the format of those already posted. Postings are accepted from anyone,
as long as they relate to the reliability field. It works best if your posting
contains contact information for possible follow-up. But, if necessary, postings
can be anonymous - just so state in your email! Also, let
us know when
you do get
workable solution or
answer. We want to post it so everybody will benefit.
Problem/Question
(Oct 13, 2003):
Does anyone have any experience on accelerated life testing of automotive lamps? Life testing is being done through an overvoltage application to determine its equivalent condition. I just don't know also if the formula in accelerated life test defined on incandescent lamps (higher voltage) is the same as with the automotive lamps.
Hope you can help me on this.
Sincerely
Joel Trinidad
jt14000@hotmail.com
Solution/Answer:
Can you help?
Problem/Question
(Oct 13, 2003):
Some one could help me out please. What is the difference between Reliability testing and Durability testing of mechanical parts?
Please provide me answer for this..
Hari.
hariprasad@tatamotors.com
Solution/Answer:
Can you help?
Problem/Question
(Oct 10, 2003):
I hope you can help. I am interested in vibration analysis. I am a reliability coordinator with a preventative maintenance program that needs to be upgraded. I hear that monitoring vibration is the way to go. Can you give some info on what I would need to purchase as far as monitoring equipment and so on... I would be monitoring gear boxes, bearings, and shafts.
Thanks a bunch
Karen
Trovinge@DIALCORP.com
Solution/Answer:
Can you help?
Problem/Question
(October 1, 2003):
I am a new reliability engineer working on electronic ballasts. I would like to get some guidance as to the reliability programs that might be applicable to it. I am very new to this field. Other than that, I am also tasked to investigate MOSFET failures in our product. Does anyone have any idea as to how I can get started? I mean, guide me in doing the failure analysis and simulating failure modes applicable to MOSFETS? I also would like to be guided as to how I can measure the MTBF of electronic components. Hope somebody out there can respond to me.
Thanks in advance.
Walter Cornelio Osorio
Product Reliability Test Engineer
Trilux Electronics and Luminaires, Inc.
email: wosorio@trilux.com.ph
Solution/Answer:
Can you help?
Reply:(October 1, 2003)
Collect failure data.
If you're lucky, you might get ages at failures, by failure mode. You'll also need survivors' ages. With this data, you can use standard statistics to estimate and analyze failure rate functions by failure mode (MOSFET or other) and make broom charts that indicate whether reliability has changed. Nobody in his right mind would track ballasts by serial number to get ages at failures. So you'll have to collect ships and failure or return counts, by accounting interval and failure mode, to estimate and analyze failure
rate functions.
Larry George
Send data to pstlarry@comcast.net or enter it into
http://www.fieldreliability.com/Table.htm, and I will send back the estimates and the analysis, free of charge.
Problem/Question
(September 22 2003):
I am looking for a whitepaper that compares and contrasts PRISM versus MIL-STD-217. We recently repeated an old MIL-STD-217 MTBF calculation using PRISM. The difference was greater than an order of magnitude. Granted, we might have been a little conservative in our old 217 calculation, but 10 times greater MTBF has raised some eyebrows here. Please help me rationalize this difference.
William (Bill) Carson
MIDAS Field Applications Engineer
VMETRO Inc.
Houston, Tx 77077
Solution/Answer:
Can you help?
Reply:(22 September, 2003)
Concerning MIL-HDBK-217 and other predictive methods:
I have used these techniques for about 31 years. Along about 1977 I learned that most of the
MIL-HDBK solutions were useless except for one interesting fact:
When assigned to ' troubled projects ', the predicted values for failure rates were always entirely innacurate and always predicted higher than actual failure rates. Predicted values were always higher, even for the ' successful ' products. We also determined that MIL-STD-871 is a dangerous thing to use.
Since 1977 we have been able to achieve more than 300 times better reliability than a MIL-217 S-Level prediction would indicate is possible. ( S-Level being Space Quality level ) We accomplished this while using commercial plastic devices, commercial high volume manufacturing processes and with very little or no inspection at all.
Here's how we did it :
1) We acknowkedged that parts fail for several possible causes:
Electronic parts:
(a) the most common failure mechanism was design caused, misapplication etc.
(b) next was mishandling ( ESD and damages parts )
(c) not very often, but once in a while we had some bad parts. We purged these and took them back to the manufacturer.
2) We acknowledged that when something fails, it is usually a physics or chemistry of failure problem, and as such, then each individual failure mechanism was inextricably tied to a specific time to failure distribution.
(a) For mechanical failures, it's simple (really!) Either there was too high a stress, or the part was too weak. The only exception was when lubricant was left out of some radial bearings. But even in this, the Weibull slope remained the same for the failures, only the scale parameter changed. The time to fail for mrchanical parts is not a function of the difference between the stress and strength, but is an xponential function of the ratio of stress to strength.
(b) Electronic part failures obeyed either a Log-Normal pdf or Two Parameter Weibull.
Capacitors .... strength distribution = smallest extreme value distribution ... time to failure Two parameter Weibull
Resistors ... chemical failure modes (corrosion, etching etc) always Log-Normal time to failure
(c) MOSFETS (Power) most commonly failed because they were "kicked back" by inductive loads... ( a design problem) Time to fail Uniform (Weibull with shape parameter = 1) When used properly, the most common failure modes were ESD, or mobile ion contamination. To check which one, burn it in at 125°C under bias overnight. In the morning, remove the part and cool while maintaining the bias, if the part is really bad, then bake it overnight without bias. Test the part again. If it entirely heals then it is due to ionic contamination. Opening the part and performing a SEM with EDX should find the contaminant. It is usually expedient in this case to burn-in all the parts under bias, testing and throwing out all the bad ones. Return the others to the supplier and demand corrective action ( follow up to make sure they found the source of the contaminant and removed it.) If the parts did not heal after the 125° stabilization bake, but partially healed, then they failed by ESD. You have to find out where in your process or earlier that they are getting zapped... and fix it.
(d) MOS, CMOS transistors and IC's either misapplication or ESD, although mobile ions are possible. You know how to determine the difference now.
I have software available which can help you to determine which failures are design problems and which are parts (supplier) problems. It's nice to have the tools of the trade.
Good luck,
Barry Schlund
Visit us at SYSTEMS-RELIABILITY.com
Problem/Question
(August 12 2003):
Seeking assistance with a simple reliability question. I have been asked to define a sample size to test a quick connect valve used in the medical field that will give a 90% confidence level that unit will survive for one year of operation. The quick connect is snapped togeather two times a day for a total of 730 cycles in a year.
We have never done any type of reliability tests with this quick connect valve in the past, therefore I have no historical data to establish any type failure rate. If someone would be so kind to give me some advice, I would greatly appreciate it.
Thanks in advance for your help
Robert McCoy, CQE
Quality Engineer
Solution/Answer:
Can you help?
Reply:(12 Aug, 2003)
I have always used the reliability engineer's 3x rule of thumb.
It provides 95% confidence. We never settled for less than 95%.
If you make 730 x 3 = 2,190 connections without failure then you will have demonstrated a 95% confidence level on the 730.
For 90% confidence, you need to do 730 x 2.305 = 1,683 connections without failure
Barry Schlund
Visit us at SYSTEMS-RELIABILITY.com
Problem/Question
(August 08 2003):
I have been in maintenance for a number of years. Held many different positions, worked my way up so to speak. Earned my Associates in Mech. Eng. I have recently learned about "maintenance reliability-PdM" vibration in particular. I have realized this is the area I should have been in ten years ago. Now I am looking for help / advise as to how can I get into this field. I have a great track record in customer service and understand the operations behind maintenance. So if any one out there can point in the right direction and what doors to knock on I would be very appreciative.
Thanks for your help,
Ray Muench
Email: ramon229@earthlink.net
Solution/Answer:
Can you help?
Problem/Question
(August 06, 2003):
What is the historical value for MTBF for a complex electronic product like a lap top computer. I have run the math for my product and now need to understand the answer. To reduce the error or uncertainty of my answer, is there a standard test set up for the MTBF for a product such as a lap top? More specifically, is there a standard quantity of units run? Is there a standard number of hrs of operation used?
Thanks,
John Hennessy, Sr QE Itronix Corp
(509) 742-1745
Email: hennessy@itronix.com
Solution/Answer:
Can you help?
Reply:August 06, 2003
I don't have a computer MTBF, but I do have computer reliability estimates of various kinds. One reason for no MTBF is that such a small fraction of computers fail during their useful life that extrapolating from ~5% to MTBF is ridiculous. I don't recommend testing them to determine MTBF, but I do recommend ongoing reliability tests to do statistical process control to detect process defects. The attachment describes how. The attachment shows how to provide a tolerable MTBF estimate, if someone insists.
EDITOR'S NOTE: There was no attachment.
Larry George
pstlarry@comcast.net
Reply:August 06, 2003
As a matter of fact there are many standards. You will find that the best companies ( like DELL )
use higher standards than the " El Cheapo's ".
First of all, you need to find out (from someone in the business) what is a pareto chart of
field failures for laptops.
Dropped from the second floor .... xx % ( every user will eventually drop their laptop.)
Battery failure (internal short etc)
CPU overheated because the cooling fan separated from the chip ... y%
Power supplies
Large capacitors exploding because they were installed backwards ...zz%
Decoupling capacitors shorted (smashed against the PWB) .... tt%
etc
Transistors/drivers......
etc etc
You need to eliminate each failure mode through concerted design and test verification.
Do you have wires (tack them down and provide stress relief at both ends ) common sense things.
Have you seen DELL's Volkswagon tests? Where the beat them to death until the wheels fall off?
I know of one company who shoots for a 3 year life, another shoots for 5 years, and (guess who)
shoots for 10. So, it depends on where you want to be in the marketplace.
Barry Schlund
Visit us at SYSTEMS-RELIABILITY.com
Reply:June 16, 2004
A few years back there was a paper about using PCs in Point of Sale
terminals that did accelerated life testing -- one of the years RAMS was
in LA. We've looked at PCs for use in industrial controls, and found that they
experience several wear out failure modes.
* The display typically lasts 3-5 years (CRT is obvious, but
illumination behind LCD has a similar life.
* Hard drive used constantly lasts about 3 years (this may have
changed)
* The cooling fan has a similar life.
The secondary problem is after 3-5 years none of the original parts is
available, and you might not be able to get the operating system
(Windows in particular) to run on new hardware.
William Hagen , CRE
whagen2@ford.com
Powertrain ME Plant Engineering Commonization
Cell (313) 575-9529 Fax (313) 390-7701
Problem/Question
(July 31, 2003):
I am working for a electronics company and I am interested in a procedure to calculate the MTBF and life fit rate data for one of our new products. I am also interested to know more about MIL-HDBK-217 and Bellcore Reliability Prediction Procedure TR332 in which I believe this will help me in calculating the MTBF.Does anyone have such documents?
Thanks in Advance
Ken
Email: bw_quek.ost@olympus.com.sg
Solution/Answer:
Can you help?
Reply:(31 July, 2003)
BelCorp and MIL-HDBK-217 are both wrong.
We have been demonstrating 300 times better reliability than an 217 S-Level prediction would say the product's were capable of if we used all space level quality parts. But we did it with cheap commercial plastic parts, and high volume commercial processes (Water soluble fluxes etc), and with very little or no inspection.
A Quality engineer's job is to work himself out of a job. The same for a Reliability engineer. If they actually do this, then they will actually have a job forever.
We found that most field failures were design issues, and that when there were parts failures (that weren't design issues) then they were due to supplier process, or in-house poor ESD handling. When these problems were eliminated, we had 300 x MIL-217 S-Level reliability.
We have software techniques and training on how to determine if failures are design related or supplier related.
Barry Schlund
Visit us at SYSTEMS-RELIABILITY.com
Problem/Question
(July 22, 2003):
Please assist in advising how to go about doing a GAGE R&R test when the samples are DESTRUCTED or destroyed. In this case, it is around doing the CO2 values in beer bottles. Every time a test is done per bottle, that specific sample is "destroyed" as it cannot be repeated. I used three operators using 20 samples each, giving a total of 60 bottles destroyed. I know you would be concerned about wasting beer, but the pay back in a proper R&R method by far out weighs the beer.
Please assist in giving sound advise.
Thanks,
Clive Smith
Solution/Answer:
Can you help?
Problem/Question
(June 29, 2003):
I am trying to evaluate the reliability of a Positive Temperature Coefficient Device (PTC) Resettable Fuse. Does anyone have any experience in using these types of devices. One of the engineers I talked to thought that they tend to fail often. Are they reliable devices? How would one be modeled to predict its reliability? What is the normal failure mode, open or closed? What type of qualification testing may be performed to ensure a high reliability or quality level?
I'd appreciate any advice you may have.
Thanks,
Rich K.
Email: richk@xetron.com
Solution/Answer:
Can you help?
Problem/Question
(June 06, 2003):
I am a very new reliability engineer. So does anyone know the complete tests which are supposed to be run in order to ensure the reliability of a product? The current product i'm working on is LED. Does anyone have a list of the tests and the justification of why we need to run the tests? Eg. Temperature Cycling is to simulate the hot and cold condition and also to check for the CTE mismatch.
Thanks in advance.
Kevin
Email: kevin-wl_yeoh@agilent.com
Solution/Answer:
Can you help?
Problem/Question
(May 13, 2003):
We are in the process of adding electronic components to military vehicles and wish to address electronic component reliability in terms of heat increases over time. What would be most helpful would be a graphic representation of percent degradation of reliability as temperature increase from ambient to 200 degrees F.
Michael D. Barton
Abrams Tank Reliability Engineer
DSN 786-6761
COMM 586-574-6761
FAX 586-574-8188
Solution/Answer:
Can you help?
Reply:(13 May, 2003)
As a general rule do this:
Use an activation energy of 0.7 for all CMOS devices and IC's (ESD and other voltage breakdown phenomena)
Use an activation energy of 0.7 for all other silicon transistors (Aluminum electromigration)
Make sure that no semiconductor supplier has given you any smashed gold bonds which lead to purple plague
and Kirkendal voiding. ( Activation energy 1.0 ) It happens to every supplier from time to time. Use only burned-in parts (125°C for 96 hrs minimum) using an activation energy of 0.7 compute the field failure rate from the burn-n results. If they are not better than S-Level predictions then find out why (and fix it)
Use an activation energy of 0.43 for mechanical failures ( bearings etc)
If you know of any diffusion mechanisms, the activation energy would be in the range 1.0 to 1.4 . Eliminate these failure mechanisms through design ( diffusion barriers etc). These high activation energy mechanisms will kill you.
Barry Schlund
Visit us at SYSTEMS-RELIABILITY.com
Go to Current Technical Help.
Go to 2002 Technical Help.
Go to 2001 Technical Help.
Go to 2000 Technical Help.
Go to 1999 Technical Help.
Go to 1998 Technical Help.
|