Electronics-Related.com
Forums

Burn-in strategy

Started by Don Y December 2, 2022
On Fri, 2 Dec 2022 10:53:51 -0700, Don Y <blockedofcourse@foo.invalid>
wrote:

>We're making devices that are typically built from 3+ "modules" >(form factor is highly constrained so board space is similarly). > >Presently looking at the cost-benefit assessment of burning-in >the modules and/or assembled devices. > >*Modules* will be made offshore so any test/burnin has to be part >of the quoted manufacturing cost. > >Modules will be "post processed" domestically to install final >firmware, S/Ns, private keys and watermarks. This allows for >all of that information to be tracked, here (instead of relying >on an offshore vendor who may decide to copy IP). > >Aside from installing the above, the only real "mechanical" >modifications are connecting modules together and packaging. > >So, a "final test" can just verify proper operation of >the *device* (instead of its constituent modules). > >If modules are burned-in and tested prior to acceptance, domesticly, >the number of failures after assembly should be minimal. (Yet, >you'd still want to verify proper operation as the cost to repair >or replace far exceeds the price of the devices) > >OTOH, an assembly problem that could manifest during burn-in >would want some assurances that it wouldn't sneek past final >inspection in the absence of a post-assembly burn-in phase. > >At the very least, this sort of question would apply to anyone who >installs firmware after offshore manufacturing. Or, assembles >subsystems sourced offshore. > >So, what is the best practices guidance?
Not sure how the off-shore vendor is expected to burn in unprogrammed devices. Are they provided with 'do-loop' or burn-in programs for checking function? Use of pre-stressed parts is intended to cut down on fall-out and rework from your own testing of the final unit. In-house fixed costs of burn-in, with your unique fit and function, will likely be larger on a per-unit basis than that for the higher-volume modules. Weighted labor, rent and energy costs, likewise. 'Assembled with burnt-in modules' is not 'burned-in', so it depends on what your client is expecting re any specified defect rate or quality level. A single failure in burn-in of pretested units should ring alarm bells, but can't be expected to detect 'assembly problems'. Overnight is easy for you, but I think you'll find that 24hrs ( - set-up time ) is expected, at tmax, with dynamic power cycling, to allow for the system's thermal time constant. (~5 to 15 minutes off in the hour). Anything vaguely mechanical requires testing of occasional 'off the line' samples, (~HALT) to verify integrity of the assembly process, as it was defined for origial qualification. You could probably outsource that, but would need to own the mounting jig and connectors. If the offshore vendor CAN test with do-loop or burn-in programming, why not the 'completed' unit under the same provisions? Then you test, reflash/reprogram, and test again.You'd have to take much more interest in the vendor's procedures, but you could then fill your facility with mannequins. For consumer products, who bothers? RL
On 12/3/2022 7:14 AM, legg wrote:
> On Fri, 2 Dec 2022 10:53:51 -0700, Don Y <blockedofcourse@foo.invalid> > wrote: > >> We're making devices that are typically built from 3+ "modules" >> (form factor is highly constrained so board space is similarly). >> >> Presently looking at the cost-benefit assessment of burning-in >> the modules and/or assembled devices. >> >> *Modules* will be made offshore so any test/burnin has to be part >> of the quoted manufacturing cost. >> >> Modules will be "post processed" domestically to install final >> firmware, S/Ns, private keys and watermarks. This allows for >> all of that information to be tracked, here (instead of relying >> on an offshore vendor who may decide to copy IP). >> >> Aside from installing the above, the only real "mechanical" >> modifications are connecting modules together and packaging. >> >> So, a "final test" can just verify proper operation of >> the *device* (instead of its constituent modules). >> >> If modules are burned-in and tested prior to acceptance, domesticly, >> the number of failures after assembly should be minimal. (Yet, >> you'd still want to verify proper operation as the cost to repair >> or replace far exceeds the price of the devices) >> >> OTOH, an assembly problem that could manifest during burn-in >> would want some assurances that it wouldn't sneek past final >> inspection in the absence of a post-assembly burn-in phase. >> >> At the very least, this sort of question would apply to anyone who >> installs firmware after offshore manufacturing. Or, assembles >> subsystems sourced offshore. >> >> So, what is the best practices guidance? > > Not sure how the off-shore vendor is expected to burn in > unprogrammed devices. Are they provided with 'do-loop' > or burn-in programs for checking function?
Because most "modules" don't have any programmable components! Think of them as "I/O boards". You'd exercise an I/O board with software running *in* the test fixture, not in the I/O board! If you build the test fixture from the same sets of modules that would eventually "host" that I/O board "module", you don't need to design a special test fixture *or* special tester software if the eventual device is already known to contain run-time diagnostic support -- to alert the user to failures before they become consequential: "Hmmm... I know that I just turned on the stovetop but I don't sense current flowing. Is my current sensor broken? Or, the driver?". I have a module that provides audio output (w/amplification) and input. What do you expect to find on that module in terms of programmable devices? But, if you mate it to a "CPU module" that has been PXE-booted with a test program that generates audio (with a suitable load to stress the amp) and digitizes the audio looped back to the input channel, then you have a test fixture that can exercise that particular type of module.
> Use of pre-stressed parts is intended to cut down on > fall-out and rework from your own testing of the > final unit. In-house fixed costs of burn-in, with your > unique fit and function, will likely be larger on a > per-unit basis than that for the higher-volume modules. > Weighted labor, rent and energy costs, likewise.
The point of doing some/most of that offshore is to reduce the costs of doing it domestically. Does the offshore manufacturer care if the test fixture is applying power to a 30A load and sensing the current flowing to it? Does it really have to be a stovetop?? Will I *gain* any additional information if I subsequently test that "pre-burned-in" module with a genuine stovetop??
> 'Assembled with burnt-in modules' is not 'burned-in',
Exactly. But, the types of failures that should manifest should only be related to the assembly of devices out of modules and the installation of the deliverable firmware. E.g., the *device* that controls the stove will already have had the *module* that switches/sense the power AND the CPU module AND the "power supply" module already pre-qualified.
> so it depends on what your client is expecting re any > specified defect rate or quality level. A single failure > in burn-in of pretested units should ring alarm bells, > but can't be expected to detect 'assembly problems'.
The same sorts of problems that can manifest in post-burnin *module* test can also be evident in post-assembly testing. But, a smaller *set* of potential problems because the majority of the (module) assembly has already been tested.
> Overnight is easy for you, but I think you'll find that > 24hrs ( - set-up time ) is expected, at tmax, with dynamic > power cycling, to allow for the system's thermal time > constant. (~5 to 15 minutes off in the hour).
Yup. But, if you can produce test fixtures at will, you can choose to stress designs for multiple days at a time! It all depends on how you model the failure mode distribution over time. Ideally, you want to catch all of the infant mortality *failures* before they get shipped to customers. So, you *expect* to see the predicted infant mortality rate (of your revised design/process) in the post-burnin testing. If you aren't seeing any failures, then you likely aren't "aging" the devices adequately enough to see those failures before shipping the product. If you routinely sample your "product", you can watch to see how the current design/process is tracking your predictions when you "fixed" the design/process. If that accelerated testing shows infant mortalities happening later in service life than your initial prediction, you may want to tweek your burnin to better capture those. Otherwise, you've just "wasted" the initial service life of that unit and made it's likely infant mortality sooner, after sale, than it would have been had you not burned it in!
> Anything vaguely mechanical requires testing of occasional > 'off the line' samples, (~HALT) to verify integrity of the > assembly process, as it was defined for origial qualification. > You could probably outsource that, but would need to own the > mounting jig and connectors.
There are no mechanisms involved. When I worked with hand tools, we would destructively test production samples to verify we were meeting our design goals. For the same reason; you don't want the item to break after the customer has received it! Owning jigs/connectors also applies to the abovementioned test fixtures. But, this is less important if the fixtures aren't really "special". If the contract manufacturer wants to hold them hostage, you just shrug and write them off... create new fixtures from new production modules! [You're not at the mercy of, for example, the manufacturer keeping your (costly) tooling!]
> If the offshore vendor CAN test with do-loop or burn-in > programming, why not the 'completed' unit under the same > provisions?
Because the mix of "devices" will vary with the market. Why not install all your 4K7 resistors in the circuit boards you have on hand... and *hope* that exactly as many of each board that you have actually get sold?! The final firmware installed in each *device* varies from device to device (S/Ns, private keys, watermarks) even if the basic functionality is the same. Ensuring that a module has no "personality" prior to FINAL assembly ensures you have better control over what is actually going out the door. [Installing firmware and final test/burnin are trivial compared to all of the testing and burnin that has to happen for the component modules]
> Then you test, reflash/reprogram, and test > again.You'd have to take much more interest in the vendor's > procedures, but you could then fill your facility with > mannequins. > > For consumer products, who bothers?
These aren't the sort of devices you buy at your local XYZ store and set up on your own. They are built *into* your home; part of the design philosophy has been that they aren't visible. Would you like seeing wall-warts and little boxes of kit all around your home? [I have 240 such "devices" in our modest little home] As such, a fair bit of effort goes into prewiring the home, making places for the devices to reside *in* walls, ceilings, closets, etc. The most cost effective approach is to do this when the home is in the construction phase and the skeleton is plainly visible. So, the primary (volume) customer is the home builder/developer. He would be unhappy if he installed 240 (or more) devices and had to revisit the property shortly after "closing" because the device located behind the dishwasher shit the bed. And, even more annoyed because of the "bad press" that it would generate among other potential buyers (developers typically build "subdivisions" full of homes, not individual homes). Because these "features" would be a (pricey) selling point for the home! A *homeowner* (the traditional "consumer") would typically only need to be involved with *devices* in the event of a failure. He could either hire someone to make the repair/replacement or undertake it, himself. In either case, he'd not be happy if the replacement shit the bed in short order and he had to rehire the same installer/repairman! (now he'd have memories of TWO failures -- one at "wearout" and the other at "infant mortality") IME, "people" don't want to have to dick with things; they just want them to work. The time spent "correcting" something that shouldn't have needed correcting is seen as "a hassle"... And, a poor reflection on the supplier (home builder, equipment manufacturer, etc.) [We just purchased a new appliance. One silly bit of decorative plastic trim is broken. How many hours will I have to spend "on hold" trying to get someone to put a replacement in the mail, to me? Did I really *expect* to have to do that when I ordered the appliance??]
On 12/2/2022 19:53, Don Y wrote:
> We're making devices that are typically built from 3+ "modules" > (form factor is highly constrained so board space is similarly). > > Presently looking at the cost-benefit assessment of burning-in > the modules and/or assembled devices. > > *Modules* will be made offshore so any test/burnin has to be part > of the quoted manufacturing cost. > > Modules will be "post processed" domestically to install final > firmware, S/Ns, private keys and watermarks.&nbsp; This allows for > all of that information to be tracked, here (instead of relying > on an offshore vendor who may decide to copy IP). > > Aside from installing the above, the only real "mechanical" > modifications are connecting modules together and packaging. > > So, a "final test" can just verify proper operation of > the *device* (instead of its constituent modules). > > If modules are burned-in and tested prior to acceptance, domesticly, > the number of failures after assembly should be minimal.&nbsp; (Yet, > you'd still want to verify proper operation as the cost to repair > or replace far exceeds the price of the devices) > > OTOH, an assembly problem that could manifest during burn-in > would want some assurances that it wouldn't sneek past final > inspection in the absence of a post-assembly burn-in phase. > > At the very least, this sort of question would apply to anyone who > installs firmware after offshore manufacturing.&nbsp; Or, assembles > subsystems sourced offshore. > > So, what is the best practices guidance? >
Apart from the burn-in done here I already told you about I remember one more example I know of. During the early 80-s, a friend of mine worked as a production engineer at a factory which made clones of the PDP-11, they were shipped to the USSR. Many boards, full of TTL chips, many of which Russian (they used to make things up to say 4 bit counters etc., like the 74193 etc. under names no one could repeat) so the failure rate was huge. Their standard procedure was - as I remember the stories, never witnessed these - some time in a 40C chamber (don't know how many hours) where most of the failures would manifest, then on some rattling machine for a vibration test, may be more I don't know of.
On Friday, December 2, 2022 at 11:28:06 PM UTC-5, John Larkin wrote:
> On Fri, 2 Dec 2022 15:38:08 -0800 (PST), Fred Bloggs > <bloggs.fred...@gmail.com> wrote: > > >On Friday, December 2, 2022 at 6:22:48 PM UTC-5, John Larkin wrote: > >> On Fri, 2 Dec 2022 15:09:26 -0800 (PST), Fred Bloggs > >> <bloggs.fred...@gmail.com> wrote: > >> > >> >On Friday, December 2, 2022 at 1:08:08 PM UTC-5, John Larkin wrote: > >> >> On Fri, 2 Dec 2022 10:53:51 -0700, Don Y <blocked...@foo.invalid> > >> >> wrote: > >> >> >We're making devices that are typically built from 3+ "modules" > >> >> >(form factor is highly constrained so board space is similarly). > >> >> > > >> >> >Presently looking at the cost-benefit assessment of burning-in > >> >> >the modules and/or assembled devices. > >> >> > > >> >> >*Modules* will be made offshore so any test/burnin has to be part > >> >> >of the quoted manufacturing cost. > >> >> > > >> >> >Modules will be "post processed" domestically to install final > >> >> >firmware, S/Ns, private keys and watermarks. This allows for > >> >> >all of that information to be tracked, here (instead of relying > >> >> >on an offshore vendor who may decide to copy IP). > >> >> > > >> >> >Aside from installing the above, the only real "mechanical" > >> >> >modifications are connecting modules together and packaging. > >> >> > > >> >> >So, a "final test" can just verify proper operation of > >> >> >the *device* (instead of its constituent modules). > >> >> > > >> >> >If modules are burned-in and tested prior to acceptance, domesticly, > >> >> >the number of failures after assembly should be minimal. (Yet, > >> >> >you'd still want to verify proper operation as the cost to repair > >> >> >or replace far exceeds the price of the devices) > >> >> > > >> >> >OTOH, an assembly problem that could manifest during burn-in > >> >> >would want some assurances that it wouldn't sneek past final > >> >> >inspection in the absence of a post-assembly burn-in phase. > >> >> > > >> >> >At the very least, this sort of question would apply to anyone who > >> >> >installs firmware after offshore manufacturing. Or, assembles > >> >> >subsystems sourced offshore. > >> >> > > >> >> >So, what is the best practices guidance? > >> >> We were doing overnight burnin+test of our products, after automated > >> >> test and cal, but the failure rate was zero. Useful burnin might take > >> >> weeks and temperature cycling or something expensive like that. > >> >> > >> >> Temperature cycling is probably the biggest stressor of parts and > >> >> solder joints and design margins. Shock+vibration next. Just benign > >> >> burnin doesn't seem to do much. > >> > > >> >Infant Mortality--The Lesser Known Reliability Issue > >> >https://ieeexplore.ieee.org/document/4274831 > >> Paywalled. What's a reasonable burn time to catch infant mortality? If > >> it's months, it wouldn't be practical for commercial gear. > > > >It's not the only game in town. There's tons literature on infant mortality with testing techniques dating to WW2. > Military electronics used to require JAN/TX transistors, carefully > assembled and tested and burned in and fabulously expensive. Then > someone determined that regular equivalents were more reliable.
Their main strength was hermetic encapsulation. That was something they knew how to do. Now look at them, installing parts salvaged off junk consumer products from China, washed off in some dirty river, after being removed en masse from a circuit board with a gas torch, and then re-labeled with some JAN code. The main criteria for acceptance was they looked shiny and new.
On Friday, December 2, 2022 at 6:22:48 PM UTC-5, John Larkin wrote:
> On Fri, 2 Dec 2022 15:09:26 -0800 (PST), Fred Bloggs > <bloggs.fred...@gmail.com> wrote: > > >On Friday, December 2, 2022 at 1:08:08 PM UTC-5, John Larkin wrote: > >> On Fri, 2 Dec 2022 10:53:51 -0700, Don Y <blocked...@foo.invalid> > >> wrote: > >> >We're making devices that are typically built from 3+ "modules" > >> >(form factor is highly constrained so board space is similarly). > >> > > >> >Presently looking at the cost-benefit assessment of burning-in > >> >the modules and/or assembled devices. > >> > > >> >*Modules* will be made offshore so any test/burnin has to be part > >> >of the quoted manufacturing cost. > >> > > >> >Modules will be "post processed" domestically to install final > >> >firmware, S/Ns, private keys and watermarks. This allows for > >> >all of that information to be tracked, here (instead of relying > >> >on an offshore vendor who may decide to copy IP). > >> > > >> >Aside from installing the above, the only real "mechanical" > >> >modifications are connecting modules together and packaging. > >> > > >> >So, a "final test" can just verify proper operation of > >> >the *device* (instead of its constituent modules). > >> > > >> >If modules are burned-in and tested prior to acceptance, domesticly, > >> >the number of failures after assembly should be minimal. (Yet, > >> >you'd still want to verify proper operation as the cost to repair > >> >or replace far exceeds the price of the devices) > >> > > >> >OTOH, an assembly problem that could manifest during burn-in > >> >would want some assurances that it wouldn't sneek past final > >> >inspection in the absence of a post-assembly burn-in phase. > >> > > >> >At the very least, this sort of question would apply to anyone who > >> >installs firmware after offshore manufacturing. Or, assembles > >> >subsystems sourced offshore. > >> > > >> >So, what is the best practices guidance? > >> We were doing overnight burnin+test of our products, after automated > >> test and cal, but the failure rate was zero. Useful burnin might take > >> weeks and temperature cycling or something expensive like that. > >> > >> Temperature cycling is probably the biggest stressor of parts and > >> solder joints and design margins. Shock+vibration next. Just benign > >> burnin doesn't seem to do much. > > > >Infant Mortality--The Lesser Known Reliability Issue > >https://ieeexplore.ieee.org/document/4274831 > Paywalled. What's a reasonable burn time to catch infant mortality? If > it's months, it wouldn't be practical for commercial gear.
Right. FedEx and UPS will put your gear through the shake, rattle and roll test. Maybe do some dummy shipments out and back and see how that works.
On Sat, 3 Dec 2022 13:09:52 -0800 (PST), Fred Bloggs
<bloggs.fredbloggs.fred@gmail.com> wrote:

>On Friday, December 2, 2022 at 11:28:06 PM UTC-5, John Larkin wrote: >> On Fri, 2 Dec 2022 15:38:08 -0800 (PST), Fred Bloggs >> <bloggs.fred...@gmail.com> wrote: >> >> >On Friday, December 2, 2022 at 6:22:48 PM UTC-5, John Larkin wrote: >> >> On Fri, 2 Dec 2022 15:09:26 -0800 (PST), Fred Bloggs >> >> <bloggs.fred...@gmail.com> wrote: >> >> >> >> >On Friday, December 2, 2022 at 1:08:08 PM UTC-5, John Larkin wrote: >> >> >> On Fri, 2 Dec 2022 10:53:51 -0700, Don Y <blocked...@foo.invalid> >> >> >> wrote: >> >> >> >We're making devices that are typically built from 3+ "modules" >> >> >> >(form factor is highly constrained so board space is similarly). >> >> >> > >> >> >> >Presently looking at the cost-benefit assessment of burning-in >> >> >> >the modules and/or assembled devices. >> >> >> > >> >> >> >*Modules* will be made offshore so any test/burnin has to be part >> >> >> >of the quoted manufacturing cost. >> >> >> > >> >> >> >Modules will be "post processed" domestically to install final >> >> >> >firmware, S/Ns, private keys and watermarks. This allows for >> >> >> >all of that information to be tracked, here (instead of relying >> >> >> >on an offshore vendor who may decide to copy IP). >> >> >> > >> >> >> >Aside from installing the above, the only real "mechanical" >> >> >> >modifications are connecting modules together and packaging. >> >> >> > >> >> >> >So, a "final test" can just verify proper operation of >> >> >> >the *device* (instead of its constituent modules). >> >> >> > >> >> >> >If modules are burned-in and tested prior to acceptance, domesticly, >> >> >> >the number of failures after assembly should be minimal. (Yet, >> >> >> >you'd still want to verify proper operation as the cost to repair >> >> >> >or replace far exceeds the price of the devices) >> >> >> > >> >> >> >OTOH, an assembly problem that could manifest during burn-in >> >> >> >would want some assurances that it wouldn't sneek past final >> >> >> >inspection in the absence of a post-assembly burn-in phase. >> >> >> > >> >> >> >At the very least, this sort of question would apply to anyone who >> >> >> >installs firmware after offshore manufacturing. Or, assembles >> >> >> >subsystems sourced offshore. >> >> >> > >> >> >> >So, what is the best practices guidance? >> >> >> We were doing overnight burnin+test of our products, after automated >> >> >> test and cal, but the failure rate was zero. Useful burnin might take >> >> >> weeks and temperature cycling or something expensive like that. >> >> >> >> >> >> Temperature cycling is probably the biggest stressor of parts and >> >> >> solder joints and design margins. Shock+vibration next. Just benign >> >> >> burnin doesn't seem to do much. >> >> > >> >> >Infant Mortality--The Lesser Known Reliability Issue >> >> >https://ieeexplore.ieee.org/document/4274831 >> >> Paywalled. What's a reasonable burn time to catch infant mortality? If >> >> it's months, it wouldn't be practical for commercial gear. >> > >> >It's not the only game in town. There's tons literature on infant mortality with testing techniques dating to WW2. >> Military electronics used to require JAN/TX transistors, carefully >> assembled and tested and burned in and fabulously expensive. Then >> someone determined that regular equivalents were more reliable. > >Their main strength was hermetic encapsulation. That was something they knew how to do. Now look at them, installing parts salvaged off junk consumer products from China, washed off in some dirty river, after being removed en masse from a circuit board with a gas torch, and then re-labeled with some JAN code. The main criteria for acceptance was they looked shiny and new.
Plastic packaged transistors turned out to be just as good as TO-can parts too. We buy real parts from authorized distributors. It's amazing how much more reliable parts are now, vs the earlier days of ICs. We rarely have a bad part on newly built boards. When we see a pattern of "bad parts" it usually turns out to be a design issue.
On 12/3/2022 11:07 AM, Dimiter_Popoff wrote:
> During the early 80-s, a friend of mine worked as a production > engineer at a factory which made clones of the PDP-11, they > were shipped to the USSR. Many boards, full of TTL chips, many of which > Russian (they used to make things up to say 4 bit counters > etc., like the 74193 etc. under names no one could repeat) > so the failure rate was huge. > Their standard procedure was - as I remember the stories, never > witnessed these - some time in a 40C chamber (don't know how > many hours) where most of the failures would manifest, then > on some rattling machine for a vibration test, may be more I > don't know of.
The point being to operate the device at "extreme" conditions to effectively "age" it faster than real time to a point in its useful life AFTER most of the infant mortality failures have surfaced. Presumably, it wasn't NORMALLY operated in a 40C environment *or* under high vibration. You can also play games with the power supplies to make the components (and design) "uncomfortable". [Otherwise, you would have to age it at normal operating conditions for that full period of time -- a foolish strategy when you've got resources tied up for much longer than necessary! Imagine infant mortalities manifesting after weeks of normal operation... would you want to leave units running at "STP" for weeks just to be sure to capture all of those failures pre-sale?] Done correctly, its an ongoing *process* where you track failure rates (and modes) and revise your model so you continue to capture the failures of interest. In shops where we've done this, there was a *department* dedicated to tracking product quality. Way too much "Profanity and Sadistics" for non-math types!
On Sat, 3 Dec 2022 20:07:59 +0200, Dimiter_Popoff <dp@tgi-sci.com>
wrote:

>On 12/2/2022 19:53, Don Y wrote: >> We're making devices that are typically built from 3+ "modules" >> (form factor is highly constrained so board space is similarly). >> >> Presently looking at the cost-benefit assessment of burning-in >> the modules and/or assembled devices. >> >> *Modules* will be made offshore so any test/burnin has to be part >> of the quoted manufacturing cost. >> >> Modules will be "post processed" domestically to install final >> firmware, S/Ns, private keys and watermarks.&#4294967295; This allows for >> all of that information to be tracked, here (instead of relying >> on an offshore vendor who may decide to copy IP). >> >> Aside from installing the above, the only real "mechanical" >> modifications are connecting modules together and packaging. >> >> So, a "final test" can just verify proper operation of >> the *device* (instead of its constituent modules). >> >> If modules are burned-in and tested prior to acceptance, domesticly, >> the number of failures after assembly should be minimal.&#4294967295; (Yet, >> you'd still want to verify proper operation as the cost to repair >> or replace far exceeds the price of the devices) >> >> OTOH, an assembly problem that could manifest during burn-in >> would want some assurances that it wouldn't sneek past final >> inspection in the absence of a post-assembly burn-in phase. >> >> At the very least, this sort of question would apply to anyone who >> installs firmware after offshore manufacturing.&#4294967295; Or, assembles >> subsystems sourced offshore. >> >> So, what is the best practices guidance? >> > >Apart from the burn-in done here I already told you about >I remember one more example I know of. >During the early 80-s, a friend of mine worked as a production >engineer at a factory which made clones of the PDP-11, they >were shipped to the USSR. Many boards, full of TTL chips, many of which >Russian (they used to make things up to say 4 bit counters >etc., like the 74193 etc. under names no one could repeat) >so the failure rate was huge. >Their standard procedure was - as I remember the stories, never >witnessed these - some time in a 40C chamber (don't know how >many hours) where most of the failures would manifest, then >on some rattling machine for a vibration test, may be more I >don't know of.
I was doing much the same in the late 1970s. We had a number of new SEL 32/55 midi computers, with this brand new semiconductor RAM memory (replacing magnetic core memory), and were having lots of early failures. So, I decided to give them some hot summer days: The computers were looping on a memory test, as before, but now with their air intakes partially blocked by cardboard, with a thermocouple in the core so we could adjust the cardboard to achieve the max allowed temperature. Initially, delivered units would fail within a day. We would remove the cardboard et al and call the vendor, who would then find and replace the failed memory. Rinse and repeat. Pretty soon, the vendor instituted a hot screening program before delivery, it being far cheaper to fix in factory than the field, and in a year or two semiconductor memory field reliability had improved greatly. Joe Gwinn
On 12/4/2022 10:46 AM, Joe Gwinn wrote:
> I was doing much the same in the late 1970s. We had a number of new > SEL 32/55 midi computers, with this brand new semiconductor RAM memory > (replacing magnetic core memory), and were having lots of early > failures. > > So, I decided to give them some hot summer days: The computers were > looping on a memory test, as before, but now with their air intakes > partially blocked by cardboard, with a thermocouple in the core so we > could adjust the cardboard to achieve the max allowed temperature. > > Initially, delivered units would fail within a day. We would remove > the cardboard et al and call the vendor, who would then find and > replace the failed memory. Rinse and repeat. > > Pretty soon, the vendor instituted a hot screening program before > delivery, it being far cheaper to fix in factory than the field, and > in a year or two semiconductor memory field reliability had improved > greatly.
But, the vendor likely didn't just "block the vents" and *hope* ALL the early faults would manifest in the first 24 hours. Instead, he likely stressed a sample population over a longer period of time and recorded the failure rates, over time -- looking for the "knee" at which the failure rate leveled off. Longer burnin times would just needlessly shorten the useful life of the device; shorter would risk some number of infant mortality failures slipping through to manifest at the customer. It seems that most folks have a naive understanding of how burnin is supposed to work. That "simply" plugging the unit in before sale is enough to catch the early failures. Unless you know where (in time) those failures are probabilistically going to manifest, how can you know that 24, 48, 72, 168 hours is "enough"? Or, that 60C is the best temperature to accelerate failures? (my residential devices have to *operate* at 60C. And, -40C.) [If you're not going to approach it with a scientific basis, you're likely just looking to capitalize on your customers' ignorance: "We burn in our products for ## hours to ensure quality". Yeah. Right. "Then why did OUR unit shit the bed after two weeks?"]
On Mon, 5 Dec 2022 05:09:48 -0700, Don Y <blockedofcourse@foo.invalid>
wrote:

>On 12/4/2022 10:46 AM, Joe Gwinn wrote: >> I was doing much the same in the late 1970s. We had a number of new >> SEL 32/55 midi computers, with this brand new semiconductor RAM memory >> (replacing magnetic core memory), and were having lots of early >> failures. >> >> So, I decided to give them some hot summer days: The computers were >> looping on a memory test, as before, but now with their air intakes >> partially blocked by cardboard, with a thermocouple in the core so we >> could adjust the cardboard to achieve the max allowed temperature. >> >> Initially, delivered units would fail within a day. We would remove >> the cardboard et al and call the vendor, who would then find and >> replace the failed memory. Rinse and repeat. >> >> Pretty soon, the vendor instituted a hot screening program before >> delivery, it being far cheaper to fix in factory than the field, and >> in a year or two semiconductor memory field reliability had improved >> greatly. > >But, the vendor likely didn't just "block the vents" and *hope* >ALL the early faults would manifest in the first 24 hours. > >Instead, he likely stressed a sample population over a longer >period of time and recorded the failure rates, over time -- looking >for the "knee" at which the failure rate leveled off. Longer burnin >times would just needlessly shorten the useful life of the device; >shorter would risk some number of infant mortality failures slipping >through to manifest at the customer. > >It seems that most folks have a naive understanding of how burnin is >supposed to work. That "simply" plugging the unit in before sale >is enough to catch the early failures. Unless you know where (in time) >those failures are probabilistically going to manifest, how can >you know that 24, 48, 72, 168 hours is "enough"? Or, that 60C is >the best temperature to accelerate failures? (my residential >devices have to *operate* at 60C. And, -40C.) > >[If you're not going to approach it with a scientific basis, you're >likely just looking to capitalize on your customers' ignorance: >"We burn in our products for ## hours to ensure quality". Yeah. >Right. "Then why did OUR unit shit the bed after two weeks?"]
The best temperature to accellerate failures is the operating limit for which the design is intended to address, under functioning conditions that produce the highest intended self-generated rise. If you have access to early testing, you'll have some idea of the margins for functional operation that this limit condition provides, and the accompanying MTBF calculation for this previously-measured condition. It is only when margins to the limits are actually exceeded that predicted life is possibly compromised. Complete thermal cycling is impractical for simple burn-in. It is usually restricted in application to design verification or later sample process quality assurance. Cold cycling tolerance is relevant to consumer products mainly to demonstrate air-shipment worthiness. For burn in, simple on-off cycling to allow stress over self- generated temperature swings is considered adequate. RL