On every Dynamic Random Access Memory (DRAM) module are memory chips made up of numerous tiny integrated circuits (ICs) that are expected to perform reliably over long stretches of time under tough workloads and settings. Whether at rest or in operation, these modules face hazards throughout their usage life, and any malfunction can lead to system breakdowns that could disrupt or delay business operations. As such, these components should be manufactured and packaged according to very high standards and their use should be managed in such a way that would address or minimize known risks.
As ICs continue to diminish in size, they become more susceptible to damage from a host of environmental factors, especially for systems placed in inhospitable locations with severe hot or cold temperatures and airborne particles. Systems set up in outdoor remote areas where access to regular maintenance is limited are particularly vulnerable.
Uneven IC Quality. ICs within a wafer may not all have the same quality. There may be strong and weak ICs in one wafer. When different-quality ICs are used on the same DRAM module, the module will cause unstable system operation.
Figure 1. Using ICs of different qualities in the same DRAM module can cause unstable system operation.
Thermal. High operating temperature can cause DRAM chips to degrade fast, especially when installed in systems and environments with poor air flow.
Environmental. Particles such as dust, harmful gases, aggressive chemicals, humidity, water and other airborne elements that penetrate the DRAM module can cause corrosion and damage. For example, when sulfur from the environment enters gaps in the resistors and comes into contact with silver, which is used as conductive material for sensitive electronic parts, silver becomes a non-conductive compound called sulfur sulfide. The resulting corrosion causes the resistor to become open circuit. It will no longer function properly, ultimately leading to module failure.
Electro-Mechanical. Damage to the contacts/gold fingers can weaken or adversely impact signal integrity. Electrostatic discharge (ESD) can cause electrons to transfer between the surfaces and generate electrostatic charge. Friction, contact area and humidity can determine the amount of static electricity generated. The simple unwinding of tape from a roll or walking on a carpet can generate static electricity that can potentially damage electronic components, as even small parametric changes can cause adverse effects, ranging from performance degradation to complete device failure.
A power surge or spike, which delivers a sudden power voltage increase, could also be harmful to the ICs or memory chips, causing the DRAM to malfunction or shorten its useful life. Voltage fluctuations and sudden power loss events can degrade components and circuits, reduce equipment life and cause data loss or corruption.
Reliability Solutions and Technologies
DRAM modules for the enterprise and industrial markets are typically installed in high-performance environments such as data centers where non-stop computing is done on massive scales and any disruption in operation can impact business heavily. Usage in IoT systems is also common. These systems may be situated in remote places that could be difficult to reach for regular maintenance. It is therefore important to make sure that the DRAM modules can be depended on for long stretches of time.
ATP's modules undergo two levels of testing to ensure maximum reliability:
1. Advanced IC-Level Testing screens for ICs with the best reliability and quality characteristics that are suitable for applications requiring wide temperature.
2. Enhanced Module-Level Tests: Test During Burn-In (TDBI) and Automatic Test Equipment (ATE) guarantee that modules meet and even exceed qualifying parameters.
It is extremely important to determine the electrical and mechanical characteristics of individual components that will be used on ATP's DRAM modules to ensure the highest quality and reliability. IC-level testing is conducted to ensure the reliability of the new die, and the ICs are subjected to low and wide temperature tests to screen out weak ICs.
Figure 2. IC-Level Testing
Figure 3. Weak ICs that fail in high-low temperature environments are screened out.
Functional Testing Using Automatic Testing Equipment (ATE)
The ATE detects component defects and structural defects related to the DIMM assembly and screens out marginal timing and signal integrity (SI) sensitivities. ATE provides electrical testing patterns with various parameter settings, such as marginal voltage, signal frequency, clock, command timing and data timing under continuous thermal cycle. Also, based on customers' requests, tailor-made electrical testing patterns can be programed and implemented into the ATE testing process. The ATE testing system can pinpoint individual defective ICs or DRAM PC boards, thus providing a more efficient failure analysis method for both new product development and mass production stages.
Figure 4. Module-Level Testing using ATE
System-Level Failure Detection and Prevention via TDBI
The bathtub curve in Figure 5 illustrates the three key periods of product failure. As shown below, high failure rates occur during the early life or infancy, but goes down to a constant rate during its normal usage life. Failure rates increase again as it wears out and towards the end-of-life stage.
Figure 5. The Bathtub Curve shows the three periods of product failure.
Test During Burn-in (TDBI) at mass production level detects early life failures (ELF) and effectively screens out weak ICs that could fail during the early product life.
The ATP TDBI system applies extreme high/low temperature, high-low voltage, and pattern testing on DRAM modules. The system consists of:
The miniature chamber, which isolates temperature cycling only to module being tested so as not to thermally stress the rest of testing systems. This minimizes the failure of other testing components, such as the motherboards. In conventional large thermal chambers, the failures of non-DRAM-related testing components are constant given that the whole system is thermally stressed.
Module riser adapters from the motherboard, which allow easy module insertions in production-level volumes
Multiple temperature sensors, which regulate temperature profiles, operating on a wide testing temperature range of -40°C to 95°C
Through accelerated testing methods such as TDBI, ATP significantly lowers failure rates and extend the product service life by making sure that only robust DRAM chips are on the module. Even just 0.01% error on a 99.99% effective device can increase the failure rates at module level and lead to failure in actual usage, TDBI detects and screens out up to 0.01% error to ensure the DRAM modules' reliability.
Figure 6. The ATP TDBI system
Figure 7. The ATP mini chamber isolates temperature cycling only to the module being tested to make sure that the motherboard and the rest of the testing systems are not thermally stressed.
ATP DRAM Special Features
ATP DRAM modules with industrial temperature rating can endure extreme temperatures ranging from -40°C to 85°C, ensuring long-term stability for systems installed in telecom, industrial and military/aerospace operations where consistent availability and steadfast performance are of critical importance.
Sealed Against the Elements
Conformal coating is a method that coats electronic circuits and modules with a protective layer of the chemical compound parylene that "conforms" to the surface. The film serves to shield against dust, chemical, extreme temperatures, moisture and corrosion. The coating film is formed by the chemical vapor deposition (CVD) process, and unlike dipping and spraying techniques, is completely pinhole free as the film conforms to any irregular shape, forming a vacuum-like environment to coat all components and points of failure. The parylene film provides exceptional rust prevention; resistance to solvents, acids and alkalis; excellent insulation and dielectricity; and, high tensile and yield strength. (Source: Parylene Engineering)
Protected from Corrosive Sulfur
Silver is the most widely used conductive material for sensitive electronic parts such as electrodes in chip resistors. Among all the metals, pure silver has the highest electrical and thermal conductivity. It is also resistant to oxidation, which is a form of metal corrosion.
When sulfur-containing particles from the environment are diffused or enter gaps in the resistors and come into contact with silver, the chemical reaction that follows transforms the conductive silver-based electrode into a non-conductor called silver sulfide. When silver corrodes, it loses conductivity; the resistor then goes open circuit and will no longer function properly, ultimately leading to module failure.
Rapid climate change, rising global temperatures, and continued environmental degradation are all causing the perpetual increase of contaminants in the air. High sulfur concentration is found particularly in the following environments: mines, petrochemical plants, and polluted industrial areas; where oil, gas lubricants or fossil fuels are used; geothermal/volcanic sites and hot springs; and, waste disposal sites. There are also some countries, especially where industrial pollution is rampant, where the daily atmosphere has high concentrations of sulfur.
ATP DRAM modules use anti-sulfur resistors (on a project basis) that repel the damaging effects of sulfur contamination, guaranteeing continued dependable performance for a long time and lowering the total cost of ownership by preventing unnecessary downtime and expensive component replacements.
Thicker Gold Fingers and PCB Assembly
Gold fingers are the gold-plated edge connector contacts on the DRAM printed circuit board (PCB) assembly. They are referred to as such because they are long and narrow, resembling fingers. They are gold-pated to help protect the PCB from wear and tear. To ensure the quality of the signal transmission between the connector and ATP DRAM module, ATP utilizes gold finger plating with 30µ" thickness, compared with competitors' DRAM modules thickness typically at less than 10µ". Additionally, the board itself consists of 6-10 layers of printed circuit board assemblies (PCBA), ensuring better signal quality as well module durability for many years of reliable use.
Figure 8. Thicker gold fingers and up to 10 layers of PCB assemblies make ATP DRAM modules reliable and durable.
ATP has over 25 years of expertise in manufacturing industrial-grade memory and storage products. Visit the ATP website for more information on which DRAM solutions are best suited for your industrial and embedded requirements, or contact an ATP Distributor/Representative in your area.