Hi James, et al,
Thanks for the nice comments. This is going to be kind of a long reply, so if anyone would like to carry this off line, I'd be happy to hear from you. I've been harboring these thoughts about the failings of standard drop tests for many years, but was able to put them into practice while working at Hewlett Packard, where I was responsible for packaging of DeskJet printers for 17 years. Throughout the 1990s and early 2000s, WalMart was one of the largest outlets for DeskJet printers. At one point, WalMart contacted HP saying that we had to pass ISTA 3A testing. Unfortunately, no DeskJet could pass that test, and in order to pass it, the size of the box would have needed to increase, costing millions of dollars in both material and shipping costs annually (all DeskJets were packaged to fit perfectly onto pallets, so any increase in size, even a few millimeters, would have decreased pallet density significantly). Luckily, we had lots of information leading us to believe the tests I helped set were far more effective than 3A.
Instead of using some number of data acquisition recorders to attempt measuring the distribution system, HP took extreme steps to collect every damaged and returned unit, all of which were serialized, and conduct extensive failure analysis. Those results helped guide us to a new way of testing, where we replicated consistent failures, and avoided causing damages that rarely or never occurred in distribution. On top of that, we figured out that cushion curves were grossly conservative, let alone that the damage boundary method of using trapezoidal pulses to set critical acceleration levels for cushioning to protect from was also about 60% off (I.e., if a bare product broke at 50 G's with a trapezoidal input, it wouldn't actually break until 80 G's in a cushioned package, due to the different shape of shock pulse). Fortunately, WalMart and HP management allowed a more thorough review of the current damage levels within the WalMart supply chain, not only for DeskJets, but our two largest competitors, Epson and Canon. The testing we conducted was far more thorough than any standard, dropping onto every face, edge and corner, and employing 30 unique drop sequences. At the same time, we didn't over-stress the product or packaging since we allowed ourselves to change out packaging after every 4 drops. Ultimately, this was like a combination of HALT/HASS testing but at reasonable levels. The result: we used about half the amount of foam cushioning as either Epson or Canon (on a per pound basis of product weight to foam weight), and yet WalMart confirmed DeskJets had the lowest damage rates of any printer sold. As a result, WalMart not only agreed to not force HP to pass 3A, but presented us an award for having the lowest damage rates of any printer manufacturer.
In all, I was able to benefit from using 400 million data acquisition recorders during my years at HP, all in the clever form of DeskJet printers, to discern exactly how best to test. Due to the serial numbers, we even had an idea of the type of supply chain the units passed through, let alone knowing where in the world specific failures occurred or didn't occur. Since leaving HP in 2005, I've refined this test and have successfully employed it for several Fortune 1000 companies. The test has proven itself to even be extremely effective for e-commerce trade.
In 2015, I gave a presentation at the ISTA Transpack conference on using this methodology. The client, a high-tech company, was shipping 10,000 units/month, 100% of which were shipped individually, exclusively by FedEx. This product had broad distribution in the US, Asia, the Middle East and Europe They had a failure rate of 0.3% worldwide, consistent on every continent, though with slightly lower rates in Asia than the rest of the world. 85% of this damage, or about 0.255% in total, came from one component, but they had never broken this component in either package drops nor bare product testing. The product was worth about $700 and even though the overall damage rates appeared very low, it was causing negative customer feedback and they wanted to resolve it. Only by using multiple unique drop sequences did we discover the damage occurred only if the 3rd of 4 drops was onto the bottom face, no matter what the first two orientations were. Also, we discovered 30" drops were needed for this damage to take place. This product weighed 55 lbs and this company, like so many others, followed guidelines that say there's some magical line for product weights, and that if it steps over that line, then you can decrease the drop heights. As a result, this product had been qualified at 24", even though we all know that conveyor belts remain at the same height for a 15 lb product as they do for a 55 lb product, let alone the waist heights of workers remain at about 30" and table tops remain at about 30". Using the multiple sequence concept, we then easily stepped up the drop heights and quickly discovered that components would break in the lab that had never broken in the field. Ultimately, we used high speed video and discovered the cushioning had not been designed effectively. Simply modifying the cushioning, without increasing the box size, eliminated the damage completely.
This client had attempted to measure the supply chain, placing expensive data acquisition recorders into their product and then sending the units from one side of the country to the other. After months of measurements and a few dozen trips, not one unit had arrived with the damage they were concerned about. The reason they didn't succeed was due to statistics. Using something called the Binomial Probability Theorem (written by an IBM engineer), it predicts that this company would have needed to send close to 950 shipments just to get one unit to fail. No study has ever been that large, and even if it were, it would have been nearly impossible to discern the difference between that single damaged unit and the first 949 non-damaged units. In contrast, it took only 10 units, using 30 sequences, to not only perfectly replicate the damage that occurred at only 0.255%, but the testing didn't cause any other damage. This has been repeated for a variety of products and appears to be far more predictive as a qualification test than any test standard, let alone as an effective way to replicate known, consistent field failures. In fact, I would challenge any standards organization to find a product with a known, consistent type of damage and then see if their tests both replicates the failure and doesn't cause any other failure, let alone attempt to step up the input incrementally to identify the next weakest link in the product/package design.
I attended a webinar on the new Amazon ISTA test methods. I was surprised the presenter said that if you pass the tests, and Amazon later finds you have excessive damage, then you'll be required to redesign the packaging and then pass the same test again. It would seem logical that if the first design passed the tests, but ended up with excessive shipping damage, then the solution would be to change the test, not just modify the packaging. I do understand some of the historical reasons for the tests as they exist, but seeing this email is way too long as is, I won't go into that now.
Hope this helps. I welcome anyone's feedback and thoughts.
Thanks,
Kevin
Kevin Howard
Consultant
www.packnomics.com
www.linkedin.com/in/kevin7howard
cell: 360-606-0235
desk: 360-828-8822