Root-cause failure analysis can significantly reduce the risk for catastrophic equipment failures. K.bloch: for equipment failure analysis to be effective, our beliefs must align with the facts. He says the evidence you need to solve the problem is most likely available but hidden.
Root-cause failure analysis can significantly reduce the risk for catastrophic equipment failures. K.bloch: for equipment failure analysis to be effective, our beliefs must align with the facts. He says the evidence you need to solve the problem is most likely available but hidden.
Root-cause failure analysis can significantly reduce the risk for catastrophic equipment failures. K.bloch: for equipment failure analysis to be effective, our beliefs must align with the facts. He says the evidence you need to solve the problem is most likely available but hidden.
Extreme failure analysis: never again a repeat failure
Apply root-cause failure analysis to recurring reliability problems
K. Bloch, Flint Hills Resources, L.P., Rosemount, Minnesota The ultimate purpose of this article is to significantly reduce the risk for catastrophic equipment failures. Readers may believe that having been trained in root-cause failure analysis (RCFA) is enough. Why, then, is some equipment allowed to repeatedly fail? Are low-consequence repeat failures discretionary maintenance opportunities, or precursors to more serious reliability and safety problems? What really constitutes effective RCFA? Let's consider real life experiences to answer these questions. For equipment failure analysis to be effective, our beliefs (and even the most reasonable of assumptions) must align with the facts. Unfortunately, an extreme failure (an explosion, fire, wreck or crash) often complicates matters by compromising much of the information that we would normally use to determine an accident's cause. The issue with an extreme failure is that although limited physical evidence remains, its consequences are devastating. Indeed, the consequences are so severe that it is unthinkable to take action without being certain that the problem will be solved. Determining causes with scant physical evidence. Without physical evidence it can be very difficult to look at an effect and determine its cause. In contrast, predicting the effect of an observed cause is a relatively simple task. For example, consider the simple mental experiment1 shown in Fig. 1. First predict the outcome of a melting ice cube on hot concrete. Then look at the photo under it and explain how the water stain got there. Note that you would be mistaken to believe that an ice cube left behind this stain. In situations where conclusive physical evidence has been compromised, it is sometimes easier to pass failures off as acts of sabotage or conspiracy. Worse yet, events leaving behind no physical evidence are often dismissed as an "act of God," and the case is closed.
Fig. 1 Melting ice cubes leave a stain on concrete, but what left the other stain behind?
In reality, the evidence you need to solve the problem is most likely available but hidden from plain sight. Therefore, identifying a probable cause involves knowing where to find this evidence. Admittedly, resolving who or what left the water behind in Fig. 1 is hardly a matter of great consequence, but in extreme failures the implications are infinitely higher. Moreover, since there is usually low confidence in the physical evidence left behind by extreme failures, we must turn our attention to their latent, or hidden causes. Latent cause identification. Hidden but powerful forces within our organizations allow incremental mistakes to negatively impact safety and reliability. We must identify these latent causes to develop an action plan toward assured failure prevention. Latent cause identification is simplified somewhat by recognizing that a specific sequence of events is shared between many different extreme failures. The "extreme failure life cycle" shown in Fig. 2 represents the relationship between a failure, a repeat failure and an extreme failure. Underlying maintenance and design defects can usually be detected as the probable cause of many controversial failures when this pattern is kept in mind.
Fig. 2 Extreme failure life cycle showing the process a failure goes through to become an extreme failure. Notice the repeat failure's position. Fact-based conclusions ultimately add more value than unproductive conspiracy and sabotage theory debates. Assigning blame instead of confronting the latent cause is a certain prescription for repeating the same problem. The extreme-failure life cycle indicates that when repeat reliability events are disregarded they eventually become the catalyst for progressively more serious and potentially highly dangerous equipment failures. Repeat failures tell an important story. The role that a "repeat failure" plays in the life cycle of an extreme failure is of great interest. In a "hindsight is 20/20" world, we often wish we had acted differently after suffering the painful consequences of a decision under our control. Since repeat failures are the likely intermediate step leading up to an extreme failure, they are also reliable warning signals that precede many catastrophic equipment failures. Taking control over repeat failures to consciously prevent a catastrophic accident reinforces the precept that we are in charge of equipment reliability and not victims of their "unpredictable" behavior. A repeat failure is simply defined as a recurring equipment difficulty that prevents it from achieving its anticipated life expectancy. Repeat failures exist because we have perhaps concluded that a particular failure mechanism is more economical to manage than to correct. If allowed to persist, a repeat failure will eventually be perceived as a discretionary, low-risk nuisance with no potential safety or environmental consequence. This defective risk assessment approach is also known as "normalization of deviance" and must be resisted.2 Repeat failures build a reactive work order history in our maintenance management systems. More often than not, the entries abound with useless information such as "bearing replaced" when the entry "bearing failed due to oil starvation resulting from use of pressure-unbalanced constant-level lubricator" would have added real value. Regardless, repeat failure work orders tend to get buried under higher-priority items that represent a more immediate production constraint. Repeat failures are often addressed only as time allows and without asking why the failure occurred. Knowing why the failure occurred may require a failure analysisand performing a failure analysis on something viewed as a low-consequence risk takes time away from addressing immediate production constraints that show up on the daily maintenance plan. In truth, this highly reactive "reliability strategy" is the trademark of a repair-focused organization. While they might claim to be reliability-focused, such organizations exhibit few, if any, of the requisite traits or do so in name only. Extreme failures. While we are obviously not condoning repeat failures, extreme failures are much more offensive. Extreme failures are "extreme" in every sense of the word and are differentiated as: Being of, or having the potential for, the most extreme consequences Leaving behind extremely little physical evidence to readily expose a probable cause Statistically, extremely improbable. Also, because precursor repeat failures leave their tracks in the maintenance management system, extreme failures, in retrospect, always appear to be very predictable. Therefore, the maintenance management system contains not only evidence critical for investigating an extreme failure, but also reproof for not taking preventive action. The following examples illustrate the relationship between repeat and extreme failures. The Hindenburg disaster: an extreme failure. The Hindenburg disaster is one of the most identifiable extreme failures in the history of modern machines. The circumstances behind this failure still stir considerable controversy and debate, led by various conspiracy and sabotage assertions that accompany most extreme failures. The purpose of examining it here is to demonstrate how the pattern shown in Fig. 2 applies to all extreme failures no matter where they occur. Only by associating the extreme failure with its adjunct repeat failure can we determine a fact-based credible scenario that moves us away from accepting theories fueled by speculation. The Hindenburg airship was built with a lightweight metal airframe held rigid by a network of 0.125-in.-diameter steel bracing wires under tension. Its outer covering consisted of cotton linen painted with a metallic cellulose acetate butyrate "dope" to repel water and reflect sunlight. Sixteen inflatable bags were filled with 7 MMscf of hydrogen to lift the airship, since the preferred medium (helium) was not available. Like every machine, the Hindenburg had an operating envelope and violating its limits would greatly increase the mechanical failure risk. Operating procedures were used to mitigate these failure risks, and the Zeppelin Company's enviable safety record was evidence of an effective training program. Top among these procedures were strict rules governing landing maneuvers to avoid exceeding the bracing wires' 1,000-lbs tensile force limit in the tail-to-fuselage section, which absorbs the energy produced while turning the massive airship. Regardless, the Hindenburg's maintenance records contain a history of bracing wire failures in the tail-to- fuselage section.3 The Hindenburg's otherwise perfect transatlantic flight was spoiled by unexpected headwinds that put it 12 hours behind schedule upon its arrival in Lakehurst, New Jersey. Eager to land the ship without further delay, the captain ordered a risky sharp left turn after the wind suddenly changed direction to quickly reorient the airship's nose back into the wind. This violated landing procedures that required aborting the landing attempt if the wind shifted direction. Following procedures was needed to safely point the airship's nose back into the wind without exceeding the bracing wires' stress limit. After making the sharp left turn, the captain noticed the Hindenburg suddenly becoming tail- heavy. Since procedures also required landing the airship horizontally to avoid damaging the tail fin, the captain released the remaining ton of water from the ship's rear ballast tanks (Fig. 3). Several minutes later, the captain ordered six crewmen to the front of the airship to counterbalance the continued tail section downward-slope. Next, he dropped the anchor ropes from the airship's nose.
Fig. 3 Rear ballast tanks are emptied to avoid hitting the ground after the Hindenburg unexpectedly becomes tail-heavy during landing maneuvers. On the ground, everything appeared normal. The ground crew grabbed the anchor ropes and began walking the airship to the mooring mast. Before they were able to fasten the ropes to the mast however, a fire broke out in front of the top tail fin, where evidence of a hydrogen leak (tail- heaviness) existed after the captain deviated from procedures by executing a sharp left turn after the wind changed direction. The entire airship burned from the tail forward, destroying all physical evidence within 32 seconds. Thirty-five of the 97 people on board were killed along with one ground crew member. In hindsight, knowing that a repeat failure is somehow involved makes it easy to understand that a bracing wire probably broke upon exceeding its stress limit, just as expected. While this failure had occurred previously, this time the unstable wire penetrated a hydrogen bag and the airship's outer skin, which set off a sequence of events that resulted in one of history's most famous disasters. The repeat failure became extreme by an unlikely combination of contributing factors: A very tight schedule, made even tighter by strong headwinds during the flight Procedure deviation Hydrogen containment was lost The failure occurred during a critical phase during the landing procedure Light rain was falling, which made the anchor ropes capable of conducting an electrical charge after becoming adequately moistened. Some may wonder why the Zeppelin Company did not address the Hindenburg's design risk with something more reliable than an administrative control procedure, like stress-resistant materials in the vulnerable tail-to-fin section. But it is important to consider how the Zeppelin Company's perfect safety record influenced its risk tolerance for bracing wire failures. In hindsight, their maintenance records show that this repeat failure represented a discretionary maintenance nuisance that could be managed with little inconvenience. Living with the failure mechanism was, therefore, a more economical alternative. Would the choice to sacrifice a wire in the interest of preserving the airship's remaining turnaround time have been considered acceptable if the procedure deviation had not ended in an extreme failure? While the Zeppelin Company's safety record was indicative of a reliability-focused organization it was, in fact, guilty of making decisions associated with a repair-focused organization. Inherently safe technology advocates will argue that the use of hydrogen instead of helium is what caused the accident, while minimizing the impact of maintenance practices that led to a loss of containment scenario. Whether or not helium was available to Germany in the mid 1930s is not the issue here. In modern times we must operate responsibly because it is not practical to make similar substitutions. To illustrate, let's turn our attention to industries where OSHA's Process Safety Management (PSM) Standard (29 CFR 1910.119) applies. The standard's purpose is to achieve safe and continuous containment of hazardous substances inherent to the manufacturing process. Spent caustic tank explosion. Refineries use caustic (sodium hydroxide) to purify liquefied petroleum gas (LPG). As the caustic reacts with LPG contaminants, its concentration decreases. In other words, it becomes "spent." To maintain the minimum caustic concentration needed to continue the reaction, spent caustic must be periodically removed and replaced with an equal volume of fresh caustic. In one refinery, the spent caustic batches into a 35,000-gallon intermediate cone-roof storage tank. From there the caustic slowly drains to the waste treatment facility (Fig. 4). This disposal strategy absorbs large slugs of spent caustic that would otherwise upset the biological treatment system.
Fig. 4 A degassing vessel was installed to vent hydrocarbons from spent caustic before entering the storage tank. In 2004, a spent-caustic system hazard and operability (HAZOP) study concluded that operator error could result in sending a large volume of LPG directly into the spent-caustic storage tank. Upon entering the tank, the LPG would vaporize and release a propane vapor cloud into the refinery. The history of fugitive vapor releases in refineries is not comforting; vapor releases continue to be responsible for extensive equipment damage and fatalities upon ignition. Therefore, a HAZOP action item was assigned to mitigate the risk for a vapor cloud release from the atmospheric spent-caustic storage tank pressure relief system. A degassing vessel was retrofitted in front of the spent-caustic storage tank and commissioned on day 1 (actually in 2005). This system satisfied the HAZOP action item's purpose for hydrocarbon removal from the spent caustic entering the tank. For most of the time the system would operate in "fill" mode, where spent caustic from the upstream liquid/liquid LPG contact process would stagnate in the degassing vessel while venting hydrocarbons into the refinery flare header. After allowing sufficient time to pass, operators would perform a manual "dump" procedure by opening the discharge valve under nitrogen pressure to drain its degassed (vented) contents into the tank. Operators were expected to stand by the transfer valve during this manual procedure, to verify that the liquid seal above the degassing vessel's discharge nozzle inlet remained intact. On day 529 (in 2007) the spent-caustic storage tank failed a leak detection and repair (LDAR) test, with over 2,000 ppm hydrocarbon measured exiting the tank's atmospheric pressure relief device (PRD). In compliance with refinery policy, a work order was issued to repair the leaking PRD within 15 days of discovery. The repair involved tightening the bolts around the PRD to stop the hydrocarbon leak. After the repair, a second LDAR test was performed to confirm that the repair was successful so that the work order could be closed. However, the LDAR test failed again with over 2,000 ppm hydrocarbon being measured exiting the tank after the repair. In response, the results of the failed repair attempt were logged in the maintenance management system and another repair was scheduled. For the second repair, the PRD's sealing gasket was replaced. The LDAR test failed again after the second repair attempt, with about 1,000 ppm hydrocarbon detected leaking out of the tank. The maintenance management system was again updated with the failure information, and a third repair attempt was scheduled. This repair was canceled, however, because a final LDAR test conducted before executing the work showed zero ppm hydrocarbon at the PRD. On day 621 (2007) two contractors working near the tank both prematurely shut down their jobs at the same time, after a foul odor from an unidentified source invaded their work area. Operators were advised of the situation and they immediately responded by investigating the problem. However, the source for the release was not positively identified because the odor had dissipated by the time they entered the process unit to investigate the complaints. The contractors were allowed to resume working in the area and the odor did not return. On day 628 (2007) the spent-caustic storage tank exploded suddenly and without warning shortly after operators initiated the procedure to drain spent caustic from the degassing vessel into the tank. Because the operator had left the valve to attend to another part of the process, there were no injuries or fatalities. However, the accident was severe. It caused the tank to become airborne, spread fire into the unit, and interrupted spent-caustic disposal operations. The damage imposed by the accident (Fig. 5) compromised any physical evidence that would expedite root-cause identification.
Fig. 5 Spent-caustic storage tank after explosion.
Only after the incident were the repetitive LDAR failures and odor complaints recognized as warning signals that hydrocarbon was leaking through the degassing vessel into the tank. Remembering the ignition triangle, this satisfied the fuel requirement for an explosion. Although 50 years of reliable spent-caustic storage system operation had been experienced before the accident, the refinery was faced with compelling evidence that elements of a repair-based culture existed. This culture allowed three repeat failures (hydrocarbon vapor emission events) without investigating why hydrocarbons were entering the tank after commissioning the degassing vessel.
Fig. 6 Minimum nozzle submergence requirements (feet) to prevent vapor entrainment when draining liquid without a vortex breaker. 8
In the post-accident investigation, it was proven that the spent-caustic interface level did not drop below the degassing vessel's drain nozzle at the time of the accident. Therefore, attention shifted to alternative scenarios that would explain how hydrocarbons could penetrate the degassing vessel's liquid seal. By chasing down this thread, the investigation uncovered evidence that an unintended design condition existed, which allowed flare gas and LPG in the degassing vessel to contaminate the spent-caustic storage tank during the draining procedure. Since the degassing vessel was draining without a vortex breaker, it would have to operate according to the nozzle submergence requirements shown in Fig. 6 to avoid entraining hydrocarbon vapor in spent caustic. Archived process data provided evidence that the degassing vessel operated outside of these limits (Fig. 7). This means that hydrocarbon vapor was passing into the tank every time a transfer was made. The investigation uncovered additional systemic defects that explain how the failure became extreme. These conditions produced an unlikely combination of contributing factors: A procedure deviation that made it possible for operators to transfer spent caustic without using nitrogen, which greatly increased the amount of hydrocarbon vapor in the degassing vessel headspace The formation of a pyrophoric iron sulfide ignition source on the internal tank roof surface Oxygen in the tank. Both examples strongly reinforce repeat failures' involvement in extreme failures. In every case, a trustworthy and actionable cause emerges. It is based on evidence associated with a preceding repeat failure. Recall, however, that the goal of a reliability-based organization is to recognize the warning signals and take action before an extreme failure triggers an accident investigation. The final example shows how this can be accomplished by taking appropriate intervention steps upon detecting a repeat failure. Extreme failure avoidance. A five-stage, barrel-type, hydrogen recycle centrifugal compressor similar to the one shown in Fig. 8 is in service in a large midwestern refinery's platformer unit. The compressor operates at 8,200 rpm and processes a recycle gas flow of about 97 MMscfd. The suction gas is contaminated with ammonium chloride. This situation is conducive to depositing salt on the rotor, which has been the presumed source for a series of recurring vibration events over the compressor's 30-year history.
Fifteen months into a stable run after overhaul, the compressor tripped offline and coasted to a stop without lubrication following an unintended shutdown of both lube-oil supply pumps. After a warm restart, vibration appeared to be stable and in general very similar to conditions before the trip. Stable operation was interrupted a month later when the outboard radial bearing vibration suddenly jumped to 1.7 mils. Vibration analysis indicated that subsynchronous vibration had developed due to a fluid instability problem that produced an "oil whirl" pattern. Two months later, the vibration profile deteriorated further into an "oil whip" pattern. This resulted in increasingly unstable and unpredictable vibration spikes exceeding 2 mils. Reducing the frequency and severity of the vibration spikes was possible only by operating the compressor at speeds below 7,600 rpm. The speed curtailment resulted in a significant platformer unit rate cut. The economics favored shutting down the unit to repair the compressor rather than continuing to operate the machine below its normal running speed. The repair plan was limited to replacing the inboard and outboard floating-ring oil seals and tilt-pad radial bearings. These components were suspected to have been damaged by the accidental loss of lube oil. The repair plan also provided a rationale for the type of vibration experienced soon after, which indicated a fluid instability problem characterized by oil whip. When the machine was opened for inspection, the maintenance staff was pleased to find radial- bearing and floating-ring oil seal damage consistent with their diagnosis. The damaged components were replaced and the compressor restarted. Unfortunately, the unstable subsynchronous vibration component remained at speeds above 7,600 rpm upon the compressor's return to service. A second repair at considerable expense was scheduled in response to this unfortunate turn of events. Since the compressor barrel was to be opened for inspection, a complete overhaul was planned. A comprehensive vibration study was performed to narrow down the repair scope. An investigation was launched to determine if a repeat failure could explain this machine's long history of what appeared to be unrelated, but persistent unstable vibration events at high speed. Although the compressor is armed with an eddy-current type noncontacting shaft vibration monitoring and shutdown system, "unstable" and "high speed" are words that do not go well together in reliability and safety-based organizations. Therefore, refinery staff wanted to determine if rotor fouling and other discrete events were somehow related. Among these events the most recent one was where replacing the damaged components did nothing to correct unstable vibration. The vibration study provided evidence needed to determine both probable cause and, ultimately, avoidance of a repeat failure. Fig. 9 shows how the subsynchronous component adjusts to maintain a constant fractional relationship with the rotor speed. It is "locked-in" at a rotating frequency of 3,000 cpm that corresponds to the rotor's first natural fundamental frequency (critical speed). These characteristics apply to flexible rotors that operate above one or more shaft critical speeds. 4 The compressor maintenance file contains a history of unstable vibration events at speeds above 7,600 rpm. These events date back to 1985 and consistently appeared within 18 to 24 months after overhaul. References document similar cases involving the aerodynamic excitation of a rotor's first natural fundamental frequency. 5 This condition may be experienced with flexible rotors, due to the gradual deterioration of damping properties associated with normal operation after compressor overhaul. 6
Fig. 7 Actual degassing vessel operation compared with minimum nozzle submergence requirements shows vapor entrainment occurring.
Fig. 9 Cascade plot showing a troublesome subsynchronous vibration component "locked- in" at 3,000 cpm along with expected synchronous (1X) vibration. Aerodynamic rotor instability was thus identified as the probable cause for the history of compressor vibration events. This fact-based explanation developed the confidence management needed to approve the investigation team's long-term recommendation, i.e., to address the inherent instability by either redesigning or replacing the compressor. Most importantly, it interrupted an extreme failure's life cycle that might have resulted in unacceptable consequences, no matter what their relative "improbability." Bottom line: Tolerating repeat failures is inconsistent with reliability-focused thinking. The science of warning signals. As these examples illustrate, rarely will an extreme failure occur simply based on a single, isolated event. Rather, extreme failures are produced when an existing repeat failure combines with other factors that are statistically unlikely to coexist. By way of analogy, repeat failures keep reappearing like bars on a gambling casino slot machine. Repeat failures are common, predictable events that independently represent low risk. But when all the bars line up, there is a payout. When certain deviations line up with repeat failures you get negative payout in the form of an extreme failure. This is the basis for the "coupling" argument introduced by Charles Perrow in his classic Normal Accidents text. Perrow's basic premise is that complex systems are uniquely suited for two or more independent and innocuous conditions to combine at once to produce an unexpected catastrophic event.7 This principle is best reflected in our compressor example, where a flexible rotor (the latent cause) is no problem at all until it interacts with the contributing factors that align within 18 to 24 months of normal operation. Likewise, the normal deterioration from start-of-run conditions expected after 18 to 24 months would have little impact on a rigid rotor's aerodynamic stability operating in this specific service. The benefit of recognizing and controlling a repeat failure is that eliminating only one of the coupling requirements can mitigate the risk for an extreme failure. For example, the accidents suffered in the case of the Hindenburg and the spent-caustic storage tank could have been prevented had the repeat failures (snapped bracing wires and hydrocarbon leakage, respectively) been resolved. It is more rewarding to trigger an investigation that prevents an accident rather than investigating the accident you could have prevented. What can you do? Knowledge about the relationship between repeat failures and extreme failures adds value in two ways. First, it becomes possible to locate the facts we need to filter our beliefs, so that a credible probable cause can be identified when physical evidence has been compromised. Second, it promotes confidence that we control process reliability and safety and will not let it control us. By recognizing warning signals we can take deliberate actions to prevent extreme failures before suffering unacceptable consequences. Since failure and accident prevention are the reliability-based organization's trademark, here are a few suggestions: Recognize repeat failures. Check reactive work orders and challenge the ones that pop up regularly. Ask yourself, "Do I know why I'm working on this again?" Perform an RCFA if the answer is no. Follow and enforce procedures. Shortcuts tend to introduce risks that procedures mitigate. Follow procedure steps in order. Communicate openly when you think there may be a better way to execute a procedure or if the steps do not make sense or seem out of order before deviating from them. Use good judgment. When changing conditions or circumstances interfere with the plan, don't be afraid to enter a holding pattern or call time out. Stopping a job makes more sense than executing it unsafely. Operate a near-miss awareness, reporting and investigation program. Ask employees to report things that don't look, sound or smell right. Follow up on employee concerns about unresolved problems. Resolve the issue and communicate findings back to them. Look for trends that indicate a bigger problem looming. Develop and apply internal RCFA skills. Our biggest opportunity lies with correcting small failures to avoid the bigger ones. Ultimately, no time will be saved unless RCFA is performed. RCFA triggers must be linked to repeat failures. Many organizations tier their RCFA levels according to safety, environmental and economic thresholds. Reserve a category for repeat failures and measure improvement (reduction) over time. The maintenance staff will appreciate reducing the backlog and their frustration over experiencing the same problems. You also benefit in knowing that you are systematically mitigating the risk for an improbable, yet far too costly, extreme failure (PSM incident). Communicate and incorporate lessons learned. Lessons obtained by investigating repeat failures extend far beyond the equipment type on which they occur. They will benefit different units, areas, sites and even industries. Maximizing value from a single failure involves communicating lessons learned effectively throughout an organization. Lessons learned from outside resources can be obtained from numerous sources, such as the annual NPRA Safety Conference (www.npra.org), semiannual API/NPRA Operating Practices Symposium (www.api.org), the AIChE Spring National Meeting (www.aiche.org), and the US Chemical Safety Board (www.csb.gov). Above all, remember that the machines we build perform and respond exactly as expected under the conditions to which they are exposed. Rarely, if ever, is the cause for a failure out of our control. Be convinced that answers and solutions will come to those who act on their responsibility to explain unacceptable equipment performance. LITERATURE CITED 1 Taleb, N. N., The Black Swan, Random House, New York, New York, p. 196, (ISBN 978-1-4000-6351-2), 2007. 2 Bloch, K. and S. Williams, "Normalize Deviance at Your Peril," Chemical Engineering, 111, No. 5, pp. 5256, 2004. 3 "The Hindenburg Airship," Seconds From Disaster, Yavar Abbas, The National Geographic Channel, November 15, 2005. 4 Eisenmann, Sr., R., and R. Eisenmann, Jr., Machinery Malfunction Diagnosis and Correction, Prentice-Hall, Inc., Upper Saddle River, New Jersey, p. 436, (ISBN 0-13-240946-1, out of print), 1998. 5 Nicholas, J. C. and J. Kocur, "Rotordynamic Design of Centrifugal Compressors in Accordance with New API Stability Specifications," Proceedings of the Thirty-Fourth Turbomachinery Symposium, Turbomachinery Laboratory, Texas A&M University, College Station, Texas, pp. 2534, 2005. 6 Eisenmann, op. cit., p. 436. 7 Perrow, C., Normal Accidents: Living With High-Risk Technologies, Princeton University Press, Princeton, New Jersey, p. 7, (ISBN 0-691-00412-9), 1999. 8 Lieberman, N., Troubleshooting Refinery Processes, Penwell Publishing Co., Tulsa, Oklahoma, p. 272, 1981.
The author
Kenneth Bloch is lead process reliability engineer at Flint Hills Resources' Pine Bend Refinery in Rosemount, Minnesota. He is responsible for mitigating and investigating process-governed failures on refinery assets. A Certified API 510 Inspector, Mr. Bloch publishes articles on equipment failure analysis, life cycle extension, and reliability improvement in Hydrocarbon Processing and Chemical Engineering magazines, and is a regular participant and speaker at the semiannual API/NPRA Operating Practices Symposium and annual NPRA National Safety Conference. He holds a BS degree (honors) from Lamar University in Beaumont, Texas.