Managing Thermal Stress: Cooling System Maintenance for Longevity

If you’ve ever stood on a patch of sun-baked bluestone gravel and watched a four-million-dollar piece of infrastructure slowly roast itself to death while its cooling fans screamed at full tilt, you’ll understand why I don’t trust my ears anymore.


It was a Tuesday afternoon in mid-August. The kind of sweltering, airless day where the heat radiating would send a person to hospital. there was a maiintainance & I was doing a walk-through of Substation 4, a critical node in our grid, carrying a heavy, rubberized thermal imaging camera that was sticking to my sweaty palms. There were no alarms ringing in the control room. The SCADA system was happily reporting that the 50 MVA transformer was operating within normal parameters. The massive banks of cooling fans were spinning so fast they would practically levitate, kicking up a storm of dust and dry grass.

By all conventional metrics, everything was fine.

But when I lifted the thermal imager to my eye, the screen told a terrifyingly different story. Instead of a smooth, even gradient of green and yellow moving across the radiator fins, the screen looked like a bruised plum.

The top half of the transformer’s main tank was a glaring, angry crimson. But the bottom half of the second radiator bank—the very fins the fans were so desperately trying to cool—was dark purple. It was dead cold.

I remember lowering the camera and just staring at the colossal grey steel box, listening to the deafening 60Hz hum vibrating in my chest. I felt a cold knot form in my stomach despite the 104-degree heat.

The fans were doing their job flawlessly. blowng air across the radiators.

We talk a lot in this industry about "thermal stress" and "longevity." We write thick, sterile manuals about it. But in that moment, the reality of thermal stress wasn't a graph on a PowerPoint slide; it was a living, breathing emergency happening right in front of me.

Here’s the reality of how a transformer dies: it doesn't usually happen in a spectacular lightning strike or a sudden explosion. It dies a slow, suffocating death from the inside out. The lifeblood of these massive machines is the mineral oil circulating inside them, and the heart is the cellulose paper wrapping the copper windings. When that paper gets too hot, it degrades. Think of an old paperback book left in a baking hot attic for twenty years—the pages turn yellow, they get brittle, and eventually, they crumble to dust.

Once that paper insulation crumbles inside a transformer, you get an internal short, and it's game over. You can’t "uncook" the paper.

For months, our maintenance crew had been checking off boxes. Are the fan belts tight? Yes. Are the fan motors drawing the correct current? Yes. Did we power-wash the cottonwood seeds and dirt off the radiator fins? Yes. We were treating the cooling system like an air conditioner hanging out the window of a living room. We thought that as long as we were blowing air at the metal, we were managing the heat.

But standing there with my thermal camera, I had a massive "Aha!" moment that permanently changed how I view equipment maintenance. A transformer’s cooling system isn't an air conditioner; it’s a circulatory system.

If the arteries—in this case, the narrow channels inside the radiator fins—are clogged with oxidized oil sludge, it doesn't matter how hard the lungs are breathing. The heat stays trapped inside the core, silently cooking the life out of the insulation.

I called the crew in, turned off the breaker of that transformer, took out 70 litres of oil to remove the radiator from top, then opened the top cover and place a long insulated and properly cleaned long stick and dip t iinsiide and found upto 2 inches of thick sludge at the base. Then we took out all the oils out, cleaned the sludge properly removed all the radiators and cleaned inside of the radiator with pressurzed oil. then connected all the radiators . refilled with new oil after filtration.

I stopped looking at cooling systems as a collection of parts and started treating them as a fragile thermal ecosystem.

So, what did I actually take away from nearly losing asset on a random Tuesday afternoon? How do you actually manage thermal stress for longevity, rather than just waiting for things to break?

Here are the three shifts in perspective that saved our grid, and frankly, they apply to managing any complex, high-stress system in your life or business:

1. Stop Trusting the Noise; Look for the Flow

Just because a system is making a lot of noise and expending a lot of energy doesn't mean it's actually doing the work. Our fans were screaming, but the heat wasn't moving. In your own operations, stop measuring the effort and start measuring the transfer. For a transformer, this means mandatory, routine thermal imaging under heavy load. Don't just look to see if the radiators are hot; look at the temperature differential (the Delta-T) between the top and the bottom. If the bottom is suspiciously cool while the top is baking, your flow is blocked.

2. Maintain the Fluid, Not Just the Hardware

We spent all our time washing the outside of the radiators and ignoring the chemistry happening inside. The sludge that blocked our system didn't appear overnight; it was the result of years of degrading oil chemistry. You have to religiously monitor your Dissolved Gas Analysis (DGA) and oil quality metrics. If your oil is oxidizing, your cooling system is dying from the inside out, no matter how shiny the fans are on the outside. Treat the internal chemistry with the same urgency you treat a broken fan blade.

3. Redefine What "Normal" Looks Like

The SCADA system didn't flag the issue because the average top-oil temperature hadn't yet crossed the critical alarm threshold. But average is a dangerous word. The core was experiencing severe, localized thermal stress that the sensors couldn't see. You have to establish a baseline of what healthy operation looks like for each specific asset under different loads, and investigate anything that deviates from that unique baseline, even if it hasn't tripped an alarm yet.

I still walk that same bluestone gravel at Substation 4. The 50 MVA transformer is still humming away, carrying the load of upto 2500 Amps at 11 kV. But I don't listen to the fans anymore. I watch the heat.

We spend so much time in our professional lives reacting to the loud, screaming alarms of things that are already broken. But I have to ask you: what "perfectly functioning" system in your world is quietly burning itself out from the inside, just waiting for you to look a little closer?