0
Research Papers

Thermal Performance and Efficiency of a Mineral Oil Immersed Server Over Varied Environmental Operating Conditions OPEN ACCESS

[+] Author and Article Information
Richard Eiland

Department of Mechanical and
Aerospace Engineering,
University of Texas at Arlington,
P.O. Box 19023,
Arlington, TX 76013
e-mail: richard.eiland@mavs.uta.edu

John Edward Fernandes

Department of Mechanical and
Aerospace Engineering,
University of Texas at Arlington,
P.O. Box 19023,
Arlington, TX 76013
e-mail: john.fernandes@mavs.uta.edu

Marianna Vallejo

CH2M,
2020 SW 4th Avenue, Street 300,
Portland, OR 97201
e-mail: marianna.vallejo@ch2m.com

Ashwin Siddarth

Department of Mechanical and
Aerospace Engineering,
University of Texas at Arlington,
P.O. Box 19023,
Arlington, TX 76013
e-mail: ashwin.siddarth@mavs.uta.edu

Dereje Agonafer

Fellow ASME
Department of Mechanical and
Aerospace Engineering,
University of Texas at Arlington,
P.O. Box 19023,
Arlington, TX 76013
e-mail: agonafer@uta.edu

Verrendra Mulay

Facebook Inc.,
Menlo Park, CA 425081
e-mail: vmulay@fb.com

1Corresponding author.

Contributed by the Electronic and Photonic Packaging Division of ASME for publication in the JOURNAL OF ELECTRONIC PACKAGING. Manuscript received February 16, 2017; final manuscript received August 1, 2017; published online September 7, 2017. Assoc. Editor: Baris Dogruoz.

J. Electron. Packag 139(4), 041005 (Sep 07, 2017) (9 pages) Paper No: EP-17-1020; doi: 10.1115/1.4037526 History: Received February 16, 2017; Revised August 01, 2017

Complete immersion of servers in dielectric mineral oil has recently become a promising technique for minimizing cooling energy consumption in data centers. However, a lack of sufficient published data and long-term documentation of oil immersion cooling performance make most data center operators hesitant to apply these approaches to their mission critical facilities. In this study, a single server was fully submerged horizontally in mineral oil. Experiments were conducted to observe the effects of varying the volumetric flow rate and oil inlet temperature on thermal performance and power consumption of the server. Specifically, temperature measurements of the central processing units (CPUs), motherboard (MB) components, and bulk fluid were recorded at steady-state conditions. These results provide an initial bounding envelope of environmental conditions suitable for an oil immersion data center. Comparing with results from baseline tests performed with traditional air cooling, the technology shows a 34.4% reduction in the thermal resistance of the system. Overall, the cooling loop was able to achieve partial power usage effectiveness (pPUECooling) values as low as 1.03. This server level study provides a preview of possible facility energy savings by utilizing high temperature, low flow rate oil for cooling. A discussion on additional opportunities for optimization of information technology (IT) hardware and implementation of oil cooling is also included.

FIGURES IN THIS ARTICLE
<>

Continually increasing demand for information technology (IT) applications and services has provided sustained growth and interest in data centers. The large amounts of energy consumed by data center facilities have placed a significant emphasis on the energy efficiency of the building’s overall operation. One area of importance is the cooling energy required. Data center cooling is in place to maintain and control safe operating temperatures for the IT equipment they house. Traditional approaches to cooling data centers use air as the primary cooling medium. Heat rejected from IT hardware is absorbed by the air and either rejected to the outside ambient, mixed with incoming fresh air, or cooled through refrigeration processes. These techniques are matured fields and well documented with safe environmental conditions established by ASHRAE TC 9.9 [1]. However, with continuing increases in heat densities of electronic components and densities at the rack level rising, we may be approaching the limits of air cooling, especially for high power racks and as such, alternatives methods are sought.

Both direct and indirect forms of liquid cooling offer many advantages over conventional air cooling such as higher heat capacities and lower transport energy requirements. Indirect methods using water as a cooling medium through cold plates or rear door heat exchangers demonstrate the benefits of a liquid cooling strategy [2,3]. Water cooling may also allow increased efficiency through use of higher temperature fluids and possible use of waste heat for other applications [4]. Cold plates have been a long standing method of bringing water cooling to high powered devices as with the thermal conduction module of the early 1980s [5]. Even today, there is continued interest in such applications and making this old approach more dynamic for the constantly changing requirements of next-generation components [6]. However, in most applications, the use of cold plates still requires air cooling for a portion of the components within servers. Significant developments have been made to bring water cooling to a complete cooling solution (i.e., remove 100% of the IT heat) for even the most powerful super computers [7]. Although additional piping infrastructure required within the server may increase the costs and system complexity of this technology, the cost per performance may still be equal or better than alternatives. Ample data and guidelines for implementing water cooled data center environments are available by sources such as ASHRAE TC 9.9 [8].

Direct immersion of electronic equipment offers a singular cooling solution in which the entirety of a server may be cooled by a single medium. This may provide simplicity and ease in planning and implementation of a total solution. A legacy approach to full liquid immersion cooling is using dielectric fluorocarbon refrigerants in pool boiling application. These fluids are extremely good electrical insulators and have moderate thermal characteristics [9]. Additionally, their low boiling points make them suitable for two-phase flow applications capable of removing large heat densities [10,11]. These high heat transfer rates can be further enhanced though special coatings that increase boiling rates [12,13]. However, there are additional challenges to this technology, with cost as one of the possible concerns that are still to be overcome before it will see wide application. Current research in this area seeks to best select fluids for these purposes [14].

Compared to air, many mineral oils have a heat capacity roughly 1200 times greater. The increased thermal properties, along with their dielectric nature, make mineral oils a possible alternative for data center applications. Mineral oils have long been used as heat transfer fluids with especially large adoption in power delivery applications such as high voltage transformers [15]. The performance of transformer oils as heat transfer fluids may be significantly improved with the addition of nanoparticles, as recently shown in Ref. [16]. A small fraction (<0.100% by weight) of diamond nanoparticles added to mineral oil were shown to increase the thermal conductivity of the base fluid by 40–70% while having minimal detrimental impact to flow properties such as viscosity.

When considering oil immersion as a possible cooling technology for data centers, it is observed that there is limited literature. Recent media attention has provided general power usage effectiveness (PUE) values [17,18], but the details of the operation are absent. Prucnal [19] provides a helpful overview of general operating benefits of oil-based data centers, mentioning the possibility to use facility chilled water up to 30 °C compared to 7–13 °C for traditional air-based systems. A fairly extensive account is provided by Patterson and Best [20], which showed a 36% improvement in thermal resistance of an oil immersion system compared to an air-cooled counterpart with no adverse mechanical effects. Additionally, this work showed successful cooling with oil inlet temperatures up to 43 °C and cooling PUEs in the 1.02–1.03 range. However, no discussion regarding the volume flow rates used was provided. With only limited data available, a large knowledge gap remains in the industry regarding environmental requirements to operate an oil immersion cooling facility.

The primary goal of this study is to establish general operating conditions, in terms of volumetric flow rate per server and oil inlet temperature, which can be expected for safe operation of servers in an oil immersion cooling configuration. For this work, an Intel-based Open Compute server, Fig. 1, was experimentally tested and characterized. Initially, the server was operated in the standard air-cooled configuration with internal server fans to establish baseline operating performance and component temperatures. Next, the server was removed from its standard chassis and placed in an insulated acrylic container and submerged in white mineral oil. Data were recorded for the duration of testing; however, only steady-state data are reported here with some comments made regarding transient performance. Although the complete experimental setup is not entirely reflective of an actual data center facility implementation in fully built-out conditions, the data provides strong evidence to support continued research in this area.

Air Testing Setup.

To establish a baseline for comparison, the server is initially tested in the standard air-cooled configuration as shown in Fig. 1. Detailed descriptions of the Intel-based Open Compute server design can be found in Refs. [21] and [22]. The server motherboard (MB) contains two CPUs each with a rated thermal design power of 95 W. These components represent the primary heat sources in the system and are cooled by two extruded aluminum heat sinks. The key features, which enable efficient air cooling, are four 60 × 60 × 25.4 mm fans and an air duct that directs air flow over the temperature critical components (i.e., processors and memory). The internal server fans are controlled by a fan speed control algorithm native to the server. This fan speed control operates by adjusting the fan speeds using pulse width modulation (PWM) to achieve a target CPU die temperature. In this manner, the fan speeds and die temperatures typically oscillate with some over- and under-shoot of a targeted value. An average value over the duration of the test cycle is reported. To test the server in the standard configuration, it is allowed to draw air from the ambient laboratory environment for cooling. A synthetic computing workload is applied and internal monitoring tools are used for data collection, as discussed in the Results and Discussion section.

Oil Testing Setup.

The oil immersion test setup used in the present investigation consists of the following major components as shown in Fig. 2 and discussed in the following. A single Intel-based Open Compute server motherboard and power supply unit (PSU) are placed horizontally in an immersion tank. Some modifications are made to the server to enable testing in the container available and to enable operation in oil. The server motherboard and PSU printed circuit boards are removed from their respective metal chasses to reduce their size to fit in the test container available. In addition, the air duct is removed and the internal server fans of both the server and PSU are disconnected and removed. The hard disk drive (HDD), incapable of operating when submerged, is placed outside the tank although other options exist to leave the HDD intact. The HDD is cooled by natural convection only and represents a small portion of the total IT heat load that is not removed by oil (approximately 3.5%). Of additional note, the grease-based thermal interface materials applied on the CPUs and chipset are kept in place for the duration of testing. Some sources suggest that oils and greases conflict and may cause the thermal grease to dissolve leading to contamination of the oil and possible fouling in the system. However, in the authors’ background study, this information was merely anecdotal and was not followed for the present experiments.

The tank is made of one-half inch thick acrylic of inner dimensions 45.7 × 36.8 × 19.1 cm (18.0 × 14.5 × 7.5 in) and is wrapped in 2.54 cm (1.0 in) thick insulation. A total of four 1.27 cm (0.5 in) diameter ports are tapped into the container, two serving as inlets to the tank and two as outlets. The inlet and outlet ports are strategically placed with ball valves to allow for the impact of flow patterns through the system to be studied. One set of inlet and outlet is directly in-line with the center of the base of the CPU heat sinks. When only these two ports are open, it will likely represent the best case cooling scenario for the system since the highest velocity flow is directed straight across the CPU heat sinks. The second inlet/outlet port pair is in-line with the power supply unit, removed from the main motherboard components. It is predicted that this flow configuration will prompt higher component temperatures (especially CPUs) since their cooling will be primarily induced by bulk fluid motion. These two flow configurations are used independently during testing and are termed “MB flow path” and “PSU flow path,” respectively, as shown in Fig. 3. The rationale behind pursuing the PSU flow path is to investigate the system configuration where the inlet flow is confined to directed flow over the secondary heat sources, and the underlying heat transfer mode is a conduction dominated bulk transfer mode.

A small magnetically driven centrifugal pump with a 12 V direct current (DC) brushless motor located on the outlet side of the tank circulates fluid through the system. A DC power supply delivers a constant voltage signal to the pump. The voltage delivered and current drawn by the pump are logged by a workstation in 2 s intervals. The pump is equipped with a four-pin connector, which enables pulse width modulation for speed control from 1300 to 4500 rpm. A function generator is used to control the speed of the pump to achieve the desired volumetric flow rate to the inlet of the tank.

Heat is rejected from the oil to the laboratory environment via two 240 mm radiators constructed of two-pass, single row brass tubes with louvered copper fins. Each radiator is equipped with two 120 mm 12 V DC brushless motor fans. A DC power supply delivers a constant voltage signal to the fans. The voltage delivered and current drawn by the fans are logged by the workstation. The fans are equipped with four-pin connectors, which enable PWM speed control from 1650 to 5100 rpm. A function generator is used to control the speed of the fans to achieve the desired inlet temperature to the tank containing the server.

An Omega FLMH-1402AL in-line flow meter is used to record the volumetric flow rate of the oil. The flow meter has a scale accuracy of ±4%. The meter is rated for oils with a specific gravity of 0.873. This leads to a correction factor of 1.0% for the oil used in the current system. The flow meter is placed midway between the outlet and inlet sections of the immersion tank. As this is an analog device, it is visually monitored throughout a given test, with the values at the conclusion of the test reported. Although mineral oils have relatively high volumetric expansion rates (β = 0.0007 1/K), it is assumed that the volumetric flow rate is conserved from the point of flow measurement to inlet of the immersion tank. In actual testing, the fans on the second radiator in the loop were not needed to maintain inlet temperatures and it is assumed that temperature change between flow meter and tank inlet is minimal and any resulting change in density is negligible.

An Omega Universal Serial Bus data logger is used to record the laboratory ambient air conditions in 1 min intervals with an accuracy of ±1.0 °C.

The oil inlet temperature of the tank is measured using T-type thermocouples placed in the stream of the flow, 25 mm (1 in) upstream of the inlet ports to the tank. These thermocouples have an accuracy of ±0.5 °C and are used to maintain the tank inlet temperature to within ±0.5 °C of the desired value. Similar thermocouples are 25 mm (1 in) downstream of the outlet ports to measure the temperature difference of the oil across the server.

The material specifications of the technical grade white mineral oil are given in Table 1. In total, 11.4 L (3 gal) of oil are used to fill the system (tank, radiators, and tubing). This allows the tank to be filled to a height of 6.99 cm (2.75 in), completely submerging the server to just above the top of the extruded heat sinks. This leaves an air column height of 12.1 cm (4.75 in) between the top free surface of the oil and the lid of the container. The system piping consists of 1.27 cm (0.5 in) inner diameter vinyl tubing covered with 1.27 cm (0.5 in) foam insulation. This size tubing allows the system to maintain low velocity, laminar flow, reducing system pressure losses.

T-type thermocouples are attached with epoxy to the surface of 11 components across the motherboard. The components monitored represent a range of component types (chipsets, voltage regulators (VRDs), dual in line memory module (DIMM) chips) to provide a survey of the thermal performance of the two cooling methods being tested. Of these components, the voltage regulators have the most sensitive thermal requirements, with a maximum safe operating temperature of 85 °C. If VRDs exceed this limit, they begin to throttle and degrade the overall compute performance of the system. Thermocouples are also placed in three bulk fluid locations, as well as in two locations in the air gap above the oil to help establish when steady-state conditions are reached. These thermocouples have error limits of ±1.0 °C and are connected to a data acquisition system, which records their values in 5 s intervals.

The total server power consumption is measured using a Yokogawa CW121 power meter by connecting voltage and current clamps to the incoming power feed to the server. Power consumption data are recorded in 5 s intervals and logged on the workstation.

Compute Load Generation and Data Collection.

To generate a computational workload on the server, a synthetic load generation program is employed. The lookbusy software tool allows users to set predefined CPU, memory, I/O, and networking utilization targets [23]. For this study, a workload of 75% CPU utilization with 20% memory allocation is used as design conditions. This level of workload represents high activity that would be desired in operational service and generates near maximum heat output for the server. The native Linux operating system monitoring tools mpstat and free are used to record CPU and memory utilization levels, respectively. An internal diagnostic tool provided by the motherboard manufacturer reports data from dynamic temperature sensor sensors in the processors, as well as, rpm readings from Hall sensors in the internal server fans.

The primary heat-generating components and main driver for optimizing thermal management in the particular server under study are the CPUs. As such, CPU die temperature is the main metric of concern for this study; however, understanding and monitoring of other motherboard components is important to the overall health of the server system.

Test Procedure.

The system oil was set to a desired inlet volumetric flow rate via speed control of the pump. By adjusting the speed of the radiator fans, a desired oil inlet temperature could be achieved. This process was typically iterative, and required several adjustments to achieve steady-state conditions. The values reported here are averages over the course of at least 1 h of steady-state conditions, which is defined as within ±0.1 lpm and ±0.5 °C of the targeted volumetric flow rate and oil inlet temperature, respectively. Since typical real-time workloads in data centers are not constant over long durations, the values presented here may represent worst-case conditions given the transient times required to achieve steady-state. Figure 4 shows the temperature of several variables over the course of a test. These temperatures are the oil MB inlet temperature, which was a defining parameter of the test case, MB outlet temperature, temperature in the bulk fluid at a location in the vicinity of the center of the motherboard at a depth of about 2.54 cm (1 in) from the free surface of the oil, and temperature of the air gap between the top surface of the oil and the lid of the test container. It can be seen that the time to reach steady-state values can take 30 min or more depending on the initial condition. Average steady-state values over a period of at least 1 h are reported here forth.

The test range studied included oil inlet temperatures from 30 °C to 50 °C, in increments of 5 °C. The volumetric flow rate is varied from 0.5 lpm to 2.5 lpm in increments of 0.5 lpm. However, at 30 °C and 35 °C inlet temperature, a flow rate of 2.5 lpm was not achievable because the oil viscosity caused the pressure drop through the system to be prohibitively high, as will be discussed. The results/data points corresponding to higher flow rates at 30 °C and 35 °C inlet temperature have been left blank, as will be apparent in the Results and Discussion section.

Air-Cooled Baseline Results.

The results of the preliminary air-cooled testing are used to establish typical operating temperatures of the server. Table 2 lists the average temperature of four components during steady-state conditions after three repeated trials at the design compute conditions. Data recorded showed that all three of these runs occurred with an ambient inlet temperature of 25 °C ± 1.0 °C. These components are the CPU0 die temperature, the input/output hub (IOH) chip, a memory chip on the dual in line memory module, and a voltage regulator device located directly behind CPU1. These four components represent a variety of component types and spaced around the motherboard. The CPU0 die temperature of 74.0 °C can be used as a basis for comparison in the oil cooled results. This serves to represent an upper limit of desirable operating temperature for the CPUs. As a point of reference, the average power consumption of the server (IT components + internal server fans) during this testing was 222.4 W. Roughly 4 W, or less than 2% of this power, can be attributed to the fans required for cooling.

Varying Oil Flow Rate and Inlet Temperature.

The resulting CPU die temperatures over the range of test conditions when using the MB flow path of the oil immersion setup are shown in Fig. 5. Ambient laboratory conditions were observed to be 25 °C ± 1.0 °C throughout the study. Quadratic curve fits are used to show the trends for increasing volumetric flow rates at given oil inlet temperatures. As should be expected, increasing the inlet flow rate at a given inlet temperature results in decreasing die temperatures. Increasing the flow rate from 0.5 lpm to 1.0 lpm produced the most significant impact on die temperature, resulting in a roughly 4.9 °C reduction when the inlet temperature was 30 °C. Figure 5 shows that increasing flow rate beyond 1.5 lpm begins to have diminishing returns, wherein the increase in pumping power does not correspond to a significant reduction of die temperature for the inlet temperatures studied. Using the 74.0 °C die temperature from the air cooling results as a benchmark; it seems feasible to use a range of these operating conditions to safely cool the server. All studied flow rates at inlet temperatures of 30 °C, 35 °C, and 40 °C provide suitable operating conditions. When the inlet temperature reaches 45 °C, flow rates beyond 1.5 lpm are needed to maintain CPU temperatures below the 74.0 °C threshold.

When the PSU flow path was utilized, in which only the PSU inlet and outlet ports were open, resulting CPU temperatures were higher for given flow rates and inlet temperatures as seen in Fig. 6. These higher temperatures were to be expected, since the flow of oil across the CPU heat sinks is not directly focused from the container inlet as in the MB flow path case. Figure 6 also uses quadratic curve fits to highlight the trends with increasing flow rate; however, some of these trends are fairly linear. For example, at an inlet temperature of 35 °C, each 0.5 lpm increase in flow rate from 0.5 lpm to 2.0 lpm results in a roughly 2.3 °C reduction in CPU die temperature.

Using the PSU flow path, inlet flow rate and inlet temperature conditions that result in 74.0 °C die temperature or less are greatly reduced. Here, flow rates of 1.0 lpm and above are needed at 35 °C inlet temperatures and flow rates of 2.0 lpm and above are needed at 40 °C inlet temperatures. These results indicate that ducting of flow over key components is of importance in an oil cooling system. Although the fluid velocities are low through heat sinks, bulk fluid motion of the system does not provide the same cooling performance as directed flow, particularly at lower flow rates.

Partial Power Usage Effectiveness and System Power Consumption.

Power usage effectiveness has been widely adopted and used throughout as the standard efficiency metric for data centers. PUE, originated by Belady and others, determines data center efficiency by taking the total facility power consumed divided by IT load (useful work of the data center) [24]. Detailed description and development of the term can be found in Ref. [25]. Partial PUE (pPUE) metrics can be developed to understand efficiency of specific subsystems and subsets of the data center. pPUE of cooling systems can be expressed as follows: Display Formula

(1)pPUECooling=cooling power+IT loadIT load

Equation (1) represents the efficiency of just the cooling system. For the present test setup, the entire cooling loop may be thought of as representing a “complete” data center loop in which the heat rejected by the IT equipment is eventually rejected to an ambient air of 25 °C (77 °F) from the air-cooled radiators to the laboratory environment. The caveats to this assumption are discussed in the Additional Discussion section. The cooling energy is the sum of the power drawn by the centrifugal pump and radiator fans. Table 3 compiles the pPUECooling values obtained at the various conditions studied for the MB flow path case. Over the range of conditions studied, pPUECooling values as low as 1.027 are achievable; however, this does not necessarily coincide with the minimum total system power operating point. In the range of desired CPU temperatures (i.e., CPU0 DT < 74 °C), pPUECooling values range from 1.036 to 1.170. Similar values can be seen in Table 4 for the PSU flow path case.

Although server power consumption was recorded for the air-cooled baseline test, this datum does not lend itself to pPUECooling calculations. Rough extrapolations as to the energy required by a computer room air conditioner-based or economizer-based system to achieve the 25 °C inlet air temperature would need to be made and are beyond the scope of the present work.

These results can be expanded to begin to understand efficiency of an oil immersion system at different ambient conditions. A maximum system approach temperature can be developed by taking the difference between the oil outlet temperature and ambient laboratory temperature. In this case, the oil outlet temperature is calculated using the measured inlet temperature plus the temperature rise, ΔT, across the server using the standard steady-flow thermal energy equation Display Formula

(2)q=m˙cpΔT

where q is the measured IT power, cp is the oil’s specific heat, and m˙ is the mass flow rate, taken as the product of the oil’s density and measured system volume flow rate. The relation between the maximum system approach and system efficiency, shown in Fig. 7, is fit with an exponential curve as can be expected in heat exchanger analysis. These results are helpful to system designers to begin estimating the system efficiency over expanded ambient conditions. This cannot replace a detailed energy flow model of a facility but serves as a starting point for reference designs.

PUE and pPUECooling alone do not provide a complete picture of the energy efficiency of a data center cooling scheme. With the goal of minimizing total facility power for a data center, all components’ energy consumption must be considered holistically. Figure 8 shows a surface plot of the minimization of total power consumption for the system for the MB flow path case. This includes the power to the server (IT load), the power consumed by the pump, and the power drawn by the radiator fans. It can be seen that the minimum power draw for the system occurs at 40 °C (104 °F) inlet temperature and 0.5 lpm flow rate. At this point, the CPU die temperature is 74.1 °C, just above the targeted upper limit. Although the best pPUECooling value occurred at an inlet flow rate of 0.5 lpm and 50 °C inlet temperature, the total system power at this point was 2% higher and had a CPU0 die temperature of 83.4 °C.

The somewhat conflicting results of pPUECooling and total system power can best be understood by analyzing the individual system components (IT, pump, and fans). The general trends are as follows:

  • IT power increases nonlinearly with CPU die temperature (and all component temperatures in general) due to leakage current effects in silicon devices. This trend is roughly quadratic as shown in Fig. 9, which shows the relation between CPU0 die temperature, the input/output hub, and one of CPU1’s VRDs with the total server power from the data collected in the MB Flow Path cases. All three of these components have similar trends with total server power. This is not a complete relation since the total server power depends on power consumption and leakage of all components on the MB and PSU, but it does provide a rough estimate of the general behavior. For example, increasing the CPU0 die temperature from 60 °C to 70 °C results in a roughly 2.3 W increase in total server power. Increasing the die temperature from 70 °C to 80 °C results in a 4.7 W increase in power consumption. There is certainly incentive to operate the server at lower component temperatures for this reason; however, it must be complimented with the rest of the system.

  • It is known that the viscosity of oils vary with temperature. This has a direct impact on the system pressure and hence, pumping power required of the system. A standard correlation relating viscosity and temperature for transformer oils (similar to the light mineral oil used in this study) is given in Ref. [15] as Display Formula

    (3)μ=C1*Exp[2797.3T+273.2]
    where μ is the dynamic viscosity in centipoise, T is the temperature in  °C, and C1 is a coefficient for scaling (a value of 0.0013573 is provided in the reference for transformer oil). The interdependence of viscosity with Reynolds number (Re), Reynolds number with friction factor (f), and friction factor with pressure drop (Δp) for laminar flow [26,27] Display Formula
    (4)μ1RefΔp
    should eventually manifest itself in the pumping power of the system by the relation Display Formula
    (5)Ppump=Δp*V˙
    where V˙ is the volumetric flow rate. Figure 10 shows suitable agreement with the pumping power required to maintain 1.0 lpm of flow rate in the immersion system for the MB flow path and the viscosity trend predicted by Eq. (3). Over the range of inlet temperatures studied, a 43.5% reduction in viscosity is predicted and an average reduction in pumping power of 42.6% is observed for a given flow rate.

    Looking at the pumping power over the range of temperatures and flow rates studied, as in Fig. 11, it is easy to see that there is ample incentive to operate at higher temperatures. The curves in Fig. 11 are fit with cubic trend lines which should be expected from the pump affinity laws, which state that pump power is proportional to the cube of the impeller or shaft speed [28]. As stated earlier, increasing flow rate beyond 1.5 lpm did not correspond to a significant reduction of component temperatures while operating at oil inlet temperatures beyond 45 °C necessitates flow rates more than 1.5 lpm to maintain safe operating temperatures. In Fig. 12, for oil flow rates less than 1.5 lpm, the competing trends of nonlinear decrease in pumping power and nonlinear increase in IT power as oil temperature increases can be observed. These trends observed indicate a design opportunity in achieving an operating point with minimum power requirement.

    For all cases studied, the flow is distinctly laminar, well below the transition regime. At the lower end, with a flow rate of 0.5 lpm and oil temperature of 30 °C, the Reynolds number within the system piping is ∼50. At the other extreme, with a flow rate of 2.5 lpm and oil temperature of 50 °C, the Reynolds number is ∼440. These values will become substantially lower as the flow enters the larger volume of the tank and flows across the server.

  • At oil inlet temperatures of 35 °C and above, the radiator fans operate at idle speeds. At these low speeds, the radiator fans consume <1.2% of the total system power. Their impact on pPUECooling values over the operating conditions is relatively constant.

Understanding these interrelated trends of system components is central to selecting the optimal operating conditions for an oil immersion cooled system. The results presented here for the MB flow path and PSU flow path cases provide some initial bounding operating conditions for an oil cooled system. Since the MB flow path can be considered a “better” case scenario for the cooling, it can utilize higher temperature fluids and lower flow rates.

Comparison to Air-Cooled Baseline Results.

Although a direct comparison of PUE is not possible between the air-cooled baseline tests and the oil immersion results, other performance parameters may be used. Thermal resistance to heat transfer between a device and coolant is given by Display Formula

(6)Rth=TdTcq

where Td is the device temperature, Tc is the incoming coolant temperature, and q is the heat dissipated by the device. In this way, a comparison between oil and air can be made by focusing on the critical device temperature, the CPU, and the server’s inlet coolant temperature divided by the total IT power of the system. From the air-cooled baseline tests, air provides a thermal resistance of 0.224 °C/W. Over all oil cases studied, oil provided a thermal resistance ranging from 0.128 to 0.175 °C/W, with an average resistance of 0.147 °C/W. This average value is a 34.4% improvement over the baseline air cooling case. This improved performance is consistent with the results reported in Ref. [20].

Comparisons of surface temperature measurements show improvements in thermal performance across all components on the motherboard. Table 5 compares the rise in surface temperature of components, Td, over the inlet coolant temperature, Tc, with a lower value being better. The air results are taken from baseline tests and oil results are the average temperature rise over all the MB flow path test conditions. The most significant improvements are seen across all voltage regulating devices for the CPUs, which are critical for power delivery to the CPUs. In all conditions, including 50 °C oil inlet temperature at a flow rate of 0.5 lpm, the VRDs exhibited lower surface temperatures than in the air-cooled baseline case. From a thermal reliability perspective, this is significant.

Additional Discussion.

The initial results of this study are helpful in establishing a range of flow requirements that can be expected on per server or per Watt basis for a data center oil cooling strategy. These figures may be helpful in design of a larger system and provide relative sizing requirements for pumps, heat exchangers, and system pressure drop. Other than using temperature difference across the server (ΔT) for the design criteria, which may be difficult to determine, these results establish flow requirements based on component temperatures, namely CPU, which are the critical elements of the system.

The efficiency values (pPUECooling) presented earlier are unique to the system being tested. Certain simplifications and modifications were required for laboratory testing that may not be in place in an actual data center implementation. A fully built oil immersion data center may contain additional heat exchanges (i.e., oil-to-water, water-to-air) before final rejection of heat to the environment. The additional exchanges and piping will require increased cooling energy consumption. However, larger components (pumps, fans, etc.) are generally more efficient than geometrically similar, smaller components. Because of these counteracting efficiency factors, the results here are expected to be indicative of what may be seen at larger scales.

As discussed in Ref. [21], this Open Compute server was strategically designed and optimized for air cooling. Many of the design aspects are beneficial in oil cooling, but it is expected that further optimization for oil cooling can be achieved. For example, the heat sinks may be better designed for oil flow conditions as fin efficiencies in liquid and air are markedly different. This modification may be especially helpful in the PSU flow path case in which heat transfer is mainly from bulk fluid motion. Modifications to the current system, such as the removal of the air duct and PSU chassis, present material savings that may not be achievable in standard air-cooled server designs.

Further enhancements to the system’s performance may be gained by more prudent selection of the oil used. Additional oil types such as vegetable oils and other mineral oils may provide better heat transfer and fluid characteristics compared to the white mineral oil used for this study. In general, there are a wide range of topics that can be explored to better understand this promising cooling technique.

The purpose of this work was to establish the general operating trends that may be seen in an oil immersion cooled data center setup. By operating a single server fully immersed in mineral oil and varying volumetric flow rate and oil inlet temperature, bounding environmental operating conditions were established. From these results, it is possible to utilize oil inlet temperatures up to 45 °C for cooling. pPUECooling values ranging from 1.03 to 1.17 were achieved in the current experimental setup. Comparison with baseline air cooling tests showed 34.4% reduction in the thermal resistance of the system and significant reduction in the temperature difference between component surfaces and the inlet coolant. Improvements to system and hardware design may increase the efficiency and potential for use of oil cooling in data centers. Future work in this area could include optimizing heat sinks for oil cooling, constructing a larger experimental setup to understand how efficiency scales with size, understanding dynamic loading effects of IT equipment in mineral oil, exhaustive reliability studies, and understanding servicing challenges present in an oil immersed data center.

  • National Science Foundation (Grant No. 1134821).

  • cp =

    specific heat

  • f =

    friction factor

  • m˙ =

    mass flow rate

  • Ppump =

    pumping power

  • q =

    total server power dissipated

  • Rth =

    thermal resistance

  • Re =

    Reynolds number

  • Tc =

    coolant inlet temperature

  • Td =

    device temperature

  • V¯ =

    volumetric flow rate

  • Δp =

    pressure drop

  • ΔT =

    temperature difference across server

  • μ =

    dynamic viscosity

ASHRAE TC 9.9, 2015, Thermal Guidelines for Data Processing Environments, 4th ed., ASHRAE Datacom Series, ASHRAE, Atlanta, GA.
Alkharabsheh, S. , Fernandes, J. , Gebrehiwot, B. , Agonafer, D. , Ghose, K. , Ortega, A. , Joshi, Y. , and Sammakia, B. , 2015, “ A Brief Overview of Recent Developments in Thermal Management in Data Centers,” ASME J. Electron. Packag., 137(4), p. 040801. [CrossRef]
Ellsworth, M. J., Jr. , and Iyengar, M. K. , 2009, “ Energy Efficiency Analyses and Comparison of Air and Water Cooled High Performance Servers,” ASME Paper No. InterPACK2009-89248.
Carbó, A. , Oró, E. , Salom, J. , Canuto, M. , Macías, M. , and Guitart, J. , 2016, “ Experimental and Numerical Analysis for Potential Heat Reuse in Liquid Cooled Data Centres,” Energy Convers. Manage., 112, pp. 135–145. [CrossRef]
Chu, R. C. , Hwang, U. P. , and Simons, R. E. , 1982, “ Conduction Cooling for an LSI Package: A One-Dimensional Approach,” IBM J. Res. Dev., 26(1), pp. 45–54. [CrossRef]
Fernandes, J. , Ghalambor, S. , Docca, A. , Aldham, C. , Agonafer, D. , Chenelly, E. , Chan, B. , and Ellsworth, M., Jr. , 2013, “ Combining Computational Fluid Dynamics (CFD) and Flow Network Modeling (FNM) Design of a Multi-Chip Module (MCM) Cold Plate,” ASME Paper No. IPACK2013-73294.
Ovaska, S. J. , Dragseth, R. E. , and Hanssen, S. A. , 2016, “ Direct-to-Chip Liquid Cooling for Reducing Power Consumption in a Subarctic Supercomputer Centre,” Int. J. High Perform. Comput. Networking, 9(3), pp. 242–249. [CrossRef]
ASHRAE TC 9.9, 2011, “ 2011 Thermal Guidelines for Liquid Cooled Data Processing Environments,” ASHRAE, Atlanta, GA.
Incropera, F. P. , 1999, Liquid Cooling of Electronic Devices by Single-Phase Convection, Wiley-Interscience Publication, New York.
Anderson, T. M. , and Mudawar, I. , 1989, “ Microelectronic Cooling by Enhanced Pool Boiling of a Dielectric Fluorocarbon Liquid,” ASME J. Heat Transfer, 111(3), pp. 752–759. [CrossRef]
Tuma, P. , 2010, “ The Merits of Open Bath Immersion Cooling of Datacom Equipment,” 26th Annual IEEE Semiconductor Thermal Measurement and Management Symposium (SEMI-THERM), Santa Clara, CA, Feb. 21–25, pp. 123–131.
Arik, M. , 2007, “ Enhancement of Pool Boiling Critical Heat Flux in Dielectric Liquids by Microporous Coatings,” Int. J. Heat Mass Transfer, 50(5–6), pp. 997–1009. [CrossRef]
El-Genk, M. S. , 2012, “ Immersion Cooling Nucleate Boiling of High Power Computer Chips,” Energy Convers. Manage., 53(1), pp. 205–218. [CrossRef]
Warrier, P. , Sathyanarayana, A. , Patil, D. V. , France, S. , Joshi, Y. , and Teja, A. S. , 2012, “ Novel Heat Transfer Fluids for Direct Immersion Phase Change Cooling of Electronic Systems,” Int. J. Heat Mass Transfer, 55(13–14), pp. 3379–3385. [CrossRef]
Pierce, L. W. , 1992, “ An Investigation of the Thermal Performance of an Oil Filled Transformer Winding,” IEEE Trans. Power Delivery, 7(3), pp. 1347–1358. [CrossRef]
Taha-Tijerina, J. J. , Narayana, T. , Tiwary, C. S. , Lozano, K. , Chipara, M. , and Ajayan, P. M. , 2014, “ Nanodiamond-Based Thermal Fluids,” ACS Appl. Mater. Interfaces, 6(7), pp. 4778–4785. [CrossRef] [PubMed]
Jones, P. , 2012, “ Data Center Dynamics,” Data Center Dynamics Ltd., London, accessed Nov. 15, 2013, http://www.datacenterdynamics.com/focus/archive/2012/09/intel-says-liquid-and-servers-do-mix
Fulton, S., III , 2017, “ Data Center Knowledge,” Data Center Knowledge, San Francisco, CA, accessed June 2, 2017, http://www.datacenterknowledge.com/archives/2017/05/01/data-center-cooling-how-practical-is-dunking-servers-in-oil/
Prucnal, D. , 2013, “ Doing More With Less: Cooling Computers With Oil Pays Off,” Next Wave, 20(2), pp. 20–29. http://www.grcooling.com/wp-content/uploads/2015/06/NSA-The-Next-Wave.pdf
Patterson, M. , and Best, C. , 2012, “ Oil Immersion Cooling and Reductions In Data Center Energy Use,” ASHRAE Summer Conference, San Antonio, TX, June 23–27.
Frachtenberg, E. , Heydari, A. , Li, H. , Michael, A. , Na, J. , Nisbet, A. , and Sarti, P. , 2011, “ High-Efficiency Server Design,” International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), Seattle, WA, Nov. 12–18, pp. 1–11.
Li, H. , and Michael, A. , 2011, “ Intel Motherboard Hardware v1.0,” Open Compute Project, accessed Nov. 22, 2013, http://www.opencompute.org/projects/intel-motherboard/
Carraway, D. , 2013, “ Lookbusy: A Synthetic Load Generator,” Look Busy, accessed Aug. 18, 2017, http://www.devin.com/lookbusy
The Green Grid, 2008, “ The Green Grid Data Center Power Efficiency Metrics: PUE and DCiE,” The Green Grid Administration, Beaverton, OR, White Paper No. 6. http://www.premiersolutionsco.com/wp-content/uploads/TGG_Data_Center_Power_Efficiency_Metrics_PUE_and_DCiE.pdf
The Green Grid, 2012, “ PUE: A Comprehensive Examination of the Metric,” The Green Grid Administration, Beaverton, OR, White Paper No. 49. https://datacenters.lbl.gov/sites/all/files/WP49-PUE%20A%20Comprehensive%20Examination%20of%20the%20Metric_v6.pdf
Godfrey, D. , and Herguth, W. R. , 1996, Physical and Chemical Properties of Industrial Mineral Oils Affecting Lubrication—Parts 1–5, Society of Tribologists and Lubrication Engineers, Park Ridge, IL.
Eiland, R. , Fernandes, J. , Vallejo, M. , Agonafer, D. , and Mulay, V. , 2014, “ Flow Rate and Inlet Temperature Considerations for Direct Immersion of a Single Server in Mineral Oil,” IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm), Orlando, FL, May 27–30, pp. 706–714.
White, F. M. , 2011, Fluid Mechanics, 7th ed., McGraw-Hill, New York.
Copyright © 2017 by ASME
View article in PDF format.

References

ASHRAE TC 9.9, 2015, Thermal Guidelines for Data Processing Environments, 4th ed., ASHRAE Datacom Series, ASHRAE, Atlanta, GA.
Alkharabsheh, S. , Fernandes, J. , Gebrehiwot, B. , Agonafer, D. , Ghose, K. , Ortega, A. , Joshi, Y. , and Sammakia, B. , 2015, “ A Brief Overview of Recent Developments in Thermal Management in Data Centers,” ASME J. Electron. Packag., 137(4), p. 040801. [CrossRef]
Ellsworth, M. J., Jr. , and Iyengar, M. K. , 2009, “ Energy Efficiency Analyses and Comparison of Air and Water Cooled High Performance Servers,” ASME Paper No. InterPACK2009-89248.
Carbó, A. , Oró, E. , Salom, J. , Canuto, M. , Macías, M. , and Guitart, J. , 2016, “ Experimental and Numerical Analysis for Potential Heat Reuse in Liquid Cooled Data Centres,” Energy Convers. Manage., 112, pp. 135–145. [CrossRef]
Chu, R. C. , Hwang, U. P. , and Simons, R. E. , 1982, “ Conduction Cooling for an LSI Package: A One-Dimensional Approach,” IBM J. Res. Dev., 26(1), pp. 45–54. [CrossRef]
Fernandes, J. , Ghalambor, S. , Docca, A. , Aldham, C. , Agonafer, D. , Chenelly, E. , Chan, B. , and Ellsworth, M., Jr. , 2013, “ Combining Computational Fluid Dynamics (CFD) and Flow Network Modeling (FNM) Design of a Multi-Chip Module (MCM) Cold Plate,” ASME Paper No. IPACK2013-73294.
Ovaska, S. J. , Dragseth, R. E. , and Hanssen, S. A. , 2016, “ Direct-to-Chip Liquid Cooling for Reducing Power Consumption in a Subarctic Supercomputer Centre,” Int. J. High Perform. Comput. Networking, 9(3), pp. 242–249. [CrossRef]
ASHRAE TC 9.9, 2011, “ 2011 Thermal Guidelines for Liquid Cooled Data Processing Environments,” ASHRAE, Atlanta, GA.
Incropera, F. P. , 1999, Liquid Cooling of Electronic Devices by Single-Phase Convection, Wiley-Interscience Publication, New York.
Anderson, T. M. , and Mudawar, I. , 1989, “ Microelectronic Cooling by Enhanced Pool Boiling of a Dielectric Fluorocarbon Liquid,” ASME J. Heat Transfer, 111(3), pp. 752–759. [CrossRef]
Tuma, P. , 2010, “ The Merits of Open Bath Immersion Cooling of Datacom Equipment,” 26th Annual IEEE Semiconductor Thermal Measurement and Management Symposium (SEMI-THERM), Santa Clara, CA, Feb. 21–25, pp. 123–131.
Arik, M. , 2007, “ Enhancement of Pool Boiling Critical Heat Flux in Dielectric Liquids by Microporous Coatings,” Int. J. Heat Mass Transfer, 50(5–6), pp. 997–1009. [CrossRef]
El-Genk, M. S. , 2012, “ Immersion Cooling Nucleate Boiling of High Power Computer Chips,” Energy Convers. Manage., 53(1), pp. 205–218. [CrossRef]
Warrier, P. , Sathyanarayana, A. , Patil, D. V. , France, S. , Joshi, Y. , and Teja, A. S. , 2012, “ Novel Heat Transfer Fluids for Direct Immersion Phase Change Cooling of Electronic Systems,” Int. J. Heat Mass Transfer, 55(13–14), pp. 3379–3385. [CrossRef]
Pierce, L. W. , 1992, “ An Investigation of the Thermal Performance of an Oil Filled Transformer Winding,” IEEE Trans. Power Delivery, 7(3), pp. 1347–1358. [CrossRef]
Taha-Tijerina, J. J. , Narayana, T. , Tiwary, C. S. , Lozano, K. , Chipara, M. , and Ajayan, P. M. , 2014, “ Nanodiamond-Based Thermal Fluids,” ACS Appl. Mater. Interfaces, 6(7), pp. 4778–4785. [CrossRef] [PubMed]
Jones, P. , 2012, “ Data Center Dynamics,” Data Center Dynamics Ltd., London, accessed Nov. 15, 2013, http://www.datacenterdynamics.com/focus/archive/2012/09/intel-says-liquid-and-servers-do-mix
Fulton, S., III , 2017, “ Data Center Knowledge,” Data Center Knowledge, San Francisco, CA, accessed June 2, 2017, http://www.datacenterknowledge.com/archives/2017/05/01/data-center-cooling-how-practical-is-dunking-servers-in-oil/
Prucnal, D. , 2013, “ Doing More With Less: Cooling Computers With Oil Pays Off,” Next Wave, 20(2), pp. 20–29. http://www.grcooling.com/wp-content/uploads/2015/06/NSA-The-Next-Wave.pdf
Patterson, M. , and Best, C. , 2012, “ Oil Immersion Cooling and Reductions In Data Center Energy Use,” ASHRAE Summer Conference, San Antonio, TX, June 23–27.
Frachtenberg, E. , Heydari, A. , Li, H. , Michael, A. , Na, J. , Nisbet, A. , and Sarti, P. , 2011, “ High-Efficiency Server Design,” International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), Seattle, WA, Nov. 12–18, pp. 1–11.
Li, H. , and Michael, A. , 2011, “ Intel Motherboard Hardware v1.0,” Open Compute Project, accessed Nov. 22, 2013, http://www.opencompute.org/projects/intel-motherboard/
Carraway, D. , 2013, “ Lookbusy: A Synthetic Load Generator,” Look Busy, accessed Aug. 18, 2017, http://www.devin.com/lookbusy
The Green Grid, 2008, “ The Green Grid Data Center Power Efficiency Metrics: PUE and DCiE,” The Green Grid Administration, Beaverton, OR, White Paper No. 6. http://www.premiersolutionsco.com/wp-content/uploads/TGG_Data_Center_Power_Efficiency_Metrics_PUE_and_DCiE.pdf
The Green Grid, 2012, “ PUE: A Comprehensive Examination of the Metric,” The Green Grid Administration, Beaverton, OR, White Paper No. 49. https://datacenters.lbl.gov/sites/all/files/WP49-PUE%20A%20Comprehensive%20Examination%20of%20the%20Metric_v6.pdf
Godfrey, D. , and Herguth, W. R. , 1996, Physical and Chemical Properties of Industrial Mineral Oils Affecting Lubrication—Parts 1–5, Society of Tribologists and Lubrication Engineers, Park Ridge, IL.
Eiland, R. , Fernandes, J. , Vallejo, M. , Agonafer, D. , and Mulay, V. , 2014, “ Flow Rate and Inlet Temperature Considerations for Direct Immersion of a Single Server in Mineral Oil,” IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm), Orlando, FL, May 27–30, pp. 706–714.
White, F. M. , 2011, Fluid Mechanics, 7th ed., McGraw-Hill, New York.

Figures

Grahic Jump Location
Fig. 1

Intel-based Open Compute server in standard air-cooled configuration with air duct removed for visual purposes

Grahic Jump Location
Fig. 2

Schematic of test setup and data collection equipment

Grahic Jump Location
Fig. 3

Diagram of the two flow configurations through the immersion tank. “MB flow path” occurs when only the MB inlet and outlet valves are open. “PSU flow path” occurs when only the PSU inlet and outlet valves are open.

Grahic Jump Location
Fig. 4

Typical test duration and establishing steady-state conditions for the server under test

Grahic Jump Location
Fig. 5

Impact of increasing oil volume flow rate on CPU die temperature along lines of constant oil inlet temperature for the MB flow path case

Grahic Jump Location
Fig. 6

Impact of increasing oil volume flow rate on CPU die temperature along lines of constant oil inlet temperature for the PSU flow path case

Grahic Jump Location
Fig. 7

Relation between system approach temperature and efficiency

Grahic Jump Location
Fig. 8

Surface contour of the normalized total system power consumption for the MB flow path case

Grahic Jump Location
Fig. 9

General relationship between component temperatures and total server power based on the data collected in the MB flow path case

Grahic Jump Location
Fig. 10

Comparison of trends for temperature-dependent pumping power for the current system and temperature dependent viscosity of transformer oil as predicted by Eq. (3)

Grahic Jump Location
Fig. 11

Temperature-dependent oil flow rates and cubic relation of pumping power to flow rate

Grahic Jump Location
Fig. 12

Competing trends for pumping power and server IT power over a range of oil inlet temperatures at constant operating oil flow rates

Tables

Table Grahic Jump Location
Table 1 Mineral oil physical properties
Table Grahic Jump Location
Table 2 Baseline temperature results of air-cooled testing
Table Grahic Jump Location
Table 3 pPUECooling values at given tested operating conditions with the MB flow path
Table Grahic Jump Location
Table 4 pPUECooling values at given tested operating conditions with the PSU flow path
Table Grahic Jump Location
Table 5 Thermal performance comparison of motherboard component surface temperatures

Errata

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In