====== ICS Primary Data Center ====== == Vital Statistics: December 2024 Census == * Size: 1600 sqft * HVAC: 3 Units * Cabinets: 42 * Servers (~800) * Bare Metal: 598 * Virtual Machines: 300 * [[hardware:storage|Storage]] * Servers: 20 * Petabytes: 3.2 PB * [[services:network|Network]] (locally managed, 1/10/40GB) * Devices: 83 * Address space: 128.195.0.0-128.195.63.255 * GPU * Clusters: 8 * GPU hosts: 44 * Distributed computing managed by [[services:slurm|Slurm]] ===== Requirements for Equipment inside Data Center ===== Please use the following checklist to make sure equipment can be hosted in the ICS Data Center: * [[https://docs.google.com/spreadsheets/d/1M4vffsZ8vYTY-SZuX3e6sGIbhb1mg_2hvY2rR2_fkSg/edit?usp=sharing|ICS Hardware Checklist]] All ICS faculty and researchers are welcome to place their equipment in the ICS data center so long as the equipment has the following attributes: * **Rack mountable**: The equipment must rack neatly and securely into a standard 19" cabinet. Rack mount kit is required. * **Advanced Out of Band Management**: Often called BMC, iDRAC, IPMI, ILOM, or OOB, out of band management should provide complete remote administration with a full console. * **Trusted Platform Module (TPM)**: We are encrypting Linux installs and the module allows the passphrase to be saved there to be used for autoboot. * **Airflow**: The ICS data center is split into hot and cold aisles. Any equipment in the data center must take cold air in from the cold aisle and exhaust into a hot aisle. Typically, the means front to back airflow. * **Proper Form Factor**: Equipment must actually fit in a standard rack or a specialty rack must be provided. Typically, this means the equipment must have a depth of 30" or less. * **CAVEAT**: A server with a depth of 31.5" can work if placed in a rack where the PDUs are not adjacent to each on the side of the rack. * **IPv4**: The equipment must be capable of using an IPv4 address. At this time we do not offer IPv6 addresses. * **Power Supply Cables**: The standard power cord is C13-C14. There are a limited number of C19-C20 connections in the datacenter. * Ask the vendor if a C19-C14 is a viable option for powering the server (ideally 4ft). ==== Additional Requirements for Managed Equipment ==== ICS Computing support can manage equipment that comes assembled, is certified for the intended OS, and includes a minimal warranty. ICS researchers are welcome to house any system that meets the basic requirements and self-support but we are unable to provide support for self assembled or DIY builds. ==== Provided ==== The following resources are provided by ICS free of charge for all equipment in the ICS data center: * Up to 2 120V or 240V power connections per host. * Up to 2 1Gb/s per node plus 1 OOB connections. * Space in cabinet limited by availability and heat restrictions. ==== Recommended ==== The following lists out additional recommendations for servers in the ICS Datacenter: * RAID 10/5/6 support. * Redundant PSU * 10Gb/s Network * 5 year NBD warranty ===== Required for ICS Managed Systems ===== [[https://swiki.ics.uci.edu/doku.php/services:supported_os|ICS Operating System Support]] ==== Equipment Purchase Workflow ==== - ICS Support can help get server quotes after being provided system specifications. * If the client gets server quotes without ICS Support's assistance, there needs to be sufficient time given for review (minimum of 1 week). - Servers are reviewed for: * Chassis depth and height. * Maximum power usage (applies especially for GPU servers) * Power input cord (C13 or C19). We have been surprised with C19 which we have less outlets for. * {{:services:power_cord.jpg?200|}} * Network requirements (10GB, 1GB, fiber, etc) * Minimum 3 years warranty, with 5 years preferred. - A helpdesk ticket should be open with the quotes included for customer/ICS review and approval. ==== Installation Workflow ==== - Create IP address with name chosen for the server - Rack server and install. - Test max power consumption of server. * Verify there is enough power capacity for the server * Label power usage on the server and in DCIM - Re-rack if necessary or add more power to rack if available. ===== Property Tags ===== We can look up purchase orders in KFS to determine what equipment a property tag is meant for: - Login to https://portal.uci.edu - Click on "Finances/KFS" menu link and then the KFS Homepage under the "Tools & Support" section. * {{:services:datacenter:proptag_1.jpg?200|}} - Click on "Custom Document Search" and click on "Search - Purchase Orders". * {{:services:datacenter:proptag_2.jpg?200|}} - Click on the heart icon to make it a favorite so you can see the it from the Home link. * {{:services:datacenter:proptag_3.jpg?200|}} - For the search, just enter the purchase order number in the "Purchase Order #" field. * {{:services:datacenter:proptag_4.jpg?200|}} - The search results will be at the bottom of the page. ===== CRAH Units ===== The data center is cools vi thre [[https://heinz-mech.com/hvac/crac-units/crac-vs-crah/#:~:text=The%20difference%20between%20a%20CRAC,control%20valve%20to%20provide%20cooling.|CRAH]] units. [[https://drive.google.com/file/d/1Oq89I2glylvGNppb0I6F6b5X9PucGmRR/view?usp=sharing|Documentation]] ==== CRAH Maintentance ==== === Motor Replacement === Motors on all three units were replaced by campus facilities in the first half of 20222. Lifetime on the motors is projected to be 12 years. We'll need to look at replacing them again in 2034. * Job ID 371862 for Special HVAC Request in Information and Computer Science has been completed. * Job ID 370331 for Hot/Cold Climate in Information and Computer Science has been completed. ==== Portable AC Units ==== ICS purchased 3 portable AC units with BSAS funding. ^Name^Original Name^MAC^Location^Last Known IP^SwitchPort^Observium^ |Moe|poweralert-00066745cc57|00:06:67:45:cc:57|N06| |B12/3023-C2950S-085/3|[[https://observium.ics.uci.edu/device/device=114/|114]]| |Larry|poweralert-00066745cc50|00:06:67:45:cc:50|A13|10.195.255.251|C1/3023-C2950G-046/7|[[https://observium.ics.uci.edu/device/device=114/|116]]| |curly|poweralert-00066745cc51|00:06:67:45:cc:51|H04|10.195.255.252|A04/3023-C2950G-045/14|[[https://observium.ics.uci.edu/device/device=114/|115]]|