Design IT room future state architecture

Design the target technical environment, application architecture, target shared services environment, and target disaster recovery environment. This may additionally include any temporary migration facilities and transformation activities, if planned.

Design considerations

Building an IT room requires detailed attention to several main design considerations.

Location

IT room location is the first consideration, even before considering the layout of the IT rooms contents. Most designers agree that, where possible, the computer room should not be built where one of its walls is an exterior wall of the building. Exterior walls can often be quite damp and can contain water pipes that could burst and drench the equipment. Avoiding exterior windows means avoiding a security risk, and breakages. Avoiding both the top floors and basements means avoiding flooding, and leaks in the case of roofs. If a centralized computer room is not feasible, server closets on each floor may be an option. This is where computer, network and phone equipment are housed in closets and each closet is stacked above each other on the floor that they service.

In addition to the hazards of exterior walls, designers need to evaluate any potential sources of interference in proximity to the computer room. Designing such a room means keeping clear or radio transmitters, and electrical interference from power plants or lift rooms.

Other physical design considerations range from room size, door sizes and access ramps (to get equipment in and out) to cable organization, physical security and maintenance access.

Air conditioning

Computer equipment generates heat, and is sensitive to heat, humidity, and dust, but also the need for very high resilience and failover requirements. Maintaining a stable temperature and humidity within tight tolerances is critical to IT system reliability. Server room temperature has to be between 18-27°C or 64-80°F; humidity should be between 40%-60% rH.

In most server rooms “close control air conditioning” systems, also known as PAC (precision air conditioning) systems, are installed. These systems control temperature, humidity and particle filtration within tight tolerances 24 hours a day and can be remotely monitored. They can have built-in automatic alerts when conditions within the server room move outside defined tolerances.

Fire protection

The fire protection system’s main goal should be to detect and alert of fire in the early stages, then bring fire under control without disrupting the flow of business and without threatening the personnel in the facility. Server room fire suppression technology has been around for as long as there have been server rooms. Traditionally, most computer rooms used Halon gas, but this has been shown to be environmentally unfriendly and unsafe for humans. Modern computer rooms use combinations of inert gases such as Nitrogen, Argon and CO2. Other solutions include clean chemical agents such as FM200 and also hypoxic air solutions that keep oxygen levels down. To prevent fires from spreading due to data cable and cord heat generation, organizations have also used those that are coated with FEP tubing. This plastic reduces heat generation and safeguards material metal efficiently.

Future-proofing

The demands of server rooms are constantly changing as organizations evolve and grow and as technology changes. An essential part of computer room design is future proofing so that new requirements can be accommodated with minimal effort. As computing requirements grow, so will a server room’s power and cooling requirements. As a rough guide, for every additional 100 kW of equipment installed, a further 30 kW of energy is required to cool it. As a result, air conditioning designs will need to have scalability designed in from the outset.

The choice of racks in a server room is usually the prime factor when determining space. Many organisations use telco racks or enclosed cabinets to make the most of the space they have. Today, with servers that are one-rack-unit (1U) high and new blade servers, a single 19- or 23-inch rack can accommodate anywhere from 42 to hundreds of servers.

Redundancy

If the computer systems in a server room are mission critical, removing single points of failure and common-mode failures may be of high importance. The level of desired redundancy is determined by factors such as whether the organisation can tolerate interruption whilst failover systems are activated, or must they be seamless without any business impacts. Other than computer hardware redundancy, the main consideration here is the provisioning of failover power supplies and cooling.

Security

Physical security also plays a large role. Physical access to the site is usually restricted to selected personnel, with controls including a layered security system often starting with fencing, bollards and mantraps. Video camera surveillance and permanent security guards are almost always present if the IT room is large or contains sensitive information on any of the systems within. The use of finger print recognition mantraps is starting to be commonplace.

Server farm

A server farm or server cluster is a collection of computer servers, usually maintained by an organization to supply server functionality far beyond the capability of a single machine. Server farms often consist of thousands of computers which require a large amount of power to run and to keep cool. At the optimum performance level, a server farm has enormous costs (both financial and environmental) associated with it. Server farms often have backup servers, which can take over the function of primary servers in the event of a primary server failure. Server farms are typically collocated with the network switches and/or routers which enable communication between the different parts of the cluster and the users of the cluster. Server farmers typically mount the computers, routers, power supplies, and related electronics on 19-inch racks in the IT room.

Applications

Server farms are commonly used for cluster computing. Many modern supercomputers comprise giant server farms of high-speed processors connected by either gigabit ethernet or custom interconnects. Server farms are increasingly being used instead of or in addition to mainframe computers by large enterprises, although server farms do not yet reach the same reliability levels as mainframes. Because of the sheer number of computers in large server farms, the failure of an individual machine is a commonplace event, and the management of large server farms needs to take this into account by providing support for redundancy, automatic failover, and rapid reconfiguration of the server cluster.

Performance

The performance of the largest server farms (thousands of processors and up) is typically limited by the performance of the IT room cooling systems and the total electricity cost rather than by the performance of the processors. Computers in server farms run 24/7 and consume large amounts of electricity, for this reason, the critical design parameter for both large and continuous systems tends to be performance per watt rather than cost of peak performance or (peak performance / (unit * initial cost)). Also, for high availability systems that must run 24/7 (unlike supercomputers that can be power-cycled to demand, and also tend to run at much higher utilizations), there is more attention placed on power saving features such as variable clock-speed and the ability to turn off both computer parts, processor parts, and entire computers (WoL and virtualization) according to demand without bringing down services.

Network infrastructure

Communications in IT rooms today are most often based on networks running the IP protocol suite. IT rooms contain a set of routers and switches that transport traffic between the servers and to the outside world. Redundancy of the Internet connection is often provided by using two or more upstream service providers.

Some of the servers at the IT room are used for running the basic Internet and intranet services needed by internal users in the organization, e.g., e-mail servers, proxy servers, and DNS servers.

Network security elements are also usually deployed: firewalls, VPN gateways, intrusion detection systems, etc. Also common are monitoring systems for the network and some of the applications. Additional off site monitoring systems are also typical, in case of a failure of communications inside the IT room.

IT room infrastructure management

IT room infrastructure management (IRIM) is the integration of information technology and facility management disciplines to centralize monitoring, management and intelligent capacity planning of a IT rooms critical systems. Achieved through the implementation of specialized software, hardware and sensors, IRIM enables common, real-time monitoring and management platform for all interdependent systems across IT and facility infrastructures.

Depending on the type of implementation, IRIM products can help IT room managers identify and eliminate sources of risk to increase availability of critical IT systems. IRIM products also can be used to identify interdependencies between facility and IT infrastructures to alert the facility manager to gaps in system redundancy, and provide dynamic, holistic benchmarks on power consumption and efficiency to measure the effectiveness of “green IT” initiatives.

Tasks

  1. Identify the major building blocks of the infrastructure future state architecture.
  2. Produce macro and micro design.
  3. Peer review designs and obtain approval.
  4. Engage and manage third parties to produce delegated designs.
  5. Update IT room data repository.

Hints and tips

  • Reviewing source infrastructure and network architecture may inform or constrain the target design decisions. Do not attempt to redesign the application architecture and systems landscape. These higher level architectures will be migrated in their entirety as part of the IT room migration approach. The focus of this design work should be on the major elements of the target infrastructure.
  • Focus on the major common infrastructure such as the IT room LAN arrangements, SAN storage, VMware infrastructure, middleware components, database subsystems, etc.
  • In most cases the source application architecture will be replicated ‘like for like’ and in this case ensure that existing designs are suitable for deployment in the target IT room. Where remediation work is, ensure that this is documented and reviewed.
  • Where re-platforming is to take place then full application design validation should be undertaken to ensure the target application stack will work.
  • Each shared service must be examined in line with the agreed treatment of source to target migration.
  • Each migration unit must be examined to ensure that it support the planned Disaster Recovery approach.

Activity output

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s