Tuesday, August 21, 2012

Evolution of AIX

Those of you who have worked with Sun Solaris, HP-UX and/or other flavors of UNIX* prior to using AIX* probably wonder what took you so long. I know I did. I started working with AIX in the late 1990s, and although it took a lot longer for me to feel the same about IBM* UNIX technology-based hardware, I'm very glad I made the transition to IBM from Sun and HP. This article focuses on the power of the POWER* system and its evolution, as well as the evolution and history of the systems software that drive its architecture. Here, I'll discuss the Power Architecture*, AIX and Linux* in detail, as well as touch on their past, present and future.

History

POWER stands for "Power Optimization With Enhanced RISC" and is the processor architecture used by many IBM systems today. It's descended from the 801 CPU and is a second-generation RISC-based processor. It was first introduced in 1990 to support UNIX technology-based RS/6000* systems. The instructions were fixed length (4 bytes) and had consistent formats. What made the architecture unique among existing RISC architectures was that it was functionally partitioned, which separated the functions of program flow control, fixed-point computation and floating-point computation.
The objective of most RISC architectures was to be extremely simple so that implementations would have an extremely short cycle type. This would result in processors that could execute instructions at the fastest possible clock rate. The Power Architecture designers chose to minimize the total time spent to complete a task. This time was a byproduct of three different types of components - the path length, number of cycles needed to complete an instruction and cycle time.
During the early '90s, five different RISC architectures were actively competing with one another. IBM partnered with Apple and Motorola to develop a common architecture, which would meet the standards of an alliance they would eventually form. Its first design was simple and all of its instructions were completed in one cycle. It lacked floating-point and parallel-processing capabilities. The Power Architecture was a real attempt to correct this flaw. It consisted of more than 100 instructions and was known as a complex RISC system.
The first POWER chip consisted of 800,000 transistors per chip and was functionally partitioned. It had separate floating-point registers and could scale from low-end to the highest-end workstations. The first chip actually had several chips on one single motherboard, but was refined to one RISC chip with more then 1 million transistors.
Released in 1993, the POWER2* chip, was the standard-bearer for almost five years. It contained 15 million transistors per chip. It also added a second floating-point unit (FPU) and extra cache.
The POWER3* architecture, the first 64-bit symmetric multiprocessor, was designed to work on both scientific and technical computer applications. It included a data prefetch engine, dual floating-point execution units and a non-blocked inteverleaved data cache. It used copper interconnect, which delivered double the performance for the same price.
The POWER4* architecture was released in 2001 with 174 million transistors per processor. It incorporated micron copper and silicon-based technology. Each processor had 64-bit 1GHz Power PC* cores and could execute as many as 200 instructions simultaneously. It became the driving force behind the IBM POWER4 servers, including the iSeries* and pSeries* lines, which allowed for logical partitioning. As wonderful as the POWER4 systems were, if you purchased one shortly before the System p5* platforms were released, you probably weren't a happy camper.

POWER5

There were many design objectives for the POWER5* technology. Some of them were:
  • To maintain binary capabilities with older POWER4 systems
  • Enhance and extend SMP scalability
  • Improve performance and reliability
  • Provide additional server flexibility
  • Improve power-efficiency
  • Provide virtualization capabilities
The POWER5 architecture, introduced in 2003, contained 276 million transistors per processor. It was based on the 130 nanometer copper/silicon-on-insulator (SOI) Process and featured chip multiprocessing, a larger cache, a memory controller on the chip, simultaneous multi-threading (SMT), advanced power management and improved Hypervisor technology.
POWER5 was created to allow up to 256 logical partitions and was available on both the System p* and System i* platforms. Each POWER5 core is designed to support SMT and single threaded modes. The software (the Hypervisor) switches the processor from SMT to single-threaded mode.
On the POWER5 chip image shown in above figure , FXU refers to the fixed point integer unit; ISU refers to Instruction Sequencing Unit; LSU refers to Load Store Unit; L2 refers to Level 2 cache; and MC refers to Memory Controller.

Above figure more clearly illustrates the interrelationships of the chip and SMT. The most powerful improvements on the actual core chip are:
Enhanced memory subsystem, including an improved L1 cache design, which featured 2-way set associative i-cache, 4-way set associative d-cache and a new replacement algorithm (LRU vs. FIFO). It also included a larger L2 cache of 1.9 MB, with a 10-way set associative. Additionally, it provided an improved L3 cache design touting a 36 MB, 12-way set associative; L3 on the processor side of the fabric; satisfied L2 cache misses more frequently; and avoided traffic on the inter-chip fabric. Finally, the enhanced memory subsystem offered an on-chip L3 directory and memory controller, reducing off-chip delays after an L2 miss and reducing memory latency.
  • Improved pre-fetch algorithms
  • Enhanced performance
  • SMT
  • Hardware support for Micro- partitioning* of servers
As a result of its dual-core design and support for SMT, one POWER5 chip actually appears as a 4-way microprocessor to the OS. Processors using SMT can issue multiple instructions from different code paths during one single cycle. Multiple instructions from both hardware threads can be issued from one processor cycle.

Hypervisor

 

Let's look at the Hypervisor above figure, without which there is no virtualization. As we examine the architecture more closely, layers above the POWER Hypervisor* are similar but the contents are characterized by the OS. The layers of code supporting AIX and Linux consist of system firmware and Run-Time Abstraction Services (RTAS). System firmware is composed of low-level firmware type code that perform server-unique I/O configurations and the open firmware that contains the boot-time drivers, boot manager and device drivers required to initialize adapters and hardware devices. RTAS consists of code that supplies platform-dependent accesses. They can be called from the OS. These calls are all passed to the Hypervisor, which handles all I/O interrupts.
Open firmware and RTAS are both platform-specific firmware and both are tailored by the platform developer to manipulate the specific platform hardware. The POWER4 processor introduced support for LPAR with a new privileged processor state called POWER Hypervisor* mode. In the POWER5 processor, further design enhancements were introduced that enable the sharing of processors by multiple partitions. The POWER Hypervisor Decrementer (HDEC) is a new hardware facility in the POWER5 design programmed to provide the POWER Hypervisor with a timed interrupt independent of partition activity.
The POWER5 processor can be packaged in a dual-chip module (DCM), with either one dual-core chip per module, or as an MCM with four dual-core chips per module. The POWER5+* systems came and presented a quad-core module (QCM). Only recently, the POWER5+ chips were announced as shipping on IBM's high end p590 and p595 servers. The roadmap for POWER is further illustrated in below Figure .

Year
Version
Innovations
1986-1992
AIX 2 & 3
RISC Support, Dynamic Kernel, JFS, LVM, SMIT
1994-1996
AIX 4.1 & 4.2
4-8 way SMP, NFS V3, NIM support, HACMP, POWERPC
1997-1999
AIX 4.3
24-way SMP, 64-bit hardware support, IPSEC, Workload Manager, direct I/O, Alt.Disk.Install
2001
AIX 5.1
POWER4 support, logical partitioning, Linux affinity, JFS2, Dynamic CPU, 64-bit kernel, Linux affinity
2002
AIX 5.2
DLPAR, 16 TB filesystem support, concurrent I/O, multi-path I/O, CuoD
2004
AIX 5.3
SMT, APV (WIO, PLM, Micro-partitioning), POWER5 Support, 64-way SMP, NFS version 4, shrinking JFS2 f/s, SUMA, RAS

No comments:

Post a Comment