|Feedback To Author | Feedback To Webmaster

Copyright © 1998 by Fabian Pascal. All Rights Reserved

 

32 Bit Computing

By Fabian Pascal

Intel "Compatibility" Under Windows NT

Users usually know that when upgrading hardware components such as disk or video controllers, they must be compatible with the operating system, which means proper drivers. And if they don't know, NT will make it obvious, by either failing to boot (new disk controller), or by booting in VGA mode (new video controller). Neither case requires reinstalling NT.

But what about the system board and processor? A current issue of interest is how Super7 systems with Pentium-compatible CPUs from AMD and Cyrix compare with Pentium II Slot 1 systems. Given the common x86 architecture of Slot 1 and Super7 system boards and CPUs, shouldn't users be able to switch from one to the other transparently, if they keep all other components the same? That, after all, is what Intel compatibility is all about.

That is what I assumed when I benchmarked the two platforms -- system board and CPU combinations -- which requires keeping all other system elements constant. I ran the tests on the on a Slot 1 system, then migrated the components -- including the hard drive with NT -- to a Super7 system and reran the benchmarks, as per Listing 1. Note that except for system boards -- Abit's BX6 Slot 1 vs. AOpen's AX59Pro Super7 -- and CPUs -- Intel's PII/300 vs. Cyrix MII-333 -- the configurations are identical.

Slot 1 Platform
----------------
System Board: Abit BX6
chipset: Intel 440BX
BIOS: Award 4.51PG
CPU: Intel Pentium II 300Mhz

Super7 Platform
----------------
System Board: AOpen AX59Pro
chipset: VIA MVP3
BIOS: Award 4.51PG
CPU: Cyrix MII 333 MHz

Common Components
------------------
128MB PC100 CAS2 SDRAM
Eizo FX-E8 21" monitor
Matrox Millenium G200 AGP
Adaptec AHA2940UW Dual SCSI controller
Seagate Cheetah 9LP UW drive
SyQuest SyJet
Plextor UltraPleX 32X CDROM
Imation LS-120 SuperDisk

Listing 1: Slot 1 and Super7 System Configurations

The Super7 system kept crashing, generating several types of blue screen STOP errors:

KMODE_EXCEPTION_NOT_HANDLED (0x1E)
UNEXPECTED_KERNEL_MODE_TRAP (0x7F)
PAGE_FAULT_IN_NONPAGED_AREA (0x50)

When a lengthy online investigation into the Microsoft Knowledge Base (MSKB) offered no insight into these errors (see sidebar), I contacted Microsoft technical support and was somewhat surprised by the following response:

"...we always recomend reinstalling Windows NT when you change a core component like a motherboard or CPU, especially in instances where you're changing vendors and chipsets; differences in how the CPUs address memory and how the chipsets handle addressing and enumeration can give NT serious headaches. In cases like this, the change is even more drastic than, say, changing between two vendors supplying BX motherboards. Resource differences between an Intel BX chipset system and, say, a VIA MVP3+ system can be dramatic and the only reliable way to find out if there is a problem with the software or the hardware is to install Windows NT cleanly, allowing it to run through its detection routine from the ground up, building a baseline snapshot of the system and its resources. There are third party programs that will clone an entire system, but even they recommend you do it on identical hardware. The only way we support cloning is through drive imaging before the GUI portion of setup, meaning just the files have been copied."

Furthermore, as officially stated in MSKB article Q162001, help from Microsoft in diagnosing problems in such cases may not be forthcoming:

"Microsoft does not provide support for systems that have been installed by duplicating fully installed copies of either Windows NT Workstation and Server..."

Consider now users who want to upgrade/change their system boards and/or CPUs in either direction. Having to reinstall NT is not, as most users who had to do it painfully learned, a desirable option. Windows systems tend to be highly customized in terms of GUI settings and applications. Such customizations are arrived at over extended periods of time and are quite a hassle to reconstruct. On the other hand, they sure don't want to lose technical support for an element as important as their operating system.

NT Hardware Basics

NT's boot time hardware recognizer, NTDETECT (together with NTOSKRNL) create a registry database in which information about the installed hardware components is recorded at boot time:

Computer ID
Bus/Adapter Type
Video
Keyboard
Communication Ports
Parallel Ports
Floppy Disks
Mouse/Pointing Device

where Computer ID contains platform data. The database has four sections under the HKEY_LOCAL_MACHINE key, as follows:

[HKEY_LOCAL_MACHINE\HARDWARE]
[HKEY_LOCAL_MACHINE\HARDWARE\DESCRIPTION]
[HKEY_LOCAL_MACHINE\HARDWARE\DEVICEMAP]
[HKEY_LOCAL_MACHINE\HARDWARE\OWNERMAP]
[HKEY_LOCAL_MACHINE\HARDWARE\RESOURCEMAP]

The information in these sections is accessible in user-readable form with NT Diagnostics, one of the Administrative Tools. Some system board and CPU information, for example, is recorded in the DESCRIPTION section of the hardware database. Additional platform information is scattered throughout the other sections in the form of components attached to the system board, e.g. buses, ports, etc.

Because the hardware database is recreated each time NT boots based on whatever hardware NTDETECT recognizes, one can get the impression that insofar as platform is concerned, NT behaves in a Plug'n Play (PnP) fashion. Indeed, when a NT installation created on the Slot 1 system is used on the Super7 system, the registry hardware database contains the correctly updated platform information. But we know that, unlike Windows98, NT is not PnP, so what gives?

The NT component that contains platform-specific code is actually the Hardware Abstraction Layer (HAL), described as:

"... a kernel-mode library of hardware manipulating routines provided by Microsoft or by the hardware manufacturer ... [which] lies ... between the hardware and the rest of the operating system ... It enables the same operating system to run on different platforms with different processors ... Different processor configurations often use different HAL drivers ... A HAL is installed during setup ..."

The decision which HAL to install for a given platform is, thus, not made at boot time, but by the NT setup program during NT installation, based on the platform that it detects at that point.

Note: It is not entirely clear whether the HAL drivers referenced are part of HAL, or external to it, which could make a difference in the context under consideration.

Suppose you replace one platform (say, Slot 1/PII) with another (Super7/MII), or vice-versa, keeping everything else the same -- what an upgrade would really be like. Microsoft's official position requires you to reinstall NT so that, supposedly, NT installation program will detect the new platform and install the HAL specific to it. This means that different HALs exist for the two platforms! Thus, with the HAL for the Slot 1 platform installed, reinstalling NT on the Super7 system would detect the change and replace the HAL with the Super7 one. But note the qualification in Microsoft's official position on this:

"Changing the motherboard can mean a new HAL is required. Whether a different HAL is installed totally depends on the hardware. Some systems have specific HALs. Sometimes systems that seem identical use different HALs."

But as far as I can tell, there is only one HAL for Slot 1 and Super7 platforms, regardless of vendor, chipset, or CPU: the standard HAL for Intel systems. All the other I386 HALs documented in the Windows NT Resource Kit pertain to multiprocessor systems, or a few specific proprietary or customized platforms, as follows:

486 c Step Processor
AST SMP systems
Cbus Systems
MCA-based systems
Most Intel multiprocessor systems (and uniprocessor version)
NCR SMP computers
Olivetti SMP computers
Compaq Systempro
WYSE7 systems

Indeed, when I installed NT from scratch on either platform, the standard Intel HAL was installed in both cases.

Note: Perhaps the standard Intel HAL has different routines for different platforms, which are selectively invoked depending on which platform is detected at boot time. Aside from the fact that this would require HALs to change as new platforms emerge (which, as far as I know, has not happened), if that were the case, then it would be the equivalent of PnP and, therefore, no NT reinstallation would be necessary anyway.

A Less Drastic Solution

To avoid the loss of GUI customization and applications that a reinstall would cause, a Microsoft technician recommended the following:

"Rerun Windows NT Setup for the new hardware platform. The software settings would remain, as long as the choice was an upgrade instead of a new install."

But the only Setup option that does not overwrite the registry is "Repair damaged NT installation"; the "upgrade" option overwrites customization. And according to the official Microsoft position "An upgrade in place may fix things. A repair will not."

In the case of the two test configuraions, the repair option did not change the HAL. Given that NT reinstallation did not either, there is no reason to suspect that they do not have the same HAL and, the remarks about differences in Slot1 and Super7 chipsets and processors "confusing NT" do not apply to Slot 1 and Super7 systems. There appears to be no need for reinstallation.

Conclusion and Recommendations

My debugging experience in this (sidebar) and other cases is that STOP crashes are extremely hard to diagnose. With HAL obviously not the issue, it is, therefore, unlikely that the crashes were caused by the platform switch. Further support for this is provided by the fact that 1E STOP errors continued to occur even after repairing and reinstalling NT. What seems to be happening here is that MS and its support people require a NT reinstall/repair simply because they cannot tell what is causing these STOP errors.

The good news is that users do not really need to reinstall or repair NT when upgrading platform. The bad news is that if the upgrade happens to coincide with some other problems, in its inability to diagnose them, Microsoft may induce them to reinstall.

There is an interesting aside to this. This article was rejected by a mainstream publication on the following grounds:

"It seems like this is a lot of trouble to go through just to basically recommend what MSFT said in the first place. Also, with NT5 on the way (even closer by the time this prints), and many people will be upgrading to that, is this a likely scenario? What is the current install base of Cyrix/AMD platforms, and how often is this scenario of switching platforms likely to occur? This is not an upgrade that corporate IT is likely to undertake, because the overhead is far too high. Upgrades of this type would be rip and replace, such as how organizations are moving from old Win16 platforms to NT. Is the recommendation that people should try this kind of upgrade? Are you saying the MSFT is wrong in what it is saying in response to your issues? Also, why would you want to encourage users to try something that MSFT specifically says it will not support? We end up the article without really answering the issue at hand. It opens with this huge problem, and then just says "try to fix it, and if it fails, reinstall it". Seems to be what MSFT said to begin with."

Of course, this (typically) misses the whole point of the article, as my rebuttal spells out:

"The whole point of the article is that a lot of trouble is caused by MSFT's reinstall requirement, which is unnecessary. In essence, the problem is that when crashes occur after an upgrade, MSFT will require a reinstall only because they cannot diagnose the problems, which have nothing to do with the upgrade. Moreover:

* NT5 is a MUCH more complicated system than NT4 and it is quite unclear that the upgrade will be en masse. Neither is it clear that it will resolve this specific issue.

* The problem exists in both upgrade directions, it is not limited to a PII to AMD upgrade.

* The switches to Super7 platform have been increasing considerably and will intensify as the forthcoming processors are coming out. AMD is giving Intel a run for its money and top vendors such as Compaq, HP and Dell are all selling AMD systems and will do so even more with the K6-3 and K7 CPUs.

* Corporate emphasis is fine, but it should not be exclusive

* Rip and replace is not a cost-effective policy. In fact, one point of the Super7 platform is to enable users to upgrade from old PII/mb combos to newer ones at minimal cost

* Such upgrades may be cost-effective in many instances, but if MSFT requires a NT reinstall on top of them, it is actually discourages them.