Hacking Memory
A descent into madness…
The problem
My venerable Dell Precision 3630 hails from 2019. It has the good old Intel i7-8700K, a pretty solid CPU, even today. However, the one area it’s sorely lacking is memory. It was only given 32GB of 2666MHz DDR4, which is certainly showing its limitations.
I’ve built plenty of computers and upgraded others, so I’m reasonably confident poking around inside a case. Installing a few sticks of memory and maybe poking some BIOS knobs is well within my capabilities. Or so I thought. With my four sticks of new 3200MHz DDR4, 16GB each (64GB, apparently the maximum supported on my motherboard), I got stuck into the replacement.
Physical installation of the memory was no problem, although the clearance between the slots and the CPU heat pipes is closer than seems ideal. Upon reboot, I was greeted with full (actually full, not just maximum load) fans and a few silent resets. Eventually, I got a screen saying that my system clock was incorrect. I suppose the system is nearly 6 years old at this point, and the real time clock battery has had enough.
In the BIOS setup, I checked out the new memory. Luckily, it recognised the full 64GB. Unluckily, it decided to run it at the JEDEC-approved 2133MHz. Maybe if the original memory had run at this speed, I would’ve been fine, but given that it’s usually just a click away to enable the XMP profile to run at its rated speed, I was chasing more. After desperately searching the entire BIOS setup, I was shocked to see that there are no settings at all for configuring memory! I’m sure you could pragmatically argue that Dell is focusing on giving you a stable workstation experience, and going into soft-overclocking territory isn’t congruent with that. Cynically, they probably just want to sell you their expensive “compatible” upgrades.
UEFI hacking
A somewhat desperate and cursory search revealed that there is quite a lot of activity around UEFI “hacking”, and Dell-specific tools exist. There was a Reddit post with somebody trying to achieve a similar goal: in their case, it was to increase the maximum supported memory. While also battling the issue of my OS not actually booting, I fiddled around with the UEFI hacking.
The first step is to extract the firmware image. I found that I was on
UEFI version 2.31.0, so I downloaded the update executable
corresponding to this version from Dell. The Dell PFS BIOS Assembler
was intended for downgrades, as far as I can tell. However, it also
just extracts the different component updates, which gives us a
System BIOS with BIOS Guard v2.31.0.bin to play with.
Next up, I used UEFITool to find the Setup section, which is
otherwise difficult to find in a sea of GUIDs. I searched for sections
containing the string “XMP”, which was indeed found, giving me some
hope that this was a possible endeavour. This sent me directly to the
right section, giving me a PE32 Image bin file to dig deeper into.
To turn the binary file into something useful, we can use IFRExtractor-RS. The readme file gives the perfect description of what and why we need this:
UEFI Internal Form Representation (IFR) is a binary format that UEFI Human Interface Infrastructure (HII) subsystem uses to store strings, forms, images, animations and other things that eventually supposed to end up on BIOS Setup screen. In many cases there are multiple settings that are still present in IFR data, but not visible from BIOS Setup for various reasons, and IFR data can also help in finding which byte of which non-volatile storage available to UEFI corresponds to which firmware setting.
Searching through the resulting file leads to the following section, which is hidden inside a “Memory Overclocking” form that I’m sadly not presented with in my setup screen.
OneOf Prompt: "DIMM profile", Help: "Select DIMM timing profile that should be used.",
QuestionFlags: 0x14, QuestionId: 0x2784,
VarStoreId: 0x1, VarOffset: 0x91D,
Flags: 0x10, Size: 8, Min: 0x0, Max: 0x3, Step: 0x0
OneOfOption Option: "Default DIMM profile" Value: 0, Default, MfgDefault
OneOfOption Option: "Custom profile" Value: 1
SuppressIf
EqIdVal QuestionId: 0xF63, Value: 0x0
OneOfOption Option: "XMP profile 1" Value: 2
End
SuppressIf
EqIdVal QuestionId: 0xF63, Value: 0x0
EqIdVal QuestionId: 0xF63, Value: 0x1
Or
End
OneOfOption Option: "XMP profile 2" Value: 3
End
End
This is clearly exactly what I want. Take note of three things from
here: the VarStoreId is 0x1, the VarOffset is 0x91D, and I’m
looking for values of 2 or 3, for XMP profile 1 or 2, respectively. I
can see that this variable exists in the Setup variable store, which
is important for the next step!
VarStore Guid: EC87D643-EBA4-4BB5-A1E5-3F3E36B20DA9,
VarStoreId: 0x1, Size: 0x1951, Name: "Setup"
With everything in hand, the final step is to set this variable. For
this, the setup_var.efi tool does the job. This invocation sets the
byte at offset 0x91D in the Setup variable store to use XMP
Profile 1. For good measure, perform a soft reboot after executing the
command.
setup_var.efi -r Setup:0x91D=2
However, to get to this point, I needed a UEFI shell, which isn’t
provided by my Dell BIOS (unsurprising). I did have GRUB available, as
both my regular bootloader, and on the Fedora live disk I was using
for troubleshooting. Thankfully there was enough space on the EFI
System Partition on the live disk (after deleting the IA32 boot
executable) to put both setup_var.efi and Shell.efi from
TianoCore EDK II. At the GRUB command line, the following sequence
of commands gets us into the shell:
insmod part_gpt
insmod chain
set root='(hd0,gpt2)'
chainloader /EFI/BOOT/Shell.efi
boot
Then from the shell:
fs0:
cd \EFI\BOOT
setup_var.efi -r Setup:0x91D=2
Finally, sit around and wait for the memory training. Eventually, at the setup screen we can see the memory running at 2666MHz… Close enough! I tried using XMP Profile 2, but it didn’t seem to get through POST and my motivation to mess with this any more was waning. It’s a bit of a shame to not run at the full speed, but having a working system is more important.
Unexpected detour
The scariest, and perhaps most time-consuming part of this whole process was trying to get back to my OS. After installing the new memory and resetting the clock, the bootloader was found but would just hang indefinitely. Because everything wants to do graphical boot these days, I was either sitting at a blank screen, or watching a spinner, which gave me no indication of what was causing the issue. Obviously, since I’d been fiddling with memory, this is where I started my troubleshooting. I spent ages switching modules, dropping from all 4, to just 2, to reinstalling the original set.
A couple of clues were showing up: after deleting rhgb quiet from
the boot command line and setting gfxpayload to text (maybe not
required), I could see that systemd was hanging in trying to bring up
my root disk. But when I was browsing through the GRUB command line
before, my root disk did indeed show up, and I could list the
files. However, in the Fedora live environment, it didn’t show up. I
finally checked the kernel log, and saw the following message:
Found 1 remapped NVMe devices. Switch your BIOS from RAID to AHCI mode to use them
Couldn’t be clearer! I assume that a failed POST, or just the failed battery flipped this setting. With a single NVMe SSD and one SATA drive, RAID mode is useless to me anyway. Curiously, this setting is under SATA, yet affects my NVMe drive? Back in AHCI mode my OS came back up instantly!