Secondary Error Codes: Difference between revisions
No edit summary |
No edit summary |
||
Line 74: | Line 74: | ||
|} | |} | ||
On power up, after [[GPU]] power and clocking available, the [[SMC]] starts [[SMC#seqUnReset|seqUnReset]] and releases the GPU from reset. It then waits for the GPU to assert GPU_RST_DONE. If the GPU_RST_DONE signal is not asserted by the time allotted | On power up, after [[GPU]] power and clocking available, the [[SMC]] starts [[SMC#seqUnReset|seqUnReset]] and releases the GPU from reset. It then waits for the GPU to assert GPU_RST_DONE. If the GPU_RST_DONE signal is not asserted by the time allotted, the SMC will retry seqUnReset 4 more times. If GPU_RST_DONE is still not asserted, the SMC enters [[Error Codes#EC_BOOT|EC_BOOT]] and the 0020 code is displayed on the front panel. | ||
=== 0021 === | |||
{|class="wikitable" | |||
! Hex !! Name !! Description !! Type !! Repair Guide | |||
|- | |||
| 0x09 || ERROR_NO_PCIE || PCIe link did not enter L0 after seqUnReset time passed || EC_BOOT || [[Repair Guides/0021|0021]] | |||
|} | |||
After receiving GPU_RST_DONE, the [[SMC]] monitors the [[PCIe]] L0 status and waits for the link to enter L0 state. If the link does not enter the L0 state by the time allotted, the SMC will retry seqUnReset 4 more times. If the link still does not enter L0 state, the SMC enters [[Error Codes#EC_BOOT|EC_BOOT]] and the 0021 code is displayed on the front panel. | |||
{{Debug and Repair}} | {{Debug and Repair}} |
Revision as of 17:52, 5 January 2023
SMC Errors
These errors are generated by the System Management Controller.
0001
ANA_V12P0_PWRGD is driven high by the ANA (later HANA) as long as the V_12P0 rail is within tolerance. If V_12P0 ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter EC_FATAL and the 0001 code to be displayed on the front panel.
Hex | Name | Description | Type | Repair Guide |
---|---|---|---|---|
0x01 | ERROR_V_12P0 | ANA_V12P0_PWRGD negated unexpectedly | EC_FATAL | 0001 |
0002
Hex | Name | Description | Type | Repair Guide |
---|---|---|---|---|
0x02 | ERROR_V_CPUCORE | VREG_CPU_PWRGD negated unexpectedly | EC_FATAL | 0002 |
VREG_CPU_PWRGD is driven high by the CPU power controller as long as the V_CPUCORE rail is within tolerance. If V_CPUCORE ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter EC_FATAL and the 0002 code to be displayed on the front panel.
0003
Hex | Name | Description | Type | Repair Guide |
---|---|---|---|---|
0x03 | ERROR_V_GPUCORE | VREG_GPU_PWRGD negated unexpectedly | EC_FATAL | 0003 |
VREG_GPU_PWRGD is driven high by the GPU power controller as long as the V_GPUCORE rail is within tolerance. If V_GPUCORE ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter EC_FATAL and the 0003 code to be displayed on the front panel.
0010
Hex | Name | Description | Type | Repair Guide |
---|---|---|---|---|
0x04 | ERROR_NO_ANA | ANA/HANA is not responding to reads or writes | EC_FATAL | 0010 |
The SMC communicates with the ANA/HANA via the SMBus. If communication is lost, the SMC enters EC_FATAL and the 0010 code is displayed on the front panel.
0011
Hex | Name | Description | Type | Repair Guide |
---|---|---|---|---|
0x05 | ERROR_THERMAL_CPU | CPU thermal overload | EC_THERMAL | Thermal Overload |
The SMC monitors the CPU thermal diode as reported by the ANA/HANA. If the CPU temperature exceeds the Trip Temperature defined in the SMC Config, the SMC enters EC_THERMAL and the 0011 code is displayed on the front panel.
0012
Hex | Name | Description | Type | Repair Guide |
---|---|---|---|---|
0x06 | ERROR_THERMAL_GPU | GPU thermal overload | EC_THERMAL | Thermal Overload |
The SMC monitors the GPU thermal diode as reported by the ANA/HANA. If the GPU temperature exceeds the Trip Temperature defined in the SMC Config, the SMC enters EC_THERMAL and the 0012 code is displayed on the front panel.
0013
Hex | Name | Description | Type | Repair Guide |
---|---|---|---|---|
0x07 | ERROR_THERMAL_EDRAM | eDRAM thermal overload | EC_THERMAL | Thermal Overload |
The SMC monitors the eDRAM thermal diode as reported by the ANA/HANA. If the eDRAM temperature exceeds the Trip Temperature defined in the SMC Config, the SMC enters EC_THERMAL and the 0013 code is displayed on the front panel.
0020
Hex | Name | Description | Type | Repair Guide |
---|---|---|---|---|
0x08 | ERROR_GPU_RST_DONE | GPU_RST_DONE signal not asserted after seqUnReset time passed | EC_BOOT | 0020 |
On power up, after GPU power and clocking available, the SMC starts seqUnReset and releases the GPU from reset. It then waits for the GPU to assert GPU_RST_DONE. If the GPU_RST_DONE signal is not asserted by the time allotted, the SMC will retry seqUnReset 4 more times. If GPU_RST_DONE is still not asserted, the SMC enters EC_BOOT and the 0020 code is displayed on the front panel.
0021
Hex | Name | Description | Type | Repair Guide |
---|---|---|---|---|
0x09 | ERROR_NO_PCIE | PCIe link did not enter L0 after seqUnReset time passed | EC_BOOT | 0021 |
After receiving GPU_RST_DONE, the SMC monitors the PCIe L0 status and waits for the link to enter L0 state. If the link does not enter the L0 state by the time allotted, the SMC will retry seqUnReset 4 more times. If the link still does not enter L0 state, the SMC enters EC_BOOT and the 0021 code is displayed on the front panel.