Secondary Error Codes: Difference between revisions

From XenonLibrary
Jump to navigation Jump to search
No edit summary
(Redirected page to Errors#Secondary Error Codes)
Tag: New redirect
 
(41 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{See also|Error Codes}}
#REDIRECT [[Errors#Secondary Error Codes]]
__NOTOC__
{{Navbox
|name        = Error Database
|title      = [[Error Database]]
|listclass  = hlist
|state      = uncollapsed
|group1    = [[#SMC Errors|SMC Errors]]
|list1      = [[#0001|0001]]{{dot}}[[#0002|0002]]{{dot}}[[#0003|0003]]{{dot}}[[#0010|0010]]{{dot}}[[#0011|0011]]{{dot}}[[#0012|0012]]{{dot}}[[#0013|0013]]{{dot}}[[#0020|0020]]{{dot}}[[#0021|0021]]{{dot}}[[#0022|0022]]
|group2    = [[#XSS Errors|XSS Errors]]
|group3    = [[#UEM Errors|UEM Errors]]
}}
 
== SMC Errors ==
These errors are generated by the [[SMC|System Management Controller]].
 
=== 0001 ===
ANA_V12P0_PWRGD is driven high by the [[ANA]] (later [[HANA]]) as long as the [[Power Rails#V_12P0|V_12P0]] rail is within tolerance. If V_12P0 ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter [[Error Codes#EC_FATAL|EC_FATAL]] and the 0001 code to be displayed on the front panel.
 
{|class="wikitable"
! Hex !! Name !! Description !! Type !! Repair Guide
|-
| 0x01 || ERROR_V_12P0 || ANA_V12P0_PWRGD negated unexpectedly || EC_FATAL || [[Repair Guides/0001|0001]]
|}
 
=== 0002 ===
{|class="wikitable"
! Hex !! Name !! Description !! Type !! Repair Guide
|-
| 0x02 || ERROR_V_CPUCORE || VREG_CPU_PWRGD negated unexpectedly || EC_FATAL || [[Repair Guides/0002|0002]]
|}
 
VREG_CPU_PWRGD is driven high by the [[CPU]] power controller as long as the [[Power Rails#V_CPUCORE|V_CPUCORE]] rail is within tolerance. If V_CPUCORE ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter [[Error Codes#EC_FATAL|EC_FATAL]] and the 0002 code to be displayed on the front panel.
 
=== 0003 ===
{|class="wikitable"
! Hex !! Name !! Description !! Type !! Repair Guide
|-
| 0x03 || ERROR_V_GPUCORE || VREG_GPU_PWRGD negated unexpectedly || EC_FATAL || [[Repair Guides/0003|0003]]
|}
 
VREG_GPU_PWRGD is driven high by the [[GPU]] power controller as long as the [[Power Rails#V_GPUCORE|V_GPUCORE]] rail is within tolerance. If V_GPUCORE ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter [[Error Codes#EC_FATAL|EC_FATAL]] and the 0003 code to be displayed on the front panel.
 
=== 0010 ===
{|class="wikitable"
! Hex !! Name !! Description !! Type !! Repair Guide
|-
| 0x04 || ERROR_NO_ANA || [[ANA]]/[[HANA]] is not responding to reads or writes || EC_FATAL || [[Repair Guides/0010|0010]]
|}
 
The [[SMC]] communicates with the [[ANA]]/[[HANA]] via the SMBus. If communication is lost, the SMC enters [[Error Codes#EC_FATAL|EC_FATAL]] and the 0010 code is displayed on the front panel.
 
=== 0011 ===
{|class="wikitable"
! Hex !! Name !! Description !! Type !! Repair Guide
|-
| 0x05 || ERROR_THERMAL_CPU || CPU thermal overload || EC_THERMAL || [[Repair Guides/Thermal Overload|Thermal Overload]]
|}
 
The [[SMC]] monitors the CPU thermal diode as reported by the [[ANA]]/[[HANA]]. If the CPU temperature exceeds the [[Thermal Algorithm#Trip Temperatures|Trip Temperature]] defined in the [[SMC Config]], the SMC enters [[Error Codes#EC_THERMAL|EC_THERMAL]] and the 0011 code is displayed on the front panel.
 
=== 0012 ===
{|class="wikitable"
! Hex !! Name !! Description !! Type !! Repair Guide
|-
| 0x06 || ERROR_THERMAL_GPU || GPU thermal overload || EC_THERMAL || [[Repair Guides/Thermal Overload|Thermal Overload]]
|}
 
The [[SMC]] monitors the GPU thermal diode as reported by the [[ANA]]/[[HANA]]. If the GPU temperature exceeds the [[Thermal Algorithm#Trip Temperatures|Trip Temperature]] defined in the [[SMC Config]], the SMC enters [[Error Codes#EC_THERMAL|EC_THERMAL]] and the 0012 code is displayed on the front panel.
 
=== 0013 ===
{|class="wikitable"
! Hex !! Name !! Description !! Type !! Repair Guide
|-
| 0x07 || ERROR_THERMAL_EDRAM || eDRAM thermal overload || EC_THERMAL || [[Repair Guides/Thermal Overload|Thermal Overload]]
|}
 
The [[SMC]] monitors the eDRAM thermal diode as reported by the [[ANA]]/[[HANA]]. If the eDRAM temperature exceeds the [[Thermal Algorithm#Trip Temperatures|Trip Temperature]] defined in the [[SMC Config]], the SMC enters [[Error Codes#EC_THERMAL|EC_THERMAL]] and the 0013 code is displayed on the front panel.
 
=== 0020 ===
{|class="wikitable"
! Hex !! Name !! Description !! Type !! Repair Guide
|-
| 0x08 || ERROR_GPU_RST_DONE || GPU_RST_DONE signal not asserted after seqUnReset time passed || EC_BOOT || [[Repair Guides/0020|0020]]
|}
 
After [[GPU]] power and clocking are available, the [[SMC]] starts [[SMC#seqUnReset|seqUnReset]] which releases the GPU from reset. It then waits for the GPU to assert GPU_RST_DONE. If the GPU_RST_DONE signal is not asserted in the time allotted, the SMC will retry seqUnReset 4 more times. If GPU_RST_DONE is still not asserted, the SMC enters [[Error Codes#EC_BOOT|EC_BOOT]] and the 0020 code is displayed on the front panel.
 
=== 0021 ===
{|class="wikitable"
! Hex !! Name !! Description !! Type !! Repair Guide
|-
| 0x09 || ERROR_NO_PCIE || PCIe link did not enter L0 after seqUnReset time passed || EC_BOOT || [[Repair Guides/0021|0021]]
|}
 
After receiving GPU_RST_DONE during [[SMC#seqUnReset|seqUnReset]], the [[SMC]] monitors the [[PCIe]] L0 status and waits for the link to enter the L0 state. If the link does not enter the L0 state in the time allotted, the SMC will retry seqUnReset 4 more times. If the link still does not enter L0 state, the SMC enters [[Error Codes#EC_BOOT|EC_BOOT]] and the 0021 code is displayed on the front panel.
 
=== 0022 ===
{|class="wikitable"
! Hex !! Name !! Description !! Type !! Repair Guide
|-
| 0x0A || ERROR_NO_HANDSHAKE || CPU did not send GetPowerUpCause to SMC || EC_BOOT || [[Repair Guides/0022|0022]]
|}
 
After the [[PCIe]] link has entered the L0 state during [[SMC#seqUnReset|seqUnReset]], the SMC releases the [[CPU]] from reset. The CPU will run the [[Bootloaders]] and start the [[XSS]]. When the XSS starts, it will attempt to retrieve the power up cause from the SMC. If the SMC does not receive GetPowerUpCause in the time allotted, the SMC will retry seqUnReset 4 more times. If GetPowerUpCause is still not received, the SMC enters [[Error Codes#EC_BOOT|EC_BOOT]] and the 0022 code is displayed on the front panel.
 
{{Debug and Repair}}

Latest revision as of 19:53, 5 May 2023