Secondary Error Codes: Difference between revisions

Revision as of 17:36, 5 January 2023

SMC Errors

These errors are generated by the System Management Controller.

0001

ANA_V12P0_PWRGD is driven high by the ANA (later HANA) as long as the V_12P0 rail is within tolerance. If V_12P0 ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter EC_FATAL and the 0001 code to be displayed on the front panel.

Hex	Name	Description	Type	Repair Guide
0x01	ERROR_V_12P0	ANA_V12P0_PWRGD negated unexpectedly	EC_FATAL	0001

0002

Hex	Name	Description	Type	Repair Guide
0x02	ERROR_V_CPUCORE	VREG_CPU_PWRGD negated unexpectedly	EC_FATAL	0002

VREG_CPU_PWRGD is driven high by the CPU power controller as long as the V_CPUCORE rail is within tolerance. If V_CPUCORE ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter EC_FATAL and the 0002 code to be displayed on the front panel.

0003

Hex	Name	Description	Type	Repair Guide
0x03	ERROR_V_GPUCORE	VREG_GPU_PWRGD negated unexpectedly	EC_FATAL	0003

VREG_GPU_PWRGD is driven high by the GPU power controller as long as the V_GPUCORE rail is within tolerance. If V_GPUCORE ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter EC_FATAL and the 0003 code to be displayed on the front panel.

0010

Hex	Name	Description	Type	Repair Guide
0x04	ERROR_NO_ANA	ANA/HANA is not responding to reads or writes	EC_FATAL	0010

The SMC communicates with the ANA/HANA via the SMBus. If communication is lost, the SMC enters EC_FATAL and the 0010 code is displayed on the front panel.

0011

Hex	Name	Description	Type	Repair Guide
0x05	ERROR_THERMAL_CPU	CPU thermal overload	EC_THERMAL	Thermal Overload

The SMC monitors the CPU thermal diode as reported by the ANA/HANA. If the CPU temperature exceeds the Trip Temperature defined in the SMC Config, the SMC enters EC_THERMAL and the 0011 code is displayed on the front panel.

0012

Hex	Name	Description	Type	Repair Guide
0x06	ERROR_THERMAL_GPU	GPU thermal overload	EC_THERMAL	Thermal Overload

The SMC monitors the GPU thermal diode as reported by the ANA/HANA. If the GPU temperature exceeds the Trip Temperature defined in the SMC Config, the SMC enters EC_THERMAL and the 0012 code is displayed on the front panel.

0013

Hex	Name	Description	Type	Repair Guide
0x07	ERROR_THERMAL_EDRAM	eDRAM thermal overload	EC_THERMAL	Thermal Overload

The SMC monitors the eDRAM thermal diode as reported by the ANA/HANA. If the eDRAM temperature exceeds the Trip Temperature defined in the SMC Config, the SMC enters EC_THERMAL and the 0013 code is displayed on the front panel.

0020

Hex	Name	Description	Type	Repair Guide
0x08	ERROR_GPU_RST_DONE	GPU_RST_DONE signal not asserted after seqUnReset time passed	EC_BOOT	0020

On power up, after GPU power and clocking available, the SMC starts seqUnReset and releases the GPU from reset. It then waits for the GPU to assert GPU_RST_DONE. If the GPU_RST_DONE signal is not asserted by the time allotted for seqUnReset, the SMC will retry seqUnReset 4 more times. If GPU_RST_DONE is still not asserted, the SMC enters EC_BOOT and the 0020 code is displayed on the front panel.

@@ Line 20: / Line 20: @@
 |}
-VREG_CPU_PWRGD is driven high by the CPU power controller as long as the [[Power Rails#V_CPUCORE|V_CPUCORE]] rail is within tolerance. If V_CPUCORE ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter [[Error Codes#EC_FATAL|EC_FATAL]] and the 0002 code to be displayed on the front panel.
+VREG_CPU_PWRGD is driven high by the [[CPU]] power controller as long as the [[Power Rails#V_CPUCORE|V_CPUCORE]] rail is within tolerance. If V_CPUCORE ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter [[Error Codes#EC_FATAL|EC_FATAL]] and the 0002 code to be displayed on the front panel.
 === 0003 ===
@@ Line 29: / Line 29: @@
 |}
-VREG_GPU_PWRGD is driven high by the GPU power controller as long as the [[Power Rails#V_GPUCORE|V_GPUCORE]] rail is within tolerance. If V_GPUCORE ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter [[Error Codes#EC_FATAL|EC_FATAL]] and the 0003 code to be displayed on the front panel.
+VREG_GPU_PWRGD is driven high by the [[GPU]] power controller as long as the [[Power Rails#V_GPUCORE|V_GPUCORE]] rail is within tolerance. If V_GPUCORE ever drops out of tolerance, the signal is de-asserted, causing the SMC to enter [[Error Codes#EC_FATAL|EC_FATAL]] and the 0003 code to be displayed on the front panel.
 === 0010 ===
@@ Line 66: / Line 66: @@
 The [[SMC]] monitors the eDRAM thermal diode as reported by the [[ANA]]/[[HANA]]. If the eDRAM temperature exceeds the [[Thermal Algorithm#Trip Temperatures|Trip Temperature]] defined in the [[SMC Config]], the SMC enters [[Error Codes#EC_THERMAL|EC_THERMAL]] and the 0013 code is displayed on the front panel.
+=== 0020 ===
+{|class="wikitable"
+! Hex !! Name !! Description !! Type !! Repair Guide
+|-
+| 0x08 || ERROR_GPU_RST_DONE || GPU_RST_DONE signal not asserted after seqUnReset time passed || EC_BOOT || [[Repair Guides/0020|0020]]
+|}
+On power up, after [[GPU]] power and clocking available, the [[SMC]] starts [[SMC#seqUnReset|seqUnReset]] and releases the GPU from reset. It then waits for the GPU to assert GPU_RST_DONE. If the GPU_RST_DONE signal is not asserted by the time allotted for seqUnReset, the SMC will retry seqUnReset 4 more times. If GPU_RST_DONE is still not asserted, the SMC enters [[Error Codes#EC_BOOT|EC_BOOT]] and the 0020 code is displayed on the front panel.
 {{Debug and Repair}}

Secondary Error Codes: Difference between revisions

Revision as of 17:36, 5 January 2023

Contents

SMC Errors

0001

0002

0003

0010

0011

0012

0013

0020

Navigation menu

Secondary Error Codes: Difference between revisions

Revision as of 17:36, 5 January 2023

SMC Errors

0001

0002

0003

0010

0011

0012

0013

0020

Navigation menu

Search