Coder Social home page Coder Social logo

[PSA] Default settings on motherboards set RAM to low frequency & low voltage on VDDCR_SOC & other RAM settings may cause a false postive segfault! about ryzen-test HOT 5 OPEN

suaefar avatar suaefar commented on June 25, 2024
[PSA] Default settings on motherboards set RAM to low frequency & low voltage on VDDCR_SOC & other RAM settings may cause a false postive segfault!

from ryzen-test.

Comments (5)

suaefar avatar suaefar commented on June 25, 2024

AMD asked me, and probably other clients, during the RMA to specifically try higher SOC voltages. It still crashed. So if you don't check it yourself in advance, AMD will probably ask you to rule out memory issues.

The test is sensitive to failures in different subsystem including memory corruption.
Generally, I would propose to first get a stable system [1] and then look for the bug in your Ryzen processor.
However, changing the SOC to higher values goes along with increasing the power consumption (probably beyond the 65W in case of the R7 1700), which is a workaround and no solution.
Running the memory outside the spec (i.e., overclocking the interconnect) to get a stable system does not seem a good solution to me.

In my opinion, a system must be stable with the default settings, without any tuning or tweaking.
If it is not, there is a problem and the manufacturers should help you troubleshooting, be it a processor bug, defective memory, or wrong BIOS settings.

Feel free to propose an addition to the readme and send a pull request.

[1] https://wiki.archlinux.org/index.php/Stress_Test

from ryzen-test.

protox avatar protox commented on June 25, 2024

For AMD there seems to be a major disconnect between motherboard & ram manufactures. All I am saying is perhaps running on the lowest possible specification isn't a good idea for this test, because it is extremely extensive.

I'm not an expert in this field, so I'll leave it for someone else to make pull requests and decide if such a suggestion is a good idea or not.

At this same time it seems like this test is currently the primary way people are determining if they have a faulty CPU or not. It could be that a tiny bump to VDDCR_SOC can be enough to make a post week 25 system stable to avoid this issue and avoid an RMA?

from ryzen-test.

suaefar avatar suaefar commented on June 25, 2024

You are completely right, and objectively it seems to be a good suggestion.
But subjectively I run out of patience.

I am utterly frustrated by AMDs communication on this issue and I am not willing to help them any further:

  • AMD should acknowledge (or deny) the problem, i.,e., that many of their top-tier parts are defective.
  • AMD should call back all affected parts, if not from users at least from retailers! WTF they are still knowingly selling these...
  • AMD should provide a tool to test if your processor is affected (I only share the code that I used)

If they still(!) don't care, I don't mind unsettled users bugging them.
It's up to AMD to say something, to do something, to provide a tool!

from ryzen-test.

protox avatar protox commented on June 25, 2024

I completely agree with you, AMD needs to do FAR more and at least come out and tell us what the issue is and which CPUs are affected. People are jumping through hoops, paying for shipping in certain countries and getting long delays to get their RMA at the moment.

from ryzen-test.

 avatar commented on June 25, 2024

If anyone thinks about solution not requiring RMA, there seemingly is one (did not test on many CPUs, just on mine 1800X from week 22): I've disabled CPU micro Op Cache (called uOpCache/Op Cache/...). On i.e. ASUS X370 boards this requires a modded BIOS enabling AMD CBS menu, some other board may allow that on stock BIOS.

After disabling uOpCache, tests (both this one and windows 'bzip2 compiler' killer) stopped crashing and could run for hours without any crash until terminated manually. Performance loss is negligible (around ~3% on 7zip and compilation times, winrar seemingly even slightly (~1%) benefits from it in multi-threaded mode). According to some people noticing slight performance drops on 'fixed' Ryzens, these probably just come with uOpCache internally disabled or limited.

from ryzen-test.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.