Coder Social home page Coder Social logo

Green Bars Issue about linux-on-pixel-c HOT 51 CLOSED

denysvitali avatar denysvitali commented on August 30, 2024
Green Bars Issue

from linux-on-pixel-c.

Comments (51)

thierryreding avatar thierryreding commented on August 30, 2024 3

Yeah, it's not terribly pretty, but we need the global for the early code anyway, so might as well reuse it here. The tegra_powergate_sequence_power_up() is really a legacy API that we should remove eventually, but we haven't fixed up all the drivers yet. I'm sure we'll get there some day, but in the meantime I think this is a good workaround.

Now it's up to Peter for finalizing the MBIST series, then we can merge it all into linux-next and get broader testing. Hopefully that will help you guys make progress on this interesting project.

from linux-on-pixel-c.

cyndis avatar cyndis commented on August 30, 2024 2

This could be related to missing DMA queue flushes on powergating, looking into that

from linux-on-pixel-c.

thierryreding avatar thierryreding commented on August 30, 2024 1

@vartom I suppose this could've been caused by the power domain support patches. Patch Nr. 1 that you've linked to seems to intend to reset the device prior to using it. Unfortunately that no longer works since commit 64230aa075864f7c8e270b583bca685304246b57 in mainline. That change is supposedly the right thing to do, but I could imagine that it might break things.

One thing you could try is take a recent mainline (or linux-next) and edit arch/arm64/boot/dts/nvidia/tegra210.dtsi by looking for the two dsi@... nodes and commenting out or removing the power-domains properties and at the same time removing the DSI resets from the SOR power domain (look for the label "pd_sor:" and remove the TEGRA210_CLK_DSI{A,B} entries from the resets property).

The effect of that should be that the code conditionalized on !pdev->dev.pm_domain in the DSI host driver will be enabled, meaning that the driver will take manual control of the reset (like it used to before the generic PM domain support patch).

You should be able to reproduce the effect of the ChromeOS patch by adding a reset_control_assert(dsi->rst) after the devm_reset_control_get() in tegra_dsi_probe().

This is all somewhat complicated, so if you're not in a hurry to test this out, I'll try to whip up a proper patch for you to test when I'm at my workstation tomorrow.

from linux-on-pixel-c.

marcan avatar marcan commented on August 30, 2024 1

This fixed it for me. It seems the powergate logic does not reset DC properly, so it comes up in a broken state.

--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -2011,7 +2011,7 @@ static const struct tegra_dc_soc_info tegra210_dc_soc_info = {
        .supports_block_linear = true,
        .supports_blending = true,
        .pitch_align = 64,
-       .has_powergate = true,
+       .has_powergate = false,

I wonder if the issue has something to do with the fact that the DC clock is dynamic and derived from the mode. Maybe it's not set up when it is powered up, and it needs a clock to reset properly? Or maybe it's a timing issue and the powergate code does not give DC enough time to reset, or something similar.

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

Bars are obtained due to disruption to the DSI mode even-odd split.
a little about the differences you can see here

But this mode should not work if you believe this message
https://github.com/denysvitali/linux-smaug/blob/4.14-rc6/drivers/gpu/drm/panel/panel-sharp-lq101r1sx01.c#L227

Edit by @denysvitali: Fixed links

from linux-on-pixel-c.

denysvitali avatar denysvitali commented on August 30, 2024

Yeah, you probably are right. This may explain the odd behaviour that can be sometimes happen when using our kernel: the left half of the screen goes black, whilst the right side stays "on" with the content displayed correctly.
screenshot from 2017-12-11 08-34-46

I still don't understand what are the circumstances that cause the issue though

Edit: Btw, we're using the JDI LPM102A188A panel

from linux-on-pixel-c.

Samt43 avatar Samt43 commented on August 30, 2024

@vartom I have exactly the same problem on my device,
These green bars are always there and i couldn't find any way to get rid of them until now unfortunatly.
I think that yes, the problem come from DSI port configuration but i still can't understand why it works correctly with google stock kernel.
"To support a different mode a mechanism needs to be
put in place to communicate the configuration back to the DSI host
controller."
No idea how google did it, but they should have done it somehow

from linux-on-pixel-c.

denysvitali avatar denysvitali commented on August 30, 2024

Apparently w/ linux-4.8.y (refer to this comment for more information on how to run it) the bars don't appear. There are still some (minor) glitches that disappear after the screen is clicked.

from linux-on-pixel-c.

marcan avatar marcan commented on August 30, 2024

I don't think this has anything to do with DSI. I think this is an issue with the display engine fetching pixels from the framebuffer. Maybe @Gnurou could shed some light on what might cause this?

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

I analyzed the code in kernel 3.18 and the one that was applied in 4.8 and came to the conclusion that the problem can be hidden only in power management. denysvitali/linux-smaug@87904c3
I also found that denysvitali/linux-smaug@f08c32f this is a message that in the case of our tablet is simply necessary.

from linux-on-pixel-c.

marcan avatar marcan commented on August 30, 2024

I'm seeing the green bars issue on another device and I'm 99% certain it has nothing whatsoever to do with DSI. I know this because the green bars do not affect the hardware cursor, and because they scale together with the framebuffer. If it were a DSI issue it would affect the entire screen regardless of where the pixels come from. It's possible we're looking at different problems, though, but the end result certainly looks similar...

from linux-on-pixel-c.

denysvitali avatar denysvitali commented on August 30, 2024

@marcan Is the device you're talking about Tegra (or just NVIDIA) based?
May I know what device is it, and which kernel version works / displays the bars?

from linux-on-pixel-c.

marcan avatar marcan commented on August 30, 2024

It's Tegra210, same as the Pixel C. I'm not doing much development, just testing, but I'll try to drag someone else who is working on it here. The kernel is mainline git (somewhere circa 4.15-rc9).

from linux-on-pixel-c.

dhewg avatar dhewg commented on August 30, 2024

I'm seeing similar effects on this cutie: https://imgur.com/a/ZZFDi
This is on mainline 5132ede0, somewhere between v4.15-rc9 and v4.15

While a reboot may work around that, this script does too (on this device there's a ~50% chance it'll work afterwards):

cat panel-reinit.sh
#!/bin/sh
echo 1 >> /sys/devices/platform/50000000.host1x/drm/graphics/fb0/blank
echo 0 >> /sys/devices/platform/50000000.host1x/drm/graphics/fb0/blank

from linux-on-pixel-c.

denysvitali avatar denysvitali commented on August 30, 2024

@dhewg
Yeah, I tried that too (I didn't documented it though), and I can confirm ot works 25% of the time for me.
Restarting lightdm (not sure why just lightdm) seems to fix the problem sometimes, and sometimes it triggers it

from linux-on-pixel-c.

denysvitali avatar denysvitali commented on August 30, 2024

@marcan Have you tried with an older kernel? 4.9.y seems to work fine on the Pixel C. I'm not sure at the moment if it's caused by the Nouveau driver.
What panel are you using? The Pixel C uses a JDI LPM102A188A

from linux-on-pixel-c.

dhewg avatar dhewg commented on August 30, 2024

We didn't yet try an older kernel.
And this is a JDI LPM062M326A panel.

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

@dhewg The JDI LPM062M326A panel you have connected to two DSI channels? those running in ganged mode? Can you show the driver source for this panel?

from linux-on-pixel-c.

Gnurou avatar Gnurou commented on August 30, 2024

I have seen that issue, but don't know what the cause is. @thierryreding may have some insight.

from linux-on-pixel-c.

dhewg avatar dhewg commented on August 30, 2024

It's not running in ganged mode, but I don't think it matters.
Try this:

cat itsnotdsi.sh
#!/bin/sh
busybox devmem 0x54200100 32 5
busybox devmem 0x542013b0 32 0
busybox devmem 0x542010f8 32 0x3000000
busybox devmem 0x54201100 32 0x1000100
busybox devmem 0x54201008 32 0x20010000
busybox devmem 0x542013c4 32 0x1000000

This will set up a 256x256 black cursor at 256,256 on the screen. You'll get a black rectangle that punches right through all the green bars.
There'd be no punching through if this was a DSI issue.

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

Why are you so sure that not DSI issue? How can you explain that after removing the PM, issue disappeared?
P.s. In the kernel 4.14-rc6, the described rollback of the PM did not lead to a positive result.

from linux-on-pixel-c.

marcan avatar marcan commented on August 30, 2024

A DSI issue would affect the entire screen. The panel doesn't know anything about the cursor. If the cursor is not affected by the green bars, it's guaranteed not to be a panel/DSI issue. If you try that shell script and you get a black square with no green bars, your issue isn't a DSI issue either. If you do get green bars through the cursor, then the problem you're seeing and the one we're seeing are different problems (that just look similar).

If the problem has to do with timing, power, clocking, bad internal state, or really any other "messy" nondeterministic origin (which it probably does, since the bars sometimes appear and sometimes don't), then a change in unrelated code could cause timing differences that change how often it happens, without having directly changed relevant code.

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

In my case, if i use 4.14-rc6 kernel, I always see only the bars as on the first photo in the first post. They are green or later become blue. Only at 4.8/4.9 with my changes I saw a normal picture on the screen.
The script will try to apply later.

from linux-on-pixel-c.

Samt43 avatar Samt43 commented on August 30, 2024

I have always greens bars on my pixel c with recent kernels.
I could do some test some time ago on my pixel C and it seems that display is not setup in even-odd split. I can affirm this by this : if you modify mipi_dsi_dcs_set_page_address by changing the pixel column address for one of the DSI Lane, then only the right of the left part of the screen is modified.

https://github.com/denysvitali/linux-smaug/blob/4.15-rc8/drivers/gpu/drm/panel/panel-jdi-lpm102a188a.c

ret = mipi_dsi_dcs_set_page_address(jdi->dsi, 0,
			jdi->mode->vdisplay - 1);
if (ret < 0)
	DRM_ERROR("failed to set page address: %d\n", ret);

https://imgur.com/a/SGDyX

ret = mipi_dsi_dcs_set_page_address(jdi->slave, 0,
			jdi->mode->vdisplay - 1);
if (ret < 0)
	DRM_ERROR("failed to set page address: %d\n", ret);

https://imgur.com/a/xNTAz

So maybe the issue is not related to even odd mode, or maybe DSI is not in this mode while display is in this mode ?

from linux-on-pixel-c.

Samt43 avatar Samt43 commented on August 30, 2024

The noise observed on the screen when I did the column modification didn't contains any vertical line... I'm wondering if these lines are not really written on the framebuffer. So strange.

from linux-on-pixel-c.

marcan avatar marcan commented on August 30, 2024

The lines are not in the host framebuffer memory (you can confirm that by dumping /dev/fb0) and they are not introduced by the DSI block or the panel (you can confirm that with the cursor trick mentioned earlier). If you modify the DC window A to not cover the full display the non-covered portion is also free of lines. That means the problem is introduced somewhere in the pixel fetch pipeline. As far as I know that's roughly EMCโ†’MCโ†’SMMUโ†’DC(window A).

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

del.

from linux-on-pixel-c.

marcan avatar marcan commented on August 30, 2024

I don't think the problem is a simple performance issue (memory subsystem not meeting streaming requirements of the display). The X1 TRM says DC handles underflow by repeating the last valid pixel. That's not what we're seeing here. Enabling underflow debug mode (should show a solid pixel) doesn't change the output. The underflow counters (DC_A_WINBUF_AD_UFLOW_STATUS_0) are zero.

from linux-on-pixel-c.

Samt43 avatar Samt43 commented on August 30, 2024

I could try the smaug kernel forked by denysvitali, and no green bars anymore !
lightdm and weston are now working without bars, only graphical acceleration seems to be missing due to :
MESA-LOADER: failed to retrieve device information
gbm: failed to open any driver (search paths /usr/lib/dri)
gbm: Last dlopen error: /usr/lib/dri/tegra_dri.so: cannot open shared object file: No such file or directory
https://ded1.denv.it/boot-linux-4.8.y-d4ed7e5e8657_20180207_122924.img.unsigned
Thanks !

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

@Samt43 4.8 this is the kernel with the fix I found.

from linux-on-pixel-c.

Samt43 avatar Samt43 commented on August 30, 2024

@vartom This kernel works like a charm for me :)
So the fix doesn't work with more recent versions if i understand ?
How did you came to the conclusion 4.8 could work with this ?
Thanks again, i'm now investigating why gbm doesn't load the tegra driver correctly and load swrast instead.

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

I tried similarly to roll back the kernel 4.14. It did not help. the kernel did not start. Maybe I'm just somewhere wrong and not all rolled. A similar patch helped run 4.9, I did not try other versions to run.
I analyzed the native 3.18 kernel and compared it to the mainline kernel. I checked two very early versions of 3.18 to test how the screen works with the minimum number of patches made by Google (chrome OS team). After that I found out that only one patch โ„–1( denysvitali/linux-smaug@f08c32f ) is needed for the correct operation of the screen. After that, I first tried to apply patch โ„–1 in the maneline kernel. As it turned out, new versions of the kernel contain changes in this portion of the code. Attempts to adapt the patch โ„–1 were unsuccessful. It was after this that I decided to roll back some of the kernel for the ability to apply that patch โ„–1. As it turned out, it immediately gave normal operation of the screen. While I was analyzing the cores, I managed to rewrite the panel driver, took as a basis panel-sharp-lq101r1sx01 from the mainline.

from linux-on-pixel-c.

marcan avatar marcan commented on August 30, 2024

@vartom Did you ever get a chance to run that cursor test? It would be good to know if your problem has the same behavior as the one we're seeing.

@thierryreding do you think issues with the DSI block could affect upstream logic in DC, or perhaps we're seeing two different problems with a similar manifestation here?

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

@marcan I can not do this cursor test. Since when I see the strip on the screen the system does not boot. Only the kernel is started. Accordingly, the script is not executed.

from linux-on-pixel-c.

delroth avatar delroth commented on August 30, 2024

Confirmed that disabling powergating for DC seems to fix the issue. I could observe some pseudo-randomness in the DC reg state across reinits before, that has now disappeared.

Increasing the delays in the powergate sequence code to the values dc.c uses doesn't seem to help.

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

@delroth Did you check this on a pixel ะก? on which version of the kernel?

from linux-on-pixel-c.

cyndis avatar cyndis commented on August 30, 2024

I have had similar coloured stripes on a Jetson TX1 on HDMI output - disabling powergating seems to fix this for me as well.

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

@thierryreding Works great. Thanks for describing the solution. The variant you proposed worked together with the addition of has_powergate = false. Smaug. Kernel 4.15.2 mainline.

from linux-on-pixel-c.

thierryreding avatar thierryreding commented on August 30, 2024

@vartom @marcan Thanks for that hint. Disabling power gating unconditionally is likely going to break for devices where the bootloader did not power up the display partitions. However, the fact that it works with has_powergate = false makes me think, as @cyndis suggested, that we're running into one of those strange issues where the memory access queueing breaks. There's a set of patches that attempt to fix that here: https://github.com/thierryreding/linux/commits/for-4.17/work

But initial tests (by @cyndis) indicate that that doesn't fix the problem, at least for HDMI. Unfortunately I can't reproduce these bars on any of the devices that I have, so I can't test that these actually fix anything. At least they don't break on any of my setups so far.

Only the top 6 patches are relevant, but the tree is otherwise heavily modified, so they may not apply cleanly on top of v4.15.2. @cyndis was able to cherry-pick them to a recent linux-next, though. It's unlikely that this will fix DSI if it doesn't fix HDMI, but in case you want to give this a try anyway, that's where you'll find the code.

In the meantime, I'll keep investigating what else this could be.

from linux-on-pixel-c.

thierryreding avatar thierryreding commented on August 30, 2024

Oh, perhaps one interesting test would be to force a modeset and see if things are fixed after that. Maybe repeat a couple of times to see if things break again after subsequent modesets. That would give us a clue about whether the issue is limited to driver probe time or more generally to suspend/resume.

You should be able to force a modeset using the instructions from @dhewg's comment earlier:

echo 1 >> /sys/devices/platform/50000000.host1x/drm/graphics/fb0/blank
echo 0 >> /sys/devices/platform/50000000.host1x/drm/graphics/fb0/blank

which should be equivalent to the slightly shorter:

echo 4 > /sys/class/graphics/fb0/blank
echo 0 > /sys/class/graphics/fb0/blank

from linux-on-pixel-c.

marcan avatar marcan commented on August 30, 2024

@thierryreding your patches don't help... but I found something else that does. This patchset.

https://patchwork.ozlabs.org/project/linux-tegra/list/?series=25347

Unfortunately, it's broken on the legacy API path. tegra_powergate_power_up segfaults because pg->pmc is uninitialized in the path coming from tegra_powergate_sequence_power_up which uses a fake struct tegra_powergate. And of course, checking for NULL elides the workaround and doesn't fix the problem. However, hacking it to always do it does:

--- a/drivers/soc/tegra/pmc.c
+++ b/drivers/soc/tegra/pmc.c
@@ -432,7 +432,7 @@ static int tegra_powergate_power_up(struct tegra_powergate *pg,
 
        usleep_range(10, 20);
 
-       if (pg->pmc->soc->needs_mbist_war)
+//     if (pg->pmc->soc->needs_mbist_war)
                err = tegra210_clk_handle_mbist_war(pg->id);
        if (err)
                goto disable_clks;

from linux-on-pixel-c.

thierryreding avatar thierryreding commented on August 30, 2024

Excellent! I was just looking at that as well and have a proper fix for the segfault that your seeing. I'll upload a new tree in a second.

from linux-on-pixel-c.

thierryreding avatar thierryreding commented on August 30, 2024

Alright, here's a tree which should include everything that @marcan used to fix this:

https://github.com/thierryreding/linux/commits/for-4.17/work

Nevermind the top patch in that tree, it's not related to Tegra210.

@vartom, @delroth: can you confirm that the MBIST workaround + the PMC segfault make the green bars go away for you? Also, is this reliable in that the green bars don't come back on modesets?

from linux-on-pixel-c.

marcan avatar marcan commented on August 30, 2024

Works for me with thierryreding/linux@13eea63a70 on top of my existing tree which already had the MBIST workaround series merged in. (I hadn't realized pmc was a global available in that file, that's obviously the right solution, heh). Still works after repeated modesets.

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

@thierryreding Yes, and now it works for me (MBIST).

from linux-on-pixel-c.

thierryreding avatar thierryreding commented on August 30, 2024

@marcan, @vartom: would you mind providing an official Tested-by: on the mailing list for Peter's MBIST patch series?

from linux-on-pixel-c.

vartom avatar vartom commented on August 30, 2024

I do not mind. I'll try what I can.

from linux-on-pixel-c.

thierryreding avatar thierryreding commented on August 30, 2024

Just in case you don't know how, you'd just reply to either all of Peter's patches with a line saying:

Tested-by: Your Name <your@email>

Or reply to a single patch (usually patch 0/X if there is one, or patch 1/X otherwise) with something like:

For the series:
Tested-by: Your Name <your@email>

That way I can easily pick that up when I apply to the Tegra tree and you get the credits that you deserve for testing this.

from linux-on-pixel-c.

marcan avatar marcan commented on August 30, 2024

@thierryreding Done (hope I got the In-Reply-To right, I wasn't previously subscribed so I had to snoop around archives).

from linux-on-pixel-c.

thierryreding avatar thierryreding commented on August 30, 2024

@marcan: Looks good. Thanks a lot!

from linux-on-pixel-c.

denysvitali avatar denysvitali commented on August 30, 2024

I'm closing the issue since apparently you guys fixed it ๐Ÿš€ !
Thanks for the help!

If someone else is experiencing the same problem, just leave a comment and I'll reopen the issue

from linux-on-pixel-c.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.