Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Post processing code for depth and stencil about gtavisionexport HOT 25 OPEN

umautobots commented on June 1, 2024 5

Post processing code for depth and stencil

from gtavisionexport.

Comments (25)

Yannnnnnnnnnnn commented on June 1, 2024 6

@racinmat Thanks for your work in depth post-processing. But it is too hard for people who have no idea about rendering to understand.
To simplify, we only have to use the following formula to convert the depth value gather in the game to real depth in meters.
f=10003.814,n=0.15(default),d_game is the z-buffer value,d_real is the real depth value in meters.

from gtavisionexport.

racinmat commented on June 1, 2024 4

@wujiyoung depth is in NDC, so you need to recalculate it using the incerse of projection matrix.
I describe it in my master thesis where I inspected the GTA V visualization pipeline. https://dspace.cvut.cz/bitstream/handle/10467/76430/F3-DP-2018-Racinsky-Matej-diplomka.pdf?sequence=-1&isAllowed=y
See part 3.6.3 where I describe relation between NDC and camera space. Camera space is in meters, so after transferring it from NDC to camera space you will have it in meters.
It is more described in part 5.1, where I demonstrate projection of points from meters to NDC and backwards.

from gtavisionexport.

barcharcraz commented on June 1, 2024 2

The depth buffer is linear. That article is out of date. To be sure you can disassemble the shader code.

…

On Mon, Nov 13, 2017 at 6:55 PM Matěj Račinský ***@***.***> wrote: Hopefully this article series <http://www.adriancourreges.com/blog/2015/11/02/gta-v-graphics-study/> could help a little bit. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABPnvcSwboky4q6-y0vhfQ0ZxGLAmuyUks5s2Nb9gaJpZM4QApG7> .

from gtavisionexport.

barcharcraz commented on June 1, 2024 2

You can use our VOC format exports for the tar balls we provide. The Postgres bounding boxes may be offset+extent, check the box2d docs. Also note that tue box captured directly from the game is very corse. Since many people have requested it I will consider uploading the post processing code

…

On Mon, Nov 13, 2017 at 8:00 PM Matěj Račinský ***@***.***> wrote: Thanks for reply. I quite struggle with reading the data, since every page of this multipage tiff uses different flags. Also, I struggle with 2D bounding boxes from the PostgreSQL database. Are they up to date? Since they have all 4 points from range [0-1], I thought it would be sufficient to multiply X coords by width and Y coords by height, but that does not look right and bounding boxes do not have position which seems right when displayed over a screenshot. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABPnvW5wuC3l1sU0o2j34Vj_CI69Ifdjks5s2OY9gaJpZM4QApG7> .

from gtavisionexport.

barcharcraz commented on June 1, 2024 2

check out https://github.com/umautobots/gta-postprocessing for postprocessing code.

from gtavisionexport.

racinmat commented on June 1, 2024

Hopefully this article series could help a little bit.

from gtavisionexport.

racinmat commented on June 1, 2024

Thanks for reply. I quite struggle with reading the data, since every page of this multipage tiff uses different flags.
Also, I struggle with 2D bounding boxes from the PostgreSQL database. Are they up to date?
Since they have all 4 points from range [0-1], I thought it would be sufficient to multiply X coords by width and Y coords by height, but that does not look right and bounding boxes do not have position which seems right when displayed over a screenshot.

from gtavisionexport.

racinmat commented on June 1, 2024

@barcharcraz I tried that, but unfortumatelly, it contains only semantic segmentation for cars, not for other objects. And if I am not mistaken, it completely lacks depth data.
I wanted to use ImageViewer you have as part of solution with managed plugin, but it does not seem to be working.

from gtavisionexport.

barcharcraz commented on June 1, 2024

Ah i

…

On Tue, Nov 14, 2017 at 9:01 AM Matěj Račinský ***@***.***> wrote: @barcharcraz <https://github.com/barcharcraz> I tried that, but unfortumatelly, it contains only semantic segmentation for cars, not for other objects. And if I am not mistaken, it completely lacks depth data. I wanted to use ImageViewer you have as part of solution with managed plugin, but it does not seem to be working. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABPnvfGkhrfDrQBYk2sPgjrvhrTzk8TOks5s2Z0pgaJpZM4QApG7> .

from gtavisionexport.

racinmat commented on June 1, 2024

The postprocessing code would be really great.
I checked the bounding box, but it is stored as box (native postgres structure), not box2d (postgis extension). And in the query building here coordinates are but there, not offset and extent, if I understand it correctly.

from gtavisionexport.

barcharcraz commented on June 1, 2024

Oh, so those boxes are taken as the 2D bounding box of the 3D bounding box (as projected onto the image). Because the 3D bounding box is overlarge for humans (it includes bounds for animations) that box is large as well. The postprocess code will try and fix this by using the depth and stencil buffers to exactly calculate the bounding box in 2D.

…

On Tue, 2017-11-14 at 17:12 +0000, Matěj Račinský wrote: The postprocessing code would be really great. I checked the bounding box, but it is stored as box (native postgres structure), not box2d (postgis extension). And in the query building here coordinates are but there, not offset and extent, if I understand it correctly. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread. {"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c5 5493e4bb","name":"GitHub"},"entity":{"external_key":"github/umautobot s/GTAVisionExport","title":"umautobots/GTAVisionExport","subtitle":"G itHub repository","main_image_url":"https://cloud.githubusercontent.c om/assets/143418/17495839/a5054eac-5d88-11e6-95fc- 7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent .com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed- b52498112777.png","action":{"name":"Open in GitHub","url":"https://gi thub.com/umautobots/GTAVisionExport"}},"updates":{"snippets":[{"icon" ***@***.*** in #13: The postprocessing code would be really great. \r\nI checked the bounding box, but it is stored as box (native postgres structure), not box2d (postgis extension). And in the query building [here](https://github.com/umautobots/GTAVisionE xport/blob/master/managed/GTAVisionUtils/PostgresExport.cs#L256) coordinates are but there, not offset and extent, if I understand it correctly."}],"action":{"name":"View Issue","url":"https://github.com /umautobots/GTAVisionExport/issues/13#issuecomment-344328716"}}}

from gtavisionexport.

racinmat commented on June 1, 2024

Oh, my bad with bounding boxes.
I was confused, because in the C# code, there was new NpgsqlBox(detection.BBox.Max.Y, detection.BBox.Max.X, detection.BBox.Min.Y, detection.BBox.Min.X) but C# persists it to PostgreSQL in form of (MaxX, MaxY, MinX. MinY), which confused me. Now I can display them correctly.

But the method you proposed in your paper uses much better bounding boxes refinement. I was little bit disapointed that I could not find this post processing code, because you did really good job in refining data from both depth and stencil data.
You were right about the coarseness of these native bounding boxs. Really looking forward if you will decide to upload the post processing code. We wanted to use this repository at our university to replicate your research, and to prepare of our own dataset for some other tasks in field of machine learning.

from gtavisionexport.

TommyAnqi commented on June 1, 2024

@JiamingSuen. Hi, Have you figured out to decode the true depth value from the depth buffer?

from gtavisionexport.

JiamingSuen commented on June 1, 2024

@TommyAnqi As the author mentioned, the depth is already linearized. So no decoding is needed.

from gtavisionexport.

wujiyoung commented on June 1, 2024

@JiamingSuen I want to access the real depth value with specific metric, such as meter, what should I do?

from gtavisionexport.

barcharcraz commented on June 1, 2024

Well it's in "meters". Do be a little careful since while things like cars and people should be the right size, things like road length and building distance may be a little distorted, just because it's a game.

from gtavisionexport.

wujiyoung commented on June 1, 2024

@racinmat thank you so much, your master thesis is very useful to me. I have two questions more.

We can get entity position (x, y, z) by native call ENTITY::GET_ENTITY_COORDS, but how can we get the value of w, which is needed in transformation from world coordinate to camera coordinate in part 3.6.2.
How can we get the distance value l, r, t, b in part 5.1?

from gtavisionexport.

racinmat commented on June 1, 2024

@wujiyoung you won't get w value, since it is the coordinate in homogeneous coordinates. So usual way to treat points in homogeneous coordinates is setting w to 1, do all your calculations and then devide all points by their w value, which normalizes them from homogeneous coordinates back to 3D.
I did not caluclate the l, r, t, b directly since I need them only as fractions in the projection matrix, but they can be calculated from the field of view and height/width ratio. The exact creation of projection matrix from field of view, width, height and near clip are in this function: https://github.com/racinmat/GTAVisionExport-postprocessing/blob/master/gta_math.py#L159
it is part of my repo where I perform various postprocessing of data gathered from GTA.

from gtavisionexport.

bachw433 commented on June 1, 2024

@racinmat I used your code to transform NDC into meters, but I don't think the result is correct.
I just adjusted the input size into W=1024 & H=768.
construct_proj_matrix(H=768, W=1024, fov=50.0, near_clip=1.5)
Are there other things i have to change?

from gtavisionexport.

racinmat commented on June 1, 2024

are you sure the near_clip is correct? You need to obtain it from the camera parameters, the default near_clip is 0.15. I left 1.5 here, because I tried some things at time of writing the code, if you have incorrectly set near_clip it messes things a lot. Same goes for fov (GTA camera field of view), but I think 50 is default value.

from gtavisionexport.

xiaofeng94 commented on June 1, 2024

Hi @racinmat , thanks for your sharing. According to your thesis, it seems that the transformation from camera coordinate to NDC is perspective projection. So, does it matter that P_{3,2} (in projection matrix) is 1 or -1. If not, may I use the standard perspective projection matrix (like the one provided by DirectX) to get the depth values in meters?

from gtavisionexport.

racinmat commented on June 1, 2024

it matters, because in the RAGE (engine used by GTA V), in the camera view coordinate system, camera is heading in direction of negative Z, so positive Z values are behind camera, and negative are in front of camera. AFAIK -1 handles this negative Z coordinate. But that is just orientation, so if you use 1 instead of -1, it should work if you care only about depth in meters, and not the whole transformation into camera view.

from gtavisionexport.

bachw433 commented on June 1, 2024

thanks, @racinmat
I misunderstood 0.15 is another default number for near clip, just like your magic far clip. XD

Besides, I found a interesting stuff.
from the function :
var data = GTAData.DumpData(Game.GameTime + ".tiff", new List(wantedWeather));
you can get the projection matrix directly by calling data.ProjectionMatrix
and with your code, NDC can be transform into meters perfectly.

(data.ProjectionMatrix would be different whether there's sky(infinite depth) in the screen or not.
but only the matrix with sky can be sued for depth transformation perfectly.)

from gtavisionexport.

racinmat commented on June 1, 2024

yes, you can use the projection matrix directly, but it's inaccurate.
If you look at how projection matrix is calculated, model_view_projection matrix and model_view matrix are obtained, and projection matrix is obtain by matrix multiplication of these. Because of calculation of inverse of model_view matrix and matrix multiplication, you face numerical instabilities and resulting projection matrix is inaccurate.
Constructing projection matrix from parameters avoids these numerical stability issues.
If you compare it, constructed matrix is slightly more precise than the obtained one from code.

from gtavisionexport.

vivasvan1 commented on June 1, 2024

@Yannnnnnnnnnnn Thank you for helping dummies like me 👍 ! Appreciate your comment! And thanks to the original genius @racinmat. <3.
@racinmat I hope one day i will be able to understand your work in depth & in depth.

from gtavisionexport.

Post processing code for depth and stencil about gtavisionexport HOT 25 OPEN

Comments (25)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent