According to the <a href="https://github.com/google-research-datasets/Objectron/blob/m

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How to project 3D points annotations to 2D with CAMERA INTRINSIC MATRIX (instead of PROJECTION MATRIX)?,about google-research-datasets/objectron

Comments (6)

jinlinyi commented on May 17, 2024 4

Hi, I think I figured out how to use the intrinsic matrix to project 3D points. See project_by_intrinsics(). The difficult part is the frame transformation between Objectron frame (x down, y right, z in) and Projection in the OpenCV/H-Z framework (x right, y down, z out). Additionally, Objectron intrinsic matrix has px and py swapped because it is in portrait mode (as pointed out here) so we have to swap x and y.

import hub
import matplotlib.pyplot as plt
import cv2
import numpy as np


def draw_box(img, arranged_points, save_path):
    """
    plot arranged_points on img and save to save_path.
    arranged_points is in image coordinate. [[x, y]]
    """
    RADIUS = 10
    COLOR = (255, 255, 255)
    EDGES = [
      [1, 5], [2, 6], [3, 7], [4, 8],  # lines along x-axis
      [1, 3], [5, 7], [2, 4], [6, 8],  # lines along y-axis
      [1, 2], [3, 4], [5, 6], [7, 8]   # lines along z-axis
    ] 
    for i in range(arranged_points.shape[0]):
        x, y = arranged_points[i]
        cv2.circle(
            img,
            (int(x), int(y)), 
            RADIUS,
            COLOR,
            -10
        )
    for edge in EDGES:
        start_points = arranged_points[edge[0]]
        start_x = int(start_points[0])
        start_y = int(start_points[1])
        end_points = arranged_points[edge[1]]
        end_x = int(end_points[0])
        end_y = int(end_points[1])
        cv2.line(img, (start_x, start_y), (end_x, end_y), COLOR, 2)

    plt.imshow(img)
    plt.savefig(save_path)
    plt.close()


def project_by_intrinsics(element):
    """
    Project using camera intrinsics. 
    reference
    https://github.com/google-research-datasets/Objectron/issues/39#issuecomment-835509430
    https://amytabb.com/ts/2019_06_28/

    Objectron frame (x down, y right, z in); 
    H-Z frame (x right, y down, z out); 
    Objectron intrinsics has px and py swapped;
    px and py are from original image size (1440, 1920);

    Approach 1:
    To transform objectron frame to H-Z frame,
    we need to z <-- -z and swap x and y;
    To modify intrinsics, we need swap px, py.

    Or alternatively, approach 2:
    we change the sign for z and swap x and y after projection.
    """
    vertices_3d = element['point_3d'].reshape(9,3)
    # objectron frame to H-Z frame
    vertices_3d[:,2] = -vertices_3d[:,2]    
    intr = element['camera_intrinsics'].reshape(3,3)
    # scale intrinsics from (1920, 1440) to (640, 480)
    intr[:2, :] = intr[:2, :] / np.array([[1920],[1440]]) * np.array([[640],[480]])
    point_2d = intr @ vertices_3d.T 
    point_2d[:2,:] = point_2d[:2,:] / point_2d[2,:]
    # landscape to portrait swap x and y.
    point_2d[[0,1],:] = point_2d[[1,0],:]
    arranged_points = point_2d.T[:,:2]
    return arranged_points


def project_by_camera_projection(element):
    """
    Reference: https://github.com/google-research-datasets/Objectron/blob/master/notebooks/objectron-geometry-tutorial.ipynb
    http://www.songho.ca/opengl/gl_projectionmatrix.html
    function project_points
    """
    vertices_3d = element['point_3d'].reshape(9,3)
    vertices_3d_homg = np.concatenate((vertices_3d, np.ones_like(vertices_3d[:, :1])), axis=-1).T
    vertices_2d_proj = np.matmul(element['camera_projection'].reshape(4,4), vertices_3d_homg)
    # Project the points
    points2d_ndc = vertices_2d_proj[:-1, :] / vertices_2d_proj[-1, :]
    points2d_ndc = points2d_ndc.T
    # Convert the 2D Projected points from the normalized device coordinates to pixel values
    x = points2d_ndc[:, 1]
    y = points2d_ndc[:, 0]
    pt2d = np.copy(points2d_ndc)
    pt2d[:, 0] = (1 + x) / 2 * element['image_width']
    pt2d[:, 1] = (1 + y) / 2 * element['image_height']
    arranged_points = pt2d[:,:2]
    return arranged_points


def project_by_point2d(element):
    """
    Reference: https://app.activeloop.ai/google/bike
    function get_bbox
    """
    arranged_points = element['point_2d'].reshape(9,3)
    arranged_points[:,0] = arranged_points[:,0] * element['image_width']
    arranged_points[:,1] = arranged_points[:,1] * element['image_height']
    return arranged_points[:,:2]

if __name__ == '__main__':
    frame_ids = [0]
    for idx in frame_ids:
        bikes = hub.Dataset("google/bike")
        element=bikes[idx].compute()

        pt2d_point2d = project_by_point2d(element.copy())
        draw_box(element['image'].copy(), pt2d_point2d, 'point2d.png')
        
        pt2d_cam_proj = project_by_camera_projection(element.copy())
        draw_box(element['image'].copy(), pt2d_cam_proj, 'camera_projection.png')
        
        pt2d_intr = project_by_intrinsics(element.copy())
        draw_box(element['image'].copy(), pt2d_intr, 'camera_intrinsics.png')

from objectron.

lzhang57 commented on May 17, 2024

Hi DeriZSY, If you want to use the 3x3 camera intrinsic matrix, you can parse it from the a_r_capture_metadata_pb2 proto using the below line from the objectron-geometry-tutorial:
intrinsics = np.array(data.camera.intrinsics).reshape(3, 3)

For the 4x4 view_matrix, I am guessing you were referring to the projection_matrix that we used for projecting the points from 3D to 2D in our code. It's an approach that is often used by computer graphics folks, which is pretty much the counterpart of camera intrinsic matrix that is popular in 3D geometry community. For more details see this OpenGL tutorial.

from objectron.

DeriZSY commented on May 17, 2024

Hi DeriZSY, If you want to use the 3x3 camera intrinsic matrix, you can parse it from the a_r_capture_metadata_pb2 proto using the below line from the objectron-geometry-tutorial:
intrinsics = np.array(data.camera.intrinsics).reshape(3, 3)

For the 4x4 view_matrix, I am guessing you were referring to the projection_matrix that we used for projecting the points from 3D to 2D in our code. It's an approach that is often used by computer graphics folks, which is pretty much the counterpart of camera intrinsic matrix that is popular in 3D geometry community. For more details see this OpenGL tutorial.

Hi, thanks for the reply. You are right about the matrix part. I know that I can obtain intrinsics from the data. In the annotation for each frame, 3D box corner points (plus a center point) in the camera frame are provided and can be projected to 2D using projection_matrix .

My problem is that, how should I modify the 3D point annotation, so that I can project it with intrinsics instead of projection_matrix in the traditional way of computer vision, i.e. pts2d = K * pts3d?

To solve for the object pose, the usual way is to estimate the 2D corner and center points with a CNN, then solve for PnP using the estimated 2D points, 3D corner points, and intrinsic matrix. Google Media pipe use exactly this approach according to the paper.

How would you make it work, if the 3D points annotation can only be projected with projection_matrix instead of intrinsics? (I'd be glad to know if there is a way to work around)

from objectron.

ahmadyan commented on May 17, 2024

Projection matrix and intrinsic matrix are basically the same. Projection matrix maps 3D point to an image plane of [-1, 1] and principal point of 0, where as intrinsic matrix maps it to [w,h] with principal point at the center [w/2, h/2]. So in the projection matrix, the principal point is set to 0, and the focal lengths are divided by width/height. You can also use the intrinsic matrix directly, or grab it from the projection matrix:

    fx, fy = proj[0, 0], proj[1, 1]
    cx, cy = proj[0, 2], proj[1, 2]

from objectron.

lzhang57 commented on May 17, 2024

It also depends on which camera coordinate convention you want to use. In our case, the x-axis points to image bottom, the y-axis points to the image right, and the z-axis points away from the image. If you used a different convention, e.g. opencv's convention, then you will have to adjust the 3D points to your own camera coordinate before projecting them to 2D.

from objectron.

ahmadyan commented on May 17, 2024

@jinlinyi Thanks for the example. I would be happy to merge this in the repo, if you create a pull request.

from objectron.

How to project 3D points annotations to 2D with CAMERA INTRINSIC MATRIX (instead of PROJECTION MATRIX)? about objectron HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent