Coder Social home page Coder Social logo

Comments (6)

jinlinyi avatar jinlinyi commented on May 17, 2024 4

Hi, I think I figured out how to use the intrinsic matrix to project 3D points. See project_by_intrinsics(). The difficult part is the frame transformation between Objectron frame (x down, y right, z in) and Projection in the OpenCV/H-Z framework (x right, y down, z out). Additionally, Objectron intrinsic matrix has px and py swapped because it is in portrait mode (as pointed out here) so we have to swap x and y.

import hub
import matplotlib.pyplot as plt
import cv2
import numpy as np


def draw_box(img, arranged_points, save_path):
    """
    plot arranged_points on img and save to save_path.
    arranged_points is in image coordinate. [[x, y]]
    """
    RADIUS = 10
    COLOR = (255, 255, 255)
    EDGES = [
      [1, 5], [2, 6], [3, 7], [4, 8],  # lines along x-axis
      [1, 3], [5, 7], [2, 4], [6, 8],  # lines along y-axis
      [1, 2], [3, 4], [5, 6], [7, 8]   # lines along z-axis
    ] 
    for i in range(arranged_points.shape[0]):
        x, y = arranged_points[i]
        cv2.circle(
            img,
            (int(x), int(y)), 
            RADIUS,
            COLOR,
            -10
        )
    for edge in EDGES:
        start_points = arranged_points[edge[0]]
        start_x = int(start_points[0])
        start_y = int(start_points[1])
        end_points = arranged_points[edge[1]]
        end_x = int(end_points[0])
        end_y = int(end_points[1])
        cv2.line(img, (start_x, start_y), (end_x, end_y), COLOR, 2)

    plt.imshow(img)
    plt.savefig(save_path)
    plt.close()


def project_by_intrinsics(element):
    """
    Project using camera intrinsics. 
    reference
    https://github.com/google-research-datasets/Objectron/issues/39#issuecomment-835509430
    https://amytabb.com/ts/2019_06_28/

    Objectron frame (x down, y right, z in); 
    H-Z frame (x right, y down, z out); 
    Objectron intrinsics has px and py swapped;
    px and py are from original image size (1440, 1920);

    Approach 1:
    To transform objectron frame to H-Z frame,
    we need to z <-- -z and swap x and y;
    To modify intrinsics, we need swap px, py.

    Or alternatively, approach 2:
    we change the sign for z and swap x and y after projection.
    """
    vertices_3d = element['point_3d'].reshape(9,3)
    # objectron frame to H-Z frame
    vertices_3d[:,2] = -vertices_3d[:,2]    
    intr = element['camera_intrinsics'].reshape(3,3)
    # scale intrinsics from (1920, 1440) to (640, 480)
    intr[:2, :] = intr[:2, :] / np.array([[1920],[1440]]) * np.array([[640],[480]])
    point_2d = intr @ vertices_3d.T 
    point_2d[:2,:] = point_2d[:2,:] / point_2d[2,:]
    # landscape to portrait swap x and y.
    point_2d[[0,1],:] = point_2d[[1,0],:]
    arranged_points = point_2d.T[:,:2]
    return arranged_points


def project_by_camera_projection(element):
    """
    Reference: https://github.com/google-research-datasets/Objectron/blob/master/notebooks/objectron-geometry-tutorial.ipynb
    http://www.songho.ca/opengl/gl_projectionmatrix.html
    function project_points
    """
    vertices_3d = element['point_3d'].reshape(9,3)
    vertices_3d_homg = np.concatenate((vertices_3d, np.ones_like(vertices_3d[:, :1])), axis=-1).T
    vertices_2d_proj = np.matmul(element['camera_projection'].reshape(4,4), vertices_3d_homg)
    # Project the points
    points2d_ndc = vertices_2d_proj[:-1, :] / vertices_2d_proj[-1, :]
    points2d_ndc = points2d_ndc.T
    # Convert the 2D Projected points from the normalized device coordinates to pixel values
    x = points2d_ndc[:, 1]
    y = points2d_ndc[:, 0]
    pt2d = np.copy(points2d_ndc)
    pt2d[:, 0] = (1 + x) / 2 * element['image_width']
    pt2d[:, 1] = (1 + y) / 2 * element['image_height']
    arranged_points = pt2d[:,:2]
    return arranged_points


def project_by_point2d(element):
    """
    Reference: https://app.activeloop.ai/google/bike
    function get_bbox
    """
    arranged_points = element['point_2d'].reshape(9,3)
    arranged_points[:,0] = arranged_points[:,0] * element['image_width']
    arranged_points[:,1] = arranged_points[:,1] * element['image_height']
    return arranged_points[:,:2]

if __name__ == '__main__':
    frame_ids = [0]
    for idx in frame_ids:
        bikes = hub.Dataset("google/bike")
        element=bikes[idx].compute()

        pt2d_point2d = project_by_point2d(element.copy())
        draw_box(element['image'].copy(), pt2d_point2d, 'point2d.png')
        
        pt2d_cam_proj = project_by_camera_projection(element.copy())
        draw_box(element['image'].copy(), pt2d_cam_proj, 'camera_projection.png')
        
        pt2d_intr = project_by_intrinsics(element.copy())
        draw_box(element['image'].copy(), pt2d_intr, 'camera_intrinsics.png')

camera_projection

from objectron.

lzhang57 avatar lzhang57 commented on May 17, 2024

Hi DeriZSY, If you want to use the 3x3 camera intrinsic matrix, you can parse it from the a_r_capture_metadata_pb2 proto using the below line from the objectron-geometry-tutorial:
intrinsics = np.array(data.camera.intrinsics).reshape(3, 3)

For the 4x4 view_matrix, I am guessing you were referring to the projection_matrix that we used for projecting the points from 3D to 2D in our code. It's an approach that is often used by computer graphics folks, which is pretty much the counterpart of camera intrinsic matrix that is popular in 3D geometry community. For more details see this OpenGL tutorial.

from objectron.

DeriZSY avatar DeriZSY commented on May 17, 2024

Hi DeriZSY, If you want to use the 3x3 camera intrinsic matrix, you can parse it from the a_r_capture_metadata_pb2 proto using the below line from the objectron-geometry-tutorial:
intrinsics = np.array(data.camera.intrinsics).reshape(3, 3)

For the 4x4 view_matrix, I am guessing you were referring to the projection_matrix that we used for projecting the points from 3D to 2D in our code. It's an approach that is often used by computer graphics folks, which is pretty much the counterpart of camera intrinsic matrix that is popular in 3D geometry community. For more details see this OpenGL tutorial.

Hi, thanks for the reply. You are right about the matrix part. I know that I can obtain intrinsics from the data. In the annotation for each frame, 3D box corner points (plus a center point) in the camera frame are provided and can be projected to 2D using projection_matrix .

My problem is that, how should I modify the 3D point annotation, so that I can project it with intrinsics instead of projection_matrix in the traditional way of computer vision, i.e. pts2d = K * pts3d?

To solve for the object pose, the usual way is to estimate the 2D corner and center points with a CNN, then solve for PnP using the estimated 2D points, 3D corner points, and intrinsic matrix. Google Media pipe use exactly this approach according to the paper.

How would you make it work, if the 3D points annotation can only be projected with projection_matrix instead of intrinsics? (I'd be glad to know if there is a way to work around)

from objectron.

ahmadyan avatar ahmadyan commented on May 17, 2024

Projection matrix and intrinsic matrix are basically the same. Projection matrix maps 3D point to an image plane of [-1, 1] and principal point of 0, where as intrinsic matrix maps it to [w,h] with principal point at the center [w/2, h/2]. So in the projection matrix, the principal point is set to 0, and the focal lengths are divided by width/height. You can also use the intrinsic matrix directly, or grab it from the projection matrix:

    fx, fy = proj[0, 0], proj[1, 1]
    cx, cy = proj[0, 2], proj[1, 2]

from objectron.

lzhang57 avatar lzhang57 commented on May 17, 2024

It also depends on which camera coordinate convention you want to use. In our case, the x-axis points to image bottom, the y-axis points to the image right, and the z-axis points away from the image. If you used a different convention, e.g. opencv's convention, then you will have to adjust the 3D points to your own camera coordinate before projecting them to 2D.

from objectron.

ahmadyan avatar ahmadyan commented on May 17, 2024

@jinlinyi Thanks for the example. I would be happy to merge this in the repo, if you create a pull request.

from objectron.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.