Nuitrack and Azure Kinect: insights on skeletal tracking + pixel coordinates computation from normalized coordinates


I am using Nuitrack with an Azure Kinect to track the hand position in real time and show it on the screen of my application. I may use either the classic body tracking model or the AI model, depending on future analyses.

So, I have some general questions:

  • Where is the origin of the reference system when I use the “classic” model and the AI model? Is there a way to “force” the origin in the same point in both cases?

  • I have noticed that not all the possible joints in JointType are used for the skeletal data (here): Is it up to date or do you have news about it?

  • In order to render a point with the hand coordinate of the screen, I was thinking about the solution below. Is it correct or is there a more direct/optimized approach to achieve it?

  1. Use ConvertRealToProjCoords() to get the normalized coordinates (or export the joint.Proj coordinates)
  2. Find the corresponding pixel by using such a formula: value_pixel = actualDimension * value_norm.

Thanks for your help!

Hello @cg72, here are the short answers:

  1. origin in both cases is at the sensor/camera position
  2. joint set was initially defined by the OpenNI 1.x specification, so some joints are included during initial development for compatibility and further expansion, but we aren’t anyhow limited by that set
    Please let us know if any particular joints currently missing are crucial for your project.
  3. your formulae are correct, but the simplest option for this purpose is to use HandTracker module which implements corresponding virtual coordinate system (normalized screen coordinates) for direct UI mapping
    Please refer to this tutorial for additional details.

As always we’re here to help with any other questions / issues.

1 Like

Hello @TAG, thanks for your answer!
Anyway, I would like to expand briefly on the questions’ context:

  1. I understood that with the classical Nuitrack model you use only depth data to compute body tracking, hence the origin is the depth sensor. Since the AI model uses both depth and color (and you need to set Depth2ColorRegistration as true to make it work), I thought that the reference in this model would be the color camera. Am I correct or can I assume the same reference for both?
  2. The current joint set is working for my project. I was just wondering whether the output of joints such as RightFoot and LeftFoot was valid for body tracking or not.
  3. Thanks for the confirmation. I need to show also other joints on screen, so I think using SkeletonTracker info for all joints may be more convenient for me.