Add warp_perspective operator #5542

banasraf · 2024-06-28T10:55:31Z

Category:

New feature

Description:

It adds a new experimental.warp_perspective operator that uses CV-CUDA operator as its implementation.

Additional information:

Affected modules and functionalities:

New operator

Key points relevant for the review:

Correctness of handing of the parameteres

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: N/A

Signed-off-by: Rafal Banas <[email protected]>

dali/test/python/operator_2/test_warp_perspective.py

+    elif dtype == np.int16 or dtype == np.uint16:
+        eps = 5
+
+    test_utils.compare_pipelines(pipe1, pipe2, batch_size=bs, N_iterations=10, eps=eps)


mzient · 2024-07-01T14:00:28Z

dali/operators/image/remap/cvcuda/warp_perspective.cc

+    .AddOptionalArg<float>("matrix",
+                           R"doc(
+  Perspective transform mapping of destination to source coordinates.
+  If `inverse_map` argument is set to true, the matrix is interpreted


Suggested change

If `inverse_map` argument is set to true, the matrix is interpreted

If `inverse_map` argument is set to false, the matrix is interpreted

?
At least that's what OpenCV's documentation says.

mzient · 2024-07-02T07:51:53Z

dali/test/python/operator_2/test_warp_perspective.py

+            cv2_warp_perspective(
+                dst[f, :, :, :],
+                img[f, :, :, :],
+                matrix,
+                layout,
+                border_mode,
+                interp_type,
+                inverse_map,
+                fill_value,
+            )


I have objections against using OpenCV as a reference. It has a different notion of pixel centers than we do. If we match OpenCV, it's actually a bad thing. Instead, when the perspective matrix is actually an affine transform matrix, WarpPerspective should match WarpAffine.

mzient · 2024-07-02T07:55:02Z

dali/operators/image/remap/cvcuda/warp_perspective.cc

+      matrix = AcquireTensorArgument(ws, scratchpad, matrix_arg_, TensorShape<1>(9),
+                                     nvcvop::GetDataType<float>(), "W");


I'm pretty sure we need to apply a fixup to the matrix to match WarpAffine. We can add OpenCV compatibility (here and in WarpAffine) as an option, but I think being self-consistent is far better than randomly matching a patchwork of common libraries.

szkarpinski · 2024-07-01T12:44:05Z

dali/operators/image/remap/cvcuda/warp_perspective.cc

+
+DALI_SCHEMA(experimental__WarpPerspective)
+    .DocStr(R"doc(
+TODO


Should probably be gone before merging. It'd be nice to have a description what this operator actually does

szkarpinski · 2024-07-08T08:57:43Z

cmake/Dependencies.common.cmake

@@ -264,7 +264,7 @@ if (BUILD_CVCUDA)
  set(DALI_BUILD_PYTHON ${BUILD_PYTHON})
  set(BUILD_PYTHON OFF)
  # for now we use only median blur from CV-CUDA
-  set(CV_CUDA_SRC_PATERN medianblur median_blur morphology)
+  set(CV_CUDA_SRC_PATERN medianblur median_blur morphology warp)


Nitpick: I know it's not you but this should be PATTERN

szkarpinski · 2024-07-08T09:14:12Z

dali/test/python/operator_2/test_warp_perspective.py

@@ -0,0 +1,372 @@
+# Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.


We do a lot of parameter handling with tons of bug-prone ifs. I think we need to have some negative tests to check if we handle invalid argument combinations properly.

stiepan · 2024-07-08T10:00:07Z

dali/operators/image/remap/cvcuda/warp_perspective.cc

+              "Transformation matrix data. Should be used to pass the GPU data. "
+              "For CPU data, the `matrix` argument should be used.")


The _data suffix and data word in the description gives me an impression that this is somehow lower level thingy than the matrix argumemnt.

Suggested change

"Transformation matrix data. Should be used to pass the GPU data. "

"For CPU data, the `matrix` argument should be used.")

"Like ``matrix`` keyword argument, but accepts matrices placed in GPU memory.")

stiepan · 2024-07-08T11:09:44Z

dali/operators/image/remap/cvcuda/warp_perspective.cc

+  case the matrix can be placed on the GPU.)doc",
+                           std::vector<float>({}), true, true)
+    .AddOptionalArg("border_mode",
+                    "Border mode to be used when accessing elements outside input image.",


I'd enumerate the supported values here. It's a string, so I guess user will have no link to some enumeration.

stiepan · 2024-07-08T11:09:57Z

dali/operators/image/remap/cvcuda/warp_perspective.cc

+    .AddOptionalArg("border_mode",
+                    "Border mode to be used when accessing elements outside input image.",
+                    "constant")
+    .AddOptionalArg("interp_type", "Interpolation method.", "nearest")


stiepan · 2024-07-08T11:12:56Z

dali/operators/image/remap/cvcuda/warp_perspective.cc

+                           "Value used to fill areas that are outside the source image when the "
+                           "\"constant\" border_mode is chosen.",
+                           std::vector<float>({}))
+    .AddOptionalArg<bool>("inverse_map", "Inverse perspective transform matrix", false);


Suggested change

.AddOptionalArg<bool>("inverse_map", "Inverse perspective transform matrix", false);

.AddOptionalArg<bool>("inverse_map", "If set to false (default), the ``matrix`` is interpreted as...", false);

stiepan · 2024-07-08T11:18:26Z

dali/operators/image/remap/cvcuda/warp_perspective.cc

+      if (channels > 0) {
+        if (channels == static_cast<int>(fill_value_arg_.size())) {
+          float4 fill_value{0, 0, 0, 0};
+          memcpy(&fill_value, fill_value_arg_.data(), fill_value_arg_.size() * sizeof(float));


paranoid nitpick (due to narrowing cast in the channels check):

Suggested change

memcpy(&fill_value, fill_value_arg_.data(), fill_value_arg_.size() * sizeof(float));

memcpy(&fill_value, fill_value_arg_.data(), sizeof(float4));

Suggested change

memcpy(&fill_value, fill_value_arg_.data(), fill_value_arg_.size() * sizeof(float));

memcpy(&fill_value, fill_value_arg_.data(), sizeof(decltype(fill_value)));

stiepan · 2024-07-08T11:29:35Z

dali/operators/image/remap/cvcuda/warp_perspective.cc

+      auto width = std::max<int>(std::roundf(shape_arg_[1]), 1);
+      auto out_sample_shape = (channels != -1) ? TensorShape<>({height, width, channels}) :
+                                                 TensorShape<>({height, width});
+      output_shape = TensorListShape<>::make_uniform(input.num_samples(), out_sample_shape);


Isn't the size argument marked per-sample?

stiepan · 2024-07-08T11:34:09Z

dali/operators/nvcvop/nvcvop.cc

+  for (int d = shape.size() - 2; d >= 0; --d) {
+    inBuf.strides[d] = shape[d + 1] * inBuf.strides[d + 1];
+  }
+  TensorLayout out_layout = layout.empty() ? tensor.GetLayout() : layout;


Should we validate that the layout non-zero size matches the number of dims?

Add warp_perspective operator

a0f9683

Signed-off-by: Rafal Banas <[email protected]>

banasraf force-pushed the add-warp-perspective-operator branch from eacdc59 to a0f9683 Compare June 28, 2024 10:57

github-advanced-security bot found potential problems Jun 28, 2024

View reviewed changes

dali-automaton assigned szkarpinski and stiepan Jul 1, 2024

mzient reviewed Jul 1, 2024

View reviewed changes

mzient reviewed Jul 2, 2024

View reviewed changes

szkarpinski reviewed Jul 8, 2024

View reviewed changes

stiepan reviewed Jul 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add warp_perspective operator #5542

Add warp_perspective operator #5542

banasraf commented Jun 28, 2024

mzient Jul 1, 2024 •

edited

Loading

mzient Jul 2, 2024

mzient Jul 2, 2024 •

edited

Loading

szkarpinski Jul 1, 2024

szkarpinski Jul 8, 2024

szkarpinski Jul 8, 2024

stiepan Jul 8, 2024

stiepan Jul 8, 2024

stiepan Jul 8, 2024

stiepan Jul 8, 2024

stiepan Jul 8, 2024

stiepan Jul 8, 2024

stiepan Jul 8, 2024

	If `inverse_map` argument is set to true, the matrix is interpreted
	If `inverse_map` argument is set to false, the matrix is interpreted

		matrix = AcquireTensorArgument(ws, scratchpad, matrix_arg_, TensorShape<1>(9),
		nvcvop::GetDataType<float>(), "W");

		@@ -0,0 +1,372 @@
		# Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

		"Transformation matrix data. Should be used to pass the GPU data. "
		"For CPU data, the `matrix` argument should be used.")

	"Transformation matrix data. Should be used to pass the GPU data. "
	"For CPU data, the `matrix` argument should be used.")
	"Like ``matrix`` keyword argument, but accepts matrices placed in GPU memory.")

	.AddOptionalArg<bool>("inverse_map", "Inverse perspective transform matrix", false);
	.AddOptionalArg<bool>("inverse_map", "If set to false (default), the ``matrix`` is interpreted as...", false);

	memcpy(&fill_value, fill_value_arg_.data(), fill_value_arg_.size() * sizeof(float));
	memcpy(&fill_value, fill_value_arg_.data(), sizeof(float4));

Add warp_perspective operator #5542

Are you sure you want to change the base?

Add warp_perspective operator #5542

Conversation

banasraf commented Jun 28, 2024

Category:

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

mzient Jul 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Jul 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Jul 1, 2024 •

edited

Loading

mzient Jul 2, 2024 •

edited

Loading