Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multus to support CNI plugin which does not create interface #1095

Merged

Conversation

s1061123
Copy link
Member

This change fixes multus to support CNI plugin which does not create any interface and return empty result. Some CNI plugin may do network configuration change only and does not create any interface and return empty CNI result. Current multus assumes that CNI config always creates some interfaces hence above CNI plugin is out of assumption and multus may not work with such plugins.

@s1061123 s1061123 changed the title Fix multus to support CNI plugin which does not create interface [WIP]Fix multus to support CNI plugin which does not create interface May 17, 2023
@s1061123 s1061123 changed the title [WIP]Fix multus to support CNI plugin which does not create interface Fix multus to support CNI plugin which does not create interface May 17, 2023
} else {
kubeClient.Eventf(pod, v1.EventTypeNormal, "AddedInterface", "Add %s %v", rt.IfName, ips)
// check Interfaces and IPs because some CNI plugin just return empty result
if res.Interfaces != nil || res.IPs != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking interfaces should be enough? In which case the interfaces is equal to nil and IPs not nil?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of CNI 0.2.0, there is no 'interface' in result object, so we also need to double check IPs as well.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but what happens if you run a regular cni like bridge version 0.2.0 without IPAM?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.2.0 does require IPAM. Please see CNI 0.2.0 SPEC.

// check Interfaces and IPs because some CNI plugin does not create any interface
// and just returns empty result
if res.Interfaces == nil && res.IPs == nil {
continue

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will skip the CreateNetworkStatus of those type of plugins

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that CNI plugin without interfaces and IPs, so I suppose we don't need to create network status for that because this CNI plugin does not create any network interface.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it is a way to quickly know that the cni was run, That function will skip interfaces, ips and routes if they are nil, only name it's printed (and dns empty).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

@@ -636,6 +639,17 @@ func CmdAdd(args *skel.CmdArgs, exec invoke.Exec, kubeClient *k8s.ClientInfo) (c
return nil, cmdPluginErr(k8sArgs, netName, "error adding container to network %q: %v", netName, err)
}

res, err := cni100.NewResultFromResult(tmpResult)
if err != nil {
logging.Errorf("CmdAdd: failed to read result: %v, but proceed", err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you don't return here, the res below is nil and it will panic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This err is only guarantee that result cannot be converted and still now we could have result. So I believe that we still continue to process. It does not panic.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, if there is an error converting, that function returns
if err != nil {
return nil, err
}

and res will be equal to nil. Below you use res.Interfaces and res could be nil

What am I missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cni100.NewResultFromResult(tmpResult) is just convert result to cni100. Let's imagine that tmpResult has valid CNI result, but failed to convert cni100. In such case, multus need to return tmpResult as valid result because NewResultFromResult() failure is just failed to convert 1.0.0.

Do you guarantee that all CNI version result object can be converted 1.0.0 forever?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that.

But let me ask you something, what is the value of "res" if the function cni100.NewResultFromResult(tmpResult) returns an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no formal answer for that because there is no description in SPEC. Actual code is the answer. Please look the code. And keep in mind that it may be changed (on any purpose).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we need another pair of eyes, @dougbtv, I'm claiming that the function cni100.NewResultFromResult will return (nil, err) if there is an error with convert. Therefore, with this "res, err := cni100.NewResultFromResult(tmpResult)", res will be equal to "nil" and then the code below will panic "if res.Interfaces != nil || res.IPs != nil"

func NewResultFromResult(result types.Result) (*Result, error) {
newResult, err := convert.Convert(result, ImplementedSpecVersion)
if err != nil {
return nil, err
}
return newResult.(*Result), nil
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see both sides. I think Tomo is trying to future proof based on the fact that this could change, and isn't codified by the spec.

Let's handle this as a follow-on, if we can get a unit test to show this happening, then let's look at it again.

@s1061123 s1061123 force-pushed the fix/plugin-without-interface branch from c507ffb to 8b0de1b Compare May 17, 2023 16:41
@coveralls
Copy link

coveralls commented May 17, 2023

Coverage Status

Changes unknown
when pulling 8b63421 on s1061123:fix/plugin-without-interface
into ** on k8snetworkplumbingwg:master**.

@s1061123 s1061123 force-pushed the fix/plugin-without-interface branch from 8b0de1b to 8b63421 Compare May 17, 2023 16:46
// check Interfaces and IPs because some CNI plugin just return empty result
if res.Interfaces != nil || res.IPs != nil {
// send kubernetes events
if delegate.Name != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for covering this erroneous output in the current version

This change fixes multus to support CNI plugin which does not
create any interface and return empty result. Some CNI plugin
may do network configuration change only and does not create
any interface and return empty CNI result. Current multus assumes
that CNI config always creates some interfaces hence above CNI
plugin is out of assumption and multus may not work with such
plugins.
Copy link

@andreaskaris andreaskaris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of it, can we still add proper error handling to netutils.DeleteDefaultGw?

Because the DeleteDefaultGW method is still in itself erroneous. As in, I can write a unit test for that method alone and the unit test will fail:

// DeleteDefaultGW removes the default gateway from marked interfaces.
func DeleteDefaultGW(netnsPath string, ifName string) error {
	netns, err := ns.GetNS(netnsPath)
	if err != nil {
		return logging.Errorf("DeleteDefaultGW: Error getting namespace %v", err)
	}
	defer netns.Close()

	err = netns.Do(func(_ ns.NetNS) error {
		var err error
		link, _ := netlink.LinkByName(ifName)
		routes, _ := netlink.RouteList(link, netlink.FAMILY_ALL)
		for _, nlroute := range routes {
			if nlroute.Dst == nil {
				err = netlink.RouteDel(&nlroute)
			}
		}
		return err
	})
	return err
}

I don't understand why there's no error handling here:
https://github.com/k8snetworkplumbingwg/multus-cni/blob/master/pkg/netutils/netutils.go#L41

But in the SetDefaultGW method, there's error handling for exactly the same call:
https://github.com/k8snetworkplumbingwg/multus-cni/blob/master/pkg/netutils/netutils.go#L67

I don't understand the logic that's higher up the callstack, but if you must ignore the error, then I'd throw it away further up the stack, but I wouldn't do the following unless that's really intended to delete whatever default route if the link doesn't exists (and even if it's intended and behaves as designed, I think you should add a comment why). Also, the range routes expression throws away information

	err = netns.Do(func(_ ns.NetNS) error {
		var err error
		link, _ := netlink.LinkByName(ifName)
		routes, _ := netlink.RouteList(link, netlink.FAMILY_ALL)
		for _, nlroute := range routes {
			if nlroute.Dst == nil {
				err = netlink.RouteDel(&nlroute)
			}
		}
		return err
	})

Should be:

apierrors "k8s.io/apimachinery/pkg/util/errors"
(...)
	err = netns.Do(func(_ ns.NetNS) error {
		link, err := netlink.LinkByName(ifName)
                if err != nil {
                   return err
                }
		routes, err := netlink.RouteList(link, netlink.FAMILY_ALL)
                if err != nil {
                   return err
                }
                var errors []error
		for _, nlroute := range routes {
			if nlroute.Dst == nil {
				if err := netlink.RouteDel(&nlroute); err != nil {
                                   errors = append(errors, err)
                                }
			}
		}
		return apierrors.NewAggregate(errors)
	})

I hope that makes sense

return nil, cmdErr(k8sArgs, "error deleting default gateway in cache: %v", err)
// check Interfaces and IPs because some CNI plugin does not create any interface
// and just returns empty result
if res != nil && (res.Interfaces != nil || res.IPs != nil) {
Copy link

@andreaskaris andreaskaris May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about moving the block's content to a helper function to make it easier to read? Or is that too convoluted?

if res != nil &&  (res.Interfaces != nil || res.IPs != nil) {
     if err := configureDefaultGateway(delegate, ifName, args, netName, rt, n, k8sArgs); err != nil {
           return nil, err
     }
}
func configureDefaultGateway(delegate *types.DelegateNetConf, ifName string, args *skel.CmdArgs, netName string,
        rt *libcni.RuntimeConf, n *types.NetConf, k8sArgs *types.K8sArgs) error {
        // Remove gateway from routing table if the gateway is not used
        deleteV4gateway := false
        deleteV6gateway := false
        adddefaultgateway := false
        if delegate.IsFilterV4Gateway {
                deleteV4gateway = true
                logging.Debugf("Marked interface %v for v4 gateway deletion", ifName)
        } else {
                // Otherwise, determine if this interface now gets our default route.
                // According to
                // https://docs.google.com/document/d/1Ny03h6IDVy_e_vmElOqR7UdTPAG_RNydhVE1Kx54kFQ (4.1.2.1.9)
                // the list can be empty; if it is, we'll assume the CNI's config for the default gateway holds,
                // else we'll update the defaultgateway to the one specified.
                if delegate.GatewayRequest != nil && len(*delegate.GatewayRequest) != 0 {
                        deleteV4gateway = true
                        adddefaultgateway = true
                        logging.Debugf("Detected gateway override on interface %v to %v", ifName, delegate.GatewayRequest)
                }
        }

        if delegate.IsFilterV6Gateway {
                deleteV6gateway = true
                logging.Debugf("Marked interface %v for v6 gateway deletion", ifName)
        } else {
                // Otherwise, determine if this interface now gets our default route.
                // According to
                // https://docs.google.com/document/d/1Ny03h6IDVy_e_vmElOqR7UdTPAG_RNydhVE1Kx54kFQ (4.1.2.1.9)
                // the list can be empty; if it is, we'll assume the CNI's config for the default gateway holds,
                // else we'll update the defaultgateway to the one specified.
                if delegate.GatewayRequest != nil && len(*delegate.GatewayRequest) != 0 {
                        deleteV6gateway = true
                        adddefaultgateway = true
                        logging.Debugf("Detected gateway override on interface %v to %v", ifName, delegate.GatewayRequest)
                }
        }

        // Remove gateway if `default-route` network selection is specified
        if deleteV4gateway || deleteV6gateway {
                logging.Debugf("akaris: CmdAdd: deleteV4gateway|deleteV6gateway: %t, %t, %v, %v, %v",
                        deleteV4gateway, deleteV6gateway, args.Netns, ifName, netName)
                if err := netutils.DeleteDefaultGW(args.Netns, ifName); err != nil {
                        return cmdErr(k8sArgs, "error deleting default gateway: %v", err)
                }
                if err := netutils.DeleteDefaultGWCache(n.CNIDir, rt, netName, ifName, deleteV4gateway, deleteV6gateway); err != nil {
                        return cmdErr(k8sArgs, "error deleting default gateway in cache: %v", err)
                }
        }

        // Here we'll set the default gateway which specified in `default-route` network selection
        if adddefaultgateway {
                logging.Debugf("akaris: CmdAdd: adddefaultgateway=true, %v, %v, %v", args.Netns, ifName, *delegate.GatewayRequest)
                if err := netutils.SetDefaultGW(args.Netns, ifName, *delegate.GatewayRequest); err != nil {
                        return cmdErr(k8sArgs, "error setting default gateway: %v", err)
                }
                if err := netutils.AddDefaultGWCache(n.CNIDir, rt, netName, ifName, *delegate.GatewayRequest); err != nil {
                        return cmdErr(k8sArgs, "error setting default gateway in cache: %v", err)
                }
        }
        return nil
}

@dougbtv
Copy link
Member

dougbtv commented May 18, 2023

Andreas -- thanks for the detailed input/review.

I think I'm OK with fixing this kind of at the handler level for now, I like how you dug a layer deeper. And agreed it sure looks like it's inconsistent error handling.

I do like how this exposes what's happening with a non-interface creating CNI plugin at the handler level, but that doesn't mean we shouldn't fix what's under the surface, so I've filed that in #1098 and I think it's worth looking at as well as a follow up.

@dougbtv dougbtv merged commit f037656 into k8snetworkplumbingwg:master May 18, 2023
24 checks passed
@s1061123 s1061123 deleted the fix/plugin-without-interface branch July 13, 2023 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants