Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate from TPR to CRD for the Kubernetes backend #491

Merged
merged 1 commit into from
Aug 16, 2017

Conversation

gunjan5
Copy link
Contributor

@gunjan5 gunjan5 commented Aug 8, 2017

Description

Move to CustomResourceDefination from ThirdPartyResource for KDD backend

Todos

  • Tests
  • Documentation
  • Release note

Release Note

Migrate from TPR to CRD for the Kubernetes backend.
Users upgrading from Calico v2.4.x or older to v2.5.x or higher with Kubernetes datastore backend must follow the one-time configuration migration task to upgrade the cluster: https://github.com/projectcalico/calico/blob/master/upgrade/v2.5/README.md

@caseydavenport caseydavenport added this to the Calico v2.5.0 milestone Aug 8, 2017
@caseydavenport caseydavenport self-requested a review August 8, 2017 06:40
@gunjan5 gunjan5 self-assigned this Aug 8, 2017
@gunjan5 gunjan5 force-pushed the tpr-crd branch 2 times, most recently from 3b81e59 to 208d57e Compare August 9, 2017 21:04
@@ -0,0 +1,69 @@
kind: List
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just for the tests, right?

I think this should be in a subdirectory so it's out of the way.

e.g. test/crds.yaml

(and a test/README.md)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, just for the tests. Will move it to tests/

Copy link
Member

@caseydavenport caseydavenport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass looks fine, just some minor stuff.

I'll give it a closer look once it's done.

Spec GlobalBgpConfigSpec `json:"spec"`
type GlobalBgpConfigSpec struct {
// The reason we have Name field in Spec is because k8s metadata
// name field makes the string lowercase, so Name field in Spec is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than "make the string lowercase", "requires the string to be lowercase" is more accurate :)

// EnsureInitialized performs client initialization required for a specific
// Custom K8s Resource type.
// EnsureInitialized is no-op since CRD should be initialized
// by applying the manifest.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what manifest? Someone reading the code might not know.

I'd just say "initialized in advance"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, nits:

  • "is a no-op" not "is no-op"
  • "the CRD" not "CRD"

return K8sErrorToCalico(err, res)
}
}
// no-op, should be created by applying the manifest.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You said this two lines earlier, so can delete this one.

Key: model.IPPoolKey{CIDR: v.CIDR},
Value: &v,
Key: model.IPPoolKey{CIDR: *cidr},
Value: &model.IPPool{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we use the conversion methods that already exist for this?

See how it's done for example in BGP peers: https://github.com/projectcalico/libcalico-go/blob/master/lib/backend/k8s/resources/globalbgppeer.go#L99

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have any common converter methods for IPPool, I can take the one in client/ and move it to converter/ and use it here

@gunjan5 gunjan5 force-pushed the tpr-crd branch 2 times, most recently from c784a85 to bcd0d8c Compare August 11, 2017 18:27
@gunjan5 gunjan5 closed this Aug 11, 2017
@gunjan5 gunjan5 reopened this Aug 11, 2017
@gunjan5 gunjan5 closed this Aug 11, 2017
@gunjan5 gunjan5 reopened this Aug 11, 2017
@gunjan5 gunjan5 force-pushed the tpr-crd branch 4 times, most recently from d6b9f67 to f596581 Compare August 11, 2017 21:10
Copy link
Member

@caseydavenport caseydavenport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good. A few things we need to sort out.

@@ -22,7 +22,7 @@ import (
"k8s.io/apimachinery/pkg/runtime/schema"
)

// SystemNetworkPolicy is the ThirdPartyResource definition of a Calico Policy resource in
// SystemNetworkPolicy is the CustomResourceDefinition of a Calico Policy resource in
// the Kubernetes API.
type SystemNetworkPolicy struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we want to change this to GlobalNetworkPolicy instead.

@@ -1196,7 +1189,7 @@ var _ = Describe("Test Syncer API for Kubernetes backend", func() {
CIDR: *cidr,
IPIPInterface: "tunl0",
Masquerade: true,
IPAM: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because you can't have both IPAM and Disabled true at the same time. They're ! of each other in the conversion code.

Body(resIn).
Name(name).
Do().Into(resOut)
if updateError == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is a bit different from what was here before.

If providedRV != "", then we don't want to retry on failure - we should just return right away.

We should only retry when:

  • No Revision was provided on the KVP, AND
  • The error we got back was due to a CAS error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't looked elsewhere yet, but make sure we add a UT for this if there isn't one already.

// Attempt to Create and do an Update if the resource already exists.
// We only log debug here since the Create and Update will also log.
// Can't set Revision while creating a resource.
updated, err := c.Create(&model.KVPair{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this change?

Does "Create" fail if a revision was provided? If so, what code is providing a revision?

Perhaps we want to skip "Create" if the revision is set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create fails if revision is provided. I haven't spotted a code that provides revision for create (but nothing for update either).

Perhaps we want to skip "Create" if the revision is set?

hmm, that sounds good in theory, but I wonder if there's any client that's setting something as the revision even for create and doing this will fail (but it's the right thing)

resource: GlobalBgpConfigResourceName,
description: "Calico Global BGP Configuration",
k8sResourceType: reflect.TypeOf(thirdparty.GlobalBgpConfig{}),
k8sListType: reflect.TypeOf(thirdparty.GlobalBgpConfigList{}),
k8sResourceType: reflect.TypeOf(custom.GlobalBgpConfig{}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be GlobalBGPConfig now?

Metadata: metav1.ObjectMeta{
Name: name,
crd := custom.GlobalBgpConfig{
TypeMeta: metav1.TypeMeta{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand why we needed to add TypeMeta for GlobalBgpConfig but not for other resources. This needs more investigation.

@@ -132,70 +133,21 @@ func (h *ipPools) convertMetadataToListInterface(m unversioned.ResourceMetadata)
// convertMetadataToKey converts an IPPoolMetadata to an IPPoolKey
// This is part of the conversionHelper interface.
func (h *ipPools) convertMetadataToKey(m unversioned.ResourceMetadata) (model.Key, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(h *ipPools) - any idea why this is called h?

:p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe historically it was called a "helper" (not a fan of that name), and they's why the h has been carried with it

@@ -173,7 +173,7 @@ func (c cb) ExpectExists(updates []api.Update) {
})

// Expect the key to have existed.
Expect(matches).To(Equal(true), fmt.Sprintf("Expected update not found: %s", update.Key))
// Expect(matches).To(Equal(true), fmt.Sprintf("Expected update not found: %s", update.Key))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure you undo this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, that's why the commit message is "remove me" 😄

@gunjan5 gunjan5 changed the title [WIP] Migrate from TPR to CRD for the Kubernetes backend Migrate from TPR to CRD for the Kubernetes backend Aug 15, 2017
@gunjan5 gunjan5 assigned caseydavenport and unassigned gunjan5 Aug 15, 2017
SystemNetworkPolicyNamePrefix = "snp.projectcalico.org/"
GlobalNetworkPolicyResourceName = "GlobalNetworkPolicies"
GlobalNetworkPolicyCRDName = "globalnetworkpolicies.crd.projectcalico.org"
GlobalNetworkPolicyNamePrefix = "crd.projectcalico.org/"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we want to do is remove the GlobalNetworkPolicyNamePrefix altogether.

Here's what I'm thinking so far, and have discussed a bit with @robbrockbank

A NetworkPolicy with name "foo" should show up like this in calicoctl get policies:

  • knp.default.foo

A GlobalNetworkPolicy with name "foo" should show up like this in calicoctl get policies:

  • foo

@gunjan5 gunjan5 force-pushed the tpr-crd branch 4 times, most recently from da3a31a to ccbcfb3 Compare August 15, 2017 16:56
/hyperkube controller-manager \
--master=127.0.0.1:8080 \
--min-resync-period=3m \
--allocate-node-cidrs=true \
--cluster-cidr=10.10.0.0/16 \
--v=5

# Create CustomResourceDefinition (CRD) for Calico resources
# from the manifest crds.yaml
docker run \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should simplify this by mounting in the manifests directory into the st-apiserver container and just calling out to kubectl from within.

This is nice because it means fewer images to pull when running the tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gunjan5 you missed this one. Can you open a second PR to do this ASAP so we don't block this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah will do

@gunjan5 gunjan5 force-pushed the tpr-crd branch 2 times, most recently from cc0338c to 67a551d Compare August 15, 2017 17:31
// GlobalBgpPeerList is a list of Calico Global BGP Peer resources.
type GlobalBgpPeerList struct {
type BGPPeerSpec struct {
Spec api.BGPPeerSpec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need this to be an anonymous, inline struct. You don't want a Spec.Spec

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then in the conversion code, we will have to assign individual fields from api Spec when we do ToKVPair, which will need an update every time we change the spec. https://github.com/projectcalico/libcalico-go/pull/491/files#diff-dd846d0ae3114b5d3ffe01b897c4e3c4R94


Spec IpPoolSpec `json:"spec"`
type IPPoolSpec struct {
Spec api.IPPoolSpec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - this should be anonymous.

@@ -325,8 +251,8 @@ func (c *KubeClient) Syncer(callbacks api.SyncerCallbacks) api.Syncer {
func (c *KubeClient) Create(d *model.KVPair) (*model.KVPair, error) {
log.Debugf("Performing 'Create' for %+v", d)
switch d.Key.(type) {
case model.GlobalConfigKey:
return c.globalConfigClient.Create(d)
case model.GlobalFelixConfigKey:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, I don't think we want to rename the model.GlobalConfig stuff. That's a breaking change that requires corresponding Felix changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, find/replace mistake, ill fix it

@@ -704,7 +630,7 @@ func (c *KubeClient) getPolicy(k model.PolicyKey) (*model.KVPair, error) {
}

// Check to see if this is backed by a NetworkPolicy or a Namespace.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn't yours, but this comment is wrong now.

// This is not backed by a Kubernetes NetworkPolicy.
return "", "", fmt.Errorf("Policy %s not backed by a NetworkPolicy", name)
}

splits := strings.SplitN(strings.TrimPrefix(name, "np.projectcalico.org/"), ".", 2)
splits := strings.SplitN(strings.TrimPrefix(name, "knp.default."), ".", 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this logic is actually already wrong. If there's a namespace or a NetworkPolicy with a . in it, this will break.

Some examples:

  • namespace="casey.davenport" -> casey.davenport.mypolicy
  • policy="my.policy" -> default.my.policy

We should add UTs for these and then fix up the code, but let's do it in a separate PR immediately after this one since it's not a regression caused by this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we just use / as the separator for now since that is not a valid char for the namespace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think that fixes the problem I'm describing above at least.

kvp1a := &model.KVPair{
Key: model.PolicyKey{Name: kvp1Name},
Value: &calicoAllowPolicyModel,
}
kvp1b := &model.KVPair{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we need to move these around?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we use that one for an Update operation which needs a revision number which we only get after the previous Create operation, so I moved it after the Create to insert the Revision, and also because we should keep the variables close to where they're used

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or an Update operation which needs a revision number

I don't think that's true, now that Update does a Get first. Unless there's another reason I think we should keep these tests as they were, or at least remove the revision # so that we're not fundamentally changing the tests.

We can add another test that sets the revision on Update.

// Update the revision information from the response.
kvp.Revision = resOut.GetObjectMeta().GetResourceVersion()
return kvp, nil
// Failed to update the resource after 5 retries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment isn't correct - it's either after 5 retries, or if an RV was passed.

Metadata: metav1.ObjectMeta{
Name: name,
crd := custom.GlobalBGPConfig{
TypeMeta: metav1.TypeMeta{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we ever answered why this is needed, @gunjan5. Any thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything looks identical, I will play around with it a bit to see

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gunjan5 ping

@@ -123,9 +118,4 @@ var _ = Describe("Global BGP conversion methods", func() {
Expect(kvp.Value).To(BeAssignableToTypeOf(&model.BGPPeer{}))
Expect(kvp.Value).To(Equal(kvp1.Value))
})

It("should fail to convert an invalid Kuberenetes resource", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you delete this test case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it was essentially testing ResourceNameToIP(t.Metadata.Name) in the conversion code https://github.com/projectcalico/libcalico-go/pull/491/files#diff-dd846d0ae3114b5d3ffe01b897c4e3c4L86 which is not needed now since we include the peerIP in the Spec

@@ -155,18 +152,13 @@ var _ = Describe("System Network Policies conversion methods", func() {
Expect(k).To(Equal(key2))
})

It("should fail to convert an invalid Key to a resource name", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one?

Copy link
Contributor Author

@gunjan5 gunjan5 Aug 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was testing an invalid key based on it not having the prefix, which doesn't exist anymore

@@ -134,14 +123,4 @@ var _ = Describe("IP Pool conversion methods", func() {
Expect(kvp.Value).To(BeAssignableToTypeOf(&model.IPPool{}))
Expect(kvp.Value).To(Equal(kvp1.Value))
})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, we added CIDR to the Spec so we don't need the conversion this test was testing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's true. We added it to the spec, but we're not changing the logic around the name at all.

@@ -28,8 +28,29 @@ import (

// IPToResourceName converts an IP address to a name used for a k8s resource.
func IPToResourceName(ip net.IP) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some corresponding changes / cleanup we can do here: https://github.com/projectcalico/libcalico-go/pull/491/files#diff-711892c034cb78da74e3f90f690c86b0R113

@gunjan5 gunjan5 force-pushed the tpr-crd branch 3 times, most recently from c39e445 to 31ecf7d Compare August 15, 2017 22:44
// GlobalBgpPeerList is a list of Calico Global BGP Peer resources.
type GlobalBgpPeerList struct {
type BGPPeerSpec struct {
BGPPeerSpec api.BGPPeerSpec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is still not right.

It should be:

type BGPPeerSpec struct {
    api.BGPPeerSpec
    Scope  scope.Scope
    Node   string
    PeerIP net.IP
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Scope field is not required in the Spec - it was required in the Metadata to allow us to do queries for "get me the node peers". It can be inferred from whether the Node field is set or not.

By("Creating a System Network Policy", func() {
_, err := c.snpClient.Create(kvp1a)
By("Creating a Global Network Policy", func() {
var err error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think we need this.

By("Applying a non-existent System Network Policy", func() {
_, err := c.snpClient.Apply(kvp2a)
By("Applying a non-existent Global Network Policy", func() {
var err error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this var err error

@@ -700,7 +677,8 @@ var _ = Describe("Test Syncer API for Kubernetes backend", func() {
}()

By("Creating a Global BGP Peer", func() {
_, err := c.Create(kvp1a)
var err error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this

Expect(kvps[01].Value).To(Equal(kvp2b.Value))
keys := []model.Key{}
vals := []interface{}{}
for _, k := range kvps {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be better if we used a map so we can assert that a specific value exists for a given key.

Copy link
Member

@caseydavenport caseydavenport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Squash 'em

@caseydavenport caseydavenport merged commit 25a8c37 into projectcalico:master Aug 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants