Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.15 backports 2024-05-08 #32432

Merged
merged 5 commits into from
May 9, 2024

Conversation

christarazi
Copy link
Member

@christarazi christarazi commented May 9, 2024

Once this PR is merged, a GitHub action will update the labels of these PRs:

31605

[ upstream commit 03d8e73 ]

K8sEventHandover is no longer a flag so remove any remaining references
to it.

Signed-off-by: Chris Tarazi <chris@isovalent.com>
@christarazi christarazi requested review from a team as code owners May 9, 2024 03:02
@christarazi christarazi added the kind/backports This PR provides functionality previously merged into master. label May 9, 2024
@christarazi christarazi added backport/1.15 This PR represents a backport for Cilium 1.15.x of a PR that was merged to main. release-blocker/1.15 This issue will prevent the release of the next version of Cilium. labels May 9, 2024
@christarazi
Copy link
Member Author

/test-backport-1.15

@nebril nebril force-pushed the pr/v1.15-backport-2024-05-08 branch from 46065cc to dd551d5 Compare May 9, 2024 10:28
@nebril
Copy link
Member

nebril commented May 9, 2024

/test-backport-1.15

[ upstream commit b17a9c6 ]

Refactor the Kubernetes-fetched metadata into a consolidated struct type
for ease of adding a new type without increasing the function signatures
of the relevant functions.

Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ upstream commit aa628ad ]

This commit causes the Kubernetes Pod watcher to trigger endpoint
identity updates if the Pod UID changes. This covers the third scenario
pointed out in [1], see issue for more details.

To summarize, with StatefulSet restarts, it is possible for the
apiserver to coalesce Pod events where the delete and add event are
replaced with a single update event that includes, for example a label
change. If Cilium indexes based on namespace/name of the pod, then it
will miss this update event and therefore, Cilium must use the UID of
the pod to disambiguate.

[1]: cilium#30409

Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ upstream commit f6606c6 ]

Building off of the previous commit, This commit modifies if the CNI ADD
handling logic to pass the Kubernetes Pod UID in the ADD request to the
Agent. In addition, a extra condition checks whether the UID of the pod
corresponding to the CNI ADD request is the same UID which Cilium is
aware of (within its pod informer store) during endpoint creation. This
is key for handling StatefulSets properly, see [1].

If the UID is outdated, this means that Cilium has not yet received the
updated pod event that corresponds with the CNI ADD. In this case,
Cilium will attempt to query the pod store up for to 2 seconds. If the
pod store does not have the altest pod reference after 2 seconds, Cilium
will trigger a direct fetch from the apiserver, as to avoid a
potentially large window of time where an endpoint cannot be created. If
the direct fetch fails, the endpoint will be created with the 'init'
identity until the metadata resolver (controller) is able to fetch the
latest pod metadata.

The reason for these changes is due to the way StatefulSets are
implemented in Kubernetes. One significant property of StatefulSets is
that they have a fixed name. When a StatefulSet is restarted, Cilium is
unable to properly handle it because Cilium indexes based on
namespace/name[1]. Therefore, we must pass a UID through the CNI ADD in
order to ensure that when Cilium creates a new endpoint, it is created
with the most up-to-date corresponding Kubernetes metadata.

[1]: cilium#30409, see case 1.

Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ upstream commit b048dbd ]

Previously, the logic in the pod watcher did not properly check for
changes in a pod's IPs, incorrectly relying on the length of the IP
strings changing, rather than the contents. This commit fixes that while
also simplifying the logic into a simpler one-liner.

Reported-by: Casey Callendrello <cdc@isovalent.com>
Signed-off-by: Chris Tarazi <chris@isovalent.com>
@nebril nebril force-pushed the pr/v1.15-backport-2024-05-08 branch from dd551d5 to ff21963 Compare May 9, 2024 10:40
@nebril
Copy link
Member

nebril commented May 9, 2024

/test-backport-1.15

Copy link
Member

@nathanjsweet nathanjsweet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for API

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 9, 2024
@nathanjsweet nathanjsweet merged commit 67e80e3 into cilium:v1.15 May 9, 2024
59 checks passed
@christarazi christarazi deleted the pr/v1.15-backport-2024-05-08 branch May 9, 2024 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.15 This PR represents a backport for Cilium 1.15.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-blocker/1.15 This issue will prevent the release of the next version of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants