Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd, watchers: Populate ipcache in case of high-scale ipcache #31848

Merged
merged 3 commits into from
Apr 16, 2024

Conversation

pchaigno
Copy link
Member

@pchaigno pchaigno commented Apr 8, 2024

The first commit contains the main changes (description below). Second commit fixes our end-to-end test and third reenables it.

The high-scale ipcache feature aims to enable policy enforcement across very large Kubernetes environments comprised of many clusters which are not using Cilium Clustermesh.

The main issue of Cilium Clustermesh at that scale is that ipcache on each node would need to be populated with all remote pods, and they would be to be constantly updated. It doesn't scale. Instead the high-scale ipcache feature removes all pods from the ipcache and relies on other mecanisms to enforce policies.

This comes with a significant drawback. Ipcache being such a central component of Cilium, without ipcache entries for pods, many features (egress gateway, IPsec, etc.) cannot work anymore. For that reason, high-scale ipcache is today incompatible with those features.

This commit implements an intermediate solution found by Hemanth. Since the scalability concern comes from the very large number of remote clusters, we only need to keep pods from those clusters out of the ipcache. Hence, the ipcache can keep all pod entries for the local cluster's pods; since we are not using Cilium Clustermesh, entries for other clusters will never be added.

As a consequence, all advanced features will keep working for the local cluster. Features that rely on the ipcache will of course not work for remote clusters, but that was never the goal of high-scale ipcache anyway.

@pchaigno pchaigno added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. feature/high-scale-ipcache Relates to the high-scale ipcache feature. labels Apr 8, 2024
@pchaigno pchaigno force-pushed the hs-ipcache-repopulated-ipcache branch 7 times, most recently from 10039ff to 2dc2d5f Compare April 10, 2024 09:56
The high-scale ipcache feature aims to enable policy enforcement across
very large Kubernetes environments comprised of many clusters which are
not using Cilium Clustermesh.

The main issue of Cilium Clustermesh at that scale is that ipcache on
each node would need to be populated with all remote pods, and they
would be to be constantly updated. It doesn't scale. Instead the
high-scale ipcache feature removes all pods from the ipcache and relies
on other mecanisms to enforce policies.

This comes with a significant drawback. Ipcache being such a central
component of Cilium, without ipcache entries for pods, many features
(egress gateway, IPsec, etc.) cannot work anymore. For that reason,
high-scale ipcache is today incompatible with those features.

This commit implements an intermediate solution found by Hemanth. Since
the scalability concern comes from the very large number of remote
clusters, we only need to keep pods from those clusters out of the
ipcache. Hence, the ipcache can keep all pod entries for the local
cluster's pods; since we are not using Cilium Clustermesh, entries for
other clusters will never be added.

As a consequence, all advanced features will keep working for the local
cluster. Features that rely on the ipcache will of course not work for
remote clusters, but that was never the goal of high-scale ipcache
anyway.

Suggested-by: Hemanth Malla <hemanth.malla@datadoghq.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Commit 19e5fd2 ("datapath: hs-ipcache needs NodePort DSR with Geneve
dispatch") make DSR with GENEVE dispatch a requirement for high-scale
ipcache mode when running with the kube-proxy replacement. It however
didn't update the related test because that test was quarantined.

Fixes: 19e5fd2 ("datapath: hs-ipcache needs NodePort DSR with Geneve dispatch")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
This test was quarantined because of a flake. But we can't debug it
because we don't have a way to collect data from quarantined tests.
Previous commit fixed some issues with the test, so let's unquarantine
it to be able to debug further if needed.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
@pchaigno pchaigno force-pushed the hs-ipcache-repopulated-ipcache branch from 2dc2d5f to dc9e71e Compare April 10, 2024 10:00
@pchaigno
Copy link
Member Author

/test

@pchaigno pchaigno marked this pull request as ready for review April 10, 2024 10:07
@pchaigno pchaigno requested review from a team as code owners April 10, 2024 10:07
@pchaigno pchaigno added release-note/misc This PR makes changes that have no direct user impact. and removed release-note/minor This PR changes functionality that users may find relevant to operating Cilium. labels Apr 11, 2024
@pchaigno pchaigno enabled auto-merge April 11, 2024 08:39
@pchaigno pchaigno added this pull request to the merge queue Apr 16, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 16, 2024
Merged via the queue into cilium:main with commit b8662a2 Apr 16, 2024
61 checks passed
@pchaigno pchaigno deleted the hs-ipcache-repopulated-ipcache branch April 16, 2024 04:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/high-scale-ipcache Relates to the high-scale ipcache feature. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants