Advanced audit as tech preview in origin #16128

soltysh · 2017-09-04T12:48:03Z

@sttts this enables the advance auditing features in origin, ptal

@openshift/api-review for config changes

mfojtik · 2017-09-05T11:56:35Z

/approve

soltysh · 2017-09-19T11:13:12Z

@sttts mind double checking if we need more upstream PRs? I think I got the most important ones.

soltysh · 2017-09-19T11:14:20Z

pkg/cmd/server/kubernetes/master/master_config.go

 		}
 	}
 	// TODO: this should be done in config validation (along with the above) so we can provide
 	// proper errors
-	if err := cmdflags.Resolve(masterConfig.KubernetesMasterConfig.APIServerArguments, server.AddFlags); len(err) > 0 {
+	if err := cmdflags.Resolve(args, server.AddFlags); len(err) > 0 {


@deads2k look what I found? I don't know how I have missed this in your PR 😨. Anyway, I've added extended test to verify that from now on.

@deads2k look what I found? I don't know how I have missed this in your PR . Anyway, I've added extended test to verify that from now on.

Yikes! Can't trust anyone these days :)

soltysh · 2017-09-19T11:15:53Z

@deads2k this is ready for review as is. I don't want to grow this one bigger. I'm working on a followup where I turn on audit in the remaining api servers we have.

deads2k · 2017-09-19T13:24:49Z

pkg/cmd/server/api/types.go

@@ -474,6 +473,17 @@ type AuditConfig struct {
 	MaximumRetainedFiles int
 	// Maximum size in megabytes of the log file before it gets rotated. Defaults to 100MB.
 	MaximumFileSizeMegabytes int
+
+	// Path to the file that defines the audit policy configuration.
+	PolicyFile string


Policy files don't contain any sensitive data, right? If you had a similar config for an apiserver upstream, would you still use an external file reference?

Upstream apiserver uses an external file, so do we. See https://github.com/kubernetes/kubernetes/blob/v1.7.0/cluster/gce/gci/configure-helper.sh#L490 as a reference. There are no sensitive data, this is a list of rules which requests should be logged at what level.

Upstream apiserver uses an external file, so do we. See https://github.com/kubernetes/kubernetes/blob/v1.7.0/cluster/gce/gci/configure-helper.sh#L490 as a reference. There are no sensitive data, this is a list of rules which requests should be logged at what level.

At the point where it is wired up we use an external file. At the point where our config is described, what is the benefit to a cluster-admin of managing a separate file?

It might get big and I think it's more handy to manage a totally separate file. Especially that this isn't the first one, based on what I've checked.

It might get big and I think it's more handy to manage a totally separate file. Especially that this isn't the first one, based on what I've checked.

Others contain secrets. Of ones which don't contain secrets, I know of the scheduler file (used by a process we don't own) and some kubelet ones (also used by processes we don't own).

@smarterclayton convinced me that this could be a file given potential future direction, but it still seems like a shame. See this other page for a complete response: xxxx.

In that same way we don't own audit and we won't anymore.

In that same way we don't own audit and we won't anymore.

That doesn't control our config file format. If we believed that, we'd have flags controlling our certs and flags for managing admission. The decision about how to expose the config for our server isn't based on how kube manages its config, but based upon how we'd like to manage it.

Owning the code backing audit does not mean that everyone must choose to expose it half in flags and half in files. You can see you aren't building that here.

deads2k · 2017-09-19T13:26:06Z

pkg/cmd/server/api/types.go

@@ -474,6 +473,17 @@ type AuditConfig struct {
 	MaximumRetainedFiles int
 	// Maximum size in megabytes of the log file before it gets rotated. Defaults to 100MB.
 	MaximumFileSizeMegabytes int
+
+	// Path to the file that defines the audit policy configuration.
+	PolicyFile string


If this isn't specified, the default behavior out should remain consistent with what it was before, right? Log everything as I recall? Either way, it should be described.

If this file isn't specified, we fallback to the old audit scheme. So that after an upgrade we work the same way, as we did before. I turn the advanced audit on this config value being present.

If this file isn't specified, we fallback to the old audit scheme. So that after an upgrade we work the same way, as we did before. I turn the advanced audit on this config value being present.

You need to document what it does. It seems weird that not specifying a file and specifying an empty file give two different outcomes.

An empty file means you wanted to give it and you just messed up its contents. Its lack, though means we're working in the old audit mode, for backwards compatibility. I'll be updating our docs with the advanced bits.

deads2k · 2017-09-19T13:26:17Z

pkg/cmd/server/api/types.go

+	// Format of saved audits (legacy or json).
+	LogFormat string
+
+	// Path to a kubeconfig formatted filpe that defines the audit webhook configuration.


typoe: file

deads2k · 2017-09-19T13:28:11Z

pkg/cmd/server/api/validation/master.go

+		if err != nil {
+			validationResults.AddErrors(field.Invalid(fldPath.Child("policyFile"), config.PolicyFile, err.Error()))
+		}
+		if len(policy.Rules) == 0 {


Why shouldn't this be treated the same as "no policy file specified"?

That's how it's done upstream, I want to be consistent with that behavior.

deads2k · 2017-09-19T13:28:37Z

pkg/cmd/server/api/validation/master.go

+		}
+	}
+	if len(config.LogFormat) > 0 && config.LogFormat != auditlog.FormatLegacy && config.LogFormat != auditlog.FormatJson {
+		validationResults.AddErrors(field.Invalid(fldPath.Child("logFormat"), config.LogFormat,


field.NotSupported

deads2k · 2017-09-19T13:28:42Z

pkg/cmd/server/api/validation/master.go

+			fmt.Sprintf("invalid audit log format, allowed formats are %q", strings.Join(auditlog.AllowedFormats, ","))))
+	}
+	if len(config.WebhookMode) > 0 && config.WebhookMode != auditwebhook.ModeBatch && config.WebhookMode != auditwebhook.ModeBlocking {
+		validationResults.AddErrors(field.Invalid(fldPath.Child("logFormat"), config.WebhookMode,


field.NotSupported

deads2k · 2017-09-19T13:29:18Z

pkg/cmd/server/api/validation/master.go

+		validationResults.AddErrors(field.Invalid(fldPath.Child("logFormat"), config.LogFormat,
+			fmt.Sprintf("invalid audit log format, allowed formats are %q", strings.Join(auditlog.AllowedFormats, ","))))
+	}
+	if len(config.WebhookMode) > 0 && config.WebhookMode != auditwebhook.ModeBatch && config.WebhookMode != auditwebhook.ModeBlocking {


is auditFilePath + webhook valid. What does it mean?

Yes. It means you're getting audit send to both places. In the webhook config file you can have multiple backends and we'll send to all of them.

deads2k · 2017-09-19T13:29:48Z

pkg/cmd/server/api/validation/master.go

+		validationResults.AddErrors(field.Invalid(fldPath.Child("logFormat"), config.LogFormat,
+			fmt.Sprintf("invalid audit log format, allowed formats are %q", strings.Join(auditlog.AllowedFormats, ","))))
+	}
+	if len(config.WebhookMode) > 0 && config.WebhookMode != auditwebhook.ModeBatch && config.WebhookMode != auditwebhook.ModeBlocking {


webhook mode shouldn't be set without the webhookconfigfile.

Good point, I'll add another check.

I'll do a different check, I'll check the mode only when file is passed.

deads2k · 2017-09-19T13:30:02Z

pkg/cmd/server/api/validation/master.go

@@ -253,6 +259,24 @@ func ValidateAuditConfig(config api.AuditConfig, fldPath *field.Path) Validation
 		validationResults.AddErrors(field.Invalid(fldPath.Child("maximumFileSizeMegabytes"), config.MaximumFileSizeMegabytes, "must be greater than or equal to 0"))
 	}

+	if len(config.PolicyFile) > 0 {


missing validation on webhookconfigfile?

soltysh · 2017-09-19T20:06:53Z

/retest

sttts · 2017-09-20T07:59:49Z

pkg/cmd/server/api/v1/swagger_doc.go

@@ -90,6 +90,10 @@ var map_AuditConfig = map[string]string{
 	"maximumFileRetentionDays": "Maximum number of days to retain old log files based on the timestamp encoded in their filename.",
 	"maximumRetainedFiles":     "Maximum number of old log files to retain.",
 	"maximumFileSizeMegabytes": "Maximum size in megabytes of the log file before it gets rotated. Defaults to 100MB.",
+	"policyFile":               "Path to the file that defines the audit policy configuration.",
+	"logFormat":                "Format of saved audits (legacy or json).",
+	"webhookConfigFile":        "Path to a kubeconfig formatted file that defines the audit webhook configuration.",


webhook or webHook?

Yeah, we're using webHook in builds, I'll change it.

sttts · 2017-09-20T08:02:43Z

pkg/cmd/server/api/types.go

+	LogFormat string
+
+	// Path to a kubeconfig formatted file that defines the audit webhook configuration.
+	WebhookConfigFile string


WebhookKubeconfigFile would be clearer.

WebhookKubeconfigFile would be clearer.

Our config uses KubeConfig, but as I recall there are special allowances for webhooks to set a termination url path, right?

Yeah, but the file format follows KubeConfig, I've already applied that change.

soltysh · 2017-09-21T08:33:09Z

@sttts @deads2k @liggitt comments addressed ptal

enj · 2017-09-21T10:52:27Z

@openshift/sig-security

soltysh · 2017-09-21T11:02:35Z

pkg/cmd/server/api/install/install.go

@@ -55,6 +57,10 @@ func addVersionsToScheme(externalVersions ...schema.GroupVersion) {
 			continue
 		}
 	}
+	// we additionally need to enable audit versions, since we embed the audit
+	// policy file inside master-config.yaml
+	audit.AddToScheme(configapi.Scheme)


@deads2k is this the right place to install audit types, or you want it to move it somewhere else?

@deads2k is this the right place to install audit types, or you want it to move it somewhere else?

Can you simplify the path leading here? I don't think we should pretend with externalVersions anymore. We have a hardcoded set of things we add and we simply add them.

deads2k · 2017-09-21T13:14:44Z

pkg/cmd/server/api/types.go

+	// Path to a .kubeconfig formatted file that defines the audit webhook configuration.
+	WebHookKubeConfig string
+	// Strategy for sending audit events (block or batch).
+	WebHookMode string


use a typed string

deads2k · 2017-09-21T13:15:34Z

pkg/cmd/server/api/validation/master.go

@@ -253,6 +262,52 @@ func ValidateAuditConfig(config api.AuditConfig, fldPath *field.Path) Validation
 		validationResults.AddErrors(field.Invalid(fldPath.Child("maximumFileSizeMegabytes"), config.MaximumFileSizeMegabytes, "must be greater than or equal to 0"))
 	}

+	// setting policy file will turn the advanced auditing on
+	if config.PolicyConfiguration != nil && len(config.PolicyFile) > 0 {
+		validationResults.AddWarnings(field.Forbidden(fldPath.Child("policyFile"), "both policyFile and policyConfiguration are specified, the latter will take precedence"))


Let's fail on this.

deads2k · 2017-09-21T13:19:17Z

pkg/cmd/server/api/validation/master.go

+			validationResults.AddErrors(field.Required(fldPath.Child("auditFilePath"), "advanced audit requires a separate log file"))
+		}
+
+		if len(config.WebHookKubeConfig) > 0 {


shouldn't we have a else preventing a webhookmode is there isn't a webhookkubeconfig? Either both or none, right?

deads2k · 2017-09-21T13:24:35Z

minor comments

deads2k · 2017-09-21T19:39:39Z

/lgtm

openshift-merge-robot · 2017-09-21T19:39:52Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, mfojtik, soltysh

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

~~OWNERS~~ [deads2k,mfojtik]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

openshift-merge-robot · 2017-09-22T00:01:43Z

Automatic merge from submit-queue (batch tested with PRs 16480, 16486, 16270, 16128, 16489)

soltysh · 2017-09-25T08:57:23Z

(8) [SCCFSI] audit log needs to capture modifications to role binds and SCC policies

soltysh · 2017-09-25T08:57:41Z

(3) [SCCFSI] audit log needs to capture all login events without turning on debug levels

legionus · 2017-09-25T09:54:07Z

test/extended/setup.sh

 	# put change there - only want this for extended tests
 	os::log::info "Turn on audit logging"
 	cp "${SERVER_CONFIG_DIR}/master/master-config.yaml" "${SERVER_CONFIG_DIR}/master/master-config.orig2.yaml"
-	openshift ex config patch "${SERVER_CONFIG_DIR}/master/master-config.orig2.yaml" --patch="{\"auditConfig\": {\"enabled\": true}}"  > "${SERVER_CONFIG_DIR}/master/master-config.yaml"
+	openshift ex config patch "${SERVER_CONFIG_DIR}/master/master-config.orig2.yaml" --patch="{\"auditConfig\": {\"enabled\": true, \"auditFilePath\": \"${LOG_DIR}/audit.log\"}}"  > "${SERVER_CONFIG_DIR}/master/master-config.yaml"
+	exit 1


really ? merged ?!

Fixed in #16534 /o\

@soltysh

Automatic merge from submit-queue remove bad exit @soltysh looks like you left an exit in there #16128 cc @mfojtik

openshift-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Sep 4, 2017

openshift-merge-robot assigned mfojtik and bparees Sep 4, 2017

openshift-merge-robot added the needs-api-review label Sep 4, 2017

soltysh assigned sttts and unassigned mfojtik and bparees Sep 4, 2017

openshift-merge-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 5, 2017

soltysh mentioned this pull request Sep 18, 2017

Move audit filter before authn to log those failures as well #14535

Closed

soltysh force-pushed the advanced_audit branch from aad80e2 to 2fc9096 Compare September 19, 2017 11:12

openshift-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 19, 2017

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 19, 2017

soltysh assigned deads2k Sep 19, 2017

soltysh commented Sep 19, 2017

View reviewed changes

soltysh force-pushed the advanced_audit branch from 2fc9096 to 933acd9 Compare September 19, 2017 12:19

deads2k reviewed Sep 19, 2017

View reviewed changes

soltysh force-pushed the advanced_audit branch from 933acd9 to b40bc9d Compare September 19, 2017 15:18

sttts reviewed Sep 20, 2017

View reviewed changes

soltysh added 4 commits September 21, 2017 10:30

UPSTREAM: 48605: support json output for log backend of advanced audit

d0395a7

UPSTREAM: 51119: Allow audit to log authorization failures

c41de61

UPSTREAM: 52030: Fill in creationtimestamp in audit events

e92cb2e

UPSTREAM: 51782: A policy with 0 rules should return an error

d27faa4

soltysh force-pushed the advanced_audit branch from b40bc9d to 61f4dfe Compare September 21, 2017 08:30

soltysh commented Sep 21, 2017

View reviewed changes

deads2k reviewed Sep 21, 2017

View reviewed changes

soltysh added 2 commits September 21, 2017 16:21

Enable full advanced audit in origin

49ef6df

Basic audit extended test

e8e6700

soltysh force-pushed the advanced_audit branch from 61f4dfe to e8e6700 Compare September 21, 2017 14:36

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 21, 2017

deads2k added the api-approved label Sep 21, 2017

openshift-merge-robot merged commit d2de881 into openshift:master Sep 22, 2017

soltysh deleted the advanced_audit branch September 22, 2017 09:02

legionus reviewed Sep 25, 2017

View reviewed changes

pweil- mentioned this pull request Sep 25, 2017

remove bad exit #16534

Merged

openshift-merge-robot added a commit that referenced this pull request Sep 25, 2017

Merge pull request #16534 from pweil-/remove-exti

4be0f20

Automatic merge from submit-queue remove bad exit @soltysh looks like you left an exit in there #16128 cc @mfojtik

Advanced audit as tech preview in origin #16128

Advanced audit as tech preview in origin #16128

Conversation

soltysh commented Sep 4, 2017

mfojtik commented Sep 5, 2017

soltysh commented Sep 19, 2017

soltysh Sep 19, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

soltysh commented Sep 19, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

soltysh commented Sep 19, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

soltysh commented Sep 21, 2017

enj commented Sep 21, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deads2k commented Sep 21, 2017

deads2k commented Sep 21, 2017

openshift-merge-robot commented Sep 21, 2017

openshift-merge-robot commented Sep 22, 2017

soltysh commented Sep 25, 2017

soltysh commented Sep 25, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

soltysh Sep 19, 2017 •

edited