Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VerifyIDTokenandCheckRevoked returning error: Could not find expiry time from HTTP headers #621

Closed
timelery opened this issue May 17, 2024 · 23 comments

Comments

@timelery
Copy link

timelery commented May 17, 2024

Environment
Firebase SDK version: v4.14.0
Firebase Product: auth

Describe the problem
I am receiving an error that states Could not find expiry time from HTTP headers when executing VerifyIDTokenandCheckRevoked function.

This error first started occuring this morning during token validation. Looking at the source code in the module this error seems to originate in the token-verifier go file inside the findMaxAge method. I have verified the token I am passing to this function is valid.

@othonrm
Copy link

othonrm commented May 17, 2024

Same happening here.

@mstanleyjr
Copy link

mstanleyjr commented May 17, 2024

Seeing the same for VerifyIdToken

Edit: Same version v4.14.0. We rolled back and saw it on v4.13.0

@timelery
Copy link
Author

I also have new client errors coming in during phone auth against the IOS Firebase verifyPhoneNumber method. Error: "INVALID_APP_CREDENTIAL". These errors started occuring this morning as well.

jschaf added a commit to arryved/firebase-admin-go that referenced this issue May 17, 2024
Attempt to make it work to fix production incident.
firebase#621
@jschaf
Copy link

jschaf commented May 17, 2024

We're still figuring out the details, but we had a multi-hour outage with this error message. We're not quite sure how things broke.

The band-aid fix was arryved#1 and using the replace directive in go.mod to point to our fork. That code hasn't changed in five years, so I'm not sure what suddenly caused the breakage.

// go.mod
replace firebase.google.com/go/v4 => github.com/arryved/firebase-admin-go/v4 v4.0.0-20240517153600-191d3ba33c12

@timelery
Copy link
Author

timelery commented May 17, 2024

It looks like the issue resides in the backend Firebase API servers. The http call that go is making for token validation is no longer accepted. I tried multiple versions with the same outcome.

@armando1793
Copy link

We noticed that this has been happening only in our Cloud Run instances hosted in the us-west-2 region

We first noticed symptoms of this issue around March 17, 4am GMT+8 when we were trying to refactor some of our usage of the firebase auth go sdk. We attributed the symptoms to dev error because when we would route traffic to our staging cloud run instances to an older version the issue would disappear.

We noticed the issue again on March 17, 11am GMT+8 happening to an unrelated feature from the one I mentioned above. By around March 17, 3pm GMT+8 we noticed that the issue was happening on our cloud run instances in production despite no new revisions being deployed for the last several days. This was when we started looking into the problem as a firebase issue.

As of March 17, between 6-7pm GMT+8, we were able to use the SDK to validate client tokens via our local machines in the Philippines. But when we would try the same function inside our cloud run instances, the code would fail. Client tokens would not be validated inside the cloud run instance but would be perfectly ok on our local machines

Our findings:

When the SDK calls the URL:

https://www.googleapis.com/robot/v1/metadata/x509/securetoken@system.gserviceaccount.com

in the Philippines the headers are

"Cache-Control": [
"public, max-age=24584, must-revalidate, no-transform"
]

But inside our Cloud Run instance it is

"Cache-Control": [
"private"
]

As of writing, May 18, 1:27AM GMT+8, this is still the case. Tokens can still be validated from our local machines. But not in our cloud run instances.

@jschaf
Copy link

jschaf commented May 17, 2024

We noticed that this has been happening only in our Cloud Run instances hosted in the us-west-2 region

We observed this on GKE nodes in us-west-4 with the GCP load balancer in front. Both happening us-west is suspicious.

@timelery
Copy link
Author

timelery commented May 17, 2024

I submitted a formal firebase bug report. If any of you are aware of another way to notify the firebase team please let me know. I suspect this issue is affecting many others.

@myxomatos
Copy link

I narrowed the problem and made a fix by forking and modifying Firebase client code. I found that Go Firebase client code, including the very latest version (v4.14.0), relies on "cache-control" response header value returned by an HTTP call for public certificates. This call is invoked by the client code to verify ID tokens. Specifically, it uses "max-age" section of the header to calculate certificate expiration time. And on May 16 at 5:45pm, the header value changed to "private", breaking Firebase client code written in Go. (I'm not sure about client code written in other languages.)

More details:
Firebase Auth client code fetches certs from https://www.googleapis.com/robot/v1/metadata/x509/securetoken@system.gserviceaccount.com

This command can be used to get value of the "cache-control" header:
curl -v "https://www.googleapis.com/robot/v1/metadata/x509/securetoken@system.gserviceaccount.com" 2>&1 | grep "cache-control"

My fix is to return a default expiration value instead of an error:
dutchpet@5d4d7d0

@georgi0u
Copy link

@myxomatos what's the status of the fix?

Are you suggesting affected clients hot-patch the existing package?
Or is the main-branch fix going out soon?

Also, I'm imagining this header change — seeing as it's partially experienced — is an ongoing incremental rollout by whoever's in charge of that cert URI. Is there any luck on coordinating with them to not break users of this client?

@janaaronlee
Copy link

@georgi0u what @armando1793 and I did to work around the issue was pretty much exactly what @myxomatos did. It is definitely a bug on the Golang Firebase Admin SDK as a reasonable fallback should have been in place instead of nil.

func findMaxAge(resp *http.Response) (*time.Duration, error) {
	cc := resp.Header.Get("cache-control")
	for _, value := range strings.Split(cc, ",") {
		value = strings.TrimSpace(value)
		if strings.HasPrefix(value, "max-age=") {
			sep := strings.Index(value, "=")
			seconds, err := strconv.ParseInt(value[sep+1:], 10, 64)
			if err != nil {
				return nil, err
			}
			duration := time.Duration(seconds) * time.Second
			return &duration, nil
		}
	}
	return nil, errors.New("Could not find expiry time from HTTP headers")
}

For reference:

return nil, errors.New("Could not find expiry time from HTTP headers")

@georgi0u
Copy link

Yup, appreciate the direction. I've patched a fork as well, and rebuilt/redeployed using that.

Now, I'm curious what the plan for the official package is. And also if there's a plan to not break other unpatched clients, by coordinating within Google.

@josephjoeljo
Copy link

We're still figuring out the details, but we had a multi-hour outage with this error message. We're not quite sure how things broke.

The band-aid fix was arryved#1 and using the replace directive in go.mod to point to our fork. That code hasn't changed in five years, so I'm not sure what suddenly caused the breakage.

// go.mod
replace firebase.google.com/go/v4 => github.com/arryved/firebase-admin-go/v4 v4.0.0-20240517153600-191d3ba33c12

using that fork temporarily as well. Thank you.

@JairoPanduro
Copy link

+1 here

otakakot added a commit to bitkey-platform/firebase-admin-go that referenced this issue May 20, 2024
@ribrdb
Copy link

ribrdb commented May 21, 2024

Google Cloud Support says the production issue is fixed, although it seems like a fix here would still be good to prevent this from reoccurring.

@jschaf
Copy link

jschaf commented May 21, 2024

Google Cloud Support says the production issue is fixed, although it seems like a fix here would still be good to prevent this from reoccurring.

Does anyone know how to verify? I'd rather not tempt another outage.

@ribrdb
Copy link

ribrdb commented May 21, 2024

I think you could start a gce micro instance in whatever region your app is running and run the curl command from above:

curl -v "https://www.googleapis.com/robot/v1/metadata/x509/securetoken@system.gserviceaccount.com" 2>&1 | grep "cache-control"

You want to see public and max-age, not just 'private'

@armando1793
Copy link

Google Cloud Support says the production issue is fixed, although it seems like a fix here would still be good to prevent this from reoccurring.

Does anyone know how to verify? I'd rather not tempt another outage.

I just did what @ribrdb suggested on the staging version of our Cloud Run instance that was affected by the outage. I can confirm that the headers are available. I will attempt to repoint our package back to the official SDK in a few hours and see if the issue is resolved. Will update here when I do

@otakakot
Copy link

otakakot commented May 22, 2024

Why does it generate an error if max-age cannot be obtained? I think this value will be used later to determine if it should be refreshed or not. So why not return 0 if the value cannot be retrieved so that it is not cached?

Also, I checked the implementation of the node library (firebase-admin-node), and it seems that if the max-age value could not be obtained, an error is not generated, but the default value of 0 is set (i.e., not cached).

@lahirumaramba
Copy link
Member

Hey folks, the backend issue should be addressed now. I agree with the comments above, we should update the SDK to handle this case gracefully without throwing and continue the token verification. We will submit a fix soon and this issue will track the progress. Thanks!

@lahirumaramba
Copy link
Member

Addressed in #623

@jschaf
Copy link

jschaf commented May 23, 2024

Thank you. Are you able to cut a new release so we can upgrade without using the dev branch?

@lahirumaramba
Copy link
Member

Thank you. Are you able to cut a new release so we can upgrade without using the dev branch?

Hey @jschaf , we will cut a new release this week. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests