The Tale of the Malformed Token: An Autopilot Mystery
There are many ways for Windows Autopilot Hybrid Join to fail. But this particular failure was special. It presented the most useless error message in the history of Windows errors: “Something Went Wrong: 80004005”
And the mystery? Pre-provisioning worked perfectly fine. Same tenant. Same device. Only the user-driven Hybrid path was throwing a tantrum.
The Scene of the Crime
The user had completed their Windows Autopilot journey — or so they thought. After the Device Preparation phase (formerly ESP), they reached the Hybrid Join moment. They entered their credentials. The progress indicator spun. And then…
80004005.
Not a helpful message. Not an actionable message. Just… “something went wrong.” Clicking “Try Again” didn’t help either. It just continued the flow like nothing happened, which was frankly more confusing than the error itself.
Following the Trail of Clues
When traditional error messages fail you, it’s time to go digging. The first stop? The Shell Core event log.
Inside, we found references to:
- A JSON parsing failure
- Something called “crackIdToken”
- The dreaded idTokenNotThreeParts
Now, an ID token in the world of OAuth is a JWT — a JSON Web Token. And a proper JWT has three parts separated by dots: Header.Payload.Signature. Think of it like a three-course meal. You need all three parts to have a complete token.
When Windows couldn’t split the ID token into three parts, it wasn’t dealing with a “bad token.” It was dealing with something that wasn’t a token at all. Or something that got… mangled.
The OtaDomainJoin Culprit
The Shell Core event log pointed directly at OtaDomainJoin — a JavaScript component in the CloudExperienceHost web layer. This is the code that takes the ID token and extracts the values needed for the join flow to continue.
Its job is beautifully simple:
- Split the token on dots
- Base64 decode the middle part
- Parse that as JSON
Simple, right? But when the token isn’t actually a token, this basic operation crashes spectacularly.
The Token Handoff: A Bridge Too Far
Let’s trace what should happen when a user enters credentials during Hybrid Join:
- BeginAuth — Authentication starts
- EndAuth — MFA completes (maybe)
- ProcessAuth — Entra says “you’re good, here’s what you need”
The critical moment is ProcessAuth. In a healthy flow, it doesn’t return a full page. It returns a redirect using the ms-aadj-redir:// protocol. This redirect contains:
- A code representing the completed sign-in
- Session state
- Cloud instance hints
The redirect is the bridge between login.microsoftonline.com and the CloudExperienceHost OtaDomainJoin webapp. Clean. Simple. A proper handoff.
What Actually Went Wrong
In this case, instead of a clean redirect, Windows received a full HTML “Working…” page. You know the one — it has script bundles, telemetry config, loaders, and a navigation call to move to the next stage.
And buried inside that mess? The ID token was visible as part of the payload. Not malformed, technically. But delivered in entirely the wrong way.
The problem wasn’t that the token was bad. The problem was that the response should have been a simple redirect, but instead it was a full HTML delivery where everything was bundled into one blob.
When OtaDomainJoin tried to crack this token, it failed. When it failed, it crashed. When it crashed, you got 80004005.
Why Pre-Provisioning Worked Fine
Here’s the twist: In the same tenant, Autopilot with pre-provisioning worked perfectly. Devices went through the technician flow. ESP progressed. Apps installed. Devices were sealed. No 80004005.
Why?
During the technician phase, the device isn’t running as the real end user yet. It uses a temporary placeholder user context — similar to the “FooUser” behavior we’ve seen before. In that flow, the device identity matters more than user identity. The flow leans on the device proving itself, not on who the user is.
The moment the real user signs in during the user-driven path, everything changes. That’s when OtaDJ expects a clean ID token handoff. If that handoff is malformed, the flow collapses before it ever reaches the Offline Domain Join stage.
The “Try Again” Mystery Solved
Remember how clicking “Try Again” sometimes worked? That’s because the second attempt took a different token path and returned a clean OAuth token response. Once the token could be parsed properly, the MDM URLs could be extracted, and the Hybrid Join continued normally.
The Lesson
This wasn’t a random Hybrid Join failure. It wasn’t “just MFA being MFA.”
Authentication succeeded. The device got far enough to request the ID token. But the handoff came back in the wrong shape — a full HTML response instead of a clean redirect. That broke the token parsing step, caused the JSON parse errors, and crashed the Cloud Domain Join web experience into the dreaded 80004005 error screen.
When troubleshooting Hybrid Join failures, remember: sometimes the error isn’t in the join itself. It’s in the token handoff that happens before the join even starts.
The more you know, the less you debug. Stay curious, dear reader.