This is the home of the whitepaper documenting Zoom's planned end-to-end encryption system. The latest released PDF will always be available here. This repository will be updated as we implement and iterate our cryptographic design.
Timeline
17 June 2020: Version 2 was published. See the changelog for a summary of what changed. We still value feedback, recommendation and corrections. Please continue to post them under Issues.
22 May 2020 - 5 June 2020: A comment period on the initial design
I know web client support is not part of phase 1, but many existing Zoom users (such as ChromeOS¹) will need it. Has there been any consideration of what the threat model looks like when trying to deploy implementations of things like Ed25519 (since it's not supported in the webcrypto API), either from a side channel perspective or just a web security model perspective? Answers such as 'no we haven't thought about it' and/or 'the web is fundamentally harder to secure an E2EE design like this, use the !web clients if your personal threat model requires it' are understandable.
¹ The current Chrome app (not to be confused with the Chrome extension), seems to just be a wrapper around the existing Zoom web app. I know Chrome apps are supposed to go away eventually, but the existing one could be turned into a full software bundle on its own and avoid some of the security dilemma of deploying secure js on ~every page load.
How are you planning to handle devices that don't have hardware based key chains, but only depend software based keychains ? There are certain Android devices and Windows systems that store keys on the file system.
Blum et al. recently published a white paper describing the Zoom’s proposed End-to-End Encryption (E2EE) protocol and architecture [1], with a roadmap of work to be done in various phases. Perhaps the most important phase of this protocol is “Phase I: Client Key Management,” where the authors describe the key management protocol based on which the encryption of the media content (audio/video/text) will be performed. This is a leader/host driven protocol that relies on the public key of the leader, whereby the symmetric “meeting key” (mk) is essentially to be encrypted with each participant’s own public key authenticated/signed by the leader’s public key, and distributed to each participant over the broadcast signaling channel (“bulletin board” in the terminology of [1]). That is, when each participant receives the (signed) ciphertext, it will decrypt it to learn the meeting key and also be sure that the meeting key is indeed generated by the public key of the leader by verifying the signature. All of this means that simply sending the public key of the leader over the signaling channel is not sufficient as an attacker (a “man-in-the-middle” or MITM) can insert its own public key over the insecure signaling channel, thereby compromising the security of the entire protocol and E2EE completely. This attack, although active, is extremely easy for the adversary to perform. Since the Zoom’s server controls the signaling channel, it will be a cake walk for an adversary, who has compromised this server or if the Zoom were under coercion from law enforcement, to change the leader’s actual public to the attacker’s own public key. Thus, it is extremely important to address this critical vulnerability. It is not an option, it is a must have. To counter this, it is essential to authenticate the public key of the leader ----- this is precisely what we call as the “root of trust” in the proposed E2EE protocol because if this authentication is not done right, you may lose all security. The game will be over.
In the attached document, we review Zoom’s proposal to validate the authenticity of the leader’s public key, define some very fundamental and subtle security+usability problems with the Zoom’s approach, and then introduce a new solution – foundations of which have already been studied in our recent work (CCS 2014 and CCS 2017) – to address many of these problems. We also provide items for future work that needs to be done towards transitioning this new solution into Zoom’s E2EE protocol in practice. The proposer and his research team is happy to work with the Zoom’s researchers and engineers in making this transition possible. We appreciate the feedback from Zoom.
Will the Chinese version of Zoom receive the same security feature set slated for the International version? Also if a Zoom (International) user calls a user in mainland China using the Chinese version of Zoom, which features apply?
The current document specifies the leaders' long-term public verifying key, meaning that the 'meeting code' will be the same for every meeting this leader hosts. If you hash the leader's ephemeral per-meeting public key, that will be different for every meeting this user hosts, but consistent for the whole duration of this meeting while hosted by this user, unless the ephemeral keypair changes or the leader is handed over.
If this is the intended key, perhaps it's more accurate to call it a "Host Security Code" or even just "Host Fingerprint"? I know trust in the security of the meeting hinges on the leader/host being trusted, but having a 'meeting security code' be the same for every meeting a user hosts seems mis-named.
If the meeting security code is supposed to be about proving safety from MITM, it's also being used in a social context in the meeting: a mismatch is supposed to result in a re-key, but it seems like a false-positive (Abusive Andy confirming the code they saw in a different meeting by the same host, in front of the whole meeting, when they aren't supposed to be there) looks bad and reduces trust/care in the meeting security code at all.
Renaming this code to a 'Host Security Code' or 'Host Fingerprint' make it explicit what we are identifying and trusting, the host. If we don't trust the host of the meeting, we probably shouldn't trust the meeting either, both in current Zoom and after Phase I, so I think the intention of the name 'meeting security code' is effectively achieved.