The Tale of Google’s Response to Reptar CPU Vulnerability
Just as Vulnerability Research is an important area of focus at Google, so is Vulnerability Response to critical and complex vulnerabilities. Vulnerability Response at Google not only helps secure Google’s products and users, but in certain cases, it affects millions of devices across the Internet. In this post, we'll share the story of one of those cases – Google’s Response to Reptar.
Let’s start with some context about the role of Google’s Vulnerability Coordination Center (VulnCC) in a response like this one. Composed of a core team of security experts, VulnCC ensures Google has a consistent, well-functioning process for vulnerability management across Alphabet.
When a critical and complex security vulnerability affecting Alphabet is identified by a Google researcher, an external researcher through Google’s Vulnerability Rewards Program (VRP), or by VulnCC’s monitoring of vulnerability intelligence sources, a VulnCC team member leads the mitigation effort for the vulnerability across Alphabet.
What’s Reptar?
Reptar is a CPU vulnerability (CVE-2023-23583) with a CVSS [1] Base Score of 8.8 (High) that affects certain Intel CPU models. It’s an architectural vulnerability that corrupts the instruction pointer by running a sequence of x86 instructions which results in unexpected behavior and bypassing the CPU security boundaries.
The vulnerability was discovered in August 2023, when a validation pipeline used by the researchers reported unexpected results under certain conditions. During testing, triggering this bug in a CPU with multiple cores caused an MCE (Machine Check Exception) [2]. Moreover, this worked from an unprivileged guest virtual machine. Exploitation of this vulnerability can cause the machine to crash and can also lead to privilege escalation and information disclosure.
The vulnerability was discovered by security researchers at Google. See the dedicated blog post by Tavis Ormandy for more details about the vulnerability.
For additional information, see Intel's Security Advisory.
How Did Google Initiate the Response to Reptar?
Upon the discovery of the Reptar vulnerability and its escalation to VulnCC, I (Yousif Hussin) took on the role of leading a coordinated response for it. As a first step, the vulnerability was reported to Intel, in accordance with our reporting policy, with a disclosure deadline of 90 days. From there, Google partnered and collaborated with Intel to securely share the vulnerability mitigation information with other large industry players to ensure they too could respond and protect all users globally (not only Google users).
During coordination, we used the Traffic Light Protocol (TLP) [3]. The response was labeled TLP:RED, meaning “Not for disclosure, restricted to participants only”. Access to Reptar information was tightly controlled to ensure vulnerability details wouldn't be leaked. This was necessary, as a leak could be used by attackers against our users or others globally.
At the outset of the response, we conducted a rapid Google-wide impact assessment. We developed a response plan and assembled a response team with roles clearly assigned. We actively shared status updates with Google’s executives throughout the vulnerability response effort.
A key objective of the response was to create and execute a mitigation strategy, and ensure its timely deployment across critical areas of Google before the end of the embargo period, all while minimizing the chances of prematurely leaking any information regarding the vulnerability.
Now let's take a closer look at the different phases of the response.
Reptar Vulnerability Response Milestones Timeline
The timeline below highlights key vulnerability response milestones in relation to the embargo period.

Fig. 1. Reptar Vulnerability Response Timeline (2023)
The Google-Wide Impact Assessment for Reptar
To perform an effective impact assessment for Reptar, it was essential to have a thorough understanding of the vulnerability, how it’s exploited, its attack surface, and how Google's products and systems operate.
In the case of Reptar, we assessed the impact across Google to identify affected systems.
The key affected areas were:
- Google Compute Engine (GCE) [4] for Google Cloud
- Google’s Borg [5] Infrastructure (Non-Cloud)
- ChromeOS [6] Client Devices
Of the affected areas above, Reptar poses the highest risk to Google Cloud. This is due to the possible attack strategy where an external attacker could bypass the CPU security controls in a multi-tenant Google Compute Engine environment. This could impact the host machine and subsequently other services or virtual machines running on the same host. In turn, this could cause a denial of service on victims’ (other customers’) virtual machines & services by causing an MCE crash on the host machine (see graphic below for an example). The vulnerability could also lead to privilege escalation and information disclosure.

Fig. 2. Cloud Reptar Exploit (VM DoS)
The Reptar Vulnerability Response Team
When the impact assessment identified the affected products and systems, the scale of the response became clear and area-specific response leads joined the response team.
When issues, such as this one, require extensive coordination, it can help significantly to use an Incident Management structure to handle them even if they have not caused an incident. At Google, we use the Incident Management at Google (IMAG) framework, which is used in Reptar's response. IMAG is based on the Incident Command System (ICS) used by firefighters and medics, and it teaches how to utilize an Incident Commander (IC) to organize an emergency response by establishing a hierarchical structure with clear roles, tasks and communication responsibilities methods.
In the case of Reptar, I took on the role of the IC and structured the team as shown below and led the team throughout the response. With respect to team communications, in addition to regular meetings, the response team created and used chat channels which included a general chat channel and function-specific (e.g., Operations, GCE, Communications) chat channels.

Fig. 3. Reptar Vulnerability Response Team
Reptar Mitigation Strategy for Key Products
For Google Cloud, two primary mitigation solutions were proposed by the response team:

We internally identified and tested a chicken-bit mitigation. A chicken-bit is jargon for a chip hardware configuration setting that can be used to disable a certain feature on the chip. The chicken-bit can be set in the MSR [7] CPU register to disable a certain feature on the chip. The vulnerability is exploitable in CPUs with Fast-Strings feature enabled. So, disabling this feature via the MSR mitigates the vulnerability risk. However, the chicken-bit mitigation caused a significant performance impact, therefore it was discarded as a viable option.
After Intel provided a candidate for a long-term microcode solution to Google and other participating companies, it was extensively tested in their environments and proven to remediate the risk without causing an unacceptable impact. With this verification, Intel promoted the microcode to Production-Validated (PV) status.
In general, the microcode update is the superior solution compared to the chicken-bit. This is because the microcode update is the official update supported by the vendor, and the fix has undergone extensive testing not only by Google and Intel, but also by other major industry partners.
We also considered and compared the various microcode deployment approaches of the long-term microcode fix. The options were:
- A disruptive BIOS update
- A non-disruptive hotload microcode update
Through thorough testing and comparison of the approaches, the team concluded that non-disruptively hotloading the microcode was the best course of action. A microcode rollout plan was developed, and a suite of monitoring tools were configured to ensure systems were carefully monitored for potential anomalies throughout the rollout.
In the case of ChromeOS, the Intel-provided microcode fix was tested and incorporated into a forthcoming ChromeOS update scheduled to be made available to ChromeOS customers prior to the vulnerability disclosure date.
Execution of the Reptar Mitigation
Each area lead customized the microcode rollout for their respective environments. Subsequently, the non-disruptive hotload server microcode rollout was executed Google-wide across the entire fleet (Google Cloud + Borg Infrastructure). The mitigation was successfully and transparently deployed, which was a positive development as it was completed without impact to the user or customer.
As for the ChromeOS mitigation, the fix update was published for client devices as planned before the disclosure date.
In light of the smooth rollout without unexpected issues, and the speed at which it was completed, Reptar's mitigation experience across Google was further evidence that Google engineers maintain an infrastructure that can successfully and quickly deploy mitigations at Google's scale with ease.
Reptar Exploitation Detection
When leading a vulnerability response, we assess whether attempted exploitation can be detected and whether there’s evidence of exploitation attempts at Google. Typically, CPU vulnerabilities are harder to detect than other types of vulnerabilities by traditional host monitoring tools, but still there are usually some opportunities to provide visibility. To identify detection opportunities, the response team will work with partner security teams to develop signals to detect exploitation of the vulnerability. In the case of Reptar, researchers, in practice, have only been able to demonstrate a DoS attack so far, which is readily detectable by standard monitoring tools.
Response Communications
Communication is a critical component of the response. For example, once a vulnerability is disclosed, Google customers may have inquiries regarding the vulnerability, its mitigation and potential impact. In such a case, answers to anticipated questions should be documented in a Frequently Asked Questions (FAQ) document and then made available to customer-facing engineers to help respond to customer inquiries.
For Reptar, while some of the technical teams were engaged in mitigation activities, the communication leads ensured communication artifacts were developed. These artifacts included the Security Bulletin and an FAQ document.
In addition to communication artifacts, Communication Leads established channels to facilitate any escalations that resulted from the vulnerability response. This included external escalations from customers and internal escalations from Googlers.
Finally, Reptar Disclosure Day
Upon Intel’s disclosure of the vulnerability, Google published a Security Bulletin. We ensured that monitoring for escalations was in place and that the team was prepared to respond as necessary; however, we were fortunate enough to have not needed to use those escalation channels. This marked the successful completion of the Reptar vulnerability response.
Concluding Remarks
Vulnerability Response is a critical and rapidly-developing field, which is why Google has been investing in Vulnerability Response as a unique discipline. While the discovery and response to Reptar demonstrates Google's ability to not only protect its own users from critical security threats, but computer users around the world, every new vulnerability provides an opportunity to further refine the response process.
Google’s response to Reptar served as evidence that a well-orchestrated vulnerability response to a critical vulnerability like this one, which includes successful internal and external collaboration, is very important for protecting the Internet as a whole.
References
[1] The Common Vulnerability Scoring System (CVSS) provides a way to capture the principal characteristics of a vulnerability and produce a numerical score reflecting its severity.
[2] Machine Check Exception (MCE) is a type of hardware error that occurs when a CPU detects a hardware problem.
[3] Traffic Light Protocol (TLP) is a set of designations used to ensure that sensitive information is shared only with the appropriate audience.
[4] Google Compute Engine (GCE) is a secure and customizable compute service that lets you create and run virtual machines on Google’s infrastructure.
[5] Borg is Google’s cluster management system, designed to manage jobs and machine resources on a massive scale.
[6] ChromeOS is the speedy, simple and secure operating system that powers every Chromebook.
[7] Model-Specific Register (MSR) is any of various control registers in the x86 system architecture used for toggling certain CPU features.