Under the Hood of AI Timelines: How the AI 2027 Forecast is Made

Under the Hood of AI Timelines: How the AI 2027 Forecast is Made

While forecasting how AI will impact my career in Consumer Lending Credit Risk Management (Article Link), I explored the timeline forecast methodology used in AI 2027.

Prerequisites

Time Horizon

The concept of time horizon, proposed by the Model Evaluation & Threat Research (METR) team, states that the "50%-task-completion time horizon is the time humans typically take to complete tasks that AI models can complete with a 50% success rate." A notable result from their research is that the length of tasks AI can perform is doubling approximately every 7 months.

Monte Carlo Simulation

Monte Carlo simulation was largely developed in the 1940s by mathematicians and physicists working on secret projects. They were dealing with incredibly complex problems, such as simulating neutron diffusion, which were too difficult to solve with traditional analytical mathematics. The fundamental idea was to determine probability distributions by running numerous trials and observing the outcomes. The core ideas involve using probability distributions to model variables (instead of single, deterministic values) and conducting repeated trials to ensure a sufficient number of random scenarios are explored for patterns to emerge. The name "Monte Carlo" is thought to have originated because one of its inventors, Stanislaw Ulam, enjoyed gambling at the Monte Carlo Casino, and the project required a secret codename.

The process of Monte Carlo simulation is as follows:

  1. Define the problem: Identify the key variables and their relationships.
  2. Create a model: Use probability distributions to represent the variables.
  3. Run many trials: Simulate the model multiple times to observe the probability distribution of the outcomes.
  4. Analyze the results: Look for patterns and make predictions.

The variables used in the AI 2027 model are:

Variable Name Description Distribution
current_horizon: Current 80% time horizon Sourced from METR’s time horizon paper; specifically, Claude Sonnet 3.7’s 80% time horizon. Fixed
h_SC: Horizontal length for superhuman coder at 80% CI. The time horizon required for a superhuman coder, at an 80% confidence interval. Lognormal
T_t: Time horizon doubling time Time needed to double the time horizon as of March 2025, based on HCAST. Lognormal
cost_speed: Cost and speed adjustment factor Additional time AI might take to meet stringent speed and cost efficiency targets. Lognormal
announcement_delay: Gap between internal and external deployment Accounts for the potential gap between the capabilities of publicly announced models (used for current time horizon estimates) and more advanced internal models. Lognormal
p_superexponential, p_subexponential, is_exponential (default case) Probabilities that doubling times will accelerate (superexponential), decelerate (subexponential), or remain constant (exponential). Fixed
present_prog_multiplier (present progress multiplier), SC_prog_multiplier (progress multiplier for superhuman coder) Affects the number of calendar days required to reach the target capability level by modifying algorithmic progress (referred to as intermediate speedup). Lognormal

I was confused about the difference between "doubling time speed up" and "intermediate speed up". Doubling time speed up affects the time (number of work hours based on current capacity) required to reach the target capability level, while intermediate speed up affects the actual time it takes to reach the target capability level if efficiency is improved.

Lognormal Distribution and Normal Distribution

Assuming familiarity with the normal distribution, a lognormal distribution arises when the logarithm of a random variable is normally distributed. First, a variable following a lognormal distribution can only take positive values, as its logarithm is normally distributed (and the exponential of any real number is positive). Second, the lognormal distribution is typically skewed to the right. This implies that while most values may cluster around a central point, there is a possibility of observing significantly larger values, albeit less frequently. An underlying data generation process that often leads to a lognormal distribution involves the multiplicative accumulation of random effects. For example, stock returns (\(r\)), which result from compounding, can often be modeled this way.

Gaussian Copula

To account for correlations between variables, a Gaussian Copula can be employed. The core concept is to define a multivariate joint distribution (for correlated variables) using their individual marginal distributions and a copula function, which captures the correlation structure. The process is as follows:

  1. Specify the marginal distributions for each variable.
  2. Specify the correlation matrix.
  3. Generate random samples from the multivariate normal distribution derived from the correlation matrix.
  4. Transform the normal samples to uniform samples.
  5. Transform the uniform samples to the desired marginal distributions.

In this model, T_t, cost_speed, and the growth type probabilities (is_superexponential, is_subexponential, is_exponential) are treated as correlated variables with a correlation coefficient of 0.7. Similarly, present_prog_multiplier and SC_prog_multiplier are modeled with a 0.7 correlation.

The Process

The following table summarizes the process of each Monte Carlo simulation run.

Step Role Input Output Formula
0 Initialize variable distributions, incorporating correlations via Gaussian Copula. N/A N/A N/A
1 Calculate the number of doublings required for the current AI capability (time horizon) to reach the Superhuman Coder (SC) target capability. h_SC, current_horizon n_doublings n_doublings = np.log2(samples["h_SC"] / h_current) (h_current is a unit conversion of current_horizon)
2 Calculate the total time (in months) for the current AI capability to reach the SC target capability. n_doublings, T_t total_time total_time = n_doublings * T_t
3 Adjust total time based on the growth type (superexponential, subexponential, or exponential). is_superexponential, is_subexponential, is_exponential and corresponding factors se_doubling_decay_fraction, sub_doubling_growth_fraction total_time If superexponential, each doubling is faster by se_doubling_decay_fraction; if subexponential, slower by sub_doubling_growth_fraction. A geometric series sum adjusts T_t.
4 Adjust total time for cost and speed factors. cost_speed total_time total_time = total_time + cost_speed
5 Adjust the simulation's start date relative to the current calendar date to account for announcement delays. announcement_delay time The simulation's internal time (calendar year) is initialized: time = current_year - samples["announcement_delay"][i]/12.
6 Adjust calendar days to reach target capability by factoring in algorithmic progress (intermediate speedup). present_prog_multiplier, SC_prog_multiplier, progress_fraction: Proportion of total development effort towards SC completed at a given point in the simulation. v_algorithmic v_algorithmic = (1 + samples["present_prog_multiplier"][i]) * ((1 + samples["SC_prog_multiplier"][i])/(1 + samples["present_prog_multiplier"][i])) ** progress_fraction
7 Model the impact of compute effectiveness decreasing after a specified year. compute_decrease_date compute_rate compute_rate = 0.5 if t >= compute_decrease_date else 1.0

Read more