Appendices to AI Tools for Existential Security

This page contains appendices for the piece AI Tools for Existential Security.

What about the concern that any acceleration of an AI application might shorten timelines to transformative AI, and thereby reduce the odds of successfully navigating it? (Various people have different views on how seriously to take this concern — lack of consensus on this point means that it may be difficult to give a single answer that will be satisfying to all readers.)

In brief:

Related: On long-lasting differential capability improvements

As a general rule, it’ll be hard to achieve very strong and long-lasting boosts in the trajectory of an application. Market forces or improving technology might swamp our investments, and the paradigms we’re relying on might evolve quickly enough to make our work futile. But investing significant resources towards accelerating some useful application might still be valuable enough if:

Still, both of these “impact cases” rely on believing that we can achieve at least some non-trivial acceleration, and “futility dynamics” may shrink the counterfactual acceleration that we can achieve in different cases, so it’s worth discussing these concerns.

Three broad dynamics might make us concerned that our opportunity to make a real difference is limited.

These dynamics might diminish the possible counterfactual impact, but shouldn’t be viewed as deal-breakers:

We might also worry that our work is just irrelevant by the time it could start producing results. AI systems are improving and changing fairly rapidly. Some interventions we could invest in might quickly become irrelevant, either by no longer working with new systems or by becoming unnecessary. (For example, previous generations of automated translation tools were obsoleted by deep learning approaches.)

But interventions might still look highly promising if we properly account for this dynamic. For instance, this is a reason to aim for shorter-horizon accelerations (since predicting the near-future is easier, and the paradigms will probably shift less) or pursue more general interventions like curation of high quality datasets (which depend on more robust paradigms) over the development of model-specific scaffolding improvements.

Trying to accelerate some AI applications in the pursuit of existential security could be seen in terms of existing concepts. In particular, we might interpret it as:

We can visualize the relationship between def/acc, “differential AI development” (AI-focused DTD), and our focus area:

We could see this project as the side of “differential AI development” (DAID) — AI-specific differential technological development — that is not focused on shaping the underlying technology, and instead focuses specifically on accelerating beneficial applications of AI. We can call this “differential application development.”

Other sides of DAID also seem quite promising.

The “differential paradigm development” (DPD) side of DAID involves trying to help safer-seeming AI research paradigms succeed (e.g. preferring scaffolded LLM agents to end-to-end trained agents, for transparency reasons). It could also involve trying to shape other AI-related foundations — like the way key applications are structured and integrated with each other, or central institutions underlying AI development.

The line between differential application development and differential paradigm development is somewhat blurry. In some cases an application of one generation of technology becomes a paradigm that subsequent generations are built on. In other cases, the value of developing an application may be that it enables the boosting of a particular paradigm. But typically there will be a difference. Here we’ll quickly survey how we might think about DPD.

DPD may be high leverage because:

On the other hand, there are some reasons for skepticism about DPD:

Which paradigms might we focus on?
As a general rule, one paradigm might seem safer than others if:

The technological landscape may render some paradigms non-viable. It may be important to assess the feasibility of a paradigm so as not to waste effort on behalf of something that is ultimately a non-starter.

What can we do to differentially accelerate a paradigm?

Focusing on differential development could in principle mean speeding up the good things, or slowing down the bad things.

There are two big reasons why speeding things up often looks like the better plan:

(Of course, these are related.)

There are a lot of people in the world trying to do a lot of different things. Many are earnestly trying to accelerate all kinds of powerful technology, and there are often real costs to delaying those technologies. This means that working to slow things down is inherently somewhat adversarial — with all the downsides that entails².

In contrast, pretty much everyone will be on board with speeding up beneficial applications (even if they have different priorities). This work could attract much broader coalitions, and build more momentum, without the same friction.

This might be surprising. Slowing things down seems in some sense simpler — the world just needs to not do anything! But technological progress is more unilateralist than consensus-based. If anyone makes the breakthrough, the breakthrough is made. And in a market economy where there are large financial prizes for success, it’s hard to seriously slow things down by local intervention — if you discourage certain investors or developers from pursuing an innovation, others may step in to fill in the gap.³ AI developers seem strongly incentivized to push forward.

In contrast, the unilateralist nature of things works to your advantage if you’re trying to speed things up. You don’t need to build a consensus that it would be important to do X — you can just go and do it.⁴

Appendices to AI Tools for Existential Security

Appendices to AI Tools for Existential Security

Appendix 1: On whether accelerating applications could be bad via speeding up AI progress in general

Appendix 2: On dynamics that make it harder to meaningfully counterfactually accelerate an application (and when meaningful acceleration still looks achievable)

Swamping & substitution effects (via market forces and improving technology)

Irrelevance of our work (via nearsightedness or rapidly evolving foundations)

Appendix 3: How does this agenda relate to existing concepts, like differential technological development and def/acc?

Def/acc and differential technological development

More on “Differential AI development” (DAID)

Appendix 4: “Differential paradigm development” (a different side of DAID)

Appendix 5: Why we focus on accelerating risk-reducing applications rather than slowing down risky applications

Speeding things up is more cooperative

Speeding things up is easier

Footnotes

Listen to our podcast

Subscribe

Appendices to AI Tools for Existential Security

Citations

Authors

Citations

Appendices to AI Tools for Existential Security

Appendix 1: On whether accelerating applications could be bad via speeding up AI progress in general

Appendix 2: On dynamics that make it harder to meaningfully counterfactually accelerate an application (and when meaningful acceleration still looks achievable)

Swamping & substitution effects (via market forces and improving technology)

Image

Irrelevance of our work (via nearsightedness or rapidly evolving foundations)

Image

Appendix 3: How does this agenda relate to existing concepts, like differential technological development and def/acc?

Def/acc and differential technological development

Image

More on “Differential AI development” (DAID)

Appendix 4: “Differential paradigm development” (a different side of DAID)

Appendix 5: Why we focus on accelerating risk-reducing applications rather than slowing down risky applications

Speeding things up is more cooperative

Speeding things up is easier

Footnotes

Citations

Listen to our podcast

Subscribe

Search