# The Wealth of Stagnation: Falling Growth, Rising Valuations

**Authors:** James D. Paron
**Date:** January 19, 2026
**Affiliation:** Stanford University Graduate School of Business
**PDF (definitive version):** https://jamesparon.github.io/papers/wealth-of-stagnation/paper.pdf

> Full text below is converted from the paper's LaTeX source, with mathematics in TeX math mode. The linked PDF is the definitive version.

## Abstract

Over the last half-century, economic growth stagnated but stock-market wealth boomed. I present evidence that declining innovation productivity reconciles these trends. At the macro level, I document that R&D spending has fallen relative to value, while M&A spending has doubled relative to R&D. At the micro level, most of the increase in aggregate valuation ratios is explained by a reallocation of sales shares toward highvaluation firms. Using a Schumpeterian model of growth and asset prices, I find that declining innovation productivity explains these facts. When innovation productivity falls, R&D falls and M&A rises. This concentrates production into the hands of the most efficient (high-valuation) incumbents, causing aggregate value to boom. Quantitatively, this explains most of the decline in growth and the rise in valuations. It also helps explain other salient trends, including declining firm entry, rising concentration, and falling interest rates. While stock-market wealth boomed, the present value of consumption (consumer welfare) stagnated with output.

## Full text

Firms have a greater incentive to innovate when the present value of new profits is high. Over the last fifty years, this relationship seems to have broken down. The value of the stock market has far outpaced output, suggesting that incentives to innovate have reached historic highs. And yet, over the same period, aggregate economic growth has fallen. How can we reconcile stagnating economic growth with a booming stock market?

I present evidence that one structural shift---declining innovation productivity---explains both of these trends. Intuitively, when innovation gets harder, firms invest less in R&D and economic growth slows. This accords with evidence from the growth literature that new ideas have gotten harder to find (Gordon 2016; Bloom et al. 2020).

Declining research productivity also causes the valuation of the stock market to rise relative to output. This rise comes from a reallocation of market share toward the most productive firms in the economy. The intuition is as follows. More productive firms earn higher profits, grow faster, and therefore have higher valuations relative to sales and cashflows. They also grow through M&A, which creates value by reallocating ideas to more efficient producers. As research productivity falls, high-productivity incumbents spend relatively more on M&A than R&D, concentrating production in their hands. Concurrently, the risk of creative destruction by entrants falls, so these incumbents become entrenched. The reallocation-driven growth of incumbents comes at the expense of labor income and new entrants, so stock-market wealth booms as the rest of the economy---including welfare---stagnates.

I present evidence of this mechanism from two sources. First, using firm-level micro data, I lay out empirical evidence of declining innovation productivity and a reallocation of sales toward high-valuation firms. Second, using a Schumpeterian growth model, I estimate the decline in innovation productivity and show that the estimated decline explains a substantial fraction of the decline in growth and the rise in valuations over the last fifty years.

I first document empirical evidence that firms have shifted away from innovation and toward reallocation. Two stylized facts point to this shift. First, over the last fifty years, aggregate R&D expenditures have fallen significantly relative to market value. R&D spending has not kept pace with the rising present value of profits, consistent with declining innovation productivity driving a wedge between these series.[^3] Second, total M&A spending has risen substantially relative to R&D spending. Over time, firms have been growing less and less through innovation and increasingly through reallocation.

I next show that most of the secular rise in the valuation of the aggregate market is explained by a reallocation of market share toward high-valuation firms. I arrive at this conclusion by decomposing changes in aggregate valuation ratios (value relative to sales or cashflows) into two components: within-firm changes, holding fixed the composition of the market (e.g., sales shares); and compositional changes, holding fixed firm-level valuations. Over the last fifty years, firm-level valuations have been relatively stable.[^4] The long-run increase in aggregate value is instead a consequence of firms with high valuation ratios gaining market share. Importantly, high-valuation firms have gained market share not only in terms of value, but also in terms of *sales*, so this is a reallocation of real economic activity.[^5] This new finding poses a challenge to representative-firm explanations of rising valuations, which necessarily imply that all of the valuation increase came from within firms.

I investigate and rule out alternative explanations for these stylized facts. The first alternative is that R&D has been outsourced to private firms, which are increasingly being acquired by public firms. I find no evidence of an increase in private-firm R&D intensity or of an increase in M&A spending by public acquirers on private targets. A second alternative is that the reallocation between firms was caused by shifting sectoral composition---for example, tech displacing construction. I find that almost all of the reallocation was not between sectors, but between firms within sectors. In fact, in support of the main mechanism, I find that those sectors with the largest decline in R&D-to-value also saw the largest increase in M&A and valuation ratios and the most reallocation to high-valuation firms.

I build a Schumpeterian growth model that incorporates the tradeoff between R&D and M&A and use it to study valuations. Firms compete in product markets and grow by inventing new ideas (R&D), in the spirit of Klette and Kortum (2004). I introduce three ingredients into this framework. The first new ingredient is the buying and selling of existing ideas (M&A). M&A makes possible the exchange of intellectual property between firms and leads to a tradeoff between innovation and acquisition. The second new ingredient is heterogeneous firm-level productivity: high-productivity firms can produce goods at a lower marginal cost, giving them a competitive advantage in product markets. In equilibrium, high-productivity firms buy low-productivity firms via M&A. The third ingredient is population growth and new-variety creation (Peters and Walsh 2022). This feature endogenizes the degree of creative destruction in the economy, which turns out to be critical for quantifying asset pricing implications, as firms' valuations rise when the risk of creative destruction falls.[^6]

The assumption of heterogeneous firm-level productivity yields two equilibrium results that are key for understanding reallocation. First, high-productivity firms have higher valuation ratios. This is because high-productivity firms are more profitable---and, specifically, have more profitable growth opportunities---so they choose to invest more in R&D and arise as high-growth firms. Second, high-productivity firms are the only firms that invest in M&A. Consequently, M&A transfers ideas from less productive sellers to more productive buyers, and the surplus from this exchange creates value for the aggregate market.

In the model, a decline in research productivity causes M&A spending to rise relative to R&D spending, reallocating market share toward high-valuation firms. Aggregate M&A-to-R&D rises, mainly because all firms reduce R&D, and also because high-productivity firms shift skilled labor from R&D to M&A. Consequently, high-productivity firms start to buy up low-productivity firms faster than low-productivity firms grow by inventing new ideas. This reallocation concentrates production into the hands of high-valuation firms.

I use the model to quantify how much of the long-run historical trends are explained by declining innovation productivity. I proceed in three steps. First, I estimate the model using micro data from before the 1980s, assuming historically balanced growth. Second, I estimate the path of innovation productivity from the time series of R&D-to-value. And third, I compare the model's predicted transition path for economic trends---including growth and valuations---with the actual trends in the data.

I find that the decline in R&D-to-value since 1975 implies an estimated 47% decline in innovation productivity---in other words, innovation productivity fell almost in half. The intuition for identification is as follows. Firms spend more on R&D if either innovation productivity is high (ideas are easy to find) or the present value of successful innovation is high (ideas are valuable). Dividing aggregate R&D by value identifies long-run changes in research productivity because it nets out the effect of changing valuations. In the model, any structural change in the economy that changes R&D through the present value of ideas---for example, falling discount rates or rising market power---will also change the value of the stock market roughly one-for-one, leaving R&D-to-value relatively unchanged. In contrast, a decline in innovation productivity causes R&D-to-value to fall. R&D-to-value therefore provides good identification for the level of innovation productivity.

The estimated decline in innovation productivity explains most of the fall in growth and the rise in valuations. To see this, I feed the estimates into the model and study the economy's transition path. The only exogenous change along the transition is the fall in innovation productivity; all other parameters are held fixed. Per-capita output growth falls by 1.1 percentage points in the model, virtually the same as in the data. The mechanism also explains the rise in aggregate M&A-to-R&D.[^7] Finally, the transition explains most of the rise in aggregate valuation ratios. The level of the value-to-sales ratio rises by 83% over fifty years, compared with the roughly 85% rise in the data.[^8] As in the data, virtually all of this increase comes from a similar increase in value-to-cashflows. Notably, the decline in innovation productivity does not explain the large decline and recovery of valuations around 1980, which coincided with a large rise and fall in discount rates (namely, real interest rates).[^9]

As in the data, the model implies that most of the long-run rise in aggregate valuation ratios came from a reallocation of market share toward high-valuation firms. Firm-level valuations are, on average, stable. Why does the model imply stable firm-level valuations? While declining innovation does reduce firm-level growth rates and, in turn, valuations, this is offset by three forces. First, equilibrium discount rates fall with aggregate growth. Second, the shift to M&A enables high-productivity firms to keep growing without innovating. And third, because competitors and entrants are also doing less R&D, the risk of creative destruction falls, decreasing the rate of entry and raising the expected growth of incumbents.

Declining innovation productivity can also help explain trends in other macro and micro series. I highlight four trends of interest. First, the model explains all of the decline in the rate of new-firm entry, which in the data fell from about 12% to about 8% since 1980. The reason is that entrepreneurs also have a harder time creating new firms. This results in not only less entry, but also less creative destruction. Second, and closely related to this fact, the model implies an increase in firm concentration. With less entry and more acquisitions of small, low-productivity firms, production becomes concentrated into high-productivity incumbents, which become large and entrenched. Third, model-implied interest rates fall with the rate of economic growth. As already stated, this cannot account for the dramatic rise and fall of rates around 1980, but it does account for much of the longer-run decline in real rates since the 1960s.[^10] And fourth, the model naturally explains the puzzling decline in physical capital investment (Farhi and Gourio 2018; Miller et al. 2026). Intuitively, capital exists to scale up production of new ideas, so fewer new ideas means less need for capital.[^11]

Despite rising stock-market wealth, households are still worse off.[^12] Consumer welfare depends on the present value of consumption. One might therefore infer from the stock-market boom that welfare has risen, because present values are high. However, in a disaggregated economy, the stock-market claim is not the same as the consumption claim. The stock market is only a claim to current incumbents' profits. The consumption claim, in contrast, also pays out the factor income of workers and the future profits of new entrants. Indeed, stock-market wealth is a very small fraction of total wealth.[^13] This distinction is crucial for welfare: all of the gains in the stock market came from a reallocation to incumbents, at the expense of wages and future firms, so the boom in stock-market value did not imply any such boom in the present value of consumption. The value of the consumption claim stagnated with output, implying a disconnect between the stock market and consumer welfare.

#### Related literature

This paper advances and bridges the literatures on economic growth, asset pricing, and firm dynamics.

First, I present new, well-identified evidence of declining innovation productivity within firms. To be sure, this is not the first paper to argue that new ideas are getting harder to find. Jones (1995) points out that stable economic growth, coupled with an exponential increase in the number of researchers, must mean innovation productivity per researcher is falling. In support of this intuition, Jones (2009, 2010) documents that, over time, innovation has required larger teams of more educated, older researchers. Gordon (2016) presents evidence that ideas have been getting harder to find in the U.S. for decades, and that this explains the recent growth slowdown. Bloom et al. (2020) compile micro-level evidence of declining innovation productivity, demonstrating that the macro-level argument of Jones (1995) is also true within product markets and firms.

While all of these studies are based on the relationship between research inputs and ex post growth, this paper identifies the decline in innovation productivity from a different source: R&D spending and market valuations. Because R&D and value are both forward-looking and contemporaneous, they can be used to estimate the historical time series of innovation productivity at a relatively high frequency.[^14] Moreover, the use of a micro-founded endogenous-growth framework allows me to quantify the effect of this decline on not only economic growth, but also a host of other micro and macro moments.

My findings connect the secular decline in growth with the literature on the secular rise in the value of the stock market (Farhi and Gourio 2018; Miller et al. 2026; Greenwald et al. 2025; Atkeson et al. 2024; Eggertsson et al. 2021; Cho et al. 2024). I make four contributions. First (and foremost), I present new micro-level evidence and a new mechanism to explain this rise. Second, while past studies have taken growth to be an exogenous input, I endogenize both growth and valuations to study their general-equilibrium relationship. Third, I depart from the representative-firm assumption in these papers and instead model the full distribution of individual firms. Because reallocation between firms is key to explaining the rise in valuations, a representative-firm model cannot possibly explain these dimensions of the data. And fourth, because I model individual firms, I distinguish the stock market from the consumption claim. This distinction turns out to have major welfare implications: the reallocation-driven growth in the stock market came at the expense of wages and future firms, implying that the present value of consumption (welfare) stagnated with output.[^15]

A separate asset-pricing literature studies the implications of firm heterogeneity and innovation for the equity risk premium (Gârleanu et al. 2012; Corhay et al. 2020; Loualiche Forthcoming) and aggregate return volatility (Cochrane et al. 2008; Martin 2013). While this literature has so far focused on high-frequency volatility and risk premia, I show that firm heterogeneity is important even for determining the very-long-run *level* of market valuations. Indeed, all of the findings of this paper remain true even in an economy without any systematic risks or non-standard preferences.

Finally, this work builds on and contributes to the literature on firm dynamics and endogenous growth, the foundations of which are laid in Romer (1990), Grossman and Helpman (1991), and Aghion and Howitt (1992). My framework builds primarily on the Schumpeterian growth model of Klette and Kortum (2004) and Lentz and Mortensen (2008). This class of endogenous-growth model has been extended to study numerous aspects of the economy, including markups and misallocation (Peters 2020) and optimal innovation policy (Acemoglu et al. 2018). My model studies a new tradeoff within firms---R&D versus M&A---in an economy with ex ante heterogeneous firm-level productivity (Hopenhayn 1992; Luttmer 2011). I use the model to study the economic transition resulting from a new structural change: new ideas getting harder to find.

Several firm-dynamics papers have also studied long-run structural change in the economy, with a particular interest in falling growth and rising concentration. Peters and Walsh (2022) show how declining population growth reduces firm entry and increases concentration.[^16] Aghion et al. (2023) and De Ridder (2024) propose that technological changes enabled some firms to scale up more easily, leading to rising concentration.[^17]$^{,}$[^18] Akcigit and Ates (2023) and Olmstead-Rumsey (2022) argue that a decline in knowledge diffusion between leaders and laggards explains these trends.[^19] This literature has by and large not considered asset prices. A notable exception is Liu et al. (2022), who argue that a decline in discount rates (i.e., a rise in firm-level valuations) allows leaders to escape competition by laggards, until eventually both firms stop innovating.

These alternative mechanisms may be important, and indeed complementary, over this period; however, they are insufficient to reconcile all of the motivating empirical facts. In the model estimation, I show that no other mechanism can explain the decline in R&D-to-value, because, ultimately, they all operate by suppressing the present value of growth opportunities. Moreover, along the transition, most of these mechanisms counterfactually imply a boom in both economic growth and firm-level valuations, the latter of which is at odds with my stock-market decomposition. There is also a key conceptual difference between these stories and mine. While others have posited that a rise in concentration and market power caused a decline in growth, I argue the reverse: innovation productivity fell, and this concentrated production into the firms with the most market power.

#### Roadmap

Section 1 lays out the empirical facts. I describe the model in Section 2 and its solution in Section 3. Section 4 reports the estimation results. Section 5 presents the main results and Section 6 the welfare implications. Finally, Section 7 discusses model extensions. Proofs and additional details are in the Online Appendix.

# Empirical evidence 

This section documents empirical evidence of declining innovation productivity and reallocation of market share to high-valuation firms. I first report and discuss trends in aggregate value, R&D, and M&A. I then examine valuations in firm-level micro data. After laying out these facts, I investigate and rule out alternative explanations.

The main data sources are the annual sample of U.S. CRSP/Compustat firms and the history of M&A transactions from SDC Platinum. Note that 1975 is the first year in which firms were required to report R&D spending. The baseline measure of firm value is market equity plus book debt, less cash; cashflows are the sum of dividend and interest payments. For more details about data sources, selection, and variable construction, see Appendix 9.

<figure id="fig:aggxrd" data-latex-placement="t">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><strong>A. Aggregate value-to-sales</strong></td>
<td style="text-align: center;"><strong>B. Aggregate R&amp;D-to-value</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/valsale_xrdval" style="width:100.0%" />
</div>
<figcaption>The figure plots the ratio of aggregate market value to aggregate sales (Panel A) and the ratio of aggregate R&amp;D spending to aggregate market value (Panel B) for CRSP/Compustat firms from 1975–2020. Value is defined as market equity plus book debt, less cash.</figcaption>
</figure>

## Diverging trends in aggregate value and R&D 

Panel A of Figure 1 shows that, over the last fifty years, aggregate value has boomed relative to aggregate sales (gross output). The fact that the present value of profits is high relative to output suggests that the expected gains from successful innovation are high, so firms should have a greater incentive to do R&D. Panel B shows that this intuition has been far from true: aggregate R&D spending has fallen substantially relative to aggregate value. Something has driven a wedge between R&D and the present value of profits.

In Appendix 10.1.1, I show that a decline of equal magnitude can be seen in the patent value measure of Kogan et al. (2017). Thus, the decline shows up in innovation outputs as well. I discuss how their measure relates to R&D, and why R&D is my preferred measure.

## The rise of M&A over R&D 

Figure 2 shows that M&A spending has risen substantially relative to R&D spending. The figure plots the ratio of total M&A spending to total R&D spending from 1980--2020.[^20] M&A spending is defined as the total dollar amount paid by the acquirer in the deal. I consider two definitions of M&A, mergers (acquisitions of entire firms) and mergers plus acquisitions of assets, both of which show similar trends.[^21]

<figure id="fig:aggmaxrd" data-latex-placement="t">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><img src="paperplots/ma_xrd_sm" style="width:70.0%" alt="image" /></td>
<td style="text-align: center;"></td>
</tr>
</tbody>
</table>
</div>
<figcaption>The figure plots the aggregate ratio of M&amp;A spending to R&amp;D spending for public firms in Compustat and SDC Platinum from 1980–2020. Each point is a two-sided moving average with windows of <span class="math inline">±</span> 5 years. M&amp;A is defined as the dollar amount spent by acquirers on either mergers (transactions in which all of the target is acquired) or mergers plus acquisitions of assets.</figcaption>
</figure>

The M&A-to-R&D ratio effectively doubled over this period. The fact that both M&A and R&D are measured in dollar values (as opposed to, say, number of deals or number of patents) is key for this comparison, because higher rates of spending will reflect expectations of larger value gains to firms. The doubling of M&A-to-R&D over time therefore suggests that firms are growing less and less through innovation and increasingly through reallocation.

## Reallocation accounts for the booming stock market 

Reallocation could increase the value of the stock market if it means that high-valuation firms are growing larger. Consider the aggregate value-to-sales ratio $V_t/Y_t$. This aggregate ratio can be written as a sales-weighted average of firm-level value-to-sales ratios $V_{it}/p_{it}Y_{it}$: $$
\frac{V_t}{Y_t} = \sum_{i=1}^{M_t}\frac{p_{it}Y_{it}}{Y_t}\times\frac{V_{it}}{p_{it}Y_{it}}.

$$ The aggregate ratio could rise either because firm-level value-to-sales ratios rise, or because the sales shares of high-valuation firms increase (reallocation).

I decompose these two channels using the following accounting identity. Let $\mathcal{I}$ denote the set of firms (ignoring entry and exit for now). The change in ) from $t-1$ to $t$ can be written[^22] $$
\Delta\frac{V_t}{Y_t} = \underbrace{\sum_{i\in\mathcal{I}} \biggl(\overline{\frac{p_{it}Y_{it}}{Y_t}}\biggr) \times \biggl(\Delta\frac{V_{it}}{p_{it}Y_{it}}\biggr)}_{\textrm{within-firm change}} + \underbrace{\sum_{i\in\mathcal{I}}\biggl(\Delta\frac{p_{it}Y_{it}}{Y_t}\biggr) \times \biggl(\overline{\frac{V_{it}}{p_{it}y_{it}}}\biggr)}_{\textrm{compositional change}}.

$$ The within-firm change measures the effect of changes in firm-level valuations, holding fixed the sales shares of firms; the compositional change measures the effect of changing sales shares, holding fixed valuations. The operator $(\overline{\cdot})$ denotes an index averaging over the two periods: for example, $$
\biggl(\overline{\frac{V_{it}}{p_{it}Y_{it}}}\biggr) \equiv \frac{1}{2}\left(\frac{V_{it}}{p_{it}Y_{it}} + \frac{V_{i,t-1}}{p_{i,t-1}Y_{i,t-1}}\right).

$$ Using these indices---as opposed to leads or lags---is critical, because changes in sales shares and changes in value-to-sales are mechanically (negatively) correlated in the cross-section. The indices neutralize this correlation. Appendix 10.2 lays out this intuition formally.

Because there is entry and exit in the data, we must make two small adjustments to ). First, for a continuer (a firm operating at both $t-1$ and $t$), the sales share is defined as the share of total *continuer* sales. Second, for an entering firm (operating at $t$ and not $t-1$), the time-$(t-1)$ sales share is set equal to the time-$t$ share of aggregate sales, while the time-$(t-1)$ value-to-sales ratio is set equal to the time-$t$ value-to-sales ratio of all continuers. Thus, for entrants, the "within-firm change" measures the extent to which the firm has a higher or lower valuation than the set of continuers. Likewise, for an exiting firm, the time-$t$ sales share equals the time-$(t-1)$ share and the time-$t$ value-to-sales ratio equals the time-$(t-1)$ ratio of all continuers. Empirically, the effects of entry and exit are very small---I show this in Appendix 10.3, which separates out these terms. Nevertheless, these adjustments make it so that, under the null hypothesis of a stationary firm distribution (and a constant aggregate value-to-sales ratio), both the within-firm and compositional changes will equal zero. See Appendix 10.2 for a detailed explanation and examples.[^23]

<figure id="fig:decomp_val" data-latex-placement="t">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><strong>A. Value-to-sales</strong></td>
<td style="text-align: center;"><strong>B. Cashflow-to-value</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/decomp_valsale_cfval_combined" style="width:105.0%" />
</div>
<figcaption>The figure plots the decompositions of changes in aggregate valuations into within-firm and compositional components. The decomposition for value-to-sales is defined in (<a href="#eq:valsale_decomp" data-reference-type="ref" data-reference="eq:valsale_decomp">[eq:valsale_decomp]</a>) and the decomposition for cashflow-to-value in (<a href="#eq:cfval_decomp" data-reference-type="ref" data-reference="eq:cfval_decomp">[eq:cfval_decomp]</a>). See the main text for an explanation. Value is defined as market equity plus book debt, less cash; and cashflows are defined as dividends plus interest payments.</figcaption>
</figure>

Panel A of Figure 3 plots the cumulative change in aggregate value-to-sales since 1970 alongside the cumulative changes attributable to each of the channels in ). While most high-frequency changes in aggregate value-to-sales come from within-firm changes, the cumulative within-firm change has been small. Most of the long-run increase in the aggregate is instead explained by compositional change. In other words, there has been a sizable reallocation of revenues toward high-valuation firms. Reallocation explains the booming market.

Notably, the large and positive compositional change in the late 1990s coincided with the M&A wave (Figure 2). This is consistent with existing evidence that acquiring firms have significantly higher valuation ratios than target firms and that the merged value of the target and acquirer is significantly higher than their values before the merger (Betton et al. 2008). See Appendix 10.1.3 for a detailed survey of this evidence.

The rise in aggregate value-to-sales is not simply about higher profits; aggregate value has also risen relative to *cashflows*. Panel B of Figure 3 plots the analogous decomposition for aggregate cashflow-to-value $D_t/V_t$: $$
\Delta\frac{D_t}{V_t} = \underbrace{\sum_{i\in\mathcal{I}_{t-1,t}} \biggl(\overline{\frac{V_{it}}{V_t}}\biggr) \times \biggl(\Delta\frac{D_{it}}{V_{it}}\biggr) }_{\textrm{within-firm change}} + \underbrace{\sum_{i\in\mathcal{I}_{t-1,t}}\biggl(\Delta\frac{V_{it}}{V_t}\biggr) \times \biggl(\overline{\frac{D_{it}}{V_{it}}}\biggr)}_{\textrm{compositional change}},

$$ where $\mathcal{I}_{t-1,t}$ is the set of firms operating at $t-1$ or $t$. I decompose cashflow-to-value, instead of value-to-cashflow, simply because firm-level value-to-cashflow is undefined in periods with zero cashflows. The finding is the same: within firms, valuations have been relatively stable, while high-valuation (low cashflow-to-value) firms have gained substantial market share.

Appendix 10.3 shows that these decomposition results are robust to a number of possible concerns. They continue to hold if we separate out entry and exit (which are small), if we consider only equity values and dividends, and if we use alternative definitions of cashflows that include net share repurchases. Moreover, the appendix also studies the implications of the above facts for R&D-to-sales, a common measure of innovation in the literature. High-valuation firms also have high R&D-to-sales ratios, and there has been a corresponding reallocation of sales toward high-R&D-to-sales firms, which has increased the aggregate ratio.

## Examining alternative explanations 

The main hypothesis of this paper is that declining innovation productivity can explain the above micro and macro trends. Before evaluating this hypothesis, I consider two alternative explanations for these trends: selection among public firms and changing composition among sectors. Neither one of these alternatives explains the empirical evidence above.

#### Has R&D been outsourced to private firms?

One possible explanation of declining R&D is that R&D has been outsourced to private firms, which then get acquired by public firms, meaning that the decline in R&D and the rise in M&A are really just driven by selection into public markets. While we are limited in testing this explanation by the lack of valuation data for private firms, there are available data on R&D spending and M&A exits of private firms. I use these data to test two predictions.

First, if R&D has been outsourced to private firms, then we should see that the R&D intensity of private firms has grown by more than that of public firms, offsetting the decline in the aggregate. Appendix Figure 19 plots the time series of R&D-to-output in the U.S., inclusive of all firms, both public and private. The trend in this total R&D intensity is remarkably similar to the ratio of R&D-to-sale among public firms. In fact, the economy-wide measure increased by slightly *less* than the public-firm measure. There is, therefore, no evidence of a shift in R&D activity toward non-listed firms.

A second prediction of outsourcing to private firms is an increase in the share of M&A spending by public firms on acquisitions of private targets. Appendix Figure 19 reports the share of public firms' total M&A spending on public versus private targets over time. These shares exhibit no trend, which is evidence against the prediction that public firms are increasingly growing by buying up private ideas.[^24]

#### Was the reallocation within or between industries?

A possible explanation for the reallocation-driven stock-market boom is a reallocation of economic activity between industries. For example, perhaps the high-growth tech sector outgrew the lower-growth construction sector due to changing preferences or technologies. If this is the case, then declining innovation productivity within firms could be a separate phenomenon that has nothing to do with the reallocation to high-growth firms.

To test this hypothesis, I re-run the decompositions ) and ) at the industry level instead of the firm level. If the reallocation was mainly between industries, then we should see that most of the aggregate change is explained by compositional changes, as in the firm-level decompositions. If instead the reallocation was mainly between firms *within* industries, then most of the changes will come from industry-level changes. Appendix 10.5.1 reports the results. In all cases, only a small proportion of the aggregate change is explained by a reallocation between sectors, meaning that the reallocation was mainly *within* sectors.

The fact that the reallocation has been largely within sectors presents an opportunity to test this paper's main mechanism. If declining innovation productivity causes an increase in M&A activity and a reallocation to high-valuation firms, then we should see that industries that experienced a larger decline in R&D-to-value should also have seen a bigger rise in total valuations and M&A, and a larger compositional change in the valuation decompositions. Appendix 10.5.2 tests these predictions in the data and finds that they are true.

# Model 

I next present a Schumpeterian growth model in which heterogeneous firms compete in product markets and grow through R&D and M&A. The model is composed of two blocks: households and firms. The household block is standard, serving primarily to define the objectives of firms (including product-market structure) and close the economy. Most of the new and consequential assumptions underlie the firm block.

## Households

The economy is populated by a continuum of individuals of mass $L_t$ with population growth rate $\dot{L}_t/L_t = g_L$. A household is a collection of individuals that consumes, supplies labor, and trades in financial markets.

#### Intertemporal preferences

Individuals have identical log utility over consumption $c_t$ and constant-Frisch disutility over labor hours $l_t$. The lifetime utility of a single individual takes the form $$
U_t \equiv \mathbb{E}_t\left[\int_0^\infty e^{-\rho\tau}\left(\log{c_{t+\tau}}-\chi\frac{l_{t+\tau}^{1+1/\zeta}}{1+1/\zeta}\right)d\tau\right],

$$ where $\rho$ is the rate of time preference (impatience). Log utility corresponds to an elasticity of intertemporal substitution (EIS) of one. The parameter $\chi$ governs the overall disutility of working; the Frisch elasticity $\zeta$ governs the sensitivity of labor supply to changes in wages.

#### Demand for goods (market structure)

An individual's total consumption $c_t$ is a bundle of differentiated goods. In particular, firms produce a mass $N_t$ of varieties, indexed $j$. The individual's final demand for quantities $c_{jt}$ of these goods are then aggregated as $$
c_t \equiv \left(\int_0^{N_t}c_{jt}^{1-1/\eta} dj\right)^{1/(1-1/\eta)},

$$ subject to the budget constraint and taking each good's price as given. The parameter $\eta$ is the elasticity of substitution between varieties: the higher is $\eta$, the more substitutable.

Within variety $j$, $m_{jt}\in\mathbb{N}$ firms, indexed $i$, possess a blueprint to produce the good. The household views these blueprints as perfect substitutes, but with potentially heterogeneous qualities, denoted $q_{ijt}$. Hence, the total consumption of good $j$ is the sum $$
c_{jt} \equiv \sum_{i=1}^{m_{jt}}q_{ijt}c_{ijt},

$$ where $c_{ijt}$ is the quantity of firm $i$'s output consumed by the household. Demand ) and ) will ultimately determine the competitive structure of product markets.

#### Labor supply

A fraction $\bar{s}$ of individuals are endowed with skilled labor. They can supply labor hours either to an incumbent firm for a wage $w_{{\rm x}t}$ or to a private effort to start a new firm (entrepreneurship). Entrepreneurship is detailed below. The household decides the mix of hours between these tasks. The remaining fraction $1-\bar{s}$ can supply production labor for a wage $w_{{\rm p}t}$. While the fraction of laborers $\bar{s}$ dedicated to skilled labor tasks is fixed, it is straightforward to endogenize this separation using a Roy model, as in Luttmer (2011).

#### Securities market

Households trade without constraints in equity and bond markets. Claims to firm profits (the stock market) constitute the total supply of marketable wealth. All other traded assets---in particular, riskfree bonds---are in zero net supply.

#### Household composition

A household is defined as a mass of individuals, a *family*, who jointly make consumption, labor, and portfolio decisions. This implies that each household is comprised of a representative sample of individuals, and so, without loss of generality, we can consider a single representative household composed of the full population.[^25]

## Firms

There is a continuum of firms, indexed $i$, with mass $M_t$. Each firm employs labor for production and expansion into new markets in order to maximize its market value.

#### Firm value

Let $\Pi_{it}$ denote the operating profits of the firm and $w_{{\rm x}t}X_{it}$ its total expenditure on skilled labor for R&D and M&A. Dividends are the total cashflows paid to owners. Before an acquisition, the cashflow is $$
D_{it} = \Pi_{it} - w_{{\rm x}t}X_{it}.
$$ Upon an acquisition, it is the deal price. The firm's objective is to maximize the present value of its dividends: $$
V_{it} \equiv \mathbb{E}_t\left[\int_0^\infty\frac{\xi_{t+\tau}}{\xi_t}D_{i,t+\tau}d\tau\right],

$$ where $\xi_t$ denotes the state-price density of the household. This value-maximization problem has two parts. The first is the static profit maximization, in which the firm produces its current set of products. The second is the dynamic investment decision, in which the firm allocates skilled labor to R&D and M&A.

#### Production and profits

Firm $i$ has blueprints to produce $n_{it}\in\mathbb{N}$ varieties. The firm employs labor to produce variety $j$ according to the linear production function[^26] $$
Y_{ijt} = a_iL_{ijt}, \quad j\in\{1,\dots,n_{it}\},

$$ where the idiosyncratic productivity type $a_i$ can take a low value $a_\ell$ or a high value $a_h$. A firm's productivity type is fixed over its life; Section 7 discusses implications of relaxing this assumption by allowing for stochastic switching between productivity types.

Each blueprint has associated with it a set of characteristics, including its quality $q_{ijt}$ and the qualities and productivities of competitors. Given these characteristics, firms internalize the demand curves of households and engage in Bertrand competition (i.e., choose prices $\{p_{ijt}\}$).[^27] All firms pay a common production wage $w_{{\rm p}t}$. Hence, a firm with productivity $a_i$ and $n_{it}$ products earns operating profits $$
\Pi_{it} = \max_{\{p_{ijt}\}}\sum_{j=1}^{n_{it}} \left(p_{ijt}Y_{ijt} - w_{{\rm p}t}L_{ijt}\right),

$$ where $Y_{ijt}$ and $L_{ijt}$ depend on prices and product characteristics due to household demand.

#### Innovation (R&D)

Firms can come up with new ideas in two ways, each corresponding to its own R&D technology (Peters and Walsh 2022). The first is new-variety creation, whereby the firm invents a blueprint for a new good and becomes a monopolist. The second is a quality improvement on an existing variety, in which case the firm enters the market and competes with incumbents. Firms choose how much labor to allocate to each technology at any given time. Allowing for both types of innovation is essential for two reasons. First, new-variety creation will be necessary for balanced growth with a growing population. Second, the relative frequencies of quality improvements and new varieties will determine the degree of creative destruction between firms, which will feed back into growth rates and valuations.

The R&D technologies combine skilled labor, hired at the wage $w_{{\rm x}t}$, with the firm's existing knowledge stock $n_{it}$ to probabilistically yield a new idea. Specifically, an $n$-product firm that allocates $X_{{\rm q}it}$ units of labor to quality innovation will successfully invent a blueprint with Poisson intensity $$
\alpha\varphi X_{{\rm q}it}^\varepsilon n_{it}^{1-\varepsilon}.

$$ Likewise, a firm allocating $X_{{\rm n}it}$ units to new-variety creation will succeed with intensity $$
(1-\alpha)\varphi X_{{\rm n}it}^\varepsilon n_{it}^{1-\varepsilon}.

$$ These idea-production functions capture the fact that R&D generates new ideas by building on existing ideas. The parameter $\varepsilon\in(0,1)$ governs the importance of researchers in this process. Economically, $\varepsilon<1$ implies decreasing marginal returns to research inputs. The parameter $\varphi$ represents *innovation productivity*, the total factor productivity of R&D. Higher $\varphi$ means that it is easier to come up with new ideas in any form. Finally, $\alpha\in(0,1)$ determines the relative productivities of quality improvements over new varieties. I will also refer to $\alpha$ as the creative destruction parameter, because it increases the extent to which innovation negatively affects incumbent firms' current profits.[^28]

Quality improvements are fixed increments $\lambda>1$ over the current quality of the target firm's blueprint: if firm $i$ improves on the blueprint for good $j$ owned by firm $i'$, then this new blueprint has quality $q_{ijt}=\lambda q_{i'jt}$. The firm identifies an existing blueprint on which to improve by random search, meaning that the probability of improving on a blueprint with a given set of characteristics equals the fraction of blueprints with those characteristics currently in production across the economy.[^29] Each new variety has a quality equal to $\lambda Q_t$, where $Q_t$ is the current aggregate (average) quality of blueprints, defined below.

#### Acquisition (M&A)

Firms can also hire from the pool of skilled laborers in order to identify and acquire blueprints from other firms.[^30] In particular, each firm has access to an M&A technology, analogous to the R&D technologies, with which they can hire $X_{{\rm m}it}$ units of labor to search for an existing idea. The investment in M&A search successfully finds a target with Poisson intensity[^31] $$
\psi X_{{\rm m}it}^\varepsilon n_{it}^{1-\varepsilon}.

$$ This M&A search function is purposely defined to have the same form as the R&D technologies ) and ), in order to focus comparison between the relative productivities $\varphi$ and $\psi$. It would, for example, be equivalent to have an M&A search technology that is linear in $X_{\rm m}$ but subject to an appropriately scaled adjustment cost.

Like quality innovations, acquisition is undirected and hence determined by the distribution of blueprints in the economy. Once a firm successfully identifies a target blueprint, it enters into a negotiation with the target firm. Specifically, the two parties engage in a Nash bargaining game in which they decide on a deal price. Letting $v_{\rm buyer}$ and $v_{\rm target}$ denote the private values of the idea to the buyer and seller, respectively, the deal price solves $$
v_{\rm m} = \underset{v}{\operatorname{argmax}} \left\{(v_{\rm buyer}-v)^\varrho(v-v_{\rm target})^{1-\varrho}\right\}.
$$ The bargaining power of the acquirer is increasing in the parameter $\varrho\in(0,1)$. If both parties are indifferent between making a deal and walking away, there is no deal, consistent with an infinitesimal fixed cost of transacting.

#### Entry and exit

Skilled laborers start new firms by creating their own new ideas. I assume that these entrepreneurs have access to the exact same idea-creation technologies ) and ) as a single-product incumbent ($n=1$). Before investing in this technology, the entrepreneur knows the productivity type $a_i$, which is high with probability $\omega_h=\omega$ or low with probability $\omega_\ell=1-\omega$. The opportunity cost of a skilled laborer doing entrepreneurial research is the foregone skilled wage that could be received by working for an incumbent.

If a single-product firm loses its last good, either because of creative destruction or because it was acquired, then that firm ceases to operate. For convenience, I assume that even if a firm ceases operations, its blueprint still exists and incumbents recognize the threat of re-entry. This is not strictly necessary, but it makes the model more tractable because it means we do not need to keep track of those blueprints that are owned by dead firms.

# Model equilibrium 

## Equilibrium definitions 

To define an equilibrium, we need a more precise notion of the distribution of ideas in the economy. Each blueprint can be summarized by a specific set of features, including its quality, its producer's productivity, and the characteristics of competitors. For now, let us denote by $Z$ the set of a product's characteristics and $\mathcal{Z}$ the full space of possible $Z$. For instance, firm $i$'s blueprint for good $j$ can be summarized by some $Z_{ij}\in\mathcal{Z}$.

Let us partition the aggregate blueprint stock across characteristics $Z$ (so $N_t = \sum_{Z\in\mathcal{Z}}N_{Zt}$) and let $f_{Zt}\equiv N_{Zt}/N_t$ represent the proportion of ideas currently in operation with characteristics $Z$. Thus, the set $\{f_{Zt}:Z\in\mathcal{Z}\}$ represents the full blueprint distribution.[^32] With this notation in hand, we can define the concept of equilibrium.[^33]

**Definition 1** (Equilibrium). *Fix an initial blueprint distribution $\{f_{Zt}:Z\in\mathcal{Z}\}$ and an initial population-to-blueprints ratio $L_t/N_t$. An *equilibrium* is a time path of goods prices $\{p_{ijt}\}$, wages $\{w_{{\rm p}t},w_{{\rm x}t}\}$, state-price densities $\{\xi_t\}$, goods consumption $\{c_{ijt}\}$, labor hours $\{l_{{\rm p}t},l_{{\rm x}t}\}$, production-labor allocations $\{L_{ijt}\}$, and skilled-labor allocations $\{X_{{\rm n}it},X_{{\rm q}it},X_{{\rm m}it}\}$, such that:*

1. *Optimization: Households and firms optimize according to the setting in Section 2.*

2. *Market clearing: Household labor supply equals firm labor demand, the consumption of each good equals the quantity produced, and household wealth equals total firm value.*

The definition fixes the initial distribution $\{f_{Zt}:Z\in\mathcal{Z}\}$ and ratio $L_t/N_t$ because these are slow-moving states that evolve forward in response to equilibrium policies. We next define a stricter notion of an equilibrium in which these slow-moving states are constant.

**Definition 2** (Balanced-growth equilibrium). *A *balanced-growth equilibrium (BGE)* is an equilibrium in which, for all time $t$, the blueprint distribution is stationary ($\dot{f}_{Zt} = 0$ for all $Z\in\mathcal{Z}$) and the population-to-blueprint ratio $L_t/N_t$ is constant.*

Intuitively, balanced growth means that all resources are growing at the same rate across firms (a stationary distribution) and that the factor inputs to innovation, $L_t$ and $N_t$, grow proportionately ($g_N=g_L$). It will turn out that every well-defined equilibrium converges to a BGE over time, so these conditions are natural long-run properties of the economy. The rest of this section characterizes a BGE; subsequent sections generalize to a transition path.

Before proceeding, I impose the parametric restriction that the size of quality innovations is larger than the productivity gap between firm types: $\lambda > a_h/a_\ell$. This means that, if a low-productivity firm innovates on a high-productivity blueprint, it will choose to enter and produce. If $\lambda<a_h/a_\ell$, then low-productivity firms never compete in product markets with high-productivity firms. This restriction is not strictly necessary, as low-productivity firms can still maintain positive market share via new-variety creation. However, $\lambda>a_h/a_\ell$ will prove true in the estimation, so I impose it up front to simplify exposition.

## Equilibrium conditions 

I next lay out the equilibrium conditions for a balanced-growth path. I start by describing static production and profits, then use these results to determine valuations, R&D and M&A policies, growth, and the blueprint distribution. Appendix 11 provides a detailed derivation of all of these conditions for any equilibrium (not just a BGE).

#### Production and markups

Household consumption allocations $\{c_{ijt}\}$ maximize utility ) and ). Aggregating to total consumption (and production) $Y_{ijt}$, we have demand curves $$
p_{ijt} = q_{ijt}\left(\frac{Y_{jt}}{Y_t}\right)^{-1/\eta}p_t.

$$ Aggregate output is the numeraire, so the aggregate price index $p_t=1$. Note that this demand restriction means that any two firms $i$ and $i'$ producing a version of variety $j$ must equate *quality-adjusted* prices: $$
\frac{p_{ijt}}{q_{ijt}} = \frac{p_{i'jt}}{q_{i'jt}},

$$ consistent with the assumption of perfect substitution within product markets.

Firms choose prices subject to ) and ) to maximize operating profits ). Under Bertrand competition, only the firm with the lowest quality-adjusted marginal cost $w_{{\rm p}t}/(a_iq_{ijt})$ will produce variety $j$. We call this producer the *leader* and index it by $i_1$; we call the firm with the next lowest quality-adjusted marginal cost the *follower* with index $i_2$. The leader sets its price either to maximize its monopoly profits or, if necessary, to dissuade the follower from producing. Specifically, the leader's markup over marginal cost equals $$
\mu_{ijt} \equiv \frac{p_{ijt}}{w_{{\rm p}t}/a_i} = \min\left\{\frac{a_{i_1}q_{i_1jt}}{a_{i_2}q_{i_2jt}},\frac{\eta}{\eta-1}\right\}.
$$ The term $\eta/(\eta-1)$ is the monopoly markup: the optimal markup in the presence of no competitor. The term $(a_{i_1}q_{i_1jt})/(a_{i_2}q_{i_2jt})$ is the marginal-cost advantage of the leader: if the marginal-cost advantage is sufficiently small, then the leader must engage in limit pricing by setting its quality-adjusted price equal to the quality-adjusted marginal cost of the follower.

This solution implies three possible market structures, comprised of two leader productivity types $i_1\in\{h,\ell\}$ and three follower types $i_2\in\{h,\ell,\varnothing\}$, where $i_2=\varnothing$ means that there is no follower (a monopoly product). Let $\mu_{i_1i_2}$ denote the corresponding markup. For a monopoly product ($i_2=\varnothing$), the marginal-cost advantage is infinite, so $\mu_{i_1\varnothing} =\eta/(\eta-1)$. For a market with a follower, the quality advantage is always $q_{i_1jt}/q_{i_2jt}=\lambda$, so the markup equals $\mu_{i_1i_2}=\min\{\lambda a_{i_1}/a_{i_2},\eta/(\eta-1)\}$.

#### Revenues and profits

The above conditions together imply that the revenue share of firm $i$ producing variety $j$ equals $$
\frac{p_{ijt}Y_{ijt}}{Y_t} = \frac{1}{N_t}\left(\frac{a_iq_{ijt}}{\bar{a}Q_t}\right)^{\eta-1}\left(\frac{\mu_{ijt}}{\mathcal{M}_{\eta-1}}\right)^{-(\eta-1)}.

$$ The aggregates $\{\bar{a},Q_t,\mathcal{M}_{\eta-1}\}$ are averages of $\{a_i,q_{ijt},\mu_{ijt}\}$, defined below in ), ), and ). Because $\eta>1$, a firm generates more revenue for a product than average if it has higher productivity and quality. It earns lower revenues if it charges a higher markup, because high prices reduce demand ). Operating profits can be expressed in terms of revenues and markups as $$
\Pi_{ijt} = \left(1-\frac{1}{\mu_{ijt}}\right)p_{ijt}Y_{ijt}.

$$

#### Aggregate output and wages

Aggregate output (and consumption) equals $$
Y_t = A_tL_{{\rm p}t},
$$ where total-factor productivity (TFP, also output per labor hour) is defined as $$
A_t \equiv \underbrace{N_t^{1/(\eta-1)}}_{\textrm{variety}} \hphantom\times\hphantom \underbrace{Q_t}_{\textrm{quality}} \hphantom\times \underbrace{\bar{a}}_{\textrm{productivity}} \times \underbrace{\Omega}_{\textrm{static misalloc.}}.

$$ The first component of TFP is the love-of-variety effect, which is magnified if varieties are less substitutable ($\eta$ is closer to one). The second term is *aggregate quality*, which is defined as an average of blueprint qualities:[^34] $$
Q_t \equiv \left(\frac{1}{N_t}\int_0^{N_t}q_{jt}^{\eta-1}dj\right)^{1/(\eta-1)}.

$$ The latter two components capture the allocation of resources across producers. *Aggregate productivity* is[^35] $$
\bar{a} \equiv \left(\frac{1}{N_t}\int_0^{N_t}\left(\frac{q_{jt}}{Q_t}\right)^{\eta-1}a_{jt}^{\eta-1}dj\right)^{1/(\eta-1)},

$$ a quality-weighted average of firm-level productivity. The final term, *static misallocation* $\Omega$, is a wedge that summarizes the extent to which actual labor allocations $L_{ijt}/L_{{\rm p}t}$ differ from the optimal allocation under the current blueprint distribution. Specifically, defining $A_t^* \equiv \max_{\{L_{ijt}/L_{{\rm p}t}\}}\{Y_t/L_{{\rm p}t}\}$ as the maximum level of output per labor hour, static misallocation equals the ratio $\Omega = A_t/A_t^* \in (0,1]$. As shown in Appendix 11.1.5, this wedge attains its maximum ($\Omega=1$) if and only if there is no cross-product markup heterogeneity ($\mu_{jt}=\mu$, $\forall j$). As in Peters (2020), heterogeneous market power distorts demand and results in a misallocation of production labor across blueprints. This is why I refer to $\Omega$ as static misallocation.[^36] Appendix 11.1.3 provides a closed-form expression for $\Omega$ in terms of the distribution of blueprint characteristics.

The aggregate production wage equals $$
w_{{\rm p}t} = \Lambda\frac{Y_t}{L_{{\rm p}t}},

$$ where the (production) labor share of output $\Lambda = w_{{\rm p}t}L_{{\rm p}t}/Y_t$ is a sales-weighted average of product-level labor shares (inverse markups $\mu_{jt}^{-1}$):[^37] $$
\Lambda = \int_0^{N_t}\frac{p_{jt}Y_{jt}}{Y_t}\mu_{jt}^{-1}dj.% = \frac{\mathcal{M}_{\eta-1}^{\eta-1}}{\mathcal{M}_\eta^\eta}.

$$ Recall that there are $(1-\bar{s})L_t$ production laborers who each choose how many labor hours $l_{\rm p}$ to work. Combining the labor demand curve ) with the labor supply curve implied by utility ), we get that labor hours per production worker equal $$
l_{\rm p} = \frac{L_{{\rm p}t}}{(1-\bar{s})L_t} = \left(\frac{\Lambda}{\chi(1-\bar{s})}\right)^{\zeta/(1+\zeta)}.
$$ Each production worker exerts less effort if the aggregate markup $\Lambda^{-1}$ is higher, because higher markups reduce wages and therefore disincentivize work.

#### Rescaled firm's problem

A key step in solving the balanced-growth equilibrium is appropriately rescaling the equilibrium conditions. To do this, we can show that profits and firm values grow with output per capita $Y_t/L_t$. From ) and ) above, the profits of any blueprint can be rewritten[^38] $$
\Pi_{ijt} = \pi_{i_1i_2\hat{q}}\frac{Y_t}{L_t}. %= \pi_{i_1i_2}e^{(\eta-1)\hat{q}}\frac{Y_t}{L_t},
$$ where $\pi_{i_1i_2\hat{q}}$ is a constant that depends on the current leader $i=i_1\in\{h,\ell\}$, follower $i_2\in\{h,\ell,\varnothing\}$, and relative quality $\hat{q}\equiv\log(q_{jt}/Q_t)$. From this fact, it can be shown that a firm's value is additively separable across blueprints: $$
V_{it} = \sum_{j=1}^{n_{it}}V_{ijt}, \quad\textrm{where}\quad V_{ijt} = v_{i_1i_2\hat{q}}\frac{Y_t}{L_t}.

$$ The constant $v_{i_1i_2\hat{q}}$ denotes the rescaled value of a blueprint with those characteristics. The separability in ) means that the equilibrium can be solved in terms of blueprint values $\{v_{i_1i_2\hat{q}}\}$ instead of firm values, which dramatically reduces the dimensionality of the problem. Before characterizing the solution for these blueprint values, let us derive firms' optimal R&D and M&A policies, taking blueprint values as given.

#### R&D policies

The firm allocates $X_{{\rm n}it}$ skilled laborers to new-variety creation ) and $X_{{\rm q}it}$ to quality innovation ) at the skilled wage $w_{{\rm x}t}$. The optimal number of research hours employed for new-variety creation equates the marginal cost of research with its *marginal value product*: $$
w_{{\rm x}t}= \underbrace{\varepsilon(1-\alpha)\varphi (n_{it}/X_{{\rm n}it})^{1-\varepsilon}}_{\textrm{marginal product of research}} \times \underbrace{ \bar{v}_{{\rm n}it} }_{\textrm{marginal value}}

$$ The blueprint values in ) and the assumption that new varieties have initial quality $q_{jt}=\lambda Q_t$ imply that the marginal value of successful new-variety innovation equals $$
\bar{v}_{{\rm n}it} = \bar{v}_{{\rm n}i_1}\frac{Y_t}{L_t}, \quad\textrm{where}\quad \bar{v}_{{\rm n}i_1} = v_{i_1\varnothing,\log\lambda}.
$$ Likewise, the skilled wage (solved below) grows with per-capita output: $w_{{\rm x}t} = \hat{w}_{\rm x}Y_t/L_t$. These facts imply the policy $$
x_{{\rm n}i_1} \equiv \frac{X_{{\rm n}it}}{n_{it}} = \left(\frac{\varepsilon(1-\alpha)\varphi\bar{v}_{{\rm n}i_1}}{\hat{w}_{\rm x}}\right)^{1/(1-\varepsilon)}.

$$ Fixing the wage $\hat{w}_{\rm x}$, the firm spends relatively more on new-variety creation if the technology is more productive ($(1-\alpha)\varphi$ is high) or the resulting blueprint is more valuable ($\bar{v}_{{\rm n}i_1}$ is high).

By the exact same logic, the optimal allocation to quality innovations is $$
x_{{\rm q}i_1} \equiv \frac{X_{{\rm q}it}}{n_{it}} = \left(\frac{\varepsilon\alpha\varphi\bar{v}_{{\rm q}i_1}}{\hat{w}_{\rm x}}\right)^{1/(1-\varepsilon)},

$$ where the expected value of a blueprint from successful quality innovation equals $$
\bar{v}_{{\rm q}i_1} = \sum_{i'\in\{h,\ell\}}\int_{-\infty}^\infty v_{i_1i',\hat{q}+\log\lambda}(f_{i'h\hat{q}}+f_{i'\ell\hat{q}}+f_{i'\varnothing\hat{q}})d\hat{q}.
$$ The densities $f_{i_1i_2\hat{q}}\equiv N_{i_1i_2\hat{q}t}/N_t$ denote the share of blueprints with these characteristics; thus, in this context, $f_{i'i_2\hat{q}}$ represents the conditional probability that the firm finds and innovates on a blueprint with current leader $i_1=i'$, follower $i_2$, and relative quality $\hat{q}$.

The policies of entrepreneurs---skilled laborers working to come up with new ideas and firms---are identical to those of single-product firms. Specifically, each skilled laborer of prospective type $a_i$ dedicates $\{x_{{\rm En}i},x_{{\rm Eq}i}\}=\{x_{{\rm n}i},x_{{\rm q}i}\}$ labor hours to entrepreneurial efforts.

#### M&A policies

High-productivity firms are more profitable producers and thus ascribe a higher present value to each blueprint type. It is therefore mutually beneficial for high-productivity firms to buy blueprints from low-productivity firms. Suppose a high-productivity firm identifies a low-productivity target blueprint ($i_1=\ell$) with follower $i_2$ and relative quality $\hat{q}$. The deal price that solves the Nash bargaining problem is $$
v_{{\rm m}i_2\hat{q}} = \underset{v}{\operatorname{argmax}} \left\{(v_{hi_2\hat{q}}-v)^\varrho(v-v_{\ell i_2\hat{q}})^{1-\varrho}\right\} = \varrho v_{\ell i_2\hat{q}} + (1-\varrho)v_{hi_2\hat{q}}.
$$ The buyer gets a fraction $\varrho$ of the surplus $v_{hi_2\hat{q}}-v_{\ell i_2\hat{q}}$ and the seller gets the remaining $1-\varrho$.

In order for this deal to take place, the high-productivity firm must first successfully find a low-productivity blueprint. The skilled wage paid to M&A search is the same as that paid to researchers, so the optimal allocation to the M&A search technology ) is $$
x_{{\rm m}i_1} \equiv \frac{X_{{\rm m}it}}{n_{it}} = \left(\frac{\varepsilon\psi\bar{v}_{{\rm m}i_1}}{\hat{w}_{\rm x}}\right)^{1/(1-\varepsilon)},
$$ where $\bar{v}_{{\rm m}i_1}$ is the expected value of the acquired blueprint *net* of the deal price, $$
\bar{v}_{{\rm m}i_1} = \begin{cases}
\varrho f_\ell\sum_{i_2\in\{h,\ell\}}\int_{-\infty}^\infty(v_{hi_2\hat{q}} - v_{\ell i_2\hat{q}})\frac{f_{\ell i_2\hat{q}}}{f_\ell}d\hat{q} & \textrm{if}\quad i_1=h, \\
0 & \textrm{if}\quad i_1=\ell.
\end{cases}
$$ For low-productivity firms, the expected benefit is zero, because they would never buy any blueprint that they find; these firms therefore invest nothing in the M&A search technology ($x_{{\rm m}\ell}=0$). For high-productivity firms, the expected present value is higher if it has more bargaining power ($\varrho$) and if it is more likely to find a low-productivity blueprint ($f_\ell$).

#### Skilled wages

The skilled wage $w_{{\rm x}t}$ clears the labor market by setting the demand for R&D and M&A equal to the supply of skilled labor hours. Appendix 11.3.2 shows that $w_{{\rm x}t}=\hat{w}_{\rm x}Y_t/L_t$, where $\hat{w}_{\rm x}$ is given by ). The rescaled wage $\hat{w}_{\rm x}$ is increasing in R&D and M&A productivity $\varphi$ and $\psi$ and in expected blueprint valuations, because these forces increase demand for skilled labor. Higher wages also increase the number of hours worked by each skilled laborer:[^39] $$
l_{\rm x} = \frac{L_{xt}}{\bar{s}L_t} = \left(\frac{\hat{w}_{\rm x}}{\chi}\right)^\zeta.
$$ Thus, even for a fixed research population $\bar{s}L_t$, higher innovation incentives increase the equilibrium supply of research inputs through this intensive margin.

#### Economic growth

The equilibrium scales with per-capita output $y_t=Y_t/L_t$. Before deriving the growth rate, $g_y\equiv\dot{y}_t/y_t$, let us establish some notation. Define a firm's new-variety-creation intensity as $$
\Phi_{{\rm n}i} \equiv \frac{\varphi(1-\alpha)X_{{\rm n}it}^\varepsilon n_{it}^{1-\varepsilon}}{n_{it}} = \varphi(1-\alpha)x_{{\rm n}i_1}^\varepsilon.
$$ Define $\Phi_{{\rm q}i}$ and $\Psi_i$ analogously for quality innovation and M&A search, respectively. Lastly, define $\nu_i \equiv \omega_i\bar{s}L_t/N_t$ as the number of potential new type-$a_i$ entrants per existing blueprint (where, again, $\omega_h=\omega$ and $\omega_\ell=1-\omega$).

On the balanced-growth path, output per capita grows in proportion to variety $N_t^{1/(\eta-1)}$ and aggregate quality $Q_t$ (see )), so its growth rate equals[^40] $$
g_y = \frac{1}{\eta-1}g_N + g_Q.

$$ The growth rate of the number of varieties equals $$
g_N \equiv \frac{\dot{N}_t}{N_t} = \Phi_{{\rm n}h}(f_h+\nu_h) + \Phi_{{\rm n}\ell}(f_\ell+\nu_\ell) = g_L
$$ The first equality captures the fact that new-variety creation is higher when there is a higher new-variety intensity across both incumbents and entrants. The second equality is a necessary condition for balanced growth: the ratio $L_t/N_t$ is constant.[^41] The growth rate of aggregate quality equals $$
g_Q \equiv \frac{\dot{Q}_t}{Q_t} = \frac{\lambda^{\eta-1}-1}{\eta-1}(g_N+\delta),

$$ where $$
\delta \equiv \Phi_{{\rm q}h}(f_h+\nu_h) + \Phi_{{\rm q}\ell}(f_\ell+\nu_\ell)
$$ is the rate of *creative destruction*, the intensity with which incumbents and entrants innovate upon existing blueprints. The rate $\delta$ is increasing in the creative destruction parameter $\alpha$. Quality growth ) is increasing in $g_N$ because new varieties have above-average quality. The larger the quality innovations $\lambda$, the faster the rate of quality growth.

#### Market valuations

The policies above depend on and determine the blueprint market values $\{v_{i_1i_2\hat{q}}\}$. Using the fact that profits are separable in quality, $\pi_{i_1i_2\hat{q}} = \pi_{i_1i_2}e^{(\eta-1)\hat{q}}$, we can further decompose the blueprint values into two parts: $$
v_{i_1i_2\hat{q}} = \underbrace{v_{i_1i_2}^{\rm P}e^{(\eta-1)\hat{q}}}_{\textrm{product value}} \hphantom+ \underbrace{ v_{i_1}^{\rm G} }_{\textrm{growth value}}
$$ The product value represents the present value of profits directly generated by the blueprint; the growth value represents the contribution of the blueprint to generating future profits via R&D and M&A. Because we are in a riskless economy, every firm is discounted by the riskfree rate, $$
r_f = \rho + g_y.
$$ The coefficients $\{v_{i_1i_2}^{\rm P},v_{i_1}^{\rm G}\}$ are characterized by a system of Hamilton-Jacobi-Bellman (HJB) equations. The HJBs for product values are $$
\underbrace{\rho v_{i_1i_2}^{\rm P}}_{\textrm{discount}} = \underbrace{\pi_{i_1i_2}}_{\textrm{profits}} + \underbrace{\mathbb{I}_{i_1=\ell}\Psi_hf_h(1-\varrho)(v_{hi_2}^{\rm P}-v_{\ell i_2}^{\rm P})}_{\textrm{M\&A exit}} - \underbrace{ \delta v_{i_1i_2}^{\rm P} }_{\textrm{creat. destr.}} - \underbrace{(\eta-1)g_Qv_{i_1i_2}^{\rm P}}_{\textrm{quality depr.}}.

$$ Product values are increasing in profits and, for low-productivity firms, the possibility of being acquired at a premium; they are decreasing in the risk of creative destruction $\delta$ and in the rate of relative quality depreciation $(\eta-1)g_Q$. The HJBs for growth values are $$\begin{multline}
\underbrace{\rho v_{i_1}^{\rm G}}_{\textrm{discount}} = \underbrace{\Phi_{{\rm n}i_1}\bar{v}_{{\rm n}i_1}}_{\textrm{new variety}} +\underbrace{ \Phi_{{\rm q}i_1}\bar{v}_{{\rm q}i_1}}_{\textrm{quality innov.}} + \underbrace{\Psi_{i_1}\bar{v}_{{\rm m}i_1}}_{\textrm{acquisition}} - \underbrace{\hat{w}_{\rm x}(x_{{\rm n}i_1}+x_{{\rm q}i_1}+x_{{\rm m}i_1})}_{\textrm{expenditures}} \\
+ \underbrace{\mathbb{I}_{i_1=\ell}\Psi_hf_h(1-\varrho)(v_h^{\rm G}-v_\ell^{\rm G})}_{\textrm{M\&A exit}} - \underbrace{ \delta v_{i_1}^{\rm G} }_{\textrm{creat. destr.}}

\end{multline}$$ Growth values are higher for firms that do more R&D and M&A---high-productivity firms, in equilibrium. Like product values, growth values are reduced by creative destruction and increased for low-productivity firms if it is more likely that they will get acquired.

#### Distribution

Like market values, the equilibrium blueprint distribution is summarized over the three product characteristics $\{i_1,i_2,\hat{q}\}$. Recall that $N_{i_1i_2\hat{q}t}$ denotes the mass of blueprints with these characteristics and $f_{i_1i_2\hat{q}}\equiv N_{i_1i_2\hat{q}t}/N_t$ the share. The evolution of this distribution is characterized by a system of Kolmogorov Forward equations (KFEs), which express flows into and out of product markets in terms of firms' R&D and M&A decisions.

The KFEs follow from the fact that the net flow of blueprints equals $$
\dot{N}_{i_1i_2\hat{q}t} = N_t\dot{f}_{i_1i_2\hat{q}t} + \dot{N}_tf_{i_1i_2\hat{q}t} = g_NN_tf_{i_1i_2\hat{q}},
$$ where the second equality imposes a stationary distribution. This net flow equals the sum of constituent flows from R&D and M&A. For monopoly products ($i_2=\varnothing$), the KFEs are $$
\underbrace{g_Nf_{i_1\varnothing\hat{q}}}_{\textrm{net flow}} = \underbrace{[\partial_{\hat{q}}f_{i_1\varnothing\hat{q}}]g_Q}_{\textrm{quality depr.}} + \underbrace{\Phi_{{\rm n}i_1}(f_{i_1}+\nu_{i_1})\mathbb{I}_{\hat{q}=\log\lambda}}_{\textrm{new varieties}}+ \underbrace{(\mathbb{I}_{i_1=h}-\mathbb{I}_{i_1=\ell})\Psi_hf_hf_{\ell\varnothing\hat{q}}}_{\textrm{acquisitions}} - \underbrace{ \delta f_{i_1\varnothing\hat{q}} }_{\textrm{creat.~destr.}}.

$$ Blueprints for monopoly products flow in and out due to quality depreciation, new-variety creation, and creative destruction. The term captioned "acquisitions" represents the flows from high-productivity firms buying low-productivity monopoly products---note the asymmetry in $i_1$. The KFEs for competitive markets ($i_2\in\{h,\ell\}$) are $$\begin{multline}
\underbrace{g_Nf_{i_1i_2\hat{q}}}_{\textrm{net flow}} = \underbrace{[\partial_{\hat{q}}f_{i_1i_2\hat{q}}]g_Q}_{\textrm{quality depr.}} + \underbrace{\Phi_{{\rm q}i_1}(f_{i_1}+\nu_{i_1})(f_{i_2h,\hat{q}-\log\lambda}+f_{i_2\ell,\hat{q}-\log\lambda}+f_{i_2\varnothing,\hat{q}-\log\lambda})}_{\textrm{quality innovations}} \\
+ \underbrace{(\mathbb{I}_{i_1=h}-\mathbb{I}_{i_1=\ell})\Psi_hf_hf_{\ell i_2\hat{q}}}_{\textrm{acquisitions}} - \underbrace{ \delta f_{i_1i_2\hat{q}} }_{\textrm{creat.~destr.}}.

\end{multline}$$ The only difference from ) is that, instead of inflows from new varieties, there are inflows from quality innovations on incumbent firms' blueprints.

The total type-$a_{i_1}$ blueprint share $f_{i_1}=N_{i_1t}/N_t$ is the main determinant of total revenue and value shares in the economy. Aggregating the KFEs across $i_2$ and $\hat{q}$ implies that the total type-$a_{i_1}$ blueprint share satisfies $$
\underbrace{g_Nf_{i_1}}_{\textrm{net flow}} = \underbrace{\Phi_{{\rm n}{i_1}}(f_{i_1}+\nu_{i_1})}_{\textrm{new varieties}} + \underbrace{\Phi_{{\rm q}{i_1}}(f_{i_1}+\nu_{i_1}) - \delta f_{i_1}}_{\textrm{net quality innovations}} + \underbrace{(\mathbb{I}_{i_1=h}-\mathbb{I}_{i_1=\ell})\Psi_hf_hf_\ell}_{\textrm{acquisitions}}.

$$ This aggregated KFE displays the key forces driving the total high-productivity blueprint share $f_h$. On the one hand, $f_h$ will be driven up by the fact that these firms do more R&D ($\Phi_h>\Phi_\ell$) and that M&A is asymmetric (the last term of )). On the other hand, $f_h$ will be lower if fewer entrepreneurs are high-productivity to begin with ($\omega_h<\omega_\ell$, so $\nu_h<\nu_\ell$).

## Key equilibrium properties 

In general, the transition path between BGEs can look different from the initial and final BGEs themselves. This limits the usefulness of comparative statics. With that in mind, there are two key implications of the model that hold unambiguously both across BGEs and along the transition path, the first concerning the consequences of heterogeneous productivity and the second concerning the consequences of changing innovation productivity.

#### Equilibrium consequences of heterogeneous productivity 

In equilibrium, high-productivity firms endogenously arise as high-profit, high-growth, and high-valuation firms.

**Proposition 1** (Comparisons between productivity types). *Consider any blueprint with follower of type $a_{i_2}\in\{a_h,a_\ell\}$ and relative quality $\hat{q}\in\mathbb{R}$.*

1. *High-productivity producers earn higher revenues, higher markups, and higher profits: $$
 p_{hi_2\hat{q}}y_{hi_2\hat{q}} \geq p_{\ell i_2\hat{q}}y_{\ell i_2\hat{q}}, \quad \mu_{hi_2} \geq \mu_{\ell i_2}, \quad\textrm{and}\quad \pi_{hi_2\hat{q}} > \pi_{\ell i_2\hat{q}}.
 $$*

2. *Every high-productivity firm spends more per blueprint on R&D and M&A: $$
 w_{{\rm x}t}x_{{\rm k}h} > w_{{\rm x}t}x_{{\rm k}\ell} \quad\forall\:{\rm k}\in\{{\rm n},{\rm q},{\rm m}\}.
 $$*

3. *The product value and growth value of the blueprint are strictly higher for a high-productivity producer than for a low-productivity producer: $$
 v_{hi_2}^{\rm P} > v_{\ell i_2}^{\rm P} \quad\textrm{and}\quad v_h^{\rm G} > v_\ell^{\rm G}, \quad\textrm{so}\quad v_{hi_2\hat{q}} > v_{\ell i_2\hat{q}}.
 $$*

See Appendix 11.6 for a proof. The first statement simply follows from the fact that high-productivity firms are, by assumption, able to produce the same blueprint at a lower marginal cost. The second and third facts follow from the first. Because high-productivity firms earn higher profits, their future growth opportunities are also more profitable. They hence expect a higher present value from successfully innovating, and choose a higher R&D intensity in equilibrium (recall the first-order condition )). High-productivity firms are therefore not just more profitable today; they also have higher expected *growth rates*, and, in turn, tend to have higher valuations relative to sales and cashflows. Put another way, static productivity differences $a_h/a_\ell$ are compounded in present value terms.

Only high-productivity firms will choose to invest in M&A search. A high-productivity firm ascribes a strictly higher valuation to every given blueprint, and is therefore willing to buy a low-productivity firm's blueprint at a premium relative to its current traded market value. The reverse is not true. This has two macroeconomic implications. First, from the KFE ), one can see that M&A tends to reallocate market share toward high-productivity (hence high-valuation) firms. This reallocation matters more for the distribution when M&A is high *relative* to R&D. Second, because M&A transactions create a net surplus, they tend to increase the aggregate valuation of the market. These predictions of the model align well with empirical evidence on M&A, which I survey in Appendix 10.1.3.

#### Equilibrium consequences of changing innovation productivity $\varphi$ 

Figure 4 plots comparative statics with respect to $\varphi$ for three moments of interest. The first, plotted in Panel A, is the ratio of R&D-to-value. As will be explained in detail in the next section, lower innovation productivity decreases the R&D-to-value ratio because it drives a wedge between the present value of profits (which affects both the numerator and denominator) and R&D effort. Panel B shows that, as $\varphi$ falls, the M&A-to-R&D ratio rises. This is mainly because R&D falls, and also because M&A spending increases within high-productivity firms.

<figure id="fig:compstats_varphi" data-latex-placement="t">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: left;"><strong>A. Agg. R&amp;D-to-value</strong></td>
<td style="text-align: left;"><strong>B. Agg. M&amp;A-to-R&amp;D</strong></td>
<td style="text-align: left;"><strong>C. High-productivity share</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/compstat_xrdval_maval_highshare_combined" style="width:100.0%" />
</div>
<figcaption>The figure plots comparative statics across balanced-growth equilibria with respect to innovation productivity <span class="math inline"><em>φ</em></span>. Panels A and B show the aggregate ratios of R&amp;D-to-value and M&amp;A-to-R&amp;D, respectively. Panel C plots the total share of blueprints, sales, and profits of high-productivity firms. Model parameters take the values estimated in Section <a href="#sec:transitionestimation" data-reference-type="ref" data-reference="sec:transitionestimation">4</a>. See Appendix <a href="#app:numerical_bge" data-reference-type="ref" data-reference="app:numerical_bge">13.1</a> for details of the numerical solution method.</figcaption>
</figure>

As just explained, a higher M&A-to-R&D ratio implies more net reallocation toward high-productivity firms. Panel C of Figure 4 plots the consequence of this for the total high-productivity market share. As innovation productivity $\varphi$ falls, high-productivity firms gain market share in terms of blueprints, sales, and profits. M&A is the essential mechanism here: were we to assume no M&A ($\psi=0$), then virtually nothing would happen to high-productivity market shares. One can see how the three predictions in Figure 4 qualitatively align with the motivating empirical facts in Section 1.

# Estimation of the model and transition path 

## Overview and methods 

The goal of this section is to quantify the decline in innovation productivity and its consequences for the U.S. economy over the last fifty years. To this end, I first estimate the model parameters using firm-level micro data from before the 1980s, assuming the U.S. was on a balanced-growth path. I then estimate the transition path of innovation productivity, $\varphi_t$, using the time series of R&D-to-value. Finally, in Section 5, I feed these estimates into the model and study its equilibrium implications along the transition path.

The numerical methods for solving and simulating the model are laid out in detail in Appendix 13, and estimation details are in Appendix 14. The solution methods build on the continuous-time iterative methods in Achdou et al. (2022) and related work. To solve for the balanced-growth equilibrium, I develop a numerical method that iterates over the distribution $\{f_{i_1i_2\hat{q}}\}$ and the ratio $L_t/N_t$. To solve the transition path, I use a forward-backward iteration scheme between balanced-growth paths. To simulate firms, I develop a new method that exploits analytical results for the distribution of size and age (derived in Appendix 12). The estimation is via simulated method of moments (SMM), using the numerical SMM procedure of Catherine et al. (2023).

## Estimation of the balanced-growth equilibrium 

 Parameter Value Source/Target/Assumption
 ------------------------------------------ -- --------------- -- -------- -- ----------------------------
 **Population** 
 Population growth $g_L$ 0.0125 Historical average
 Skilled share of population $\bar{s}$ 0.142 Skilled labor force
 **Preferences** 
 Substitution elasticity between goods $\eta$ 2.9 Broda and Weinstein (2006)
 Elasticity of intertemporal substitution 1 Log utility
 Labor disutility $\chi$ 1 Normalized
 Frisch elasticity $\zeta$ 0.5 Labor literature
 **Firms** 
 Production-function elasticity $\varepsilon$ 0.5 Innovation literature
 Bargaining power $\varrho$ 0.5 Equal bargaining power

 : The table reports the baseline model parameters calibrated prior to the estimation procedure. See the main text for details.

#### Externally calibrated parameters

Before estimating the model, I calibrate eight parameters externally. Two come directly from data. The population growth rate $g_L$ is $1.25\%$, the U.S. average from 1955--1985. The share of skilled workers in the labor force is $14.2\%$, which is the share of managers, engineers, and scientists computed by Acemoglu et al. (2018).

The elasticity of substitution between goods $\eta$ is set equal to $2.9$, the median estimate of Broda and Weinstein (2006). Households have log utility, meaning that they have a unit elasticity of intertemporal substitution (EIS). Because none of the moments of interest will depend on the labor disutility scalar $\chi$, I normalize it to 1. For the Frisch elasticity of labor supply, I assume a value of 0.5, in the range of most empirical estimates from the labor literature (see Keane (2011) and Chetty (2012) for summaries).

The elasticity $\varepsilon\in(0,1)$, which determines the relative importance of laborers versus existing ideas in the idea-production functions, is common in endogenous-growth models. Following Acemoglu et al. (2018), I set it to 0.5, in line with a large body of evidence from the microeconomic innovation literature (Blundell et al. 2002; Hall and Ziedonis 2001; Hall 1993; Bloom et al. 2002; Wilson 2009).[^42] Finally, I assume buyers and sellers have equal bargaining power in M&A negotiations ($\varrho=0.5$).[^43]

#### Estimated parameters

In the first stage of the estimation, I estimate seven model parameters in the initial balanced-growth path using micro and macro data: $$
\Theta = \biggr[ \underbrace{ \rho }_{\textrm{preferences}} \underbrace{ \varphi_0 \lambda \alpha }_{\textrm{innovation}} \underbrace{ \psi }_{\textrm{acquisition}} \underbrace{ a_h/a_\ell \omega }_{\textrm{heterogeneity}} \biggl].

$$ Table 2 also lists these parameters with their descriptions. Note that the subscript on innovation productivity $\varphi_0$ indicates that this is the constant, pre-transition value.

In the rest of this section, I discuss the estimation of each parameter, focusing on how the parameter is identified by a moment in the data. For further details about moment computation and the SMM procedure, see Appendix 14.

+:------------------------------+:-:+:------------:+:----:+:--------:+:-:+:---------------------------+:--------------:+:--------:+:-------:+:-------:+
| | | Parameters | | Identifying moments |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| | | Notation | | Estimate | | Moment | | Data | | Model |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| **Preferences** | | | | | | | | | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| Discount factor | | $\rho$ | | 0.0174 | | Agg. value-to-sales | | 1.0006 | | 1.0006 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| | | | | (0.0097) | | | | (0.0230) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| **Innovation (R&D)** | | | | | | | | | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| Innovation productivity | | $\varphi_0$ | | 0.0412 | | Agg. R&D-to-value | | 0.0204 | | 0.0204 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| | | | | (0.0036) | | | | (0.0008) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| Quality innovation size | | $\lambda$ | | 1.1380 | | Agg. output growth | | 0.0248 | | 0.0248 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| | | | | (0.0204) | | | | (0.0040) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| Relative quality productivity | | $\alpha$ | | 0.7944 | | Firm entry rate | | 0.1191 | | 0.1191 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| | | | | (0.0061) | | | | (0.0048) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| **Acquisition (M&A)** | | | | | | | | | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| M&A search productivity | | $\psi$ | | 0.0819 | | Agg. M&A-to-value | | 0.0116 | | 0.0116 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| | | | | (0.0098) | | | | (0.0013) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| **Firm heterogeneity** | | | | | | | | | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| Productivity gap | | $a_h/a_\ell$ | | 1.1271 | | St. dev. of value-to-sales | | 0.5806 | | 0.5806 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| | | | | (0.0284) | | | | (0.0303) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| High-prod. entrep. share | | $\omega$ | | 0.0124 | | St. dev. of profit-to-sales | 0.5463 | | 0.5463 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+
| | | | | (0.0051) | | | | (0.0162) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+----------------+----------+---------+---------+

: The table reports the baseline estimates of parameters in the balanced-growth equilibrium before the transition. The subscript on innovation productivity $\varphi_0$ indicates that this is the value on the balanced-growth path, before the transition commences. Cross-sectional standard deviations are weighted by sales and divided by the corresponding aggregate moment. Standard errors are in parentheses. See the main text and Appendix 14 for further details on the SMM estimation procedure and how each moment is computed.

#### Discount factor $\rho$

The rate of time preference $\rho$ discounts all firms' cashflows by determining the growth-adjusted discount rate $r_f-g_y=\rho$. Market valuations are therefore higher (relative to sales and cashflows) when the discount rate $\rho$ is lower. Thus, we can identify $\rho$ using the aggregate value-to-sales ratio $V_t/Y_t$ of the market.[^44] In the data, I use the average from 1965--1975 and measure firm value as equity plus debt net of cash.

#### Initial innovation productivity $\varphi_0$

Identification of the initial $\varphi_0$ comes from the aggregate R&D-to-value ratio. I discuss the intuition for identification at length in Section 4.3 below when I estimate the transition path of $\{\varphi_t\}$. Because the R&D data only begin in 1975, I use the average ratio from 1975--1985 for this data moment.

#### Quality innovation step size $\lambda$

Larger quality-innovation increments $\lambda>1$ will mean that aggregate quality $Q_t$ grows faster with each new idea. Hence, $\lambda$ can be identified from the growth rate of aggregate output per capita $$
g_y = \frac{\lambda^{\eta-1}}{\eta-1}g_L + \frac{\lambda^{\eta-1}-1}{\eta-1}\delta.
$$ In the data, I use percentage growth in real output per capita from 1950--1985.

#### Relative quality productivity $\alpha$

The higher is $\alpha\in(0,1)$, the more likely it is that a new idea will be a quality innovation instead of a new variety. Thus, new ideas will tend to displace incumbents' existing ideas, increasing creative destruction $\delta$ and the rate of firm exit. Under balanced growth, the rates of firm entry and exit per existing firm are linked by the equation $(\textrm{entry rate} = g_L + \textrm{exit rate})$. Hence, both the entry and exit rates can identify $\alpha$. For the data moment, I use the rate of new-firm entry per existing firm in the Business Dynamics Statistics (BDS) from 1979--1985. Importantly, the BDS data include all firms, not just firms entering public markets.

#### M&A search productivity $\psi$

Let $V_{{\rm m}t}$ denote aggregate M&A deal flow: the total dollar amount spent on completed acquisitions of firms (not on M&A search labor). We can identify $\psi$ from the aggregate M&A-to-value ratio: $$
\frac{V_{{\rm m}t}}{V_t} = \underbrace{\Psi_hf_hf_\ell \vphantom{\frac{\bar{v}_{\rm m}}{V_t/N_t}}}_{\textrm{frequency}}\times\underbrace{\frac{\bar{v}_{\rm m}}{V_t/N_t}}_{\textrm{values}}.
$$ This ratio has two components. The first is the frequency with which targets are found and acquired, which is higher when $\psi$ (and hence $\Psi_h$) is higher. The second is the ratio of the average value of an acquired blueprint $\bar{v}_{\rm m}$ to the average value of an existing blueprint $V_t/N_t$. By scaling by aggregate value, we control for blueprint-level valuation changes (the second component), isolating the frequency component of M&A.

In the data, the numerator in the M&A-to-value ratio is the sum of the values of all mergers in a given year in the SDC Platinum database from 1978--1985. All acquirers are public firms, and acquisition targets include both public and private firms.

#### Firm heterogeneity parameters $a_h/a_\ell$ and $\omega$

The two parameters that govern ex ante firm heterogeneity are the productivity gap $a_h/a_\ell$ and the high-productivity share of entrepreneurs $\omega$. Identifying the efficiency advantage $a_h/a_\ell$ is especially important for the transition path, because this will determine the extent to which reallocation between firms affects macroeconomic aggregates. These parameters must be jointly identified, because both influence the (unobservable) market shares of high-productivity firms.[^45]

First, fix a value of $\omega$ and consider $a_h/a_\ell$. A larger gap has two effects. First, it increases the total market share of high-productivity firms. Second, it increases the cross-sectional difference between high- and low-productivity firms' profits and valuations. One moment that captures these effects is the *sales-weighted* cross-sectional variance of value-to-sales, defined as $$
\textrm{var}_{py}\left(\frac{V_{it}}{p_{it}y_{it}}\right) \equiv \int_0^{M_t} \frac{p_{it}y_{it}}{Y_t}\left(\frac{V_{it}}{p_{it}y_{it}} - \frac{V_t}{Y_t}\right)^2 di.
$$ By the law of total variance, this sales-weighted variance can be decomposed across types as $$
\textrm{var}_{py}\left(\frac{V_{it}}{p_{it}y_{it}}\right) = \underbrace{\sum\nolimits_{i'\in\{h,\ell\}}\frac{Y_{i't}}{Y_t}\textrm{var}_{py}\left(\frac{V_{it}}{p_{it}y_{it}}\biggl\rvert a_i=a_{i'}\right)}_{\textrm{within-type variance}} + \underbrace{\sum\nolimits_{i'\in\{h,\ell\}}\frac{Y_{i't}}{Y_t}\left(\frac{V_{i't}}{Y_{i't}} - \frac{V_t}{Y_t}\right)^2}_{\textrm{between-type variance}},
$$ where $Y_{it}$ is the total sales of all type-$a_i$ firms. The within-type term is largely invariant to $a_h/a_\ell$, as within-type dispersion in value-to-sales is all driven by heterogeneous markups and quality. The between-type term is increasing in $a_h/a_\ell$, because $a_h/a_\ell$ increases both the total sales share of high-productivity firms and the gap between total high and low value-to-sales ratios.[^46] This between-type variance is the main source of identification.

The challenge to identifying $a_h/a_\ell$ using the variance of value-to-sales is that this moment is *also* increasing in $\omega$. Specifically, raising $\omega$ will increase the total high-productivity sales share, increasing the between-type variance. Thus, in order to separately identify $a_h/a_\ell$ and $\omega$, one needs a moment that "controls" for the sales share. To do this, I use the sales-weighted cross-sectional variance of profit-to-sales. Intuitively, this moment depends on sales shares in the same way as the variance of value-to-sales; however, it is comparatively less sensitive to changes in $a_h/a_\ell$, because market valuations compound differences in profits (through the present value of growth opportunities).

Empirically, these sales-weighted variances are computed across firms in Compustat from 1965--1975. To standardize units, the moments reported in Table 2 are standard deviations, scaled by the respective aggregate ratios (see Appendix 14.2 for further discussion).

#### Discussion of estimation results

Table 2 reports the estimation results. The model matches each moment well (of course, it is exactly identified), and the relatively small standard errors validate parameter identification. Importantly, the estimate of $\varphi_0$ is very precise, suggesting that R&D-to-value is a good identifying moment (more on this in 4.3). Appendix 14.3 presents additional corroborating evidence of identification.

The estimated $\alpha$ is about $0.8$, meaning it is easier to innovate on existing blueprints than to invent entirely new goods. High-productivity firms are estimated to be about 13% more efficient, and the low value of $\omega$ implies that they are uncommon among entrepreneurs.[^47]

## Estimation of the decline in innovation productivity 

In the second stage of the estimation, I estimate the time series of innovation productivity $\{\varphi_t\}$ from 1975 to present, holding all other parameters fixed. I first discuss the intuition for identification from the R&D-to-value ratio, and then present the results.

#### Identification of innovation productivity $\varphi_t$

As in the balanced-growth path, innovation productivity is identified by the aggregate R&D-to-value ratio. The intuition is as follows. As we saw in the first-order conditions ) and ), a firm's R&D spending is increasing in innovation productivity $\varphi$, increasing in the present value of new ideas $\bar{v}_{{\rm n}i}$ and $\bar{v}_{{\rm q}i}$, and decreasing in the skilled wage $w_{{\rm x}t}$. The main challenge to identifying $\varphi$ from R&D spending is controlling for valuation and wage changes. The insight for identification is that one can do this by dividing by aggregate value.

It is most straightforward to see this in the case with inelastic labor supply ($\zeta=0$). Letting $X_{{\rm r}t} \equiv X_{{\rm n}t} + X_{{\rm q}t}$ denote the aggregate quantity of researchers employed by incumbents, the equilibrium R&D-to-value ratio at any point in time equals[^48] $$
\frac{w_{{\rm x}t}X_{{\rm r}t}}{V_t} = \underbrace{\varphi_t \vphantom{\biggr)^{-\varepsilon}} }_{\textrm{prod.}} \times \underbrace{\frac{\bar{v}_{{\rm r}t}}{V_t/N_t} \vphantom{\biggr)^{-\varepsilon}} }_{\textrm{values}} \times \underbrace{\varepsilon\biggl(1+\frac{N_t}{\bar{s}L_t}\biggl(1+\left(\frac{\psi}{\varphi_t}\frac{\bar{v}_{{\rm m}t}}{\bar{v}_{{\rm r}t}}\right)^{\frac{1}{1-\varepsilon}}\biggr)\biggr)^{-\varepsilon}}_{\textrm{substitution between factors and technologies}},

$$ where the average present value of a new idea from R&D equals $$
\bar{v}_{{\rm r}t} \equiv \left(\sum_{i\in\{h,\ell\}}f_{it}\left([(1-\alpha)\bar{v}_{{\rm n}it}]^{\frac{1}{1-\varepsilon}}+[\alpha\bar{v}_{{\rm q}it}]^{\frac{1}{1-\varepsilon}}\right)\right)^{1-\varepsilon}.
$$ The ratio ) moves one-for-one (in percentage terms) with innovation productivity $\varphi_t$. The second term, which is the ratio of the average new blueprint value to the average existing blueprint value, is essentially a constant, because parameter changes that increase valuations (e.g., a lower $\rho$) affect the numerator and denominator roughly proportionately. This is why scaling R&D by value---and not, say, output---is key to identifying $\varphi_t$. The third term combines substitution effects between factors (researchers versus blueprints) and technologies (R&D versus M&A). For the latter tradeoff, if M&A search has comparatively higher returns, then it will increase the skilled wage and reduce R&D-to-value. Under the estimated parameters, this third term is quantitatively insensitive to all parameter changes.[^49]

Consequently, the (percent) changes in R&D-to-value and $\varphi_t$ over any interval $[t,t+\tau]$ are roughly the same: $$
\Delta\log\left(\frac{w_{{\rm x},t+\tau}X_{{\rm r},t+\tau}}{V_{t+\tau}}\right) \approx \Delta\log\varphi_{t+\tau}.
$$ In the actual case of elastic labor supply $\zeta>0$, there is also a small effect from the fact that skilled labor supply rises if $\varphi_t$ or $\bar{v}_{{\rm r}t}$ rises.[^50]

Appendix 14.3 shows quantitatively that other parameter shifts in the model cannot explain the decline in R&D-to-value, verifying the above intuition. I show this by feeding other parameter transitions into the model and computing the resulting path of R&D-to-value. In fact, some plausible parameter shifts---like a decline in discount rates $\rho$, which increases valuations---actually cause R&D-to-value to *rise*, due to the positive labor-supply response. Declining innovation productivity is therefore essential to explaining this moment.

#### Estimates of innovation productivity $\varphi_t$

I estimate the time series $\{\varphi_t\}$ to fit the time series of R&D-to-value. To keep this transition parsimonious, and to smooth out higher-frequency movements in R&D-to-value that are potentially unrelated to innovation productivity, I assume that this path takes the functional form $$
\varphi_t = \varphi_0 + \frac{1}{1+\exp\{\kappa_\varphi(t-t_{\rm mid})\}}(\varphi_\infty - \varphi_0),

$$ where $\varphi_0$ is the initial level of innovation productivity (estimated in the first stage), $\varphi_\infty$ is the terminal value, $t_{\rm mid}$ is the midpoint year of the transition (i.e., when $\varphi_t=\frac{1}{2}(\varphi_0+\varphi_\infty)$), and $\kappa_\varphi>0$ governs the speed of the transition. The estimated parameters minimize the sum of squared errors between R&D-to-value in the model and data along the transition.

<figure id="fig:modelvsdata_aggxrdma" data-latex-placement="t">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: left;"><strong>A. Estimated innovation productivity</strong></td>
<td style="text-align: left;"><strong>B. Aggregate R&amp;D-to-value</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/trans_varphi_xrdval" style="width:100.0%" />
</div>
<figcaption>The figure shows the estimated time series of innovation productivity <span class="math inline">{<em>φ</em><sub><em>t</em></sub>}</span> (Panel A) and the aggregate R&amp;D-to-value ratio in the model and data (Panel B). The function governing the transition of <span class="math inline"><em>φ</em><sub><em>t</em></sub></span> is given by (<a href="#eq:varphi_transfunction" data-reference-type="ref" data-reference="eq:varphi_transfunction">[eq:varphi_transfunction]</a>); the horizontal dashed lines in Panel A correspond to the estimated pre-transition value <span class="math inline"><em>φ</em><sub>0</sub></span> and terminal value <span class="math inline"><em>φ</em><sub>∞</sub></span>. See the main text and Appendix <a href="#app:estimation" data-reference-type="ref" data-reference="app:estimation">14</a> for further details on the SMM estimation procedure.</figcaption>
</figure>

Panel A of Figure 5 plots the estimated time series of $\varphi_t$.[^51] Innovation productivity is estimated to have fallen roughly in half since 1980, meaning the same quantity of research inputs in the twenty-first century is expected to yield half the number of new ideas. Panel B plots the R&D-to-value ratio in the data and model, from which this decline is estimated.

# Main results 

I now study the transition-path equilibrium of the model under the estimated decline in innovation productivity $\varphi_t$, holding all other parameters fixed.[^52] The main finding is that the estimated decline in $\varphi_t$ explains virtually all of the long-run decline in economic growth and the rise in aggregate value-to-sales. As in the empirical decompositions in Section 1.3, all of the rise in aggregate valuations is from a reallocation of sales to high-valuation firms.

## The decline in economic growth

As one might expect, a decline in innovation productivity implies falling economic growth. Figure 6 plots the rate of per-capita output growth in the data and model. The data line shows a ten-year, two-sided moving average (i.e., $\pm10$ years) of the annualized growth rate of per-capita real GDP. In the model, this growth rate falls by about 1.1 percentage points, which is virtually all of the long-run decline we see in the data.[^53]

<figure id="fig:results_growth" data-latex-placement="t">
<div class="center">
<p> <img src="paperplots/trans_growthrate_bea" style="width:70.0%" alt="image" /></p>
</div>
<figcaption>The figure shows the annualized percentage growth rate of real output per capita in the data and model. The data series is a ten-year, two-sided moving average (i.e., an equal-weighted average over a window of <span class="math inline">±10</span> years) of the annualized quarterly growth rate using real per-capita output from the BEA. The model line is the growth rate of per-capita output: <span class="math inline"><em>g</em><sub><em>y</em><em>t</em></sub> = <em>ẏ</em><sub><em>t</em></sub>/<em>y</em><sub><em>t</em></sub></span>, where <span class="math inline"><em>y</em><sub><em>t</em></sub> = <em>Y</em><sub><em>t</em></sub>/<em>L</em><sub><em>t</em></sub></span>.</figcaption>
</figure>

Essentially all of this decline comes from the fact that there is less innovation output, but it is worth noting that there are two other channels coming from reallocation.[^54] On the one hand, growth increases because more productive firms are producing more of the products in the economy. On the other hand, because these high-productivity firms tend to have higher markups, the aggregate markup slightly rises, suppressing labor supply and reducing output growth. The net effect of these channels is positive but negligibly small.

## The reallocation-driven rise in aggregate valuations

The decline in innovation productivity causes a large rise in the aggregate valuation of the stock market. Figure 7 plots the aggregate value-to-sales ratio in the model and data since 1965. The model-implied value-to-sales ratio increases 83%, essentially all of the secular increase from the late 1960s and early 1970s to the twenty-first century. Notably, the decline in innovation productivity does not explain the dramatic fall and rise of valuations around 1980. This change coincided with a surge in real interest rates and unusually low profits and markups, neither of which is accounted for by this model.[^55]

<figure id="fig:results_valsale_agg" data-latex-placement="t">
<div class="center">
<p> <img src="paperplots/trans_valsale_agg" style="width:70.0%" alt="image" /></p>
</div>
<figcaption>The figure shows the ratio of aggregate value to aggregate sales in the model and data. The data line is the same as the line for value-to-sales in Panel A of Figure <a href="#fig:aggxrd" data-reference-type="ref" data-reference="fig:aggxrd">1</a>.</figcaption>
</figure>

What explains the stock-market boom? Figure 8 plots the decomposition ) of the aggregate value-to-sales ratio in the data and model. Recall that the within-firm change measures the cumulative effect of changing firm-level valuations, holding sales shares fixed; while the compositional change measures the effect of changing sales shares, holding firm-level valuations fixed. The data plot (Panel A) repeats Figure 3 and the model plot (Panel B) computes the decomposition on a simulated panel of firms along the model transition. The y-axes in both plots are normalized by the cumulative change in aggregate value-to-sales from 1970--2020, so that each point represents a percentage of this cumulative change.

As in the data, the model implies that all of the cumulative rise in aggregate value-to-sales came from the compositional change, a reallocation of sales shares toward high-valuation firms. Essential to this conclusion are the two main model ingredients---heterogeneous productivity and M&A---without which there would be no reallocation. I will discuss the role of M&A shortly. The within-firm change in Panel B also shows that, in the model, there is an initial rise in firm-level valuations that subsequently reverts. The reason for this is perfect foresight: investors anticipate the coming rise in M&A and value, and this is priced ex ante.

<figure id="fig:results_valsale_decomp" data-latex-placement="t">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: left;"><strong>A. Data</strong></td>
<td style="text-align: left;"><strong>B. Model</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/decomp_valsale_sim_combined_pct" style="width:100.0%" />
</div>
<figcaption>The figure plots the decompositions of changes in aggregate value-to-sales into within-firm and compositional components in the data and model. The decomposition is defined in (<a href="#eq:valsale_decomp" data-reference-type="ref" data-reference="eq:valsale_decomp">[eq:valsale_decomp]</a>)—see the main text for an explanation. In both panels, the y-axes are rescaled by the cumulative change in the aggregate from 1970–2020. Panel A is the exact same as Panel A of Figure <a href="#fig:results_valsale_decomp" data-reference-type="ref" data-reference="fig:results_valsale_decomp">8</a> (except with a rescaled y-axis). Panel B is computed from a simulated panel of firms along the transition path in the model.</figcaption>
</figure>

What explains the stability of within-firm valuations over time? Put another way, how were high-productivity firms able to maintain high growth rates despite falling R&D? There are two main reasons. First is the presence of M&A: high-productivity firms mitigated the effect of lower R&D by continuing to grow through the (relatively more productive) M&A technology. The second major reason is the decline in creative destruction by competitors and entrants. Because some innovations come in the form of quality improvements, and quality improvements have negative spillovers onto incumbents, a general decline in R&D reduces the likelihood that an incumbent will lose its products. This channel raises the expected growth rates of incumbents, translating into higher (firm-level) valuations relative to both sales and cashflows. This is where the estimated value of $\alpha$ matters for valuations: the larger is $\alpha$, the larger the decline in creative destruction and firm entry.

## The rise in M&A and reallocation

Figure 9 plots a key implication of the model: the rise in aggregate M&A-to-R&D. The data line is the same as the "mergers and acquisitions of assets" line from Figure 2. M&A-to-R&D rises for two reasons. The first (and main reason) is a denominator effect, which simply follows from the fact that all firms do less R&D. The second is a numerator effect, which follows from the fact that high-productivity firms do more M&A. When R&D gets harder, M&A becomes a relatively more attractive means of growth, so these firms invest relatively more in M&A. The consequence of rising M&A-to-R&D in the model is an increase in the total market share (share of sales, profits, etc.) of high-productivity firms. Specifically, the high-productivity sales share rises from about 15% to about 30% over this period. This explains the quantitatively large reallocation of sales toward high-valuation firms in Figure 8.

<figure id="fig:results_maxrd" data-latex-placement="t">
<div class="center">
<p> <img src="paperplots/trans_maxrd_sm" style="width:70.0%" alt="image" /></p>
</div>
<figcaption>The figure shows the ratio of aggregate M&amp;A spending to aggregate R&amp;D spending in the model and data. M&amp;A spending is the total amount spent by acquirers on targets in completed M&amp;A transactions. The data line is the same as the line for mergers and acquisition of assets in Figure <a href="#fig:aggmaxrd" data-reference-type="ref" data-reference="fig:aggmaxrd">2</a>.</figcaption>
</figure>

The timing of the model-implied M&A-to-R&D rise in Figure 9 is off: it misses the M&A wave of the late 1990s (which coincided with waves of deregulation) and it overshoots thereafter. The model could account for this by making M&A search productivity temporarily high in the 1990s.[^56] Either way, the *cumulative* amount of M&A activity over the full period from 1980 to 2020 is still very close in the model and data. Cumulative M&A activity is what ultimately matters when we consider the total amount of reallocation that has taken place across firms in the stock market.

## Implications for other series

We can also use the model to study secular change in other micro and macro series. Here I discuss two implications that have been of interest to the literature. The first is the decline in new-firm entry and the increase in concentration. Figure 10 plots the rate of new-firm entry in the data (from BDS) and the model, defined as the the number of new firms entering the economy per existing firm. The decline in research productivity explains all of the decline in entry because entrepreneurs have a harder time coming up with new ideas for firms.[^57]

<figure id="fig:results_otherseries_entry" data-latex-placement="t">
<div class="center">
<p> <img src="paperplots/trans_entry" style="width:70.0%" alt="image" /></p>
</div>
<figcaption>The figure shows the rate of new-firm entry in the model and data, defined as the number of new firms entering the economy per existing firm. The data series is computed from the Business Dynamics Statistics.</figcaption>
</figure>

As just discussed above, the decline in firm entry means that incumbents face less risk of creative destruction and become entrenched. It also implies a rise in the concentration of blueprints, both because incumbents are able to grow without risk of displacement and because there is less entry of small firms into the economy. A second force that increases concentration is the rise in M&A, which reduces the number of small, low-productivity firms by reallocating their blueprints to larger, high-productivity firms. The increased concentration of blueprints translates into increased concentration of employment, sales, and profits.

Yet another secular trend this mechanism can help explain is the long-run secular decline in real interest rates. Figure 11 plots the change in the real interest rate since 1960.[^58] I show the change, instead of the level, because the model-implied interest rate does not match the data level, as discount rates $\rho$ were estimated from firm valuations, not interest rates---Section 7 discusses how to reconcile this. In the model, the interest rate falls by the same amount as per-capita output growth: about one percentage point. This is about half of the decline since 1960. Clearly, the mechanism does not explain the spike in real rates around 1980, in line with the discussion above about where the model also misses firm valuations.

<figure id="fig:results_otherseries_rates" data-latex-placement="t">
<div class="center">
<p> <img src="paperplots/trans_interestrate" style="width:70.0%" alt="image" /></p>
</div>
<figcaption>The figure shows the change in real interest rates since 1960 in the data and model. The data series is the annualized expected real return on a nominal bond, where the realized return is measured as the Fed Funds rate deflated by realized CPI inflation. I infer the series of expected real returns from this series using a procedure similar to that of <span class="citation" data-cites="beeler2012long">Beeler and Campbell (2012)</span>, described in the main text. Note that both the data and model series are shifted to start at zero in 1960, so that this is the change in real rates since 1960.</figcaption>
</figure>

# Implications for welfare 

What are the welfare implications of the divergence of output growth and stock-market wealth over the last half-century? It might seem obvious that declining output growth should mean welfare stagnates. However, what matters for total welfare is not just the level of consumption, but its *present value*. Thus, the fact that wealth boomed over this period could suggest that other forces have offset this negative growth effect.[^59] In this section, I show that, on average, household welfare did not boom with stock-market wealth. Rather, welfare stagnated with output, because all of the gains in the stock market came from a reallocation of output from workers and entrants toward incumbent firms.

To state this formally, note that the value function of the representative household equals $$
U_t = k(t) + \rho^{-1}\log\left(\frac{\overline{W}_t}{L_t}\right),
$$ where $k(t)$ is a deterministic function of time and $\overline{W}_t/L_t$ is *total wealth* per person.[^60] Total wealth is the present value of per-capita consumption $c_t$: $$
\frac{\overline{W}_t}{L_t} = \mathbb{E}_t\left[\int_0^\infty\frac{\xi_{t+\tau}}{\xi_t}c_{t+\tau}d\tau\right] = \rho^{-1}c_t.
$$ Because consumption equals output in equilibrium, we therefore have $$
\overline{W}_t = \rho^{-1}Y_t.

$$ Total wealth is always exactly proportional to output, even along the transition path. Moreover, because output is the sum of all dividends and labor income, $$
Y_t = D_t + w_{{\rm p}t}L_{{\rm p}t} + w_{{\rm x}t}X_t,
$$ we can decompose total wealth into three components: $$
\overline{W}_t = V_t + \textrm{PV}_t^{\rm entrants} + \textrm{PV}_t^{\rm labor}.

$$ That is, total wealth includes not only the value $V_t$ of the stock market (the claim to current incumbent firms' dividends), but also the present value $\textrm{PV}_t^{\rm entrants}$ of future entrants (firms not yet traded) and the present value $\textrm{PV}_t^{\rm labor}$ of labor income paid out to households.

<figure id="fig:welfare_wealthcomp" data-latex-placement="t">
<div class="center">
<p> <img src="paperplots/trans_wealthgrowth" style="width:70.0%" alt="image" /></p>
</div>
<figcaption>The figure shows cumulative log change in stock-market wealth (<span class="math inline"><em>V</em><sub><em>t</em></sub></span>) and total wealth (<span class="math inline">$\overline{W}_t$</span>) from 1975–2020 in the model. Total wealth is defined as the present value of consumption; see the main text for a definition.</figcaption>
</figure>

This means that, while the value of the stock market boomed relative to output (see Figure 7), the present value of consumption ) stagnated with output. Figure 12 plots the cumulative change in the log of each of these wealth measures ($V_t$ and $\overline{W}_t$) since 1975. As output growth fell, total wealth stagnated; at the same time, the value of the stock market accelerated. The decomposition ) accounts for this: all of the gains in stock-market wealth came at the expense of labor income and future entrants. The present value of labor income falls mainly because high-productivity firms have higher markups. The present value of entrants falls both because the rate of new-firm entry falls (see Figure 10) and because the average value of incumbent firms (which tend increasingly to be high-productivity) rises relative to the average value of entrants (which tend to be low-productivity).

# Discussion of model extensions 

Here I consider several model extensions and alternative assumptions. Some of the extensions yield new insights about the economic transition. Appendix 15 gives formal derivations.

#### Systematic risk and risk aversion

Perhaps the most unrealistic assumption about the baseline model is that there is no systematic risk, so firms are discounted at the riskless interest rate. We can generalize this along two dimension, following Miller et al. (2026) and Farhi and Gourio (2018). First, let us generalize firms' production functions ) to include a common, *aggregate* productivity shock $z_t$, which follows a jump-diffusion: $$
Y_{ijt} = a_ie^{z_t}L_{ijt}, \quad\text{where}\quad dz_t = \kappa_z(\bar{z}-z_t)dt + \sigma_zdB_t - \zeta_zdJ_t,

$$ for $B_t$ a Brownian motion and $J_t$ a Poisson process with intensity $p_z$. For small $p_z$ and $\zeta_z>0$, this jump term represents a rare disaster, and makes it possible to match a sizable equity premium.[^61] Second, let us generalize household preferences ) to the recursive formulation of Duffie and Epstein (1992), which continues to assume a unit EIS and a constant Frisch elasticity, but allows for potentially higher risk aversion $\gamma\geq1$.[^62]

Appendix 15.1 re-solves the model under these assumptions and shows that the equilibrium conditions are *exactly the same* as in the case without systematic risk, except now the discount factor $\rho$ in the HJBs ) and ) is replaced by a *risk-adjusted* discount factor $$
\rho^* \equiv \rho - (\gamma-1)\frac{\kappa_z}{\kappa_z+\rho}\sigma_z - p_z[e^{-\zeta_z}(1-e^{-(\gamma-1)\frac{\kappa_z}{\kappa_z+\rho}\zeta_z})(1-e^{-\zeta_z})].
$$ The risk adjustment incorporates both the equity risk premium, which raises $\rho^*$, and the precautionary savings motive, which lowers $\rho^*$. Because the model estimation identified the discount factor using a valuation ratio (not the riskfree rate), the estimate is still appropriate in this setting, but should be reinterpreted as an estimate of $\rho^*$, not of $\rho$.

#### Physical capital

Another plausible generalization of the production function ) is to assume that it takes both labor and physical capital $K_t$ as inputs: $$
Y_{ijt} = a_iK_{ijt}^\beta L_{ijt}^{1-\beta}.
$$ For simplicity, suppose that the capital stock depreciates at constant rate $\delta_K$ and is produced by a competitive capital-goods sector that makes total investment $I_t$ and rents capital out to goods producers at an endogenous rental rate $r_{Kt}$.[^63] This implies total capital accumulation $$
\dot{K}_t = I_t - \delta_KK_t.
$$ Appendix 15.2 derives two results. First, the equilibrium conditions are all the same, except the "labor share" $\Lambda$ from ) is reinterpreted as the total factor share, consisting of the labor share $(1-\beta)\Lambda$ and the capital share $\beta\Lambda$. Because the estimation moments depend only on the profit share ($1-\Lambda$), the estimates remain valid. The second implication is that the equilibrium investment-capital ratio is given by $$
\frac{I_t}{K_t} = \delta_K + g_L + g_{yt} + g_{\Lambda t}.
$$ Hence, the decline in per-capita output growth $g_{yt}$ and the slight rise in the aggregate markup $\Lambda^{-1}$ in the model imply a decline in physical investment. This is a well-documented trend in the data and the subject of several studies (Farhi and Gourio 2018; Miller et al. 2026; Crouzet and Eberly 2023). In the model, investment stagnates because "average $Q$" booms but the "marginal $q$" of capital investment remains constant. Intuitively, capital exists to scale up production of new ideas, so fewer new ideas means less need for capital.

#### Cournot competition

The results also remain the same if we assume that firms compete à la Cournot (quantity competition) instead of à la Bertrand (price competition). The only difference in the Cournot model is that firms no longer engage in limit pricing. Consequently, there can arise product markets in which multiple firms choose to produce.

#### Life-cycle dynamics

As Luttmer (2011) explains, assuming that firms remain high-productivity indefinitely implies that the average age of large firms is too high. Akcigit and Kerr (2018) also show that younger firms tend to have higher R&D intensities and growth rates. Luttmer (2011) and Acemoglu et al. (2018) address these facts by assuming that high-productivity firms switch permanently to low-productivity firms at some random point in the life cycle.

Ultimately, allowing for type switching will not change the results above, because the original estimation moments do not rely on within-firm life-cycle dynamics (age) to identify parameters. Adding type-switching shocks simply shifts parameters: we would have a higher share of high-productivity entrants and a larger productivity gap, but the cross-sectional firm distribution would be the same (provided that we are not conditioning on age).

#### Own-quality innovations

A large class of Schumpeterian growth models assumes that, within product markets, firms compete for market share by improving the quality of their existing blueprints. In principle, this could occur in the baseline model if a quality innovation hits its own blueprint; however, this is a probability-zero event. Peters (2020) and Peters and Walsh (2022) propose a tractable way of incorporating own-quality innovations into this framework. As long as the own-quality innovation technology also becomes less productive as new ideas get harder to find, the main results will remain the same.

# Concluding remarks 

How do we reconcile stagnating economic growth with a booming stock market? This paper presents evidence that declining innovation productivity explains both trends. As research gets harder, firms do less R&D and growth stagnates. At the same time, M&A activity rises, which reallocates market share toward high-productivity incumbents. High-productivity firms have high present values, so the aggregate market value booms relative to output.

The mechanism is quantitatively important. From the decline in aggregate R&D-to-value, I estimate that innovation productivity fell in half since 1975. The decline in innovation productivity explains most of the secular decline in growth and the rise in aggregate valuations. It also implies, as I show in the data, that all of the rise in aggregate valuation ratios is explained by a reallocation of market share toward high-valuation firms. The mechanism helps explain other salient trends, including declining firm entry and falling interest rates.

While aggregate stock-market wealth boomed, the present value of consumption---which is what matters for consumer welfare---did not. Indeed, the present value of consumption stagnated with output. All of the reallocation-driven gains to incumbent firms came at the expense of future labor income and entrants, implying a disconnect between the stock market and the rest of the economy. This finding highlights the importance of studying both growth and the stock market at a disaggregated level.

This paper's findings concern trends over the last half-century; the next half-century may look different. For example, recent work has considered that the development of artificial intelligence (AI) may precipitate a revival of innovation productivity, if AI can replace researchers in generating new ideas (Aghion et al. 2019; Jones and Tonetti 2026). The framework in this paper is well-suited for thinking about what such a transformation could mean for the economy in the future.

# Data appendix 

This section details the main data sources, selection, and variable definitions used throughout the paper. For details about computation of data moments for the model estimation in Section 4, see Appendix 14 below.

## CRSP and Compustat 

The main data source is the annual CRSP/Compustat Merged Database from 1965 to 2020. I consider firms incorporated in the U.S. and exclude firms in utilities (SIC 2-digit code 49) and public administration (SIC 1-digit code 9), as is standard in the literature. I exclude firm-year observations with nonpositive revenues (`sale`) or market value of equity (defined below). Note that firms have only been required to report R&D expenditures (`xrd`) since 1975, so I do not use observations of this variable in any year prior to 1975.

A firm's equity value is defined as its total market capitalization at the fiscal year end: price per share (`prcc_f`) times number of shares outstanding (`csho`). The main measure of total firm value is then equity value plus book debt (`dlc `$+$` dltt`), less cash (`che`). Likewise, total dividends to equityholders equal dividends per share times shares (`dvpsx_f `$\times$` csho`), and total firm cashflows are the sum of dividends and interest payments (`xint`). In robustness checks in Appendix 10.3 below, I also consider a broader definition of cashflows, which includes net repurchases of common and preferred shares (`prstkc `$-$` prstkpc `$-$` scstkc`).

## SDC Platinum 

M&A transaction data come from the SDC Platinum database, which contains transaction-level data since 1976. I consider completed deals (`STATUS `$=$` "Completed"`) for which the acquirer is based in the U.S. (`ANATIONCODE `$=$` "US"`). Mergers (`FORM `$=$` "Merger"`) are transactions in which the acquirer purchases 100% of the target firm's stock, or businesses combine; acquisitions of assets (`FORM `$=$` "Acq. of Assets"`) are transactions in which the acquirer purchases some of the target's assets. The value of a transaction (i.e., the amount spent by the acquirer) is given by the variable `RANKVAL`, which includes the value of net debt. Because the total M&A value reported in SDC is extremely low in the first two years of data (1976--1977), I exclude these years from analyses.

Public and private acquirers are identified by the variable `APUBLIC `$\in$` {"Public","Priv."}`, and targets are likewise identified by `TPUBLIC`. To merge SDC into CRSP/Compustat at the acquirer level, I use the acquirer identifier mappings (`agvkey` in SDC matched to `gvkey`) provided by Ewens et al. (Forthcoming), who build on the work of Phillips and Zhdanov (2013). The target's market value before a merger announcement, which I use to compute the M&A premium, is given by the variable `MV`.

## Other data sources 

For analysis of trends in patent value, I use the estimates from Kogan et al. (2017), updated to 2020. In analyses of public versus private firm R&D intensity, I use the NSF R&D-to-GDP ratio for R&D performed by businesses.[^64] The specific series comes from Table 1 of the National Patterns of R&D Resources (annual series) from the National Center for Science and Engineering Statistics. Data on firm entry and the employment distribution come from the Business Dynamics Statistics (BDS). Finally, per-capita real output and population growth data come from the BEA.

# Empirical evidence: Details and additional results 

This section details the data and empirical methods used in Section 1, and presents supplementary results.

## Additional evidence on aggregate R&D and M&A 

### Trends in patent values 

This paper argues that the ratio of R&D spending to value is an ideal moment for identifying innovation productivity. An alternative measure of the expected private value of innovation is the market value of patents. Kogan et al. (2017) estimate such a measure on the dates at which patents are issued. At the aggregate level, one would expect this measure to also reflect innovation productivity, because, all else equal, declining innovation productivity will reduce the total dollar flow of patents produced. Indeed, in the model in this paper, if one is willing to interpret "blueprints" as patents, then both the market value of patents and R&D spending should fall proportionately relative to market value as innovation productivity falls.

Figure 13 plots the ratio of aggregate patent value from Kogan et al. (2017) (the sum of individual patent values each year) to the aggregate value of firms over time. Panel A reports this ratio (in logs) on the dates at which the patents were issued; Panel B reports the ratio on the dates they were filed. The series is much noisier than R&D-to-value, but clearly shows a downward trend over time. Formally, the estimated time trend from a linear regression of the log ratio on the year implies a significant decrease ($t$-statistics of $-3.81$ for issue date and $-3.39$ for filing date). More importantly, the estimated *magnitude* of the decline in patent value implied by these regressions is nearly identical to the decline in the aggregate R&D-to-value ratio over the same period: both fell by about 0.5 log points.

<figure id="fig:patval" data-latex-placement="H">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><strong>A. Patent issue date</strong></td>
<td style="text-align: center;"><strong>B. Patent filing date</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/pat_val" style="width:100.0%" />
</div>
<figcaption>The figure plots the log of the ratio of aggregate patent value from <span class="citation" data-cites="kogan2017technological">Kogan et al. (2017)</span> to aggregate firm value in Compustat over time. Panel A aggregates patent values based on the year in which the patent was issued, while Panel B aggregates based on the year in which the patent was filed. The lines in each panel are the fitted lines from a regression of each log series on the year.</figcaption>
</figure>

This result is reassuring, but also raises the question: what is the advantage of using R&D-to-value instead of a patent-value measure to study innovation productivity? The main reason is *timing*.[^65] R&D and value are both forward-looking and contemporaneous. Valuations are higher today if the present value of future profits is higher. R&D spending is higher today if research is expected to yield a stream of future profits with a higher present value. In contrast, patent valuations are partly backward-looking, in that they reflect a combination of past research efforts and unexpected or unrelated noise. The average time between patent filing and patent issuance in the Kogan et al. (2017) data is 2.8 years (10th percentile 1.2 years, 90th percentile 4.8 years), and the time between R&D spending and patent filing is plausibly even longer. Thus, market conditions at the time at which a patent is issued are plausibly much different from the conditions at the time at which the research was initially conducted. This will lead to a disconnect between expected innovation value and the patent value. R&D spending, on the other hand, is entirely based on the firm's expectations at the time at which research is conducted. This is likely the reason that the series in Figure 13 is significantly more volatile than the R&D-to-value series in Figure 1. Consequently, R&D-to-value allows for higher-frequency identification of the decline in innovation productivity, which is key to the model estimation.

### Aggregate M&A-to-R&D 

Figure 2 plotted a two-sided, five-year moving average of the aggregate M&A-to-R&D ratio over time. Figure 14 plots the underlying, unfiltered time series (in logs). The figure also plots a time trend implied by fitting a linear regression of log M&A-to-R&D on the year. The coefficient estimates are significant ($t$-statistics of 3.34 and 3.36 for mergers and mergers plus acquisitions of assets, respectively), formally confirming the visible increase over time.

<figure id="fig:aggmaxrd_unfiltered" data-latex-placement="H">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><img src="paperplots/ma_xrd_alldata" style="width:80.0%" alt="image" /></td>
<td style="text-align: center;"></td>
</tr>
</tbody>
</table>
</div>
<figcaption>The figure plots the aggregate ratio of M&amp;A spending to R&amp;D spending for public firms in Compustat and SDC Platinum from 1980–2020. Note that the y-axis is transformed onto a log scale. M&amp;A is defined as the dollar amount spent by acquirers on either mergers (transactions in which all of the target is acquired) or mergers plus acquisitions of assets. The lines denoted “Time trend" are the fitted lines from a regression of each log series on the year.</figcaption>
</figure>

### Evidence of value-creation by M&A 

The model makes two main predictions about M&A, which link the trends in Figures 2 and 3. The first prediction is that M&A reallocates market share from low-valuation, less productive firms to higher-valuation, productive firms.[^66] The second prediction, which is crucial from a macro perspective, is that M&A creates surplus value for the aggregate market. Here I briefly survey the empirical literature on M&A and discuss the evidence for these predictions.

The literature broadly finds that most mergers constitute a reallocation from low-valuation targets to high-valuation buyers. Arikan and Stulz (2016) document that acquirers tend to have significantly higher market valuations (measured as Tobin's $Q$, their proxy for growth opportunities) than targets, which is the key cross-sectional prediction linking the rise in M&A (Figure 2) with the reallocation-driven stock-market boom (Figure 3). Beyond valuations, micro-level evidence supports the view that mergers reallocate assets to more productive owners, and that these more productive owners operate the assets with greater efficiency and profitability. Using plant-level data, Maksimovic and Phillips (2001, 2002) show that mergers are associated with systematic restructuring and the transfer of assets toward firms that can operate them more efficiently.

The literature also finds that M&A creates surplus value for the aggregate market, in that the market value of the combined firm is greater than the value of the two firms if they had not merged. Betton et al. (2008) show that average cumulative abnormal returns (CARs) for the combined acquirer and target are large (1.8%) and significantly positive over both the run-up and announcement.[^67] Older studies find the same (Jensen and Ruback 1983; Andrade et al. 2001). The fact that joint CARs are large and significant is supportive, but one should note that this is a limited measure of the value created through M&A. My model predicts that most of the anticipated gains from M&A should be priced into both acquirers and targets before any news of a transaction occurs; thus, announcement returns capture only the (potentially small) part of the wealth gain that is not already priced.

The data also suggest that M&A creates surplus by increasing the target's profits and productivity in particular. Healy et al. (1992) document improvements in post-merger asset productivity and operating cash flow performance relative to pre-merger benchmarks. As mentioned above, Maksimovic and Phillips (2001, 2002) document plant-level productivity improvements. These facts accord with my model mechanism, in which the high productivity of the acquirer yields both higher productivity and higher market power (profits) of the target post-merger.

It is essential to note that, from a macro perspective, what matters for measuring total surplus is the value created for *both* the acquirer and the target. Much of the literature is instead concerned with how much of this surplus value is captured by the owners of the acquirer versus the target. This transfer is irrelevant for an investor who owns the market. These studies tend to find that most, if not all, of the positive joint returns accrue to owners of the target firm. This is consistent both with the fact that acquirers tend to pay extremely large premia relative to the target's pre-announcement value and with the fact that announcement CARs are substantially larger for targets than for bidders. Betton et al. (2008) and Arikan and Stulz (2016) find that acquirer CARs tend not to be significantly positive, and may even be negative conditional on some characteristics (e.g., all-stock acquisitions of public targets by old acquirers).[^68] Note, however, that an insignificant abnormal return for an acquirer does not necessarily mean the deal did not create net value. First, it could simply mean that most of the surplus went to the owners of the target firm. This could be explained either by differential bargaining power or by overpayment due to agency or behavioral frictions. And second, as alluded to above, near-zero announcement returns for large acquirers are actually a prediction of the model in this paper, as M&A becomes increasingly predictable for large firms and hence becomes priced in well before announcements.[^69] All of this said, my model can easily accommodate forces that lead to overpayment by managers of acquirers (e.g., an agency or behavioral friction that leads the manager to perceive a higher surplus than investors) without altering the crucial fact that M&A creates aggregate surplus.

## Micro-level decompositions: Derivation and properties 

### Derivation of empirical decomposition 

To derive the decompositions ) and ), let us consider an arbitrary aggregate variable $$
Z_t \equiv \sum_{i=1}^{M_t}s_{it}Z_{it},
$$ where $Z_{it}$ is some firm-level variable (e.g., value-to-sales) and $s_{it}$ are some arbitrary weights that sum to one (e.g., sales shares). Note that the weights $s_{it}$ need not have any relation to $Z_{it}$; we only assume that they are nonnegative and sum to one.

Before deriving the decomposition, let us define some useful notation. Let $\mathcal{I}_t$ represent the set of firms operating at time $t$. Let $\mathcal{I}_t^{\rm ent}$ denote the set of firms that entered the sample at time $t$, $\mathcal{I}_t^{\rm cont}$ the set of firms that continued to operate in both periods $t-1$ and $t$, and $\mathcal{I}_t^{\rm exit}$ the set of firms that exited in period $t$. Let the operator $\Delta$ denote the change from one period to the next (e.g., $\Delta Z_t\equiv Z_t-Z_{t-1}$). Define $s_t^{\rm ent}\equiv\sum_{i\in\mathcal{I}_t^{\rm ent}}s_{it}$ to be the total entrant share at $t$, and $s_{t-1}^{\rm exit}\equiv\sum_{i\in\mathcal{I}_t^{\rm exit}}s_{i,t-1}$ the total exiter share at $t-1$. Note that this implies $s_t^{\rm cont}=1-s_t^{\rm ent}$ and $s_{t-1}^{\rm cont}=1-s_{t-1}^{\rm exit}$. Finally, define the conditional continuer shares as $s_{it}^{\rm cont}\equiv s_{it}/s_t^{\rm cont}$ and $s_{i,t-1}^{\rm cont}\equiv s_{i,t-1}/s_{t-1}^{\rm cont}$, implying conditional aggregates $Z_t^{\rm cont}\equiv\sum_{i\in\mathcal{I}_t^{\rm cont}}s_{it}^{\rm cont}Z_{it}$.

Under this notation, we have $\mathcal{I}_t=\mathcal{I}_t^{\rm ent}\cup\mathcal{I}_t^{\rm cont}$, so the time-$t$ aggregate value is $$
Z_t = \sum_{i\in\mathcal{I}_t^{\rm ent}}s_{it}Z_{it} + \sum_{i\in\mathcal{I}_t^{\rm cont}}s_{it}Z_{it}.
$$ Likewise, $\mathcal{I}_{t-1}=\mathcal{I}_t^{\rm exit}\cup\mathcal{I}_t^{\rm cont}$, so the lagged aggregate can be written $$
Z_{t-1} = \sum_{i\in\mathcal{I}_t^{\rm exit}}s_{i,t-1}Z_{i,t-1} + \sum_{i\in\mathcal{I}_t^{\rm cont}}s_{i,t-1}Z_{i,t-1}.
$$ Differencing these equations, we have that the total change equals $$
\Delta Z_t = \sum_{i\in\mathcal{I}_t^{\rm ent}}s_{it}Z_{it} - \sum_{i\in\mathcal{I}_t^{\rm exit}}s_{i,t-1}Z_{i,t-1} + \sum_{i\in\mathcal{I}_t^{\rm cont}}(s_{it}Z_{it} - s_{i,t-1}Z_{i,t-1}).
$$ Adding and subtracting $s_t^{\rm ent}Z_t^{\rm cont}$ and $s_{t-1}^{\rm exit}Z_{t-1}^{\rm cont}$ on the right-hand side (and recalling that $s_t^{\rm ent}=1-s_t^{\rm cont}$ and $s_{t-1}^{\rm exit}=1-s_{t-1}^{\rm cont}$) then implies $$
\Delta Z_t = \sum_{i\in\mathcal{I}_t^{\rm ent}}s_{it}(Z_{it} - Z_t^{\rm cont}) - \sum_{i\in\mathcal{I}_t^{\rm exit}}s_{i,t-1}(Z_{i,t-1} - Z_{t-1}^{\rm cont}) + \Delta Z_t^{\rm cont}.% \sum_{i\in\mathcal{I}_t^{\rm cont}}(s_{it}Z_{it} - s_{i,t-1}Z_{i,t-1}).
$$ The last term can be decomposed as $$\begin{align}
\Delta Z_t^{\rm cont} &= \sum_{i\in\mathcal{I}_t^{\rm cont}}(s_{it}^{\rm cont}Z_{it} - s_{i,t-1}^{\rm cont}Z_{i,t-1}), \\
&= \sum_{i\in\mathcal{I}_t^{\rm cont}}\frac{1}{2}(s_{it}^{\rm cont}+s_{i,t-1}^{\rm cont})\Delta Z_{it} + \sum_{i\in\mathcal{I}_t^{\rm cont}}\Delta s_{it}^{\rm cont}\frac{1}{2}(Z_{it} + Z_{i,t-1}).
\end{align}$$ Using the index notation $(\overline{\cdot})$ from ), we therefore have the total decomposition $$\begin{multline}
\Delta Z_t = \underbrace{\sum_{i\in\mathcal{I}_t^{\rm ent}}s_{it}(Z_{it} - Z_t^{\rm cont}) - \sum_{i\in\mathcal{I}_t^{\rm exit}}s_{i,t-1}(Z_{i,t-1} - Z_{t-1}^{\rm cont}) + \sum_{i\in\mathcal{I}_t^{\rm cont}}(\overline{s_{it}^{\rm cont}})\Delta Z_{it}}_{\textrm{within-firm change}} \\
+ \underbrace{\sum_{i\in\mathcal{I}_t^{\rm cont}}\Delta s_{it}^{\rm cont}(\overline{Z_{it}})}_{\textrm{compositional change}}.
\end{multline}$$ If we redefine the lags and leads of entrants as described in the main text, then this decomposition can be written more concisely in the form ) and ).

### Relation to other macro decompositions 

The decomposition I construct is closely related to a larger class of macroeconomic decompositions. In the notation above, the Haltiwanger (1997) decomposition has components $$\begin{multline}
\Delta Z_t = \underbrace{\sum_{i\in\mathcal{I}_t^{\rm cont}}s_{i,t-1}\Delta Z_{it}}_{\textrm{within}_{\rm H}} + \underbrace{\sum_{i\in\mathcal{I}_t^{\rm cont}}\Delta s_{it}\Delta Z_{it}}_{\textrm{cross}_{\rm H}} + \underbrace{\sum_{i\in\mathcal{I}_t^{\rm cont}}\Delta s_{it}(Z_{i,t-1}-Z_{t-1})}_{\textrm{between}_{\rm H}} \\
+ \underbrace{\sum_{i\in\mathcal{I}_t^{\rm ent}}s_{it}(Z_{it}-Z_{t-1})}_{\textrm{entry}_{\rm H}} - \underbrace{\sum_{i\in\mathcal{I}_t^{\rm exit}}s_{i,t-1}(Z_{i,t-1}-Z_{t-1})}_{\textrm{exit}_{\rm H}}.

\end{multline}$$ Haltiwanger (1997) and Foster et al. (2001) use this decomposition as a slight adjustment to that of Baily et al. (1992). Conventionally, these decompositions have been used to understand the components of productivity growth from plant-level data (Foster et al. 2001; Bartelsman and Doms 2000; Lentz and Mortensen 2008). More recently, the same decomposition has been used to understand the drivers of secular macro trends. For example, De Loecker et al. (2020) use it to decompose the rise in the sales-weighted aggregate markup.

While this decomposition may be ideal for studying the drivers of productivity growth, it is not suitable for studying secular change in aggregate valuations for two reasons. First, the cross term is mechanically very negative, because transitory shocks to sales shares tend to be highly negatively correlated with transitory shocks to value-to-sales ratios (through the denominator).[^70] The solution to this is to use the indices $(\overline{\cdot})$ for continuers. Without entry and exit, this indexing is equivalent to adding half of the cross term to the within effect and the other half to the between effect:[^71] $$\begin{align}
\sum_{i=1}^{M}\overline{s_{it}}\Delta Z_{it} &= \underbrace{\sum_{i=1}^{M}s_{i,t-1}\Delta Z_{it}}_{\textrm{within}_{\rm H}} + \frac{1}{2}\underbrace{\sum_{i=1}^{M}\Delta s_{it}\Delta Z_{it}}_{\textrm{cross}_{\rm H}}, \\
\sum_{i=1}^{M}\Delta s_{it}\overline{Z_{it}} &= \underbrace{\sum_{i=1}^{M}\Delta s_{it}Z_{i,t-1}}_{\textrm{between}_{\rm H}} + \frac{1}{2}\underbrace{\sum_{i=1}^{M}\Delta s_{it}\Delta Z_{it}}_{\textrm{cross}_{\rm H}}.
\end{align}$$ When we have entry and exit, these will no longer be quite the same, because I adjust shares $s_{it}$ to $s_{it}^{\rm cont}$. The second reason we cannot use ) to study secular change in valuations is that there may be life-cycle dynamics that bias the within, entry, and exit terms. For example, suppose firms tend to enter with high valuations and then these valuations tend to fall with age. The within term will be negative, the entry term positive, and the two will exactly offset if the distribution is stationary. We therefore (1) compare entrant valuations to time-$t$ continuer valuations (not time-$(t-1)$) and (2) sum the within and net entry terms. Next, I illustrate how these adjustments neutralize biases.

### Illustrative examples 

To illustrate the properties of the above decomposition, I consider three stylized economies. The first two show the advantage of using the indices for continuers in the presence of transitory and persistent idiosyncratic shocks, while the third illustrates why entry and exit are included in the within-firm change. I consider a continuum of firms in each case in order to abstract from sample noise.

#### Example 1: Transitory idiosyncratic shocks

Suppose there is no entry or exit, and the set of firms are ex ante homogeneous. Firm $i$ has value $V_{it}$ and sales $Y_{it}$, where $$
Y_{it} = \bar{Y}\epsilon_{{\rm Y}it} \quad\textrm{and}\quad V_{it} = \bar{V}\epsilon_{{\rm V}it},
$$ where the shocks $\epsilon_{{\rm Y}it}$ and $\epsilon_{{\rm V}it}$ are independent over time and of each other---that is, they are completely transitory. Without loss of generality, assume that $\mathbb{E}[\epsilon_{{\rm Y}it}] = \mathbb{E}[\epsilon_{{\rm V}it}] = 1$.

The aggregate value-to-sales ratio is then a constant $$
\frac{V_t}{Y_t} = \frac{\bar{V}}{\bar{Y}}.
$$ Now let us consider what happens to the decomposition terms. The valuation ratio and sales share of firm $i$ are $$
\frac{V_{it}}{Y_{it}} = \frac{\bar{V}}{\bar{Y}}\frac{\epsilon_{{\rm V}it}}{\epsilon_{{\rm Y}it}} \quad\textrm{and}\quad \frac{Y_{it}}{Y_t} = \epsilon_{{\rm Y}it}.
$$ Hence, the Haltiwanger terms from $t-1$ to $t$ equal $$\begin{align}
\textrm{within}_{\rm H} &\equiv \frac{\bar{V}}{\bar{Y}}\int_0^1\epsilon_{{\rm Y}i,t-1}\left(\frac{\epsilon_{{\rm V}it}}{\epsilon_{{\rm Y}it}} - \frac{\epsilon_{{\rm V}i,t-1}}{\epsilon_{{\rm Y}i,t-1}}\right)di \\
\textrm{between}_{\rm H} &\equiv \frac{\bar{V}}{\bar{Y}}\int_0^1(\epsilon_{{\rm Y}it}-\epsilon_{{\rm Y}i,t-1})\frac{\epsilon_{{\rm V}i,t-1}}{\epsilon_{{\rm Y}i,t-1}}di \\
\textrm{cross}_{\rm H} &\equiv \frac{\bar{V}}{\bar{Y}}\int_0^1(\epsilon_{{\rm Y}it}-\epsilon_{{\rm Y}i,t-1})\left(\frac{\epsilon_{{\rm V}it}}{\epsilon_{{\rm Y}it}} - \frac{\epsilon_{{\rm V}i,t-1}}{\epsilon_{{\rm Y}i,t-1}}\right)di.
\end{align}$$ Evaluating these expectations and using the fact that all shocks are uncorrelated implies $$\begin{align}
\textrm{within}_{\rm H} &= \frac{\bar{V}}{\bar{Y}}\left(\mathbb{E}[\epsilon_{{\rm Y}i}^{-1}]-1\right) > 0, \\
\textrm{between}_{\rm H} &= \frac{\bar{V}}{\bar{Y}}\left(\mathbb{E}[\epsilon_{{\rm Y}i}^{-1}]-1\right) > 0, \\
\textrm{cross}_{\rm H} &= \frac{\bar{V}}{\bar{Y}}2\left(1-\mathbb{E}[\epsilon_{{\rm Y}i}^{-1}]\right) < 0.
\end{align}$$ ($\mathbb{E}[\epsilon_{{\rm Y}i}^{-1}]>1$ follows from Jensen's inequality.) Intuitively, the within term is positive because lagged sales shares are on average going to mean revert in the next period, creating a positive covariance between lagged sales share and next-period valuation---for example, firms for which sales shares are unusually high at $t-1$ will tend to have unusually low valuations at $t-1$, but these will on average mean revert at $t$, raising the valuation. Similar intuition holds for the between term.

The indexing adjustment corrects this bias: $$
\textrm{within}_{\rm H} + \frac{1}{2}\textrm{cross}_{\rm H} = \textrm{between}_{\rm H} + \frac{1}{2}\textrm{cross}_{\rm H} = 0.
$$ Consistent with a stationary firm distribution, both the within-firm change and compositional change are now zero in this economy.

#### Example 2: Persistent idiosyncratic shocks

Consider the setting in the previous example, but now, instead of independent transitory shocks, the numerator and denominator are subject to (potentially correlated) persistent shocks. Specifically, the log sales of firm $i$, denoted $y_{it}\equiv\log Y_{it}$, follow the autoregression $$
y_{it} = \log\bar{Y} + \phi_yy_{i,t-1} + \epsilon_{yit}, % - \frac{1}{2}\frac{\sigma_y^2}{1-\phi_y^2}
$$ where $\phi_y\leq1$ is the persistence of the shocks. Similarly, the log value-to-sales ratio, $v_{it}\equiv\log(V_{it}/Y_{it})$, follows the autoregression[^72] $$
v_{it} = \bar{v} + \phi_vv_{i,t-1} + \epsilon_{vit}.
$$ The shocks $[\epsilon_{yit},\epsilon_{vit}]$ are iid over time and, at each time $t$, jointly normal with variances $\sigma_y^2$ and $\sigma_x^2$ and covariance $\sigma_{xy}$. It is straightforward to compute the aggregate value-to-sales ratio $V_t/Y_t$ and see that it is a constant.

The compositional change (corrected with the index) equals $$\begin{align}
\textrm{between}_{\rm H} + \frac{1}{2}\textrm{cross}_{\rm H} &= \frac{1}{2}\int_0^1(e^{y_{it}}-e^{y_{i,t-1}})(e^{v_{it}} + e^{v_{i,t-1}})di \\
&= \frac{1}{2}\mathbb{E}\left[e^{y_{it}+v_{i,t-1}} - e^{y_{i,t-1}+v_{it}}\right].
\end{align}$$ Substituting in $y_{it} = \bar{y} + \sum_{j=0}^\infty\phi_y^j\epsilon_{yi,t-j}$ and $v_{it} = \bar{v} + \sum_{j=0}^\infty\phi_v^j\epsilon_{vi,t-j}$, the first term in the expectation equals $$\begin{align}
\mathbb{E}\left[e^{y_{it}+v_{i,t-1}}\right] &= \exp\left\{\frac{1}{2}\left(\textrm{var}(y_{it}) + \textrm{var}(v_{i,t-1}) + 2\textrm{cov}(y_{it},v_{i,t-1})\right)\right\} \\
&= \exp\left\{\frac{1}{2}\left(\sum_{j=0}^\infty\phi_y^{2j}\sigma_y^2 + \sum_{j=0}^\infty\phi_v^{2j}\sigma_v^2 + 2\sum_{j=1}^\infty(\phi_y\phi_v)^j\sigma_{yv}\right)\right\} \\
&= \exp\left\{\frac{1}{2}\left(\frac{\sigma_y^2}{1-\phi_y^2} + \frac{\sigma_v^2}{1-\phi_v^2} + 2\frac{\phi_y\phi_v}{1-\phi_y\phi_v}\sigma_{yv}\right)\right\}.
\end{align}$$ By the same algebra, one can see that $\mathbb{E}\left[e^{y_{i,t-1}+v_{it}}\right]$ is the exact same. Hence, the total compositional change (and, in turn, the within-firm change) is zero: $$
\textrm{between}_{\rm H} + \frac{1}{2}\textrm{cross}_{\rm H} = \textrm{within}_{\rm H} + \frac{1}{2}\textrm{cross}_{\rm H} = 0.
$$ Again, the indexing corrects the bias that would be present in the individual terms.

#### Example 3: Life-cycle trends and entry

This final example demonstrates how adding net entry to the within-firm change corrects for life-cycle trends in firm valuations. Consider an economy with a mass of firms producing aggregate output $Y_t$. At each time $t$, a mass of new entrants begins producing; the total output of this cohort is proportional to last period's total output: $$
Y_t^{(0)} = \theta Y_{t-1}.
$$ All other firms continue producing the same amount as before, so it follows that $$
Y_t = (1+\theta)Y_{t-1}.
$$ This means that the total sales share of firms aged $\tau$ equals $$
\frac{Y_t^{(\tau)}}{Y_t} = \frac{\theta}{1+\theta}(1+\theta)^{-\tau}.
$$ Furthermore, assume that every entering firm has the same value-to-sales ratio $$
\frac{V_{it}}{Y_{it}} = v^{(0)} = \bar{v}.
$$ As firms age, their value-to-sales ratios fall by a factor of $(1+\delta)^{-1}$: for a firm of age $\tau$, the valuation is therefore $$
\frac{V_{it}}{Y_{it}} = v^{(\tau)} = \bar{v}(1+\delta)^{-\tau}.
$$ All of these facts together imply that the aggregate value-to-sales ratio equals the constant $$
\frac{V_t}{Y_t} = \sum_{\tau=0}^\infty \frac{Y_t^{(\tau)}}{Y_t}v^{(\tau)} = \bar{v}\frac{\theta(1+\delta)}{\theta(1+\delta)+\delta}.
$$ The entrants offset the declining valuations of aging firms, yielding a stationary distribution.

Consider first the Haltiwanger decomposition terms from $t-1$ to $t$: $$\begin{align}
\textrm{entry}_{\rm H} &= \frac{Y_t^{(0)}}{Y_t}\left(\bar{v} - \frac{V_t}{Y_t}\right) > 0, \\
\textrm{within}_{\rm H} &= \sum_{\tau=0}^\infty \frac{Y_{t-1}^{(\tau)}}{Y_{t-1}}(v^{(\tau+1)}-v^{(\tau)}) < 0, \\
\textrm{between}_{\rm H} &= \sum_{\tau=0}^\infty \left(\frac{Y_t^{(\tau+1)}}{Y_t} - \frac{Y_{t-1}^{(\tau)}}{Y_{t-1}}\right)v^{(\tau)} < 0, \\
\textrm{cross}_{\rm H} &= \sum_{\tau=0}^\infty \left(\frac{Y_t^{(\tau+1)}}{Y_t} - \frac{Y_{t-1}^{(\tau)}}{Y_{t-1}}\right)(v^{(\tau+1)}-v^{(\tau)}) > 0.
\end{align}$$ The entry term is positive because entrants have above-average valuation ratios. The within term is negative because, conditional on being a continuer, valuations tend to fall over time. The between term is negative because sales shares of continuers fall as entrants crowd them out. The cross term is positive because of the interaction of the within and between effects.

The adjusted decomposition makes two important adjustments to the above terms. First, the sales shares among continuers are defined relative to total continuer output ($Y_t^{\rm cont} = \sum_{\tau=1}Y_t^{(\tau)}$), not aggregate output ($Y_t = Y_t^{\rm cont} + Y_t^{(0)})$. Consequently, the "between" term above is no longer negative, but zero: $$
\textrm{between}_{\rm H} = \sum_{\tau=0}^\infty \left(\frac{Y_t^{(\tau+1)}}{Y_t^{\rm cont}} - \frac{Y_{t-1}^{(\tau)}}{Y_{t-1}}\right)v^{(\tau)} = 0,
$$ and similarly for the "cross" term. Second, by adding the entry term to the within-firm term, we offset the deterministic life-cycle trend in valuations, so that it is now also zero: $$
\textrm{entry}_{\rm H} + \textrm{within}_{\rm H} = 0.
$$ This example works similarly if we allow firm death and add the resulting exit term to the within-firm component.

### Details about computation in the data

In the examples above, we assumed a continuum of firms, which allowed us to apply a law of large numbers. In the data, there is a finite number of firms, and thus these decompositions can be influenced by extreme outliers. To mitigate this, I do two things. First, I winsorize firm-level ratios at the 10th and 90th percentiles within each year before conducting the decomposition. Winsorizing gives a more conservative estimate of the total amount of reallocation that took place over this period; the fact that there is still substantial reallocation confirms that the decomposition results are not driven by extremely high value-to-sales firms.[^73] Second, to stabilize changes in firm-level ratios, I smooth sales (the denominator) as an average of its current and lagged value. This has a minimal effect in the long-run, but helps reduce outlier-driven noise in the short-run.

## Micro-level decompositions: Additional results 

### Separating out firm entry and exit

The main decompositions add the entry and exit margins into the within-firm change, for the reasons laid out above in Appendix 10.2. Figure 15 separates these two margins out and plots the cumulative contribution of the four decomposition terms for value-to-sales and cashflow-to-value. In both cases, both the entry and exit margins are very small, implying that the vast majority of aggregate variation is coming from continuers.

<figure id="fig:decomp_val_entryexit" data-latex-placement="H">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><strong>A. Value-to-sales</strong></td>
<td style="text-align: center;"><strong>B. Cashflow-to-value</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/decomp_valsale_cfval_combined_entryexit" style="width:105.0%" />
</div>
<figcaption>The figure plots the decompositions of changes in aggregate valuations into within-firm, compositional, and entry and exit components. The decompositions for value-to-sales and cashflow-to-value are as in (<a href="#eq:valsale_decomp" data-reference-type="ref" data-reference="eq:valsale_decomp">[eq:valsale_decomp]</a>) and (<a href="#eq:cfval_decomp" data-reference-type="ref" data-reference="eq:cfval_decomp">[eq:cfval_decomp]</a>), respectively, except that now the entry and exit components are not included in the within-firm change. See the text for details. Value is defined as market equity plus book debt, less cash; and cashflows are defined as dividends plus interest payments.</figcaption>
</figure>

### Equity value and alternative cashflow measures

The baseline measure of valuations and cashflows uses total firm value (equity value plus net debt) and total cashflows (dividends plus interest payments). One might wonder whether these decomposition results also hold for the equity component of firms. Figure 16 plots the decompositions from the main text for equity only---that is, defining value as the market value of equity and cashflows as dividends. The results are virtually unchanged.

<figure id="fig:decomp_val_eq" data-latex-placement="H">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><strong>A. Equity value to sales</strong></td>
<td style="text-align: center;"><strong>B. Dividend yield on equity</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/decomp_valsale_cfval_combined_eq" style="width:105.0%" />
</div>
<figcaption>The figure plots the decompositions of changes in aggregate equity valuations into within-firm and compositional components. The decomposition for value-to-sales is defined in (<a href="#eq:valsale_decomp" data-reference-type="ref" data-reference="eq:valsale_decomp">[eq:valsale_decomp]</a>) and the decomposition for cashflow-to-value in (<a href="#eq:cfval_decomp" data-reference-type="ref" data-reference="eq:cfval_decomp">[eq:cfval_decomp]</a>). See the main text for an explanation. Value is defined as market equity and cashflows are defined as dividends.</figcaption>
</figure>

Another question that might arise is whether the cashflow-to-value results are sensitive to the definition of payouts to investors. One could argue that, from an investor's perspective, what matters for valuation is the total payout, inclusive of net share repurchases. Figure 17 plots the decompositions of cashflow-to-value after adding net repurchases to the numerator. Panel A corresponds to total firm value and cashflows (i.e., including debt) and Panel B corresponds to equity only. While adding net buybacks does attenuate the total increase in aggregate valuations, the decomposition results remain the same: there has been a large reallocation of market share toward high-valuation (low cashflow-to-value) firms over time.

<figure id="fig:decomp_val_payout" data-latex-placement="H">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><strong>A. Total firm</strong></td>
<td style="text-align: center;"><strong>B. Equity only</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/decomp_cfval_payouts_combined" style="width:105.0%" />
</div>
<figcaption>The figure plots the decompositions of changes in aggregate cashflow-to value ratios where cashflows include net buybacks of equity. The within-firm and compositional components of the decompositions are defined in (<a href="#eq:cfval_decomp" data-reference-type="ref" data-reference="eq:cfval_decomp">[eq:cfval_decomp]</a>)—see the main text for an explanation. In Panel A, value is defined as market equity plus book debt, less cash; and cashflows are defined as dividends and interest payments plus net equity buybacks. In Panel B, value is only market equity while cashflows are dividends and net buybacks.</figcaption>
</figure>

### Implications for R&D-to-sales

An alternative measure of innovation effort that is sometimes used in the literature is the ratio of aggregate R&D to aggregate sales (or output). The black line in Figure 18 shows that this aggregate ratio has increased since 1975. A main insight of this paper is that the ratio of R&D to *value*, not sales, is the ideal moment for identifying innovation productivity, because, holding innovation productivity fixed, higher present values will increase R&D spending relative to output. Still it is informative to understand what happened to R&D-to-sales at the micro and macro levels.

The first fact that is key to understanding trends in R&D-to-sales is that high-valuation firms tend to be high-R&D-to-sales firms. Table 3 reports estimates of cross-sectional regressions of valuations relative to sales and cashflows on R&D-to-sales ratios. There is a significant positive relationship between valuations and R&D intensities within years (specifications (2) and (5)) and even within narrowly defined industries (specifications (3) and (6)). Perhaps most importantly, the $R^2$ values from these regressions are high (especially for value-to-sales), implying that R&D-to-sales explains a large proportion of the variation in firm-level valuations. This is intuitive, because firms with high valuations have a greater incentive to innovate, and firms with high rates of innovation will tend to have higher future cashflows. Put simply, high-valuation firms are high-growth firms.

+:-----------------------------------------------+:-:+:-------------:+:-----------:+:-------------:+:-----------:+:-------------:+:-:+:-------------:+:----------:+:-------------:+:----------:+:-------------:+
| Dependent variable: | | $\log(\textrm{value}_{it}/\textrm{sales}_{it})$ | | $\log(\textrm{value}_{it}/\textrm{cashflow}_{it})$ |
+------------------------------------------------+---+---------------+-------------+---------------+-------------+---------------+---+---------------+------------+---------------+------------+---------------+
| | | \(1\) | | \(2\) | | \(3\) | | \(4\) | | \(5\) | | \(6\) |
+------------------------------------------------+---+---------------+-------------+---------------+-------------+---------------+---+---------------+------------+---------------+------------+---------------+
| $\log(\textrm{R\&D}_{it}/\textrm{sales}_{it})$ | | $0.508^{***}$ | | $0.478^{***}$ | | $0.511^{***}$ | | $0.164^{***}$ | | $0.133^{***}$ | | $0.096^{***}$ |
+------------------------------------------------+---+---------------+-------------+---------------+-------------+---------------+---+---------------+------------+---------------+------------+---------------+
| | | $(0.005)$ | | $(0.006)$ | | $(0.008)$ | | $(0.009)$ | | $(0.009)$ | | $(0.018)$ |
+------------------------------------------------+---+---------------+-------------+---------------+-------------+---------------+---+---------------+------------+---------------+------------+---------------+
| Year FE | | | | Y | | | | | | Y | | |
+------------------------------------------------+---+---------------+-------------+---------------+-------------+---------------+---+---------------+------------+---------------+------------+---------------+
| Year $\times$ NAICS-6 FE | | | | | | Y | | | | | | Y |
+------------------------------------------------+---+---------------+-------------+---------------+-------------+---------------+---+---------------+------------+---------------+------------+---------------+
| $R^2$ | | $0.457$ | | $0.492$ | | $0.569$ | | $0.079$ | | $0.252$ | | $0.524$ |
+------------------------------------------------+---+---------------+-------------+---------------+-------------+---------------+---+---------------+------------+---------------+------------+---------------+
| Within $R^2$ | | $0.457$ | | $0.421$ | | $0.335$ | | $0.079$ | | $0.062$ | | $0.018$ |
+------------------------------------------------+---+---------------+-------------+---------------+-------------+---------------+---+---------------+------------+---------------+------------+---------------+

: The table reports cross-sectional regressions of firm-level valuation ratios (value to sales and value-to-cashflows) on firm-level R&D-to-sales ratios. Specifications (2) and (5) control for year fixed effects; while (3) and (6) control for year$\times$industry fixed effects, where the industry is defined at the NAICS 6-digit level. The within $R^2$ values represent the $R^2$ values after demeaning by fixed effects (i.e., the variation explained by R&D-to-sales alone). Standard errors are clustered at the firm level.

Second, because high-valuation firms have high R&D intensities, we should expect a similar reallocation toward firms with high R&D-to-sales ratios. Figure 18 decomposes aggregate R&D-to-sales into within-firm changes and composition changes (i.e., reallocation of sales), analogous to the valuation decompositions. On average, within-firm changes in R&D-to-sales have actually been negative. The rise in the aggregate is explained by a reallocation toward high-R&D-to-sales firms. The decline in R&D-to-sales within firms is consistent with relatively stable within-firm valuations (meaning incentives to innovate have not risen so much at the firm level) coupled with declining innovation productivity.

<figure id="fig:decomp_xrdsale" data-latex-placement="H">
<div class="center">
<p> <img src="paperplots/decomp_wide_xrd_sale" style="width:80.0%" alt="image" /></p>
</div>
<figcaption>The figure plots the decompositions of changes in the aggregate R&amp;D-to sales ratios. See the main text for an explanation of the within-firm and compositional components.</figcaption>
</figure>

## Additional evidence on public versus private firms 

As explained in the main text, I establish two main facts about public versus private firms' R&D and M&A. The first, plotted in Panel A of Figure 19, is that there has been no detectable increase in private-firm R&D intensity compared to public-firm R&D intensity. In particular, the black line plots the aggregate R&D-to-sales ratio among public firms (i.e., in Compustat), while the green line plots the aggregate R&D-to-GDP ratio reported by the NSF, which includes both public and private firms. If the R&D intensity of private firms rose dramatically relative to that of public firms, then we would expect the latter measure to rise by much more than the former; however, the R&D-to-GDP ratio of all firms is even more stable than R&D-to-sales of public firms.

<figure id="fig:publicvsprivate" data-latex-placement="H">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><strong>A. Public versus private R&amp;D intensity</strong></td>
<td style="text-align: center;"><strong>B. Public versus private acquisition shares</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/publicvsprivate" style="width:100.0%" />
</div>
<figcaption>The figure compares R&amp;D and M&amp;A across public and private firms. Panel A plots the R&amp;D intensities of public and private firms over time. Public firm R&amp;D intensity is the aggregate ratio of R&amp;D to sales across all firms in Compustat. Public and private R&amp;D intensity is the ratio of total R&amp;D spending to GDP in the U.S., as reported by the NSF. Panel B plots the share of M&amp;A spending by public acquirers on public versus private targets over time. M&amp;A is defined here as mergers (purchases of the entire target). Note that these lines do not necessarily add to one, because there are other categories of targets (e.g., government).</figcaption>
</figure>

The second fact, plotted in Panel B of Figure 19, is that there has been no increase in the share of M&A by public acquirers of private targets. Specifically, the figure plots the percent of all public firms' M&A spending (in dollars) on private targets and public targets in mergers (deals in which they acquired 100% of the target).[^74] If public firms were increasingly buying private targets to grow, the share of spending on private targets would rise over time. In the data, there is no such increase---in fact, there is a very slight decrease.

## Additional evidence across industries 

### Decompositions across industries 

Figure 20 plots the decompositions ) and ) at the industry level instead of the firm level. Panels A and B use NAICS 2-digit industry definitions and C and D use NAICS 3-digit industries. In all plots, the amount of reallocation is much smaller than at the firm level, suggesting that most of the reallocation was a within-industry phenomenon.

<figure id="fig:decomp_val_industries" data-latex-placement="H">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><strong>A. Value-to-sales, NAICS 2-digit</strong></td>
<td style="text-align: center;"><strong>B. Cashflow-to-value, NAICS 2-digit</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/decomp_valsale_cfval_combined_naics2" style="width:105.0%" />
<table>
<tbody>
<tr>
<td style="text-align: center;"><strong>C. Value-to-sales, NAICS 3-digit</strong></td>
<td style="text-align: center;"><strong>D. Cashflow-to-value, NAICS 3-digit</strong></td>
</tr>
</tbody>
</table>
<img src="paperplots/decomp_valsale_cfval_combined_naics3" style="width:105.0%" />
</div>
<figcaption>The figure plots the decompositions of changes in aggregate valuations into industry-level and compositional changes. The decompositions for value-to-sales and cashflow-to-value are as in (<a href="#eq:valsale_decomp" data-reference-type="ref" data-reference="eq:valsale_decomp">[eq:valsale_decomp]</a>) and (<a href="#eq:cfval_decomp" data-reference-type="ref" data-reference="eq:cfval_decomp">[eq:cfval_decomp]</a>), respectively, at the industry level (instead of the firm level). Panels A and B use NAICS 2-digit industry classifications while Panels C and D use NAICS 3-digit classifications. Value is defined as market equity plus book debt, less cash; and cashflows are defined as dividends plus interest payments. Industry-level valuation ratios are winsorized at the 1% and 99% levels.</figcaption>
</figure>

### Cross-industry regressions 

As discussed in the main text, we can exploit cross-industry variation over time to test whether the key macro facts are related in the cross-section. The first prediction I test is whether those industries with the largest declines in R&D-to-value had the largest increase in market valuations and M&A intensity. Table 4 reports the results for four regressions for NAICS-5 industries. The first specification regresses the total (log) change in an industry value-to-sales ratio from 1980 to 2020 on the total change in the industry's R&D-to-value ratio over the same period. The negative coefficient estimate implies that those sectors with the largest decline in R&D-to-value had the largest increase in total value-to-sales ratios. Similarly, the other three regressions imply that falling R&D-to-value is associated with rising value-to-cashflows, rising M&A-to-R&D, and rising M&A-to-value. These findings corroborate the first prediction.

 ------------------------------------------------ -- --------------------------------------------------------------------- -- ------------------------------------------------------------------------ -- ------------------------------------------------------------------- -- --------------------------------------------------------------------
 Dependent variable: $\Delta\log\left(\dfrac{\textrm{value}_k}{\textrm{sales}_k}\right)$ $\Delta\log\left(\dfrac{\textrm{value}_k}{\textrm{cashflow}_k}\right)$ $\Delta\log\left(\dfrac{\textrm{M\&A}_k}{\textrm{R\&D}_k}\right)$ $\Delta\log\left(\dfrac{\textrm{M\&A}_k}{\textrm{value}_k}\right)$
 \(1\) \(2\) \(3\) \(4\)
 $\Delta\log(\textrm{R\&D}_k/\textrm{value}_k)$ $-0.189^{***}$ $-0.130^{***}$ $-0.424^{***}$ $-0.184^{***}$
 $(0.031)$ $(0.033)$ $(0.097)$ $(0.086)$
 Observations $515$ $515$ $236$ $279$
 $R^2$ $0.069$ $0.030$ $0.076$ $0.016$
 ------------------------------------------------ -- --------------------------------------------------------------------- -- ------------------------------------------------------------------------ -- ------------------------------------------------------------------- -- --------------------------------------------------------------------

 : The table reports the coefficient estimates for regressions of the cumulative change in industry-level valuations ((1) value-to-sales and (2) value-to-cashflow) and M&A intensity ((3) M&A-to-R&D and (4) M&A-to-value) on the cumulative change in industry-level R&D-to-value. Industries are NAICS 5-digit sectors, excluding any sector for which there are years with missing observations (i.e., zero firms) or for which there are not enough observations to compute the regression variables (i.e., insufficient consecutive R&D or M&A observations). Industry-level variables (value, sales, cashflow, M&A, and R&D) are defined as the sums across firms within the industry---for example, if sector $k$ is composed of $M_{kt}$ firms $i\in\{1,\dots,M_{kt}\}$, then $\textrm{value}_{kt} = \sum_{i=1}^{M_{kt}}\textrm{value}_{it}$. The regression variables are then the cumulative log change in each industry-level variable from 1980 to 2020 (so there is one observation per industry). M&A spending includes both mergers and acquisitions of assets. In all specifications, variables are winsorized at the 1% and 99% level.

The second prediction I test is that those industries in which R&D-to-value fell the most are also the industries that saw the most reallocation toward high-valuation firms. To measure this latter effect, I run the decompositions ) and ) *within* each industry $k$ and sum up the cumulative compositional change from 1980 to 2020. Table 5 reports the results of these regressions. Specification (1) shows that declining R&D-to-value is associated with a larger positive compositional change in value-to-sales (i.e., more reallocation to high-valuation firms); while specification (3) shows that it is associated with more reallocation to low cashflow-to-value firms. Both validate the prediction. To see whether this result is sensitive to the sign of the R&D-to-value change, specifications (2) and (4) condition on only those industries for which the cumulative change in R&D-to-value was negative. The results are the same.

+:----------------------------------------------------------------+:-:+:---------------:+:-------------:+:---------------:+:---------:+:--------------:+:--------------:+:--------------:+
| Dependent variable: | | Compositional change in valuation |
+-----------------------------------------------------------------+---+---------------------------------------------------+-----------+--------------------------------------------------+
| | | $\textrm{value}_k/\textrm{sales}_k$ | | $\textrm{cashflow}_k/\textrm{value}_k$ |
+-----------------------------------------------------------------+---+-----------------+---------------+-----------------+-----------+----------------+----------------+----------------+
| | | \(1\) | | \(2\) | | \(3\) | | \(4\) |
+-----------------------------------------------------------------+---+-----------------+---------------+-----------------+-----------+----------------+----------------+----------------+
| $\Delta\log(\textrm{R\&D}_k/\textrm{value}_k)$ | | $-0.1038^{***}$ | | $-0.3160^{***}$ | | $0.0020^{***}$ | | $0.0021^{**}$ |
+-----------------------------------------------------------------+---+-----------------+---------------+-----------------+-----------+----------------+----------------+----------------+
| | | $(0.0519)$ | | $(0.0980)$ | | $(0.0006)$ | | $(0.0013)$ |
+-----------------------------------------------------------------+---+-----------------+---------------+-----------------+-----------+----------------+----------------+----------------+
| Subsample with $\Delta\log(\textrm{R\&D}_k/\textrm{value}_k)<0$ | | | | Y | | | | Y |
+-----------------------------------------------------------------+---+-----------------+---------------+-----------------+-----------+----------------+----------------+----------------+
| Observations | | $515$ | | $307$ | | $515$ | | $307$ |
+-----------------------------------------------------------------+---+-----------------+---------------+-----------------+-----------+----------------+----------------+----------------+
| $R^2$ | | $0.008$ | | $0.033$ | | $0.019$ | | $0.009$ |
+-----------------------------------------------------------------+---+-----------------+---------------+-----------------+-----------+----------------+----------------+----------------+

: The table reports the coefficient estimates for regressions of the cumulative compositional change in industry-level valuations (value-to-sales and cashflow-to-value) on the cumulative change in industry-level R&D-to-value. Industries are NAICS 5-digit sectors, excluding any sector for which there are years with missing observations (i.e., zero firms) or for which there are not enough observations to compute the regression variables (i.e., insufficient consecutive R&D observations). Industry-level variables (value, sales, cashflow, and R&D) are defined as the sums across firms within the industry---for example, if sector $k$ is composed of $M_{kt}$ firms $i\in\{1,\dots,M_{kt}\}$, then $\textrm{value}_{kt} = \sum_{i=1}^{M_{kt}}\textrm{value}_{it}$. For value-to-sales, the compositional change for sector $k$ from time $t-1$ to time $t$ is defined as $\sum_{i=1}^{M_{kt}}\Delta\left(\frac{\textrm{sales}_{it}}{\textrm{sales}_{kt}^{\rm cont}}\right) \times \frac{1}{2}\left(\frac{\textrm{value}_{it}}{\textrm{sales}_{it}} + \frac{\textrm{value}_{i,t-1}}{\textrm{sales}_{i,t-1}}\right)$, where $\textrm{sales}_{kt}^{\rm cont}$ is the total sales of firms in sector $k$ that continued operating from $t-1$ and $t$. The cumulative change is then the sum over time (from 1980 to 2020) of all of these changes (so there is one observation per industry). The compositional change for the cashflow-to-value is similar, but uses value weights instead of sales weights. Specifications (1) and (3) consider all sectors, while (2) and (4) consider only sectors for which the cumulative change in R&D-to-value was negative. In all specifications, variables are winsorized at the 5% and 95% level.

# Derivation of equilibrium conditions 

## Profits and production 

### Household final-good demand 

Consider any individual household in the economy. The household chooses quantities $\{y_{ijt}\}$ of each good in order to maximize the value of its total consumption ) and ). The household takes the prices $\{p_{ijt}\}$ of these goods as given. Let $W_{ct}$ denote the total expenditure of the household on all consumption goods (to be determined in the household's dynamic consumption-saving problem). Its budget constraint is then $$
W_{ct} \geq \int_0^{N_t}\sum_{i=1}^{m_{jt}}p_{ijt}y_{ijt}dj.
$$ If we define $\hat{y}_{ij}\equiv q_{ij}y_{ij}$ and $\hat{p}_{ij}\equiv p_{ij}/q_{ij}$, then this optimization solves the static Lagrangian $$
\mathcal{L}_{ct} = \max_{\{\hat{y}_{ijt}\}}\left\{c(\{\hat{y}_{ijt}\}) + \lambda_{ct}\left(W_{ct} - \int_0^1\sum_{i=1}^{m_{jt}}\hat{p}_{ijt}\hat{y}_{ijt}dj\right)\right\},
$$ where $c(\{\hat{y}_{ijt}\})$ nests ) and ). Stated this way, the problem is of the same form as the standard CES optimization without quality differences. The first-order conditions are $$
\left(\frac{y_{jt}}{c_t}\right)^{-1/\eta} = \lambda_{ct}\hat{p}_{ijt}.

$$ This implies that, within product market $j$, firms must equate quality-adjusted prices: $$
\hat{p}_{ijt} = \hat{p}_{i'jt},

$$ for all $i$ and $i'$ producing in the market. Denote by $p_j=\hat{p}_{ij}$ the price index for the whole product market, and note that $p_jy_j = \sum_{i=1}^{m_j}p_{ij}y_{ij}$.

Given this price index, the remainder of the solution follows standard CES algebra. First, the first-order condition ) implies that, for any goods $j$ and $j'$, $$
p_{jt}^\eta y_{jt} = p_{j't}^\eta y_{j't}.
$$ This means that the household's total expense on consumption is $$
\int_0^{N_t}p_{j't}y_{j't}dj' = p_{jt}^\eta y_{jt}\int_0^{N_t}p_{j't}^{1-\eta}dj'.
$$ Now define the aggregate price index $p_t$ such that $p_tc_t=\sum_jp_{jt}y_{jt}$. Substituting this into the left-hand side of the last equation, then aggregating the $y_{jt}$ up to $c_t$ implies $$
p_t =\left(\int_0^{N_t}p_{jt}^{1-\eta}dj\right)^{1/(1-\eta)}.
$$ This implies that good $j$'s price satisfies $$
p_{jt} = \left(\frac{y_{jt}}{c_t}\right)^{-1/\eta}p_t,
$$ and hence $$
p_{ijt} = q_{ijt}\left(\frac{y_{jt}}{c_t}\right)^{-1/\eta}p_t.

$$ for every firm $i$ in variety $j$.

This is the solution for a single household; the final step is to aggregate across households. Note that the first-order condition ) aggregates across households to $$
\left(\frac{Y_{jt}}{Y_t}\right)^{-1/\eta} = \lambda_{ct}\hat{p}_{ijt},
$$ where $Y_{jt}$ is aggregate consumption of good $j$ and $Y_t$ is aggregate consumption of all goods. Thus, all of the steps above hold in terms of aggregates, implying the aggregate demand curve $$
p_{ijt} = q_{ijt}\left(\frac{Y_{jt}}{Y_t}\right)^{-1/\eta}p_t.
$$ Aggregate consumption is the numeraire, so we normalize $p_t=1$.

### Firm profit maximization 

The firm's profits, and hence its pricing and production decisions, are separable across products, so it suffices to consider the profit maximization for a single product. Firms compete à la Bertrand, taking as given the demand curves ) to determine their output. A standard result in Bertrand competition is that firms may be at a corner solution if competition is sufficiently intense. Substituting the demand curve into firm $i$'s profits for product $j$ implies profits $$\begin{align}
\Pi_{ijt} &= \max_{p_{ijt}}\left(p_{ijt}-\frac{w_{{\rm p}t}}{a_i}\right)Y_{ijt} \\
&= \max_{p_{jt}}\left(p_{jt}-\frac{w_{{\rm p}t}}{a_iq_{ijt}}\right)p_{jt}^{-\eta}\biggl(Y_{jt}-\sum_{-i\neq i}q_{-ijt}Y_{-ijt}\biggr)
\end{align}$$ The second equality imposes the constraint ). The firm chooses its (quality-adjusted) price $p_{jt}=p_{ijt}/q_{ijt}$ given the qualities and quantities $\{q_{-ijt},Y_{-ijt}\}$ of competitors.

First, let us consider interior solutions: the first-order condition with respect to $p_{jt}$ is $$
(1-\eta)p_{jt}^{-\eta} + \eta\frac{w_{{\rm p}t}}{a_iq_{ijt}}p_{jt}^{-\eta-1} = 0.
$$ (Note that $Y_t-\sum_{-i\neq i}q_{-ijt}Y_{-ijt}=q_{ijt}Y_{ijt}>0$ in an interior solution). This implies $$
p_{ijt} = q_{ijt}p_{jt} = \frac{\eta}{\eta-1}\frac{w_{{\rm p}t}}{a_i}.
$$ Defining the firm's markup as its price relative to marginal cost, we have $$
\mu_{ijt} \equiv \frac{p_{ijt}}{w_{{\rm p}t}/a_i} = \frac{\eta}{\eta-1},
$$ the usual CES monopoly markup.

This interior solution will not be a Nash equilibrium if it is possible for a competitor to lower its price while still making a profit. Define a firm's quality-adjusted marginal cost as $$
mc_{ijt} \equiv \frac{w_{{\rm p}t}}{a_iq_{ijt}}.

$$ Now consider the two firms with the lowest marginal costs; denote by $i_1$ the firm with the lowest marginal cost (the "leader") and by $i_2$ that with the second lowest (the "follower"). Because quality-adjusted prices are equated through $p_j=p_{ij}/q_{ij}$, the markup of firm $i_2$ can be equivalently expressed as: $$
\mu_{i_2jt} = \frac{p_{jt}}{mc_{i_2jt}} = \frac{p_{i_1jt}}{q_{i_1jt}mc_{i_2jt}}.
$$ The lowest price at which firm $i_1$ can guarantee firm $i_2$ will be unwilling to further reduce prices is that for which $\mu_{i_2jt}=1$, or $$
p_{i_1jt} = q_{i_1jt}mc_{i_2jt}.
$$ At this price, the follower earns zero profits and the markup of the leader is $$
\mu_{i_1jt} = \frac{q_{i_1jt}mc_{i_2jt}}{w_{{\rm p}t}/a_i} = \frac{mc_{i_2jt}}{mc_{i_1jt}} = \frac{a_{i_1}q_{i_1jt}}{a_{i_2}q_{i_2jt}}.
$$ Combining these two solutions, the leader's markup equals $$
\mu_{ijt} = \min\left\{\frac{a_{i_1}q_{i_1jt}}{a_{i_2}q_{i_2jt}},\frac{\eta}{\eta-1}\right\}
$$ and the follower always has a markup of one. Because we have assumed that firms earning zero markups choose not to produce (Section 2), product $j$ is only produced by the leader.

The last piece is to determine the set of possible market structures and resulting markups. If a firm creates a new variety, it obviously has no follower $(i_2=\varnothing)$. Alternatively, a firm could enter an existing market by innovating on an existing blueprint or by buying the leader's blueprint. In either of these cases, the quality gap is $q_{i_1}/q_{i_2} = \lambda$. Hence, we have six potential market structures, which can be summarized by the pair $\{i_1,i_2\}\in\{h,\ell\}\times\{h,\ell,\varnothing\}$. Denoting by $\mu_{i_1i_2}$ the markup given this pair, the results above imply that monopolists get the CES markup, $$
\mu_{h\varnothing}=\mu_{\ell\varnothing}=\frac{\eta}{\eta-1},
$$ same-type leaders earn (at least) the quality advantage, $$
\mu_{hh}=\mu_{\ell\ell} = \min\left\{\lambda,\frac{\eta}{\eta-1}\right\},
$$ high types leading low types earn both the quality and productivity advantage, $$
\mu_{h\ell} = \min\left\{\lambda\frac{a_h}{a_\ell},\frac{\eta}{\eta-1}\right\},
$$ and low types leading high types earn $$
\mu_{\ell h} = \min\left\{\lambda\frac{a_\ell}{a_h},\frac{\eta}{\eta-1}\right\},
$$ We have assumed from the outset that $\lambda>a_h/a_\ell$, so $\mu_{\ell h}>1$ and this market structure is indeed consistent with firm profit incentives. If instead we had $\lambda<a_h/a_\ell$, then this market structure would not exist and we would be left with only the other five.[^75]

### Aggregation 

Substituting the demand curve, $p_{jt}=(Y_{jt}/Y_t)^{-1/\eta}$, and fact that only one firm produces each distinct product, $Y_{jt}=q_{ijt}Y_{ijt}$, into the definition of the markup ), we have $$
w_{{\rm p}t} = \left(\frac{L_{ijt}}{Y_t}\right)^{-1/\eta}(a_iq_{ijt})^{1-1/\eta}\mu_{ijt}^{-1}.

$$ Because there is only one firm producing each variety $j$, we can replace $a_i$ with $a_{jt}$ and drop all other $i$ indices (e.g., $q_{ijt}=q_{jt}$). Before aggregating this expression, let us define some useful macroeconomic aggregates. Aggregate quality is defined as an average of blueprint qualities: $$
Q_t \equiv \left(\frac{1}{N_t}\int_0^{N_t}q_{jt}^{\eta-1}dj\right)^{1/(\eta-1)}.% \\
$$ Aggregate productivity is defined as[^76] $$
\bar{a}_t \equiv \left(\frac{1}{N_t}\int_0^{N_t}\left(\frac{q_{jt}}{Q_t}\right)^{\eta-1}a_{jt}^{\eta-1}dj\right)^{1/(\eta-1)},
$$ a quality-weighted average of firm-level productivity. Finally, we define two aggregates of markups, $$
\mathcal{M}_{\eta t} \equiv \left(\frac{1}{N_t}\int_0^{N_t}\left(\frac{a_{jt}q_{jt}}{\bar{a}_tQ_t}\right)^{\eta-1}\mu_{jt}^{-\eta}dj\right)^{-1/\eta}

$$ and $$
\mathcal{M}_{\eta-1,t} \equiv \left(\frac{1}{N_t}\int_0^{N_t}\left(\frac{a_{jt}q_{jt}}{\bar{a}_tQ_t}\right)^{\eta-1}\mu_{jt}^{-(\eta-1)}dj\right)^{-1/(\eta-1)}.

$$ These are averages of markups across products, with higher weight placed on blueprints with higher productivity and quality.

With these objects defined, we can use the expression ) to derive two aggregate equations. First, integrating ) over $j$ and imposing market clearing $L_{{\rm p}t}=\int_0^{N_t}L_{jt}dj$, we get $$
w_{{\rm p}t} = \left(\frac{L_{{\rm p}t}}{N_tY_t}\right)^{-1/\eta}(\bar{a}_tQ_t)^{1-1/\eta}\mathcal{M}_{\eta,t}^{-1}.

$$ Second, solving for $Y_{jt}$ in ) and substituting it into the definition of total output $Y_t$ ), we get $$
w_{{\rm p}t} = N_t^{1/(\eta-1)}\bar{a}_tQ_t\mathcal{M}_{\eta-1,t}^{-1}.

$$ Now equating the right-hand sides of these expressions and solving for output, we have $$
Y_t = N_t^{1/(\eta-1)}\bar{a}_tQ_t\Omega_tL_{{\rm p}t},

$$ where the wedge term $\Omega_t$ is defined $$
\Omega_t \equiv \frac{\mathcal{M}_{\eta t}^\eta}{\mathcal{M}_{\eta-1,t}^\eta}.

$$ As shown in Section 11.1.5 below, this wedge takes a value in $(0,1]$ and attains its maximum ($\Omega=1$) if and only if there is no cross-product markup heterogeneity ($\mu_{jt}=\mu_t$, $\forall j$). This is because heterogeneous market power distorts demand and results in a misallocation of production labor across blueprints, as in Peters (2020). I therefore refer to $\Omega$ as *static misallocation*.

To solve for aggregate wages and the labor share of output, note that combining the expression ) with either condition ) or condition ) implies $$
w_{{\rm p}t} = \Lambda_t\frac{Y_t}{L_{{\rm p}t}},

$$ where the (production) labor share of output $\Lambda_t = w_{{\rm p}t}L_{{\rm p}t}/Y_t$ has solution $$
\Lambda_t \equiv \frac{\mathcal{M}_{\eta-1, t}^{\eta-1}}{\mathcal{M}_{\eta t}^\eta}.
$$ Notice that, from the household demand curve, the sales share of good $j$ in total sales is $$
\frac{p_{jt}Y_{jt}}{Y_t} = p_{jt}^{1-\eta} = \left(\frac{\mu_{jt}w_{{\rm p}t}}{a_iq_{jt}}\right)^{1-\eta},
$$ implying that $$
\Lambda_t = \int_0^{N_t}\frac{p_{jt}Y_{jt}}{Y_t}\mu_{jt}^{-1}dj
$$ The aggregate labor share is a sales-weighted average of product-level labor shares (inverse markups $\mu_{jt}^{-1}$). A natural definition of the aggregate markup is therefore $\Lambda_t^{-1}$.

### Firm labor allocations, revenues, and profits 

Let us bring back the firm-product-level indices $\{i,j\}$. The conditions ) and ) together imply labor allocations $$
\frac{L_{ijt}}{L_{{\rm p}t}} = \frac{1}{N_t}\left(\frac{a_iq_{ijt}}{\bar{a}_tQ_t}\right)^{\eta-1}\left(\frac{\mu_{ijt}}{\mathcal{M}_{\eta t}}\right)^{-\eta}.

$$ A product receives a higher labor allocation if it has higher quality, its producer is more productive, or its markup is lower.[^77] Now using the fact that revenue $p_{ijt}Y_{ijt} = \mu_{ijt}w_{{\rm p}t}L_{ijt}$ and the wage $w_{{\rm p}t}=\Lambda_tY_t/L_{{\rm p}t}$, it follows that the firm receives revenues $$\begin{align}
p_{ijt}Y_{ijt} &= \mu_{ijt}\left(\frac{L_{ijt}}{L_{{\rm p}t}}\right)\Lambda_t Y_t \\
&= \frac{1}{N_t}\left(\frac{a_iq_{ijt}}{\bar{a}_tQ_t}\right)^{\eta-1}\left(\frac{\mu_{ijt}}{\mathcal{M}_{\eta-1,t}}\right)^{-(\eta-1)}Y_t
\end{align}$$ Fixing relative allocations, the scale of production grows with aggregate output per product $Y$. The corresponding profits are then $$
\Pi_{ijt} = \left(1-\frac{1}{\mu_{ijt}}\right)p_{ijt}Y_{ijt}.

$$

### Quantification of static misallocation 

Before proceeding, we must verify the claim that the wedge term $\Omega_t$, defined in ), is indeed in $(0,1]$ and equals one if and only if the labor allocations $L_{ijt}/L_{{\rm p}t}$ maximize aggregate TFP (output per worker $A_t=Y_t/L_{{\rm p}t}$). The maximum level of TFP can be written $$
A_t^* \equiv \max_{\{L_{ijt}/L_{{\rm p}t}\}}\frac{Y_t}{L_{{\rm p}t}} = \max_{\{\hat{L}_{ijt}\}}\left(\int_0^{N_t}\left(\sum_{i=1}^{m_{jt}}a_iq_{ijt}\hat{L}_{ijt}\right)^{1-1/\eta}dj\right)^{1/(1-1/\eta)},
$$ where $\hat{L}_{ijt}\equiv L_{ijt}/L_{{\rm p}t}$ denotes the share of production labor. Within variety $j$, it is clearly the case that all labor should be allocated to the blueprint with the highest value of $a_iq_{ijt}$, just as occurs the decentralized equilibrium. Letting $\{a_{jt},q_{jt}\}$ denote the values for this blueprint, this means that $$
A_t^* = \max_{\{\hat{L}_{jt}\}}\left(\int_0^{N_t}(a_{jt}q_{jt}\hat{L}_{jt})^{1-1/\eta}dj\right)^{1/(1-1/\eta)},
$$ subject to the constraint that $$
1 \geq \int_0^{N_t}\hat{L}_{jt}dj.
$$ Letting $\lambda_{Lt}$ denote the Lagrange multiplier on this constraint, the first-order condition for this maximization is $$
Y_t^{1/\eta}(q_{jt}a_{jt})^{1-1/\eta}(\hat{L}_{jt}^*)^{-1/\eta} = \lambda_{Lt}.
$$ Aggregating this in the same was as we did the firms' first-order conditions, we have $$
Y_t^{1/\eta}N_t^{1/\eta}(\bar{a}_tQ_t)^{1-1/\eta} = \lambda_{Lt}.
$$ where $\bar{a}_t$ and $Q_t$ are identical to their values in the decentralized equilibrium. Dividing these two conditions implies $$
\hat{L}_{jt}^* = \frac{1}{N_t}\left(\frac{a_{jt}q_{jt}}{\bar{a}_tQ_t}\right)^{\eta-1}.

$$ Substituting this solution back into the demand aggregator, we get $$
A_t^* = N_t^{1/(\eta-1)}Q_t\bar{a}_t.
$$ Because $\{N_t,Q_t,\bar{a}_t\}$ are the same as in the decentralized equilibrium, the ratio of actual (decentralized) aggregate TFP to this maximal TFP is therefore $$
\frac{A_t}{A_t^*} = \Omega_t.
$$ We will necessarily have $\Omega_t\in(0,1]$, because $A_t^*\geq A_t>0$ by definition.

To understand what $\Omega_t$ represents, note that $\Omega_t=1$ if and only if there is zero markup dispersion ($\mu_{jt}=\mu_t$, $\forall j$). This can be seen by noticing that the ratio of the actual labor allocation ) to the maximizing allocation ) equals $$
\frac{\hat{L}_{jt}}{\hat{L}_{jt}^*} = \left(\frac{\mu_{jt}}{\mathcal{M}_{\eta t}}\right)^{-\eta}.
$$ This ratio equals one for all $j$ if and only if $\mu_{jt}=\mathcal{M}_{\eta t}$ for all $j$ --- that is, if and only if all products earn the same markup.

## Dynamic firm problem 

This appendix derives the equilibrium conditions for the dynamic firm and household problems, taking as given the path of the blueprint distribution $f_{\mathcal{Z}t}$ and the ratio $L_t/N_t$. In what follows, statements that an equilibrium object is a "function of time" are more precisely interpreted as those objects depending on the full distribution $f_{\mathcal{Z}t}$ and the ratio $L_t/N_t$, which themselves will be deterministic functions of time.

### Rescaling the firm problem 

Firm $i$ maximizes the present value $V_{it}$ of its cash flows $D_{it}$, as defined in ). Because the profit maximization at the blueprint level is static (see Section 11.1.4), this objective amounts to choosing R&D and M&A policies given the optimal profits ) of each product.

To solve this problem, it is helpful to first establish some facts about the state-price density $\xi_t$ which prices firms.[^78] The absence of arbitrage opportunities implies that the riskfree rate $r_{ft}$ equals $$
r_{ft} = -\mathbb{E}_t\left[\frac{d\xi_t}{\xi_t}\right]\frac{1}{dt}.
$$ and the expected return on any firm $i$ equals $$
r_{it} \equiv \mathbb{E}_t[dR_{it}]\frac{1}{dt} = r_{ft} - \mathbb{E}_t\left[\frac{d[\xi,V_i]_t}{\xi_tV_{it^-}}\right]\frac{1}{dt}.

$$ Because we have assumed that markets are essentially complete, we will ultimately find that (i) $\xi_t$ is unique and given by the marginal utility of a representative household, and (ii) all expected returns $r_{it}$ equal the riskfree rate $r_{ft}$. But these intuitive results must be proven; for now, we work with the more general expected returns $r_{it}$.

We can rescale the firm's problem to de-trend its growth relative to the aggregate. Everything will be rescaled by per-capita aggregate output $y_t=Y_t/L_t$. Denote rescaled variables with hats: $\hat{Z}_t\equiv Z_t/y_t$. The rescaled dividend $\hat{D}_{it}$ equals rescaled profits, $$
\hat{\Pi}_{it} = \sum_{j=1}^{n_{it}}\hat{\Pi}_{ijt} = \sum_{j=1}^{n_{it}}(1-\mu_{ijt}^{-1})\frac{L_t}{N_t}\left(\frac{a_iq_{ijt}}{\bar{a}_tQ_t}\right)^{\eta-1}\left(\frac{\mu_{ijt}}{\mathcal{M}_{\eta-1,t}}\right)^{-(\eta-1)},

$$ minus rescaled R&D and M&A expenditures, $$
\hat{w}_{{\rm x}t}X_{it} = \hat{w}_{{\rm x}t}(X_{{\rm n}it}+X_{{\rm q}it}+X_{{\rm m}it}).

$$ The reason for scaling by per-capita output, as opposed to total output, is that, in a balanced-growth equilibrium, the number of firms $M_t$ and blueprints $N_t$ will grow with the population, so the distribution of rescaled firm sizes (e.g., of $\hat{D}_{it}$) will be stationary. The choice to rescale the skilled wage in ), instead of the number of laborers $X_{it}$, has the same rationale.

It follows that the rescaled value of the firm equals $$
\hat{V}_{it} = \sup_{\{X_{{\rm n}it},X_{{\rm q}it},X_{{\rm m}it}\}}\mathbb{E}_t\left[\int_0^\infty \frac{\xi_{t+\tau}}{\xi_t}\frac{y_{t+\tau}}{y_t}\hat{D}_{i,t+\tau}d\tau\right].
$$ Let us derive an HJB equation for this value and express it in terms of scaled variables. First, note that the unscaled firm value can be written recursively as $$
V_{it} = \mathbb{E}_t\left[\int_t^T\frac{\xi_s}{\xi_t}D_{is}ds\right] + \mathbb{E}_t\left[\frac{\xi_T}{\xi_t}V_{iT}\right],
$$ for any $T>t$. Next, multiplying both sides of this expression by $\xi_t$ and adding $\int_{t_0}^t\xi_sD_{is}ds$, where $t_0<t$, implies: $$
\int_{t_0}^t\xi_sD_{is}ds + \xi_tV_{it} = \mathbb{E}_t\left[\int_{t_0}^T\xi_sD_{is}ds + \xi_TV_{iT}\right].
$$ Hence, the expression on the left-hand size is a martingale, and so it is also specifically a local martingale: $$
\mathbb{E}_t\left[d\left(\int_{t_0}^t\xi_sD_{is}ds + \xi_tV_{it}\right)\right] = 0.
$$ Evaluating this differential and using the expected return definition ) above, we have the unscaled HJB $$
r_{it}V_{it} = D_{it} + \mathbb{E}_t[dV_{it}]\frac{1}{dt}.
$$ In order to rescale this, note that aggregate output growth, $\dot{y}_t/y_t = g_{yt}$, is deterministic, so the covariation $d[y,\hat{V}_i]_t=0$; thus, we have the decomposition $$
dV_{it} = d(y_t\hat{V}_{it}) = g_{yt}y_t\hat{V}_{it}dt + y_td\hat{V}_{it}.
$$ Substituting in this decomposition and dividing everything by $y_t$, we get the rescaled HJB $$
(r_{it} - g_{yt})\hat{V}_{it} = \sup_{\{X_{{\rm n}it},X_{{\rm q}it},X_{{\rm m}it}\}}\left\{\hat{D}_{it} + \mathbb{E}_t[d\hat{V}_{it}]\frac{1}{dt}\right\}.

$$ Henceforth, let $\rho_{it}\equiv r_{it} - g_{yt}$, foreseeing that in equilibrium $\rho_{it}$ will equal the constant preference parameter $\rho$.

### Solving the firm problem 

To solve the firm's problem ), I first note two facts that help to dramatically simplify the problem. First, note that the rescaled profits of the firm are additively separable across products, and each product's profits depends only on (i) deterministic macroeconomic aggregates (i.e., time) and (ii) the set of blueprint-specific characteristic $Z_{ij} = \{i_1,i_2,\hat{q}\}$, where $\hat{q}\equiv \log(q/Q)$ represents *relative quality*. Letting $\mathcal{Z}_{it}\equiv\{Z_{ijt}\}_{j=1}^{n_{it}}$ denote the collection of blueprint characteristics, firm profits can be written $$
\hat{\Pi}_{it} = \sum_{j=1}^{n_{it}} \hat{\Pi}_{ijt} \equiv \sum_{\{i_1,i_2,\hat{q}\}\in\mathcal{Z}_{it}}\pi_{i_1i_2\hat{q}t}.

$$ The significance of this is that the profits $\pi_{i_1i_2\hat{q}t}$ are independent of each other and of firm characteristics (except $i_1$). The second useful fact, which will be proven below, is that the optimal R&D and M&A policies $\{X_{{\rm n}it},X_{{\rm q}it},X_{{\rm m}it}\}$ will be linear in the number of blueprints $n$ and, given $n$, will depend only on the firm's type $a_i=a_{i_1}$. Hence, we can define the allocations per blueprint as $x\equiv X/n$ and note that the R&D and M&A production functions can be written as linear functions of $n$. For example, for new-variety creation, we will have $x_{{\rm n}it} \equiv X_{{\rm n}it}/n_{it}$ and $$
(1-\alpha)\varphi X_{{\rm n}it}^\varepsilon n^{1-\varepsilon} = \Phi_{\rm n}(x_{{\rm n}it})n_{it}, \quad\textrm{where}\quad \Phi_{\rm n}(x) \equiv (1-\alpha)\varphi x^\varepsilon.
$$ Define $\{x_{{\rm q}it},x_{{\rm m}it}\}$ and $\{\Phi_{\rm q}(x),\Psi(x)\}$ analogously for quality innovation and M&A search. A consequence of linearity in $n$ is that total expenditures are linear in $n$: $$
\hat{w}_{{\rm x}t}X_{it} = \hat{w}_{{\rm x}t}x_{it}n_{it}.
$$ Combining these two facts implies that the rescaled dividend can be written $$
\hat{D}_{it} = \sum_{\{i_1,i_2,\hat{q}\}\in\mathcal{Z}_{it}}(\pi_{i_1i_2\hat{q}t} - \hat{w}_{{\rm x}t}x_{i_1t}).

$$ Like firm profits, the total firm dividend is separable across blueprint characteristics. Thus, a reasonable conjecture is that the value of the firm is also separable across blueprints: $$
\hat{V}_{it} = \sum_{\{i_1,i_2,\hat{q}\}\in\mathcal{Z}_{it}}v_{i_1i_2\hat{q}t},

$$ where $v_{i_1i_2\hat{q}t}$ represents the value of a single blueprint with a leader of type $a_{i_1}$, a follower of type $a_{i_2}$, and relative quality $\hat{q}$.

The separability of the value function ) means that, instead of solving a system of HJB equations across firms, we can instead solve a system of HJBs across blueprint types $\{i_1,i_2,\hat{q}\}$. Let us derive this from the firm-level HJBs ). Growth in the (rescaled) firm value equals the sum of six components: $$\begin{multline}
\mathbb{E}_t[d\hat{V}_{it}]\frac{1}{dt} = \underbrace{\Phi_{\rm n}(x_{{\rm n}it})n_{it}\bar{v}_{{\rm n}it}}_{\textrm{new variety}} +\underbrace{ \Phi_{\rm q}(x_{{\rm q}it})n_{it}\bar{v}_{{\rm q}it}}_{\textrm{quality innov.}} + \underbrace{\Psi(x_{{\rm m}it})n_{it}\bar{v}_{{\rm m}it}}_{\textrm{acquisition}} \\
- \underbrace{\sum_{j=1}^{n_{it}}\delta_{ijt}\left(\hat{V}_{it} - \hat{V}_{it}^{(-j)}\right)}_{\textrm{creative destruction}} + \underbrace{\sum_{j=1}^{n_{it}}\delta_{ijt}^\psi\left(\bar{v}_{{\rm m}ijt} - \left(\hat{V}_{it} - \hat{V}_{it}^{(-j)}\right)\right)}_{\textrm{M\&A exit}} - \underbrace{\sum_{j=1}^{n_{it}}\frac{\partial \hat{V}_{it}}{\partial\hat{q}_{ijt}}g_{Qt}}_{\textrm{quality depr.}} + \underbrace{\vphantom{\sum_{j=1}^{n_{it}}}\frac{\partial \hat{V}_{it}}{\partial t}}_{\textrm{time path}}.

\end{multline}$$ The first is the expected increase in firm value from new-variety creation. If the firm succeeds, which occurs with intensity $\Phi_{\rm q}(x_{{\rm q}it})n_{it}dt$, then it expects its market value to increase by $\bar{v}_{{\rm n}it}$. The second terms is analogous for quality innovation. The third term is the expected increase in value from finding and acquiring an M&A target. If the firm successfully finds a blueprint, then $\bar{v}_{{\rm m}it}$ is the surplus it expects: the value of the acquired blueprint net of the deal price it must pay. Note that the expectation $\bar{v}_{{\rm m}it}$ includes the set of blueprints that do not result in a deal, for which the surplus is zero. The fourth term is the expected loss in value from creative destruction. If a competitor improves on the quality of blueprint $j$, then the firm's value falls to some $\hat{V}_{it}^{(-j)}<\hat{V}_{it}$; each $\delta_{ijt}$ denotes the Poisson intensity with which a blueprint could be lost to creative destruction. The fifth term is the expected gain in firm value from selling a blueprint to an acquirer at a premium: the firm loses the blueprint, but receives a payout $\bar{v}_{{\rm m}ijt}$ determined in the Nash bargaining game. This occurs with intensity $\delta_{ijt}^\psi$. The sixth and final term is the depreciation in the relative quality of existing blueprints. Aggregate quality grows at a rate $\dot{Q}_t/Q_t=g_{Qt}$, so the relative quality of existing blueprints depreciates by $\dot{\hat{q}}_{ijt} = -g_{Qt}$.

Substituting ) into the firm-level HJB ) implies the first-order condition for new-variety R&D $$
\hat{w}_{{\rm x}t}= \varepsilon(1-\alpha)\varphi\left(\frac{X_{{\rm n}it}}{n_{it}}\right)^{\varepsilon-1}\bar{v}_{{\rm n}it}.
$$ Analogous first-order conditions hold for quality R&D and M&A inputs. These conditions imply the optimal policies $$\begin{align}

x_{{\rm n}it} &\equiv \frac{X_{{\rm n}it}}{n_{it}} = \left(\frac{\varepsilon(1-\alpha)\varphi\bar{v}_{{\rm n}it}}{\hat{w}_{{\rm x}t}}\right)^{1/(1-\varepsilon)}, \\

x_{{\rm q}it} &\equiv \frac{X_{{\rm q}it}}{n_{it}} = \left(\frac{\varepsilon\alpha\varphi\bar{v}_{{\rm q}it}}{\hat{w}_{{\rm x}t}}\right)^{1/(1-\varepsilon)}, \\

x_{{\rm m}it} &\equiv \frac{X_{{\rm m}it}}{n_{it}} = \left(\frac{\varepsilon\psi\bar{v}_{{\rm m}it}}{\hat{w}_{{\rm x}t}}\right)^{1/(1-\varepsilon)}.
\end{align}$$ Provided that the expected market values $\{\bar{v}_{{\rm n}it},\bar{v}_{{\rm q}it},\bar{v}_{{\rm m}it}\}$ are invariant to size $n_{it}$, these are independent of $n_{it}$, as conjectured.

With these results in hand, we can implement and prove the conjecture ). Recall that, conditional on creating a new variety, the blueprint quality is drawn from the distribution $\Gamma_{\hat{q}}$. Thus, under ), the expected value of a new variety is $$
\bar{v}_{{\rm n}it} = \int_{-\infty}^\infty v_{i\varnothing\hat{q}t}\Gamma_{\hat{q}}d\hat{q}.

$$ Recall also that quality innovations are undirected (i.e., target blueprints are chosen at random). This means that the expected present value a new blueprint through quality innovation is $$
\bar{v}_{{\rm q}it} = \int_{-\infty}^\infty [v_{i\ell,\hat{q}+\log\lambda,t}f_{\ell\hat{q}t} + v_{ih,\hat{q}+\log\lambda,t}f_{h\hat{q}t}] d\hat{q},

$$ where $f_{i\hat{q}t}\equiv \sum_{i_2}f_{ii_2\hat{q}t}$. In words, the firm finds a blueprint with current relative quality $\hat{q}$ and creates a new one with relative quality $\hat{q}+\log\lambda$. Finally, to solve for the expected value of M&A, we need to pin down when a deal will take place and, if it does, the deal price. If a firm with type $i$ purchases a blueprint of type $\{i_1=i',i_2,\hat{q}\}$, then the Nash bargaining solution (should it exist) is $$
v_{{\rm m},ii'i_2\hat{q}t} = \underset{v_{\rm m}}{\operatorname{argmax}} \left\{(v_{ii_2\hat{q}t}-v_{\rm m})^\varrho(v_{\rm m}-v_{i'i_2\hat{q}t})^{1-\varrho}\right\} = \varrho v_{i'i_2\hat{q}t} + (1-\varrho)v_{ii_2\hat{q}t}.

$$ The deal is clearly mutually beneficial if and only if $v_{ii_2\hat{q}t}>v_{i'i_2\hat{q}t}$ (if they are equal, no deal occurs, consistent with an infinitesimal fixed cost of negotiating). This will ultimately be the case if and only if there is a high-type buyer ($i=h$) and a low-type target ($i'=\ell$), because high-productivity firms obtain strictly higher profits ($\pi_{hi_2\hat{q}t}>\pi_{\ell i_2\hat{q}t}$) and will face the same risk of creative destruction ($\delta_{hi_2\hat{q}t}=\delta_{\ell i_2\hat{q}t}$, shown below).[^79] Consequently, the expected value of a new blueprint for the acquirer (net of deal price) is $$\begin{align}
\bar{v}_{{\rm m}it} &= \varrho \int_{-\infty}^\infty\sum_{i'\in\{h,\ell\}}\sum_{i_2\in\{h,\ell,\varnothing\}}\mathbb{I}_{v_{ii_2\hat{q}t}>v_{i'i_2\hat{q}t}}(v_{ii_2\hat{q}t} - v_{i'i_2\hat{q}t})f_{i'i_2\hat{q}t}d\hat{q} \\
&= \mathbb{I}_{i=h}\varrho f_{\ell t} \int_{-\infty}^\infty\sum_{i_2\in\{h,\ell,\varnothing\}}(v_{hi_2\hat{q}t} - v_{\ell i_2\hat{q}t})\frac{f_{\ell i_2\hat{q}t}}{f_{\ell t}}d\hat{q}.

\end{align}$$ There is zero expected value to M&A if the acquirer is low-productivity, because no deal will ever go through. This implies $X_{{\rm m}\ell t}=0$. For a high-productivity firm, a deal will only take place if M&A search identifies a low-productivity blueprint (hence the multiplication by $f_{\ell t}=N_{\ell t}/N_t$); conditional on finding such a target, the acquirer gets $1-\varrho$ of the (expected) surplus.

The blueprint-loss intensities $\delta_{ijt}=\delta_{i_1i_2\hat{q}t}$ in ) are pinned down by the fact that the flows of blueprints out of existing markets via quality innovation (creative destruction) must equal flows in: $$
\delta_{i_1i_2\hat{q}t}N_{i_1i_2\hat{q}t} = \sum_{i\in\{h,\ell\}} \bigl(\underbrace{\Phi_{\rm q}(x_{{\rm q}it})N_{it}f_{i_1i_2\hat{q}t}}_{\textrm{incumbents}} + \underbrace{\Phi_{\rm q}(x_{{\rm Eq}it})\omega_i\bar{s}L_tf_{i_1i_2\hat{q}t}}_{\textrm{new firms}}\bigr) ,
$$ where the policies $x_{{\rm Eq}it}$ are the amount of labor each type-$a_i$ entrepreneur dedicates to quality innovation. In the next section, we will prove that these policies are identical to the incumbent firm policies $x_{{\rm q}it}$. Both terms on the right-hand side are multiplied by the densities $f_{i_1i_2\hat{q}t}=N_{i_1i_2\hat{q}t}/N_t$ because entry is random across blueprints and therefore proportional to the number of existing blueprints with those characteristics. It follows that the solution is $$
\delta_{i_1i_2\hat{q}t} = \delta_t = \sum_{i\in\{h,\ell\}}\left(f_{it} + \omega_i\frac{\bar{s}L_t}{N_t}\right)\Phi_{{\rm q}i}(x_{{\rm q}it}),
$$ which is independent of blueprint characteristics. Only incumbents can do M&A, and only high-productivity incumbents will choose to do so, so similar logic implies the intensity of M&A exit satisfies $$
\delta_{i_1i_2\hat{q}t}^\psi N_{i_1i_2\hat{q}t} = \mathbb{I}_{i_1=\ell}\Psi(x_{{\rm m}ht})N_{ht}f_{i_1i_2\hat{q}t} ,
$$ yielding the solution $$
\delta_{i_1i_2\hat{q}t}^\psi = \delta_{i_1t}^\psi = \mathbb{I}_{i_1=\ell}\Psi(x_{{\rm m}ht})f_h,
$$ which depends on the producer's type (only low types will choose to sell blueprints), but is invariant to the other blueprint characteristics $i_2$ and $\hat{q}$.

Using these intensities along with the values ) through ), the HJB ) with ) can be separated into a system of $n_{it}$ HJB equations (with $i=i_1$) of the form $$\begin{multline}
\underbrace{\rho_{it}v_{ii_2\hat{q}t}}_{\textrm{discount}} = \underbrace{\pi_{ii_2\hat{q}t} - \hat{w}_{{\rm x}t}x_{it}}_{\textrm{cash flow}} + \underbrace{\Phi_{\rm n}(x_{{\rm n}it})\bar{v}_{{\rm n}it}}_{\textrm{new variety}} +\underbrace{ \Phi_{\rm q}(x_{{\rm q}it})\bar{v}_{{\rm q}it}}_{\textrm{quality innov.}} + \underbrace{\Psi(x_{{\rm m}it})\bar{v}_{{\rm m}it}}_{\textrm{acquisition}} \\
- \underbrace{\vphantom{\frac{\partial v_{ii_2\hat{q}t}}{\partial\hat{q}}} \delta_tv_{ii_2\hat{q}t}}_{\textrm{creative destr.}} + \underbrace{\vphantom{\frac{\partial v_{ii_2\hat{q}t}}{\partial\hat{q}}} \mathbb{I}_{i=\ell}\Psi(x_{{\rm m}ht})f_{ht}(1-\varrho)(v_{hi_2\hat{q}t}- v_{\ell i_2\hat{q}t})}_{\textrm{M\&A exit}} - \underbrace{\frac{\partial v_{ii_2\hat{q}t}}{\partial\hat{q}}g_{Qt}}_{\textrm{quality depr.}} + \underbrace{\vphantom{\frac{\partial v_{ii_2\hat{q}t}}{\partial\hat{q}}} \frac{\partial v_{ii_2\hat{q}t}}{\partial t}}_{\textrm{time path}}.

\end{multline}$$ Fixing time $t$, the whole system of HJBs for blueprint values therefore constitutes an exactly identified system across $\{i_1,i_2,\hat{q}\}\in\{h,\ell\}\times\{h,\ell,\varnothing\}\times\mathbb{R}$. This cross-section is still infinite-dimensional because the quality distribution spans $\mathbb{R}$. However, we can reduce it to a finite-dimensional system by using the fact that the profit coefficients $\pi_{i_1i_2\hat{q}t}$ are multiplicatively separable across market structure and quality: $$
\pi_{i_1i_2\hat{q}t} = \pi_{i_1i_2t}e^{(\eta-1)\hat{q}}.

$$ Thus, a reasonable conjecture is that the value of a blueprint takes the separable form $$
v_{i_1i_2\hat{q}t} = v_{i_1i_2t}^{\rm P}e^{(\eta-1)\hat{q}} + v_{i_1t}^{\rm G},

$$ where $v_{i_1i_2t}^{\rm P}e^{(\eta-1)\hat{q}}$ is naturally interpreted as the present value of the blueprint's future profits and $v_{i_1t}^{\rm G}$ the present value of growth opportunities. Substituting ) into ) and collecting terms in $e^{(\eta-1)\hat{q}}$ implies, for each pair $\{i_1,i_2\}$, the two sets of equations $$
(\rho_{it}+\delta_t+(\eta-1)g_{Qt})v_{i_1i_2t}^{\rm P} = \pi_{i_1i_2t} + \mathbb{I}_{i_1=\ell}\Psi(x_{{\rm m}ht})f_{ht}(1-\varrho)(v_{hi_2t}^{\rm P}- v_{\ell i_2t}^{\rm P}) + \frac{\partial v_{i_1i_2t}^{\rm P}}{\partial t}

$$ and, using the first-order conditions for R&D and M&A expenditures, $$\begin{multline}
(\rho_{it} + \delta_t)v_{i_1t}^{\rm G} = (1-\varepsilon)(\Phi_{\rm n}(x_{{\rm n}it})\bar{v}_{{\rm n}it} +\Phi_{\rm q}(x_{{\rm q}it})\bar{v}_{{\rm q}it} + \Psi(x_{{\rm m}it})\bar{v}_{{\rm m}it}) \\
+ \mathbb{I}_{i=\ell}\Psi(x_{{\rm m}ht})f_{ht}(1-\varrho)(v_{ht}^{\rm G}- v_{\ell t}^{\rm G}) + \frac{\partial v_{i_1t}^{\rm G}}{\partial t}.

\end{multline}$$ The expected blueprint values from R&D and M&A can also be separated as $$\begin{align}
\bar{v}_{{\rm n}it} &= \bar{v}_{{\rm n}it}^{\rm P} + v_{it}^{\rm G}, 
&& \bar{v}_{{\rm n}it}^{\rm P}\equiv v_{i\varnothing t}^{\rm P}\int_{-\infty}^\infty e^{(\eta-1)\hat{q}}\Gamma_{\hat{q}}d\hat{q}, \\
\bar{v}_{{\rm q}it} &= \bar{v}_{{\rm q}it}^{\rm P} + v_{it}^{\rm G}, 
&& \bar{v}_{{\rm q}it}^{\rm P}\equiv \sum_{i'\in\{h,\ell\}}v_{ii't}^{\rm P}\lambda^{\eta-1}\int_{-\infty}^\infty e^{(\eta-1)\hat{q}}f_{i'\hat{q}t}d\hat{q}, \\
\bar{v}_{{\rm m}it} &= \mathbb{I}_{i=h}\varrho f_{\ell t}(\bar{v}_{{\rm m}ht}^{\rm P}-\bar{v}_{{\rm m}\ell t}^{\rm P} + v_{ht}^{\rm G}-v_{\ell t}^{\rm G}), 
&& \bar{v}_{{\rm m}it}^{\rm P}\equiv \sum_{i'\in\{h,\ell,\varnothing\}}v_{ii't}^{\rm P}\int_{-\infty}^\infty e^{(\eta-1)\hat{q}}\frac{f_{\ell i'\hat{q}t}}{f_{\ell t}}d\hat{q}.
\end{align}$$ At any time $t$, this reduced system now constitutes eight equations, determining the six product values $\{v_{i_1i_2t}^{\rm P}\}$ and the two growth values $\{v_{i_1t}^{\rm G}\}$. The only additional conditions we need in order to solve this are the terminal conditions $\lim_{t\to\infty}v_{i_1i_2\hat{q}t}^{\rm P}$ and $\lim_{t\to\infty}v_{i_1t}^{\rm G}$, which we will see below are the solutions to the balanced-growth equilibrium. This system verifies the conjecture.

### Aggregate economic growth 

We can combine firms' policies with the expression for aggregate output ) to compute aggregate economic growth and its components. The growth rate of aggregate output per capita equals $$
g_{yt} = \frac{1}{\eta-1}g_{Nt} + g_{Qt} + g_{\bar{a}t} + g_{\Omega t} + g_{l_pt}.
$$ It will turn out that only the first two terms, the growth rates of the number of varieties ($g_{Nt}=\dot{N}_t/N_t$) and aggregate quality ($g_{Qt}=\dot{Q}_t/Q_t$), are needed to characterize the equilibrium; thus, I defer discussion of the last three terms to Section  on growth accounting. Growth in new varieties is given by $$
g_{Nt} = \sum_{i\in\{h,\ell\}}\Phi_{\rm n}(x_{{\rm n}it})\left(f_{it}+\omega_i\bar{s}\frac{L_t}{N_t}\right).
$$ To compute growth in aggregate quality, note that the change in aggregate quality comes from two sources: new varieties, which have initial quality $\lambda Q_t$; and quality innovations, which improve the current quality of each target blueprint by a factor of $\lambda$. By the chain rule, the change in aggregate quality is $$
dQ_t = \frac{1}{\eta-1}Q_t^{2-\eta}d\left(\frac{1}{N_t}\int_0^{N_t}q_{jt}^{\eta-1}dj\right),
$$ where $$
d\left(\frac{1}{N_t}\int_0^{N_t}q_{jt}^{\eta-1}dj\right) = \underbrace{\frac{1}{N_t}Q_t^{\eta-1}\left(\lambda^{\eta-1} - 1\right)dN_t}_{\textrm{new varieties}} + \underbrace{\frac{1}{N_t}\int_0^{N_t}d\left(q_{jt}^{\eta-1}\right)dj}_{\textrm{quality innov.}}.
$$ The second term can be simplified using a law of large numbers: with intensity $\delta_tdt$, an individual blueprint's quality jumps ($d(q_{jt}^{\eta-1})=(\lambda^{\eta-1}-1)q_{jt}^{\eta-1}$), so we have $$
\frac{1}{N_t}\int_0^{N_t}d\left(q_{jt}^{\eta-1}\right)dj = \delta_tQ_t^{\eta-1}\left(\lambda^{\eta-1} - 1\right)dt.
$$ Putting all of this together, we have the growth rate $$
g_{Qt} = \frac{\lambda^{\eta-1}-1}{\eta-1}(g_{Nt} + \delta_t).
$$

## Dynamic household problem 

### Definition of a family 

I formalize the assumption that individuals within a household make decisions as a "family" as follows. A family is a random collection of individuals on a continuum of arbitrary mass $\varpi_t\in(0,L_t]$. Each newborn individual enters an existing family, and every individual is permanently a member of one family. Thus, family size grows with the population, $\dot{\varpi}_t/\varpi_t=g_L$, and the density of families $L_t/\varpi_t$ has a constant mass.

Let us index the individuals within a family by $k$. At any time $t$, we can partition the family into those individuals who are currently alive ($k\in[0,\varpi_t]$) and those that will be born in a future period ($k\in(\varpi_t,\infty)$). The flow utility of the family is the average utility per living member at time $t$: $$
u_t \equiv \frac{1}{\varpi_t}\int_0^{\varpi_t}u(c_{kt},l_{kt})dk,

$$ where the individual utility functions are given by ). The family is a dynasty, valuing the utility of future generations and discounting it at the same rate $\rho$ with which they discount their own. Hence, the family chooses individual consumption and labor policies $\{c_{kt},l_{pkt},l_{xkt},x_{{\rm En}kt},x_{{\rm Eq}kt}\}$ and portfolio policies $\{\alpha_t\}$ (defined below) for all times $t$ and all individuals $k$ (born and unborn) to maximize the total expected family utility $$
U_t \equiv \max_{\{c_{k,t+\tau},l_{k,t+\tau},x_{k,t+\tau}\}}\left\{\int_0^\infty e^{-\rho \tau}u_{t+\tau}d\tau\right\},

$$ where, again, family utility is defined by ).

A consequence of this setting is that, for any $\varpi_t\in(0,L_t]$, each family will represent a fully diversified mass of individuals in which the proportions of each labor and entrepreneurial type equal the overall population proportions ($\bar{s}$ and $\omega$). All families will therefore be symmetric and the equilibrium decisions at the individual level will be the same for all $\varpi_t$. Without loss of generality, then, we can set $\varpi_t=L_t$ and solve the problem of a representative family composed of all individuals in the economy.

### Optimal household policies 

Even in the presence of yet-nonexistent cohorts in the household's maximization, the derivation of the household's HJB equation is standard. Under $\varpi_t=L_t$, the utility maximand ) can be written recursively as $$
U_t = \int_t^Te^{-\rho(s-t)}u_sds + e^{-\rho(T-t)}U_T,
$$ for any $T\geq t$ (I suppress the maximization for convenience). Now, multiplying this equation by $e^{-\rho t}$ and adding $\int_{t_0}^te^{-\rho s}u_sds$ with $t_0<t$ to both sides, we get $$
\int_{t_0}^te^{-\rho s}u_sds + e^{-\rho t}U_t = \int_{t_0}^Te^{-\rho s}u_sds + e^{-\rho T}U_T.
$$ This equation means the expression on the left-hand side is a martingale; therefore, it is also a local martingale, satisfying the differential equation $$
\frac{d}{dt}\left(\int_{-\infty}^te^{-\rho s}u_sds + e^{-\rho t}U_t\right) = 0.
$$ Evaluating this expression and bringing back the maximization implies the HJB equation $$
\rho U_t = \sup_{\alpha_t,\{c_{kt},l_{kt},x_{kt}\}}\left\{u_t + \dot{U}_t\right\}.

$$ To solve this, we must express $U_t$ as a function of the state variables of the household.

The first state variable of the household is its total financial wealth $W_t$. The household accumulates wealth through three endogenous sources of income. The first is labor income,[^80] $$
w_{{\rm p}t}L_{{\rm p}t} + w_{{\rm x}t}L_{{\rm x}t} = \int_0^{L_t}(w_{{\rm p}t}l_{pkt} + w_{{\rm x}t}(l_{xkt}-x_{{\rm En}kt}-x_{{\rm Eq}kt}))dk.
$$ The second source is new-firm creation from entrepreneurs. Let $dJ_{{\rm n}kt}\in\{0,1\}$ denote a Poisson jump that equals one if individual $k$ succeeds in starting a new firm via new-variety creation. The assumption that entrepreneurs have the same R&D technologies as single-product firms means $$
\mathbb{E}_tdJ_{{\rm n}kt} = \Phi_{\rm n}(x_{{\rm En}kt})dt = (1-\alpha)\varphi x_{{\rm En}kt}^\varepsilon dt.
$$ Define $dJ_{{\rm q}kt}$ analogously for a new firm formed via quality innovation. Let $v_{{\rm En}kt}$ and $v_{{\rm Eq}kt}$ denote the market value of a newly created firm. Then the total flow of new-firm value equals $$
V_{{\rm E}t} \equiv \frac{1}{dt}\int_0^{L_t}(v_{{\rm En}kt}dJ_{{\rm n}kt} + v_{{\rm Eq}kt}dJ_{{\rm q}kt})dk %= \Phi_{\rm n}(x_{Enkt})\bar{v}_{Enkt} + \Phi_{\rm q}(x_{Eqkt})\bar{v}_{Eqkt}.
$$ The fact that there is a continuum of entrepreneurs means two things. First, policies will be symmetric across technologies and types (high- versus low-productivity $a_i$), so there are only four unique policies $\{x_{{\rm En}ht},x_{{\rm Eq}ht},x_{{\rm En}\ell t},x_{{\rm Eq}\ell t}\}$. Second, we can apply a law of large numbers to express this value as a deterministic function of policies: $$
V_{{\rm E}t} = \bar{s}L_t\sum_{i\in\{h,\ell\}}\omega_i(\Phi_{\rm n}(x_{{\rm En}it})\bar{v}_{{\rm En}it} + \Phi_{\rm q}(x_{{\rm Eq}it})\bar{v}_{{\rm Eq}it}),
$$ where $\bar{v}_{{\rm En}it}$ and $\bar{v}_{{\rm Eq}it}$ are the expected values of a new firm. By the results in the previous section, we will have $\bar{v}_{{\rm En}it} = \bar{v}_{{\rm n}it}y_t$ and $\bar{v}_{{\rm Eq}it}=\bar{v}_{{\rm q}it}y_t$. Finally, the household receives capital income from trading in the stocks of incumbent firms and in the riskfree bond with return $r_{ft}$. Let $\alpha_t$ denote the share of wealth invested in the stock market[^81] and $$
r_{Vt} \equiv \frac{D_t}{V_t} + \frac{\dot{V}_t-V_{{\rm E}t}}{V_t}
$$ the return on all *incumbent* firms in the economy.[^82] Combining these three sources with consumption expenditures $C_t \equiv \int_0^{L_t}c_{kt}dk$, wealth grows according to $$
\dot{W}_t = W_t(r_{ft} + \alpha_t(r_{Vt}-r_{ft})) + V_{{\rm E}t} + w_{{\rm p}t}L_{{\rm p}t} + w_{{\rm x}t}L_{{\rm x}t} - C_t.
$$ As a reality check, note that imposing market clearing for wealth $W_t = V_t$ (and $\alpha_t=1$) and consumption $C_t=Y_t$ implies $$
\dot{W}_t = (D_t + \dot{V}_t - V_{{\rm E}t}) + V_{{\rm E}t} + w_{{\rm p}t}L_{{\rm p}t} + w_{{\rm x}t}L_{{\rm x}t} - C_t = \dot{V}_t + Y_t - C_t = \dot{V}_t,
$$ consistent with the fact that all tradable wealth in the economy is held in claims to (existing) firms' cash flows (i.e., the stock market).

The second state variable of the household captures expected growth in off-balance-sheet wealth, such as future labor income. We will prove that aggregate output $Y_t$ and time $t$ are sufficient information to summarize this state. Consider, as an example, the present value of aggregate income from production labor: $$
H_{{\rm p}t} \equiv \int_0^\infty e^{-\int_0^\tau r_{f,t+\tau'}d\tau'}w_{p,t+\tau}L_{p,t+\tau}d\tau.
$$ We showed in Section 11.1.3 that total income $w_{{\rm p}t}L_{{\rm p}t}=\Lambda_tY_t$ for labor share $\Lambda_t$ a deterministic function of time. Thus, the present value of production-labor income can be stated as $$
H_{{\rm p}t} = Y_t \int_0^\infty e^{-\int_0^\tau (r_{f,t+\tau'}-g_{Y,t+\tau'})d\tau'}\Lambda_{t+\tau}d\tau = Y_th_p(t),
$$ the product of aggregate output and a deterministic function of time. Similar arguments can be made for human wealth from skilled labor and new-firm creation. Unlike wealth $W_t$, which has an endogenous accumulation equation, the household takes aggregate output and its growth rate $\dot{Y}_t/Y_t=g_{Yt}$ as given.

To solve the HJB equation ), I first conjecture that $U_t$ is a function of wealth, aggregate output, and time, implying that $$
\rho U_t = \sup_{\alpha_t,\{c_{kt},l_{kt},x_{kt}\}}\left\{u_t + \frac{\partial U_t}{\partial W_t}\dot{W}_t + \frac{\partial U_t}{\partial Y_t}\dot{Y}_t + \frac{\partial U_t}{\partial t}\right\}.

$$ Consistent with the absence of systematic risks, the first-order condition for the allocation $\alpha_t$ implies the return restriction $$
r_{Vt} = r_{ft},
$$ as otherwise there would be an arbitrage opportunity. The first-order conditions for consumption are $$
\frac{c_{kt}^{-1}}{L_t} = \frac{\partial U_t}{\partial W_t},

$$ which implies that all individuals consume the same amount $c_{kt}=c_t=C_t/L_t$. The first-order conditions for production laborers are $$
\frac{\chi l_{pkt}^{1/\zeta}}{L_t} = \frac{\partial U_t}{\partial W_t}w_{{\rm p}t},

$$ which, similarly, implies that $l_{pkt}=l_{{\rm p}t}$. Likewise, for skilled laborers, we have $$
\frac{\chi l_{xkt}^{1/\zeta}}{L_t} = \frac{\partial U_t}{\partial W_t}w_{{\rm x}t},

$$ so $l_{xkt}=l_{{\rm x}t}$. Finally, the first-order conditions for type-$a_i$ entrepreneurial effort into each technology is $$
w_{{\rm x}t} = \varepsilon\varphi x_{{\rm En}it}^{\varepsilon-1}\bar{v}_{{\rm n}it}y_t \quad\textrm{and}\quad w_{{\rm x}t} = \varepsilon\varphi x_{{\rm Eq}it}^{\varepsilon-1}\bar{v}_{{\rm q}it}y_t.

$$ Entrepreneurs weigh the expected market value of their investment against the foregone skilled wage they would have received if they had instead worked for an incumbent. Dividing these by $y_t$ implies that the entrepreneurs have exactly the same first-order conditions as incumbents, and therefore the exact same R&D policies conditional on type.

Next, conjecture that the solution to the value function is $$
U_t = k(t) + \rho^{-1}\log\left(\frac{W_t+Y_th(t)}{L_t}\right)

$$ for some functions $k(t)$ and $h(t)$ of time, for which we will solve. In line with the intuition above, the function $h(t)$ will be such that $Y_th(t)$ represents the present value of non-financial wealth. Foreseeing this, let us define total wealth as $$
\overline{W}_t \equiv W_t+Y_th(t).
$$ It will prove to be the case that $\overline{W}_t/L_t$ equals the present value of per-capita consumption. Now substitute the conjecture ) into the first-order conditions ) through ). We then have per capita consumption $$
c_t = \rho \frac{\overline{W}_t}{L_t},

$$ the standard result for a unit EIS. Combining this with the market-clearing condition $c_t=Y_t/L_t$, the labor supply first-order conditions ) and ) imply $$
l_{{\rm p}t} = \left(\chi^{-1}\frac{w_{{\rm p}t}}{Y_t/L_t}\right)^\zeta \quad\textrm{and}\quad l_{{\rm x}t} = \left(\chi^{-1}\frac{w_{{\rm x}t}}{Y_t/L_t}\right)^\zeta.
$$ These labor supply curves can be combined with the labor demand curves from firms to solve for labor hours and wages. For production labor, we have the aggregate demand curve ) with $L_{{\rm p}t}=(1-\bar{s})l_{{\rm p}t}L_t$, implying the solution $$
l_{{\rm p}t} = \left(\frac{\Lambda_t}{\chi(1-\bar{s})l_{{\rm p}t}}\right)^\zeta = \left(\frac{\Lambda_t}{\chi(1-\bar{s})}\right)^{\frac{\zeta}{1+\zeta}}.

$$ For skilled labor, the policies ) through ) imply total demand from incumbents $$
\int_0^{M_t}X_{it}di = \sum_{i\in\{h,\ell\}}f_{it}(x_{{\rm n}it}+x_{{\rm q}it}+x_{{\rm m}it})N_t = \hat{w}_{{\rm x}t}^{-1/(1-\varepsilon)}\kappa_t^{\rm inc}N_t,
$$ where, from ) to ), the object $$
\kappa_t^{\rm inc} \equiv \sum_{i\in\{h,\ell\}}f_{it}\hat{w}_{{\rm x}t}^{1/(1-\varepsilon)}(x_{{\rm n}it}+x_{{\rm q}it}+x_{{\rm m}it})
$$ is independent of the wage $\hat{w}_{{\rm x}t}$. Recall that labor hours $l_{{\rm x}t}$ include hours supplied to both incumbents and private entrepreneurial efforts. Thus, the total skilled labor supply that must meet this demand is total hours $L_{{\rm x}t}=\bar{s}l_{{\rm x}t}L_t$ minus entrepreneurial hours $$
\int_0^{L_t}(x_{{\rm En}kt}+x_{{\rm Eq}kt})dk = \sum_{i\in\{h,\ell\}}\omega_i(x_{{\rm n}it}+x_{{\rm q}it})\bar{s}L_t = \hat{w}_{{\rm x}t}^{-1/(1-\varepsilon)}\kappa_t^{\rm ent}\bar{s}L_t,
$$ where, analogous to $\kappa_t^{\rm inc}$, $$
\kappa_t^{\rm ent} \equiv \sum_{i\in\{h,\ell\}}\omega_i\hat{w}_{{\rm x}t}^{1/(1-\varepsilon)}(x_{{\rm n}it}+x_{{\rm q}it}).
$$ Now equate aggregate labor supply with demand, $$
(\chi^{-1}\hat{w}_{{\rm x}t})^\zeta\bar{s}L_t - \hat{w}_{{\rm x}t}^{-1/(1-\varepsilon)}\kappa_t^{\rm ent}\bar{s}L_t = \hat{w}_{{\rm x}t}^{-1/(1-\varepsilon)}\kappa_t^{\rm inc}N_t,
$$ and solve for the wage: $$
\hat{w}_{{\rm x}t} = \left(\chi^\zeta\left(\kappa_t^{\rm ent} + \kappa_t^{\rm inc}\frac{N_t}{\bar{s}L_t}\right)\right)^{\frac{1-\varepsilon}{1+\zeta(1-\varepsilon)}}.

$$ Substituting this back into the supply curve gives $$
l_{{\rm x}t} = (\chi^{-1}\hat{w}_{{\rm x}t})^\zeta.

$$

To verify the conjecture, substitute the optimal policies above and the conjecture ) into the HJB ). First, note that the utility flow becomes $$
u_t = \log\rho + \log\frac{\overline{W}_t}{L_t} - (1-\bar{s})\chi\frac{l_{{\rm p}t}^{1+1/\zeta}}{1+1/\zeta} - \bar{s}\chi\frac{l_{{\rm x}t}^{1+1/\zeta}}{1+1/\zeta},
$$ where the labor supply terms are functions of time only. Second, note that $W_t=V_t$. Thus, the HJB becomes $$\begin{multline}
\rho k(t) = \log\rho - (1-\bar{s})\chi\frac{l_{{\rm p}t}^{1+1/\zeta}}{1+1/\zeta} - \bar{s}\chi\frac{l_{{\rm x}t}^{1+1/\zeta}}{1+1/\zeta} \\
+ \rho^{-1}\frac{V_t}{\overline{W}_t}\frac{\dot{V}_t}{V_t} + \rho^{-1}\frac{Y_th(t)}{\overline{W}_t}\frac{\dot{Y}_t}{Y_t} + k'(t) + \rho^{-1}\frac{Y_t}{\overline{W}_t}h'(t)

\end{multline}$$ Because the ratio $V_t/Y_t$ is a function of time only (proven formally in the next section), so too are the ratios $$
\frac{V_t}{\overline{W}_t} = \frac{V_t/Y_t}{V_t/Y_t+h(t)} \quad\textrm{and}\quad \frac{Y_th(t)}{\overline{W}_t} = \frac{h(t)}{V_t/Y_t+h(t)}.
$$ This confirms the form of the conjecture ). However, the ordinary differential equation ) only characterizes $k(t)$ in terms of the function $h(t)$. The next section computes $h(t)$ analytically. In all parts of the paper, including the transition path, the specific functional form of $k(t)$ turns out to be irrelevant, because policies and resource allocations do not explicitly depend on $k(t)$. All that matters is that it exists, which ) confirms.

### State-price density and asset prices 

The state-price density of the household equals $$
\xi_t = e^{-\rho t}\frac{\partial u_t}{\partial c_t} = e^{-\rho t}c_t^{-1},
$$ where, again, per-capita consumption $c_t$ equals per-capita output $y_t$. This implies the law of motion $$
\frac{\dot{\xi}_t}{\xi_t} = -\rho - g_{yt}.
$$ By the absence of arbitrage, the riskfree rate is equal to $$
r_{ft} = -\frac{\dot{\xi}_t}{\xi_t} = \rho + g_{yt},
$$ and all risk premia are equal to zero. Hence, the growth-adjusted discount rate applied to each firm, described in Appendix 11.2.1, is $$
\rho_{it} = r_{ft} - g_{yt} = \rho.
$$

We can use this discount rate to compute $h(t)$ in ). Note that the present value of per-capita consumption equals $$
\int_0^\infty e^{-\int_0^\tau r_{f,t+\tau'}d\tau'}c_{t+\tau}d\tau = c_t\int_0^\infty e^{-\rho \tau}d\tau = \frac{c_t}{\rho} = \frac{\overline{W}_t}{L_t},
$$ where the last equation follows from the household's optimal consumption policy. Thus, per-capita total wealth $\overline{W}_t/L_t$ is the present value of consumption. Using the definition of total wealth and solving for $h(t)$, we get $$
h(t) = \frac{1}{\rho} - \frac{V_t}{Y_t},

$$ where $$
\frac{V_t}{Y_t} = \frac{1}{L_t}\int_0^{M_t}\hat{V}_{it}di = \frac{N_t}{L_t}\int_{-\infty}^\infty\sum_{i_1\in\{h,\ell\}}\sum_{i_2\in\{h,\ell,\varnothing\}}v_{i_1i_2\hat{q}t}f_{i_1i_2\hat{q}t}d\hat{q},

$$ which is indeed a function of time only. Substituting ) and ) back into ) characterizes the solution for $k(t)$, with terminal condition $\lim_{t\to\infty}k(t)$ equal to the constant solution in the balanced-growth equilibrium.

## Distribution 

This section determines the law of motion for the distribution $f_{\mathcal{Z}t}$ by deriving the Kolmogorov Forward Equations (KFEs) in terms of the R&D and M&A policies determined in the previous section.

### Blueprint distribution 

The final step in characterizing the equilibrium is to calculate the evolution of the joint distribution $f_{i_1i_2\hat{q}t}$ of leader-follower pairs $\{i_1,i_2\}$ and relative qualities $\hat{q}\equiv\log(q/Q)$. To characterize how the distribution evolves, we derive and solve the Kolmogorov Forward Equations (KFEs). Henceforth, we will use the shorthand notation $\{\Phi_{{\rm n}it},\Phi_{{\rm q}it},\Psi_{it}\}$ to denote the policy-dependent intensities $\{\Phi_{\rm n}(x_{{\rm n}it}),\Phi_{\rm q}(x_{{\rm q}it}),\Psi(x_{{\rm m}it})\}$ and will use the notation $$
\nu_{it} \equiv \omega_i\frac{\bar{s}L_t}{N_t}
$$ to denote the ratio of type-$a_i$ entrepreneurs relative to the blueprint stock $N_t$.

Note that relative quality $\hat{q}$ depreciates continuously over time. Because $\hat{q}$ is a continuous variable, it is most intuitive to think about how this affects the distribution using a discretized time grid, so relative quality declines in increments $\hat{q}_t-\hat{q}_{t-\Delta t}=-g_{Qt}\Delta t$, and then take the limit as the increments $\Delta t$ go to zero. Note that the change in blueprint types from solely this depreciation is equal to $$
\Delta N_{i_1i_2\hat{q}t} = N_{i_1i_2,\hat{q}+g_{Qt}\Delta t,t-\Delta t} - N_{i_1i_2\hat{q},t-\Delta t}.
$$ Dividing by $N_{t-\Delta t}$ implies $$
f_{i_1i_2\hat{q}t}(1+g_{Nt}\Delta t) - f_{i_1i_2\hat{q},t-\Delta t} = f_{i_1i_2,\hat{q}+g_{Qt}\Delta t,t-\Delta t} - f_{i_1i_2\hat{q},t-\Delta t}.
$$ Dividing by $\Delta t$ and taking the limit as the time interval becomes small implies $$
\dot{f}_{i_1i_2\hat{q}t} + g_{Nt}f_{i_1i_2\hat{q}t} = \lim_{\Delta t\to0}\frac{f_{i_1i_2,\hat{q}+g_{Qt}\Delta t,t-\Delta t} - f_{i_1i_2\hat{q},t-\Delta t}}{g_{Qt}\Delta t}\frac{g_{Qt}\Delta t}{\Delta t} = \frac{\partial f_{i_1i_2\hat{q}t}}{\partial \hat{q}}g_{Qt}.
$$ This is the standard KFE for a process with only negative drift.

It is straightforward to combine this drift with the jumps across characteristics $\{i_1,i_2\}$. First, consider monopoly products $i_2=\varnothing$ with leaders $i_1=i$. The change in the number of blueprints equals $$\begin{multline}
\dot{N}_{i\varnothing\hat{q}t} = \frac{\partial f_{i\varnothing\hat{q}t}}{\partial \hat{q}}g_{Qt}N_t + \Phi_{{\rm n}it}(N_{it} + \omega_i\bar{s}L_t)\Gamma_{\hat{q}} + \mathbb{I}_{i=h}\Psi_{ht}N_{ht}f_{\ell\varnothing\hat{q}t} \\
- \delta_tN_{i\varnothing\hat{q}t} - \mathbb{I}_{i=\ell}\Psi_{ht}f_{ht}N_{i\varnothing\hat{q}t}.
\end{multline}$$ Dividing by $N_t$ and using the fact that $\dot{f}_{Zt} = \dot{N}_{Zt}/N_t-(\dot{N}_t/N_t)f_{Zt}$ implies the KFE $$
\dot{f}_{i\varnothing\hat{q}t} + g_{Nt}f_{i\varnothing\hat{q}t} = \frac{\partial f_{i\varnothing\hat{q}t}}{\partial \hat{q}}g_{Qt} + \Phi_{{\rm n}it}(f_{it} + \nu_{it})\Gamma_{\hat{q}} + (\mathbb{I}_{i=h}-\mathbb{I}_{i=\ell})\Psi_{ht}f_{ht}f_{\ell\varnothing\hat{q}t} - \delta_tf_{i\varnothing\hat{q}t}.

$$ Similarly, for a same-type leader-follower pair $i_1=i_2=i$, and letting $-i\equiv\{h,\ell\}\setminus\{i\}$, we have blueprint flows $$\begin{multline}
\dot{N}_{ii\hat{q}t} = \frac{\partial f_{ii\hat{q}t}}{\partial \hat{q}}g_{Qt}N_t + \Phi_{{\rm q}it}(N_{it} + \omega_i\bar{s}L_t)(f_{ii,\hat{q}-\log\lambda,t}+f_{i,-i,\hat{q}-\log\lambda,t}+f_{i\varnothing,\hat{q}-\log\lambda,t}) \\
+ \mathbb{I}_{i=h}\Psi_{ht}N_{ht}f_{\ell h\hat{q}t} - \delta_tN_{ii\hat{q}t} - \mathbb{I}_{i=\ell}\Psi_{ht}f_{ht}N_{\ell\ell\hat{q}t},
\end{multline}$$ implying the KFEs $$\begin{multline}
\dot{f}_{ii\hat{q}t} + g_{Nt}f_{ii\hat{q}t} = \frac{\partial f_{ii\hat{q}t}}{\partial \hat{q}}g_{Qt} + \Phi_{{\rm q}it}(f_{it} + \nu_{it})(f_{ii,\hat{q}-\log\lambda,t}+f_{i,-i,\hat{q}-\log\lambda,t}+f_{i\varnothing,\hat{q}-\log\lambda,t}) \\
+ \mathbb{I}_{i=h}\Psi_{ht}f_{ht}f_{\ell h\hat{q}t} - \delta_tf_{ii\hat{q}t} - \mathbb{I}_{i=\ell}\Psi_{ht}f_{ht}f_{\ell\ell\hat{q}t},

\end{multline}$$ Finally, for an opposite-type pair $i_1=i$ and $i_2=-i$, we have $$\begin{multline}
\dot{N}_{i,-i,\hat{q}t} = \frac{\partial f_{i,-i,\hat{q}t}}{\partial \hat{q}}g_{Qt}N_t + \Phi_{{\rm q}it}(N_{it} + \omega_i\bar{s}L_t)(f_{-ii,\hat{q}-\log\lambda,t}+f_{-i,-i,\hat{q}-\log\lambda,t}+f_{-i\varnothing,\hat{q}-\log\lambda,t}) \\
+ \mathbb{I}_{i=h}\Psi_{ht}N_{ht}f_{\ell\ell\hat{q}t} - \delta_tN_{i,-i,\hat{q}t} - \mathbb{I}_{i=\ell}\Psi_{ht}f_{ht}N_{\ell h\hat{q}t},
\end{multline}$$ and thus $$\begin{multline}
\dot{f}_{i,-i,\hat{q}t} + g_{Nt}f_{i,-i,\hat{q}t} = \frac{\partial f_{i,-i,\hat{q}t}}{\partial \hat{q}}g_{Qt} + \Phi_{{\rm q}it}(f_{it} + \nu_{it})(f_{-ii,\hat{q}-\log\lambda,t}+f_{-i,-i,\hat{q}-\log\lambda,t}+f_{-i\varnothing,\hat{q}-\log\lambda,t}) \\
+ \mathbb{I}_{i=h}\Psi_{ht}f_{ht}f_{\ell\ell\hat{q}t} - \delta_tf_{i,-i,\hat{q}t} - \mathbb{I}_{i=\ell}\Psi_{ht}f_{ht}f_{\ell h\hat{q}t}.

\end{multline}$$ Given R&D and M&A policies, the KFEs ), ), and ) fully characterize the time path of the blueprint distribution.

We can aggregate these KFEs to characterize the evolution of the total blueprint share across firm types $f_{it}$. Specifically, integrating the KFEs for $i_1=h$ across qualities and summing over follower types implies[^83] $$
\dot{f}_{ht} + g_{Nt}f_{ht} = \Phi_{{\rm n}ht}(f_{ht} + \nu_{ht}) + \Phi_{{\rm q}ht}(f_{ht} + \nu_{ht}) + \Psi_{ht}f_{ht}f_{\ell t} - \delta_tf_{ht}.

$$ Substituting in $g_{Nt}$ and $\delta_t$ implies $$
\dot{f}_{ht} = (\Phi_{{\rm n}ht} + \Phi_{{\rm q}ht})(f_{ht} + \nu_{ht})f_{\ell t} - (\Phi_{{\rm n}\ell t} + \Phi_{{\rm q}\ell t})(f_{\ell t} + \nu_{\ell t})f_{ht} + \Psi_{ht}f_{ht}f_{\ell t}.
$$ This expression gives some insight into why equilibria are stable and eventually converge to a balanced-growth path in which both types maintain positive market share. If low-productivity firms have all the market share ($f_{\ell t}=1$), then $$
\dot{f}_{ht} = (\Phi_{{\rm n}ht} + \Phi_{{\rm q}ht})\nu_{ht} > 0.
$$ The flow of new high-productivity entrants increases the market share. Conversely, if high-productivity firms produce all goods ($f_{ht}=1$), then $$
\dot{f}_{ht} = -(\Phi_{{\rm n}\ell t} + \Phi_{{\rm q}\ell t})\nu_{\ell t} < 0.
$$ In both of these extreme cases, there are no flows from M&A because, in the first case ($f_{\ell t}=1$), there are no buyers looking for targets; and in the second ($f_{ht}=1$), there are no targets to identify.

## Solution to the balanced-growth equilibrium 

In a balanced-growth equilibrium, we have a stationary distribution $\{f_{i_1i_2\hat{q}}\}$ and a constant ratio $L_t/N_t$. It is straightforward to see that, if we substitute these constants into the above equilibrium conditions, then all time dependence disappears. We are then left with the equilibrium conditions laid out in the main text.

## Proof of Proposition 1 

*Proof of Proposition 1.* For monopoly products ($i_2=\varnothing$), high- and low-productivity leaders have the same markup: $\mu_{h\varnothing}=\mu_{\ell\varnothing}=\eta/(\eta-1)$. For $i_2\in\{h,\ell\}$, $$
\frac{\mu_{hi_2}}{\mu_{\ell i_2}} = \frac{\min\{\lambda a_h/a_{i_2},\eta/(\eta-1)\}}{\min\{\lambda a_\ell/a_{i_2},\eta/(\eta-1)\}} \geq 1.
$$ Specifically, $\mu_{hi_2}>\mu_{\ell i_2}$ if $\lambda a_\ell/a_{i_2}<\eta/(\eta-1)$, and $\mu_{hi_2}=\mu_{\ell i_2}$ otherwise. This verifies the claim that $\mu_{hi_2}\geq\mu_{\ell i_2}$. Now consider revenues: $$
\frac{p_{hi_2\hat{q}}y_{hi_2\hat{q}}}{p_{\ell i_2\hat{q}}y_{\ell i_2\hat{q}}} = \left(\frac{a_h/\mu_{hi_2}}{a_\ell/\mu_{\ell i_2}}\right)^{\eta-1}.
$$ If $\mu_{hi_2}=\mu_{\ell i_2}$ (which we just showed occurs when $i_2=\varnothing$ or when $\lambda a_\ell/a_{i_2}\geq\eta/(\eta-1)$), then this ratio is clearly greater than one, so $p_{hi_2\hat{q}}y_{hi_2\hat{q}}>p_{\ell i_2\hat{q}}y_{\ell i_2\hat{q}}$. If instead $\mu_{hi_2}>\mu_{\ell i_2}$, then it must be that either $\mu_{hi_2}/\mu_{\ell i_2}=a_h/a_\ell$, so $p_{hi_2\hat{q}}y_{hi_2\hat{q}}=p_{\ell i_2\hat{q}}y_{\ell i_2\hat{q}}$; or $\mu_{hi_2}/\mu_{\ell i_2}=(\eta/(\eta-1))/(\lambda a_\ell/a_{i_2})\in(1,a_h/a_\ell)$, so $p_{hi_2\hat{q}}y_{hi_2\hat{q}}>p_{\ell i_2\hat{q}}y_{\ell i_2\hat{q}}$. This verifies the claim that $p_{hi_2\hat{q}}y_{hi_2\hat{q}}\geq p_{\ell i_2\hat{q}}y_{\ell i_2\hat{q}}$. Finally, to see that $\pi_{hi_2\hat{q}}>\pi_{\ell i_2\hat{q}}$, note that in all of the above cases, whenever $\mu_{hi_2}=\mu_{\ell i_2}$ we have $p_{hi_2\hat{q}}y_{hi_2\hat{q}}>p_{\ell i_2\hat{q}}y_{\ell i_2\hat{q}}$, and whenever $p_{hi_2\hat{q}}y_{hi_2\hat{q}}=p_{\ell i_2\hat{q}}y_{\ell i_2\hat{q}}$ we have $\mu_{hi_2}>\mu_{\ell i_2}$. Thus, $$
\frac{\pi_{hi_2\hat{q}}}{\pi_{\ell i_2\hat{q}}} = \frac{(1-\mu_{hi_2}^{-1})}{(1-\mu_{\ell i_2}^{-1})}\frac{p_{hi_2\hat{q}}y_{hi_2\hat{q}}}{p_{\ell i_2\hat{q}}y_{\ell i_2\hat{q}}} > 1,
$$ confirming that high-productivity firms earn strictly greater operating profits on a blueprint.

Next, let us prove the claim that $v_{hi_2\hat{q}}^{\rm P}>v_{\ell i_2\hat{q}}^{\rm P}$ and $v_h^{\rm G}>v_\ell^{\rm G}$. The proof is clearest if we first consider the case without M&A ($\psi=0$). In this case, the ratio of product values is $$
\frac{v_{hi_2\hat{q}}^{\rm P}}{v_{\ell i_2\hat{q}}^{\rm P}} = \frac{\pi_{hi_2\hat{q}}}{\pi_{\ell i_2\hat{q}}} > 1.
$$ This implies that the ratio of growth values is $$
\frac{v_h^{\rm G}}{v_\ell^{\rm G}} = \frac{\Phi_{{\rm n}h}\bar{v}_{{\rm n}h}^{\rm P} + \Phi_{{\rm q}h}\bar{v}_{{\rm q}h}^{\rm P}}{\Phi_{{\rm n}\ell}\bar{v}_{{\rm n}\ell}^{\rm P} + \Phi_{{\rm q}\ell}\bar{v}_{{\rm q}\ell}^{\rm P}} > 1,
$$ because $\bar{v}_{{\rm k}h}^{\rm P}>\bar{v}_{{\rm k}\ell}^{\rm P}$ means $\Phi_{{\rm k}h}>\Phi_{{\rm k}\ell}$. The intuition when we add M&A ($\psi>0$) is similar, but we now have to rule out the possibility that low-productivity firms will want to acquire high-productivity blueprints. Note that we can only have $v_{\ell i_2\hat{q}}^{\rm G}>v_{hi_2\hat{q}}^{\rm G}$ for some blueprint type if there exists at least one product for which $v_{\ell i_2\hat{q}}^{\rm P}>v_{hi_2\hat{q}}^{\rm P}$. Suppose this is the case. The ratio of product values for such a product will equal $$
\frac{v_{hi_2\hat{q}}^{\rm P}}{v_{\ell i_2\hat{q}}^{\rm P}} = (1-\theta_\ell^{\rm P})\frac{\pi_{hi_2\hat{q}}}{\pi_{\ell i_2\hat{q}}} + \theta_\ell^{\rm P},

$$ where $$
\theta_\ell^{\rm P} \equiv \frac{\Psi_\ell f_\ell(1-\varrho)}{\rho + \delta + (\eta-1)g_Q + \Psi_\ell f_\ell(1-\varrho)} \in [0,1).
$$ But then $\pi_{hi_2\hat{q}}>\pi_{\ell i_2\hat{q}}$ implies that $v_{hi_2\hat{q}}^{\rm P}>v_{\ell i_2\hat{q}}^{\rm P}$, because ) is a weighted average of one and a number greater than one. This contradicts the assumption, meaning we cannot have $v_{\ell i_2\hat{q}}^{\rm G}>v_{hi_2\hat{q}}^{\rm G}$ for any blueprint type. Hence, it is always the case that $v_{h i_2\hat{q}}^{\rm G}>v_{\ell i_2\hat{q}}^{\rm G}$ for all blueprint types, which implies that $$
\frac{v_{\ell i_2\hat{q}}^{\rm P}}{v_{hi_2\hat{q}}^{\rm P}} = (1-\theta_h^{\rm P})\frac{\pi_{hi_2\hat{q}}}{\pi_{\ell i_2\hat{q}}} + \theta_h^{\rm P},
$$ where $$
\theta_h^{\rm P} \equiv \frac{\Psi_hf_h(1-\varrho)}{\rho + \delta + (\eta-1)g_Q + \Psi_hf_h(1-\varrho)} \in [0,1).
$$ This implies that $v_{hi_2\hat{q}}^{\rm P}>v_{\ell i_2\hat{q}}^{\rm P}$ for all blueprint types, confirming the claim.

The second claim of the proposition follows from the fact that $x_{{\rm k}h}>x_{{\rm k}\ell}$ if and only if $\bar{v}_{{\rm k}h}>\bar{v}_{{\rm k}\ell}$, which is implied by $v_{hi_2\hat{q}}^{\rm P}>v_{\ell i_2\hat{q}}^{\rm P}$ and $v_h^{\rm G}>v_\ell^{\rm G}$. ◻

# Derivations for distribution of firms 

The characteristics of a firm can be summarized by its productivity type $a_i\in\{a_h,a_\ell\}$, the number blueprints it operates $n\in\mathbb{N}\setminus\{0\}$, and the characteristics of those blueprints $\{i_{2jt},\hat{q}_{jt}\}_{j=1}^n\in \{h,\ell,\varnothing\}^n\times\mathbb{R}^n$. Thus, the entire firm distribution is summarized over all possible combinations of these characteristics. This distribution is too complicated to express analytically; however, we can derive closed-form solutions for the simpler distribution of types $a_i$, blueprint counts $n$, and ages $\tau\in\mathbb{R}_+$. As explained in the main text and below, this lower-dimensional distribution yields several useful insights about the equilibrium.

## Notation 

Let $M_t(a_i,n,\tau)$ denote the measure of firms with characteristics $\{a_i,n,\tau\}$ and $m_{it}(n,\tau)\equiv M_t(a_i,n,\tau)/M_t$ the share of such firms. Marginal distributions omit arguments: for example, $m_{it}(\tau) = \sum_{n=1}^\infty m_{it}(n,\tau)$. Note that the mass $M_t$ only includes firms that are currently operating ($n\geq1$), not firms that have exited in the past. Another useful distribution will be the density of *firm-products* $p_{it}(n,\tau)\equiv nM_t(a_i,n,\tau)/N_t$, which represents the share of *blueprints* held by firms with the corresponding characteristics. The firm distribution and firm-product distribution are linked by the equation $$
m_{it}(n,\tau) = \frac{N_t}{M_t}\frac{p_{it}(n,\tau)}{n},
$$ which implies that the average number of blueprints per firm is $N_t/M_t = \sum_{n=1}^\infty nm_{it}(n)$.

It is convenient to define the firm-specific blueprint-gain and blueprint-loss intensities: $$\begin{align}
\bar{\Phi}_{it} &\equiv \Phi_{{\rm n}it} + \Phi_{{\rm q}it} + \mathbb{I}_{i=h}\Psi_{ht}f_{\ell t}, \\
\bar{\delta}_{it} &\equiv \delta_t + \mathbb{I}_{i=\ell}\Psi_{ht}f_{ht}.
\end{align}$$ The former is the intensity, per current blueprint, with which a firm is expected to generate new blueprints via R&D or M&A; the latter is the intensity with which a firm loses a blueprint via creative destruction or M&A exit.

## Derivation of KFEs 

The KFEs for the firm distribution are derived similarly to those for the blueprint distribution. First, to see how the distribution evolves with respect to age $\tau$, suppose there is no R&D or M&A ($\bar{\Phi}_{it} = \bar{\delta}_{it} = 0$) and that time evolves in discrete increments of length $\Delta t$. The change in the mass of firms $M_{it}(n,\tau)$ from time $t-\Delta t$ to time $t$ equals $$
\Delta M_{it}(n,\tau) = M_{i,t-\Delta t}(n,\tau-\Delta t) - M_{i,t-\Delta t}(n,\tau).
$$ Dividing by $M_{i,t-\Delta t}\Delta t$ and taking limits implies $$
\dot{m}_{it}(n,\tau) + g_{Mt}m_{it}(n,\tau) = \lim_{\Delta t\to0} -\frac{m_{i,t-\Delta t}(n,\tau) - m_{i,t-\Delta t}(n,\tau-\Delta t)}{\Delta t} = -\frac{\partial m_{it}(n,\tau)}{\partial \tau}.
$$ where $g_{Mt}\equiv\dot{M}_t/M_t$ is the net growth rate of the number of firms.

Now consider the case in which firms do R&D and M&A. The mass of firms with $n\geq2$ blueprints increases when firms with $n-1$ products successfully gain a blueprint, increases when firms with $n+1$ products lose a blueprint, and decreases when firms with $n$ products gain or lose blueprints. Hence, we have the KFEs $$\begin{multline}
\dot{m}_{it}(n,\tau) + g_{Mt}m_{it}(n,\tau) = -\frac{\partial m_{it}(n,\tau)}{\partial \tau} + \bar{\Phi}_{it}[(n-1)m_{it}(n-1,\tau) - nm_{it}(n,\tau)] \\
+ \bar{\delta}_{it}[(n+1)m_{it}(n+1,\tau) - nm_{it}(n,\tau)].
\end{multline}$$ The KFEs for firms with $n=1$ blueprint is similar, but there are now no inflows from firms with $n-1$ products and, at $\tau=0$, there is also entry by entrepreneurs: $$\begin{multline}
\dot{m}_{it}(1,\tau) + g_{Mt}m_{it}(1,\tau) = -\frac{\partial m_i(1,\tau)}{\partial \tau} + \mathbb{I}_{\tau=0}(\Phi_{{\rm n}it}+\Phi_{{\rm q}it})\nu_{it} \\
- \bar{\Phi}_{it}m_{it}(1,\tau) + \bar{\delta}_{it}[2m_{it}(2,\tau)-m_{it}(1,\tau)].
\end{multline}$$ In all that follows, we assume a balanced-growth equilibrium, which implies constant intensities $\bar{\Phi}_i$ and $\bar{\delta}_i$, a stationary firm distribution $m_i(n,\tau)$, and a constant growth rate of firms $g_M$ equal to the growth rate of blueprints $g_N$.

## Distribution of firm size and age 

There are three steps in deriving the full distribution $m_i(n,\tau)$: first, solve for the distribution of size given age and type $m(n|a_i,\tau)$; then, solve for the marginal distribution of age and type $m_i(\tau)$; and finally, use Bayes' rule to get the full distribution $m_i(n,\tau) = m(n|a_i,\tau)m_i(\tau)$.

### Distribution of size given age and type 

Every firm begins its life with one product, so the initial condition is $m(n|a_i,\tau=0) = \mathbb{I}_{n=1}$. From this initial condition, the conditional distribution evolves with firm age as $$\begin{multline}
\frac{\partial m(n|a_i,\tau)}{\partial \tau} = \bar{\Phi}_i[(n-1)m(n-1|a_i,\tau) - nm(n|a_i,\tau)] \\
+ \bar{\delta}_i[(n+1)m(n+1|a_i,\tau) - nm(n|a_i,\tau)].

\end{multline}$$ The solution to this differential-difference equation is the function $$
m(n|\tau,a_i) = \frac{(\bar{\Phi}_i-\bar{\delta}_i)e^{-(\bar{\Phi}_i-\bar{\delta}_i)\tau}}{\bar{\Phi}_i-\bar{\delta}_ie^{-(\bar{\Phi}_i - \bar{\delta}_i)\tau}}\left(\frac{\bar{\Phi}_i(1-e^{-(\bar{\Phi}_i-\bar{\delta}_i)\tau})}{\bar{\Phi}_i - \bar{\delta}_ie^{-(\bar{\Phi}_i-\bar{\delta}_i)\tau}}\right)^{n-1}.

$$ This solution can be derived via the probability-generating function of Kendall (1948), as laid out in Luttmer (2011) and Klette and Kortum (2004). Alternatively, one could verify this solution by simply substituting ) into the conditional KFEs ).

### Distribution of age and type 

Aggregating the KFEs for $m_i(n,\tau)$ over $n$ for any $\tau>0$, we have $$
g_Nm_i(\tau) = -\frac{\partial m_i(\tau)}{\partial \tau} - \bar{\delta}_i m_i(1,\tau).

$$ Dividing by $m_i(\tau)$ and integrating from over $\tau$ implies that the solution to this is $$
m_i(\tau) = \kappa_i\exp\left\{-\left(g_N\tau + \bar{\delta}_i\int_0^\tau m(1|a_i,\tau')d\tau'\right)\right\},

$$ for some constants of integration $\kappa_i$. Using the conditional size distribution ) at $n=1$, $$
m(1|\tau,a_i) = \frac{(\bar{\Phi}_i-\bar{\delta}_i)e^{-(\bar{\Phi}_i-\bar{\delta}_i)\tau}}{\bar{\Phi}_i-\bar{\delta}_ie^{-(\bar{\Phi}_i-\bar{\delta}_i)\tau}},
$$ the integral in ) evaluates to $$
\bar{\delta}_i\int_0^\tau m(1|a_i,\tau')d\tau' = \log(|\bar{\Phi}_i - \bar{\delta}_ie^{-(\bar{\Phi}_i-\bar{\delta}_i)\tau'}|)\bigl\rvert_0^\tau = \log\left(\frac{\bar{\Phi}_i - \bar{\delta}_ie^{-(\bar{\Phi}_i-\bar{\delta}_i)\tau}}{\bar{\Phi}_i-\bar{\delta}_i}\right).
$$ If $\bar{\Phi}_i>\bar{\delta}_i$ (type-$a_i$ firms grow on average), then this grows asymptotically to a constant $\log(\bar{\Phi}_i/(\bar{\Phi}_i-\bar{\delta}_i))$. If $\bar{\Phi}_i<\bar{\delta}_i$, then this can be written as $$
\bar{\delta}_i\int_0^\tau m(1|a_i,\tau')d\tau' = \log\left(\frac{\bar{\delta}_i e^{(\bar{\delta}_i-\bar{\Phi}_i)\tau} - \bar{\Phi}_i}{\bar{\delta}_i-\bar{\Phi}_i}\right) = (\bar{\delta}_i-\bar{\Phi}_i)\tau + \log\left(\frac{\bar{\delta}_i - \bar{\Phi}_ie^{-(\bar{\delta}_i-\bar{\Phi}_i)\tau}}{\bar{\delta}_i-\bar{\Phi}_i}\right),
$$ which asymptotically grows linearly with $\tau$.

The last step in solving for $m_i(\tau)$ is finding the constants $\kappa_i$ in ). These expressions need to integrate to the total firm-type shares $m_i=\int_0^\infty m_i(\tau)d\tau$, so it must be that $$
\kappa_i = \frac{m_i}{\bar{m}_i}, \quad\textrm{where}\quad \bar{m}_i \equiv \int_0^\infty \exp\left\{-\left(g_N\tau + \bar{\delta}_i\int_0^\tau m(1|a_i,\tau')d\tau'\right)\right\} d\tau.
$$ To solve for $m_i$, and thus for $m_i(\tau)$, note first that, at $\tau=0$, we have $$
m_i(\tau=0) = \kappa_i = \frac{m_i}{\bar{m}_i},

$$ and hence the conditional share of type-$a_i$ firms at $\tau=0$ is $$
m(a_i|\tau=0) = \frac{m_i(\tau=0)}{m_h(\tau=0)+m_\ell(\tau=0)} = \frac{m_i/\bar{m}_i}{m_h/\bar{m}_h+m_\ell/\bar{m}_\ell}.
$$ Second, note that the conditional share at $\tau=0$ is also equal to the relative rates of entry: $$
m(a_i|\tau=0) = \frac{\omega_i(\Phi_{{\rm n}i}+\Phi_{{\rm q}i})}{\omega_h(\Phi_{{\rm n}h}+\Phi_{{\rm q}h})+\omega_\ell(\Phi_{{\rm n}\ell}+\Phi_{{\rm q}\ell})}.
$$ Equating these two expressions and solving for $m_i$ implies $$
m_i = \frac{\omega_i(\Phi_{{\rm n}i}+\Phi_{{\rm q}i})\bar{m}_i}{\omega_h(\Phi_{{\rm n}h}+\Phi_{{\rm q}h})\bar{m}_h+\omega_\ell(\Phi_{{\rm n}\ell}+\Phi_{{\rm q}\ell})\bar{m}_\ell}.
$$ Substituting this back into ), we have $$
\kappa_i = \frac{m_i}{\bar{m}_i} = \frac{\omega_i(\Phi_{{\rm n}i}+\Phi_{{\rm q}i})\bar{m}_i}{\omega_h(\Phi_{{\rm n}h}+\Phi_{{\rm q}h})\bar{m}_h+\omega_\ell(\Phi_{{\rm n}\ell}+\Phi_{{\rm q}\ell})\bar{m}_\ell},
$$ and hence a closed-form solution for $m_i(\tau)$.

Finally by Bayes' rule, we can multiply the conditional distribution ) with the marginal distribution ) to get the solution $m_i(n,\tau)$.

## Derivation of firm entry and exit rates 

The firm entry rate $\mathcal{E}_{\rm entry}$ is defined as the flow of new firms being created relative to the number of incumbent firms: $$
\mathcal{E}_{{\rm entry},t} = \frac{\sum_{i\in\{h,\ell\}}(\Phi_{{\rm n}it}+\Phi_{{\rm q}it})\omega_i\bar{s}L_t}{M_t} = \frac{N_t}{M_t}\sum_{i\in\{h,\ell\}}(\Phi_{{\rm n}it}+\Phi_{{\rm q}it})\nu_{it}.

$$ The ratio $N_t/M_t$ can be computed from the marginal distribution $m_t(n)=\sum_i\int_0^\infty m_{it}(n,\tau)d\tau$: $$
\frac{N_t}{M_t} = \sum_{n=1}^\infty nm_t(n).
$$ The exit rate is likewise defined as the flow of firms existing relative to the number of incumbent firms: $$
\mathcal{E}_{{\rm exit},t} = \frac{\sum_{i\in\{h,\ell\}}\bar{\delta}_{it}M_{it}(n=1)}{M_t} = \sum_{i\in\{h,\ell\}}\bar{\delta}_{it}m_{it}(n=1).
$$ In a balanced-growth equilibrium, the entry and exit rates are constant and are linked to each other by the relation $$
g_L = \mathcal{E}_{\rm entry} - \mathcal{E}_{\rm exit},
$$ because the growth rate in the number of firms $g_M = \dot{M}_t/M_t$ is equal to the growth rate of the population $g_L$.

## Derivation of Pareto tail coefficients 

As Luttmer (2011) shows, if $\bar{\Phi}_i>\bar{\delta}_i$, then the distribution of blueprints $m_i(n)$ exhibits a Pareto tail with tail index $\theta_i>0$. To represent the fact that the probability density function $m_i(n)$ behaves like $n^{-(1+\theta_i)}$, I use the notation $m_i(n)\stackrel{\rm lim}{\sim}n^{-(1+\theta_i)}$.[^84] To show this, it is easiest to first show that the firm-product distribution $p_i(n)\propto nm_i(n)\stackrel{\rm lim}{\sim}n^{-\theta_i}$. The tail index is defined as the limit of the sequence $$
\theta_i \equiv \lim_{n\to\infty} - n \frac{p_i(n+1) - p_i(n)}{p_i(n)}.
$$ Intuitively, this tail index behaves like the continuous elasticity $\lim_{n\to\infty}-\partial\log p_i(n)/\partial\log n$ and is finite if $\log p_i(n)$ falls at the same rate as (or more slowly than) $\log n$ rises. If the distribution exhibits a tail that is thinner than a Pareto distribution (which will occur for $\bar{\Phi}_i\leq\bar{\delta}_i$), then the tail index will be infinite.

Consider the case $\bar{\Phi}_i>\bar{\delta}_i$. To compute the Pareto tail coefficient, note that the KFE for $m_i(n)$ is $$
g_Nm_i(n) = \bar{\Phi}_i[(n-1)m_i(n-1) - nm_i(n)] + \bar{\delta}_i[(n+1)m_i(n+1) - nm_i(n)].
$$ Dividing both sides by $N_t$, we can rewrite this in terms of the firm-product distribution $$
g_N\frac{1}{n}p_i(n) = \bar{\Phi}_i[p_i(n-1) - p_i(n)] + \bar{\delta}_i[p_i(n+1) - p_i(n)].

$$ Multiplying both sides by $n/p_i(n)$ and taking limits as $n\to\infty$, we then get the tail coefficient $$
\theta_i = \frac{g_N}{\bar{\Phi}_i-\bar{\delta}_i}.
$$ If instead $\bar{\Phi}_i<\bar{\delta}_i$, then the tail is thinner than Pareto. We instead have $p_i(n)\stackrel{\rm lim}{\sim}(\bar{\Phi}_i/\bar{\delta}_i)^n$. One can verify this by dividing ) by $p_i(n)$, substituting this in, and taking $n\to\infty$. As shown in Luttmer (2011), the tail index for the entire distribution of firms (i.e., including both high and low types) is $$
\theta = \min\{\theta_h,\theta_\ell\} = \min\left\{\frac{g_N}{[\bar{\Phi}_h-\bar{\delta}_h]^+},\frac{g_N}{[\bar{\Phi}_\ell-\bar{\delta}_\ell]^+}\right\} = \theta_h.
$$ High-productivity firms grow faster than low-productivity firms on average, so they dominate the tail.

As a final step, let us prove that the tail index $\theta$ of the blueprint distribution also applies to the distribution of employment and the distributions of rescaled firm revenues, profits, and value. Firm $i$'s total employment per blueprint equals $$
L_{it} = l_{\rm p}(1-\bar{s})\frac{L_t}{N_t}\left(\frac{a_i}{\bar{a}}\right)^{\eta-1}n_{it}\upsilon_{\eta it},
$$ where $$
\upsilon_{\eta it} \equiv \frac{1}{n_{it}}\sum_{j=1}^{n_{it}}\left(\frac{q_{ijt}}{Q_t}\right)^{\eta-1}\left(\frac{\mu_{ijt}}{\mathcal{M}_{\eta-1}}\right)^{-\eta}
$$ is an average of qualities and markups across products. Because the mix of product gains and losses is invariant to a firm's size, this average $\upsilon_{\eta it}$ converges in probability to a constant as $n\to\infty$: there exists a finite constant $\bar{\upsilon}_\eta(a_i)>0$ (which may depend on the type $a_i$) such that $$
\lim_{n\to\infty}\mathbb{P}(\upsilon_{\eta it} = \bar{\upsilon}_\eta(a_i)) = 1.
$$ I henceforth use the notation $\upsilon_{\eta it}\stackrel{\rm p}{\to}\bar{\upsilon}_\eta(a_i)$ to denote this probability limit. This means that $$
\frac{L_{it}}{n_{it}} \stackrel{p\rm }{\to} l_{\rm p}(1-\bar{s})\frac{L_t}{N_t}\left(\frac{a_i}{\bar{a}}\right)^{\eta-1}\bar{\upsilon}_\eta(a_i),
$$ and thus that $L\stackrel{\rm lim}{\sim}n$ (in a probabilistic sense). Among very large firms, product portfolios are well-diversified, so employment is linear in $n$. It follows that the Pareto tail of the employment distribution looks the same as that of the blueprint distribution. The exact same logic applies to the rescaled sales, profit, and value distributions: $$
\frac{p_{it}y_{it}}{Y_t/L_t} \stackrel{\rm lim}{\sim} n_{it}, \quad \hat{\Pi}_{it} \stackrel{\rm lim}{\sim} n_{it}, \quad\textrm{and}\quad \hat{V}_{it} \stackrel{\rm lim}{\sim} n_{it}.
$$

# Numerical solution and simulation methods 

## Balanced-growth path: Equilibrium solution method 

This section outlines the numerical method used to solve the system of equilibrium conditions in Section 3.2. The basic idea is two loops: an outer loop over the blueprint distribution $\{f_{i_1i_2\hat{q}}\}$ and an inner loop over the ratio $L_t/N_t$, which is constant in a BGE.

### State space

The states needed to calculate blueprint values $\{v_{i_1i_2}^{\rm P},v_{i_1}^{\rm G}\}$ are $\{i_1,i_2\}$ (recall that we solved out $\hat{q}$ analytically, so we do not need to consider this state), a six-dimensional grid. To solve for the blueprint distribution, we need the states $\{i_1,i_2,\hat{q}\}$, which is infinite-dimensional. Thus, we discretize the grid for $\hat{q}$ on an evenly spaced grid with increments $\log\lambda/m$ for some positive integer $m$. The grid has $2K+1$ points evenly centered around zero: $$\hat{q}\in\left\{-K\frac{\log\lambda}{m},-(K-1)\frac{\log\lambda}{m},\dots,-\frac{\log\lambda}{m},0,\frac{\log\lambda}{m},\dots,(K-1)\frac{\log\lambda}{m},K\frac{\log\lambda}{m}\right\}.$$ Thus, the full numerical state space has $6\times(2K+1)$ points.

### Overview of solution algorithm

Ultimately, we are interested are the values $\{v_{i_1i_2}^{\rm P},v_i^{\rm G}\}$ and the distribution $\{f_{i_1i_2\hat{q}}\}$ that solve the HJBs and KFEs. These unknowns depend on, and jointly imply, macroeconomic aggregates, firm markups and profits, and R&D and M&A policies $\{x_{{\rm n}i},x_{{\rm q}i},x_{{\rm m}i}\}$. Starting from an initial guess for values and distributions (iteration 0), the algorithm at iteration $k$ is as follows:

1. Given the distribution $\{f_{i_1i_2\hat{q}}^{k-1}\}$ and policies $\{x_{{\rm n}i}^{k-1},x_{{\rm q}i}^{k-1},x_{{\rm m}i}^{k-1}\}$, compute the profit rates $\{\pi_{i_1i_2}^k\}$ and values $\{v_{i_1i_2}^{{\rm P},k}\}$ and $\{v_i^{{\rm G},k}\}$ from the HJBs.

2. Under the new values $\{v_{i_1i_2}^{{\rm P},k}\}$ and $\{v_i^{{\rm G},k}\}$ and the old distribution $\{f_{i_1i_2\hat{q}}^{k-1}\}$, use an inner loop to solve for policies $\{x_{{\rm n}i}^{k},x_{{\rm q}i}^{k},x_{{\rm m}i}^{k}\}$ and the aggregates $\{LN^k,\hat{w}_{\rm x}^k,f_h^k\}$ that are consistent with balanced growth.

3. Using the new values and policies, solve for the distribution $\{f_{i_1i_2\hat{q}}^k\}$; use this distribution to re-compute macroeconomic aggregates.

4. Check if $\{f_{i_1i_2\hat{q}}^k\}$ has converged; if not, return to step 1.

This iterative process exploits the stability of a stationary distribution: starting from an arbitrary distribution, the model will converge to the balanced-growth path.

### Initialization

The initialization requires a guess for the distribution $\{f_{i_1i_2\hat{q}}^0\}$, policies $\{x_{{\rm n}i}^{0},x_{{\rm q}i}^{0},x_{{\rm m}i}^{0}\}$, and ratio $LN^0$. Given that the speed of convergence is insensitive to this initialization, I set the distribution to be normal across $\hat{q}$ (with standard deviation small relative to the grid) and uniform across pairs $\{i_1,i_2\}$. I set unit policies $x_{{\rm n}i}^{0}=x_{{\rm q}i}^{0}=1$ and $x_{{\rm m}h}^{0}=1$. I set $LN^0=10$.

Before starting the algorithm, we also need to compute some aggregate variables using these initial guesses. These include the aggregates $\{\bar{a}^0,\mathcal{M}_{\eta-1}^0\}$, which we need to calculate profits; and the growth rates $\{\delta^0,g_N^0\}$ and rescaled skilled wage $\hat{w}_{\rm x}^0$, which are needed to compute values and distributions. These computations are explained in further detail below.

### Solving HJBs

The markups of a product $\mu_{i_1i_2}$ are independent of product values and the distribution, so these can be computed with initialized values. The profit rate at iteration $k$ then equals $$
\pi_{i_1i_2}^k = \left(1-\frac{1}{\mu_{i_1i_2}}\right)\biggl(\frac{a_{i_1}/\mu_{i_1i_2}}{\bar{a}^{k-1}/\mathcal{M}_{\eta-1}^{k-1}}\biggr)^{\eta-1}LN^{k-1},
$$ where $\bar{a}^{k-1}$ and $\mathcal{M}_{\eta-1}^{k-1}$ are computed using $\{f_{i_1i_2\hat{q}}^{k-1}\}$.

Using the distribution $\{f_{i_1i_2\hat{q}}^{k-1}\}$, the policies $\{x_{{\rm n}i}^{k-1},x_{{\rm q}i}^{k-1},x_{{\rm m}i}^{k-1}\}$, and the ratio $LN^{k-1}$, we can compute product values as $$
v_{i_1i_2}^{{\rm P},k} = \frac{\pi_{i_1i_2}^k + \mathbb{I}_{i=\ell}\Psi_h^{k-1}f_h^{k-1}(1-\varrho)v_{hi_2}^{{\rm P},k}}{\rho + \delta^{k-1} + (\eta-1)g_Q^{k-1} + \mathbb{I}_{i=\ell}\Psi_h^{k-1}f_h^{k-1}(1-\varrho)}.
$$ Using these product values, we can compute expected product values, $\{\bar{v}_{{\rm n}i}^{k},\bar{v}_{{\rm q}i}^{k},\bar{v}_{{\rm m}i}^{k}\}$, with which we can compute the value of growth opportunities. Because of M&A, growth values are interdependent in the HJB, so we need to invert a two-by-two system: for $v^{\rm G} = [v_h^{\rm G}\quad v_\ell^{\rm G}]^\top$, $$
C^{{\rm G},k}v^{{\rm G},k} = \tilde{v}^{{\rm G},k},
$$ where (suppressing $k$ superscripts for the moment) $$
C^{\rm G} = \begin{bmatrix}
\rho+\delta-(1-\varepsilon)(\Phi_{{\rm n}h}+\Phi_{{\rm q}h}+\Psi_hf_\ell\varrho) & (1-\varepsilon)\Psi_hf_\ell\varrho \\
-\Psi_hf_h(1-\varrho) & \rho+\delta-(1-\varepsilon)(\Phi_{{\rm n}\ell}+\Phi_{{\rm q}\ell})+\Psi_hf_h(1-\varrho)
\end{bmatrix}.
$$ and $$
\tilde{v}^{\rm G} = \begin{bmatrix}
(1-\varepsilon)(\Phi_{{\rm n}h}\bar{v}_{{\rm n}h}^{\rm P}+\Phi_{{\rm q}h}\bar{v}_{{\rm q}h}^{\rm P}+\Psi_hf_\ell\bar{v}_{{\rm m}h}^{\rm P}) \\
(1-\varepsilon)(\Phi_{{\rm n}\ell}\bar{v}_{{\rm n}\ell}^{\rm P}+\Phi_{{\rm q}\ell}\bar{v}_{{\rm q}\ell}^{\rm P})
\end{bmatrix}.
$$ Inverting this system gives us $\{v_{i_1}^{{\rm G},k}\}$.

### Inner loop: Solving for policies and growth rates

A challenge presented by the balanced-growth path is that we need $g_N=g_L$. This means that we need to simultaneously find the ratio $LN$, the policies $\{x_{{\rm n}i}^k,x_{{\rm q}i}^k,x_{{\rm m}i}^k\}$, and the total market share $f_h$ that are consistent with this condition and with each other. The relations among these objects are highly nonlinear, so I implement an iterative procedure.

Within iteration $k$, we re-initialize the procedure at iteration $l=0$, setting $\{LN^0,\hat{w}_{\rm x}^0,f_h^0\}$ equal to their values $\{LN^{k-1},\hat{w}_{\rm x}^{k-1},f_h^{k-1}\}$ from the previous outer-loop iteration. We can then calculate the new-variety R&D policies $$
x_{{\rm n}i}^l = \left(\frac{\varepsilon(1-\alpha)\varphi}{\hat{w}_{\rm x}^{l-1}}(\bar{v}_{{\rm n}i}^{{\rm P},k} + v_i^{{\rm G},k})\right)^{1/(1-\varepsilon)},
$$ and do the same for $x_{{\rm q}i}^{k}$ and $x_{{\rm m}i}^{k}$. These imply $\{\Phi_{{\rm n}i}^l,\Phi_{{\rm q}i}^l,\Psi_i^l\}$. Next, compute the $LN^l$ that solves $g_N=g_L$, or $$
LN^l = \frac{g_L - \sum_{i\in\{\ell,h\}}f_i^l\Phi_{{\rm n}i}^l}{\bar{s}\sum_{i\in\{\ell,h\}}\omega_i\Phi_{{\rm n}i}^l}
$$ Then find $f_h^l$ that solves $$
(g_L + \Phi_{{\rm q}h}^l(\nu_h^l+f_h) + \Phi_{{\rm q}\ell}^l(\nu_\ell^l+f_\ell))f_h^l = (\Phi_{{\rm n}h}^l+\Phi_{{\rm q}h}^l)(\nu_h^l + f_h^l) + \Psi_h^lf_h^l(1-f_h^l).
$$ Substituting $f_\ell^l=1-f_h^l$, this is a quadratic equation, $$\begin{multline}
\underbrace{-(\Phi_{{\rm n}h} + \Phi_{{\rm q}h})\nu_h}_{\phi_0^l} + \underbrace{(g_L + \Phi_{{\rm q}h}(\nu_h-1) +\Phi_{{\rm q}\ell}(\nu_\ell+1) - \Phi_{{\rm n}h} - \Psi_h)}_{\phi_1^l}f_h^l \\
+ \underbrace{(\Phi_{{\rm q}h} + \Psi_h - \Phi_{{\rm q}\ell})}_{\phi_2^l}(f_h^l)^2 = 0.
\end{multline}$$ the solution to which is the positive root $$\begin{align}
f_h^l = -\frac{\phi_1^l}{2\phi_2^l} + \sqrt{\left(\frac{\phi_1^l}{2\phi_2^l}\right)^2 - \frac{\phi_0^l}{\phi_2^l}}.
\end{align}$$ Finally, we can re-solve for the skilled wage $$
\hat{w}_{\rm x}^l = \left((\hat{w}_{\rm x}^{l-1})^{\frac{1}{1-\varepsilon}}\sum_{i\in\{\ell,h\}}\left(\omega_i(x_{{\rm n}i}^l+x_{{\rm q}i}^l)+\frac{1}{\bar{s}LN^l}f_i^l(x_{{\rm n}i}^l+x_{{\rm q}i}^l+x_{{\rm m}i}^l)\right)\right)^{\frac{1-\varepsilon}{1+\zeta(1-\varepsilon)}}
$$ The criterion for the convergence of this inner loop is that $LN$ is constant between iterations: $|LN^l-LN^{l-1}|<\epsilon_{\rm inner}^{\rm tol}$, where I use the threshold $\epsilon_{\rm inner}^{\rm tol}=10^{-8}$. At this final iteration $l$, we take these values for $\{x_{{\rm n}i}^k,x_{{\rm q}i}^k,x_{{\rm m}i}^k\}$, $\{\Phi_{{\rm n}i}^k,\Phi_{{\rm q}i}^k,\Psi_i^k\}$, $LN^k$, $\hat{w}_{\rm x}^k$, and $f_h^k$.

### Solving KFEs

Given $f_h$, we can solve the KFEs as a $(6\times(2K+1))$-dimensional linear system. For notational simplicity, suppress the iteration index $k$ in what follows. Note that the derivative over relative quality $\hat{q}$ can be approximately by the discretization $$
\frac{\partial f_{i_1i_2\hat{q}}}{\partial\hat{q}} = \frac{f_{i_1i_2,\hat{q}+\log\lambda/m}-f_{i_1i_2\hat{q}}}{\log\lambda/m}
$$ We take a forward difference because $\hat{q}$ is falling over time.

We will use the following notation. Let $C_{ss'}$ denote a $(2K+1)\times(2K+1)$ matrix of the form $$
C_{ss'} = \begin{bmatrix}
c_{ss'}^{(0)} & c_{ss'}^{(1)} & c_{ss'}^{(2)} & c_{ss'}^{(3)} & \cdots \\
c_{ss'}^{(-1)} & c_{ss'}^{(0)} & c_{ss'}^{(1)} & c_{ss'}^{(2)} & \cdots \\
c_{ss'}^{(-2)} & c_{ss'}^{(-1)} & c_{ss'}^{(0)} & c_{ss'}^{(1)} & \cdots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{bmatrix},
$$ for constants $c_{ss'}^{(j)}$ to be defined. The indices $s$ correspond to the states $\{i_1,i_2\}$ ordered from 1 to 6 as $\{\{h,\varnothing\},\{\ell,\varnothing\},\{h,h\},\{\ell,\ell\},\{h,\ell\},\{\ell,h\}\}$; the same is true of $s'$. The interpretation of element $c_{ss'}^{(j)}$ is thus the transition rate into the current state $s$ and quality $\hat{q}$ (the row) from the state $s'$ with quality $j$ increments higher than $\hat{q}$.

Now, let us concatenate these block matrices into the $6\times6$ block matrix $$
C = \begin{bmatrix}
C_{11} & C_{12} & \cdots & C_{16} \\
C_{21} & C_{22}& \cdots & C_{26} \\
\vdots & \vdots & \ddots& \vdots \\
C_{61} & C_{62} & \cdots & C_{66} \\
\end{bmatrix},
$$ which ultimately is a $6(2K+1))\times6(2K+1))$ matrix.

Letting $\mathbf{f}$ denote the $(6(2K+1))$-dimensional vector of the distribution $f_{i_1i_2\hat{q}}$ with element order corresponding to the order in $C$, we can write the KFEs as the linear system $$
C\mathbf{f} = \mathbf{v},
$$ where the right-hand side vector $\mathbf{v}$ is a concatenation of the new-variety quality distribution and a vector of zeros: $$
\mathbf{v} \equiv [\Phi_{{\rm n}h}(\nu_h+f_h)\Gamma^\top,\Phi_{{\rm n}\ell}(\nu_\ell+f_\ell)\Gamma^\top,0_{4(2K+1)}^\top]^\top.
$$ (Recall that we have assumed $\Gamma_{\hat{q}}$ is a point mass at $\hat{q}=\log\lambda$.) From this equation, it follows that the diagonal block matrices $C_{ss}$ have elements $$\begin{align}
c_{ss}^{(0)} &= g_N + \delta + \mathbb{I}_{s\in\{2,4,6\}}\Psi_hf_h + \frac{g_Q}{\log\lambda/m}, \\
c_{ss}^{(1)} &= - \frac{g_Q}{\log\lambda/m},
\end{align}$$ where $\mathbb{I}_{s\in\{2,4,6\}}=1$ if $s\in\{2,4,6\}$. All other elements $c_{ss}^{(j)}$ equal zero. The diagonals of $C_{12}$, $C_{36}$, and $C_{54}$ are $$
c_{12}^{(0)} = c_{36}^{(0)} = c_{54}^{(0)} = -\Psi_hf_h.
$$ For $\{h,h\}$ we have non-zero elements of $C_{31}$, $C_{33}$, and $C_{35}$ equal to $$
c_{31}^{(-m)} = c_{33}^{(-m)} = c_{35}^{(-m)} = -\Phi_{{\rm q}h}(\nu_h+f_h).
$$ Similarly, for $\{\ell,\ell\}$, $$
c_{42}^{(-m)} = c_{44}^{(-m)} = c_{46}^{(-m)} = -\Phi_{{\rm q}\ell}(\nu_\ell+f_\ell).
$$ And finally, for $\{h,\ell\}$, $$
c_{52}^{(-m)} = c_{54}^{(-m)} = c_{56}^{(-m)} = -\Phi_{{\rm q}h}(\nu_h+f_h),
$$ and $$
c_{61}^{(-m)} = c_{63}^{(-m)} = c_{65}^{(-m)} = -\Phi_{{\rm q}\ell}(\nu_\ell+f_\ell).
$$ All other elements of $C$ are set equal to zero.[^85]

Because we have discretized the distribution and therefore truncated the relative quality distribution, we need to adjust coefficients at the boundary to account for the (very small) number of blueprints that would exit the economy at the boundaries of the state space. These exits would occur in one of two cases: first, when a product with the minimum quality $\hat{q}_{\rm min}$ depreciates; and second, when a product with one of the $m$ highest possible qualities $\hat{q}\in\{\hat{q}_{\rm max}-(m-1)\log\lambda/m,\dots,\hat{q}_{\rm max}-\log\lambda/m,\hat{q}_{\rm max}\}$ gets hit with a quality innovation. The former is resolved by assuming there is no depreciation at this lower bound, meaning $g_Q/(\log\lambda/m)$ must be subtracted from the top left element of every $C_{ss}$ matrix. The latter is resolved by assuming that the quality of maximal-quality blueprints cannot be further improved, which means $\delta$ must be subtracted from the $m$ bottom-right diagonal element of every $C_{ss}$ matrix.

### Convergence criteria and final computations

The convergence criterion is that the entire distribution $f_{i_1i_2\hat{q}}$ has converged between iterations. I implement this by checking whether $$
||\mathbf{f}^k-\mathbf{f}^{k-1}||<\epsilon_{\rm outer}^{\rm tol},
$$ where $||\cdot||$ is the Euclidean norm and $\epsilon_{\rm outer}^{\rm tol}$ is a very small tolerance threshold (I choose $\epsilon_{\rm outer}^{\rm tol}=10^{-10}$). If this condition is satisfied, then the iteration is complete. Otherwise, use the distribution $\mathbf{f}^k$ to recompute the aggregates $\{\bar{a}^k,\mathcal{M}_{\eta-1}^k\}$ and begin iteration $k+1$.

### Solution without population and variety growth

While this entire paper assumes positive population growth ($g_L>0$) and new-variety creation ($\alpha<1$), it nests the special case without population or variety growth ($g_L=1-\alpha=0$), which is the assumption in a large class of endogenous-growth models. The above solution method will fail in this case for two reasons, requiring two numerical changes:

1. The ratio $L_t/N_t$ will be indeterminate, so we need to set it to some exogenous initial value and omit the step of solving for $LN$ in the inner loop. Some papers assume $L_t/N_t=1$, but this need not be the case. This will imply $g_N=g_L=0$ and $g_Q>0$.

2. The matrix inversion that allows us to solve for $\mathbf{f}$ will no longer work numerically, because the right-hand-side vector will equal a vector of zeros and the matrix $C$ will be singular. The solution to this is proposed in Achdou et al. (2022): set an arbitrary element (say, element $j$) of the zero vector equal to 0.01, then set the $j$th diagonal of matrix $C$ equal to one and every other element in the $j$th row equal to zero (thus fixing $f(j)=0.01$). Then, after inverting this system, re-normalize the solution to this system so that it sums to one.

## Balanced-growth path: Simulation of firm distribution 

The distribution of firm sales, profits, valuations, and other variables is computed via simulation. Specifically, using the fact that we know the distribution of type and age $m_i(\tau)$, moments can be calculated in two steps. First, simulate a single cohort of firms over their lifecycle from $\tau=0$ onward, computing conditional moments at each $\tau$. Then, use $m_i(\tau)$ to compute unconditional moments after the simulation is complete.

### Dimension reduction and state space

While Appendix 12 shows that we can analytically solve for the firm distribution over the states $\{a_i,n,\tau\}$, this distribution does not contain enough information to infer the distributions of sales, profits, or valuations. For that, we also need to know the distribution over possible combinations of blueprint characteristics $\{i_{2j},\hat{q}_{jt}\}_{j=1}^n$ for every $n$ and $a_i$. The dimensionality of this space is far too large to solve numerically: for a grid of $n$ with $n_{\rm max}$ points and a grid of $\hat{q}$ with $2K+1$ points, there are $\sum_{n=1}^{n_{\rm max}}2(3(2K+1))^n$ possibilities. To resolve this problem, I first show that we can collapse a firm's quality into just three sufficient statistics, then approximate the distribution of these statistics over the firm lifecycle via simulation.

Each firm has $n=\sum_{i_2\in\{h,\ell,\varnothing\}}n_{i_2}$ blueprints. Define the average quality of blueprints within the firm with followers $i_2$ as $$
\tilde{q}_{i_2} \equiv \frac{1}{n_{i_2}}\sum_{j=1}^{n_{i_2}}e^{(\eta-1)\hat{q}_j}.
$$ We will show that $\{n_{i_2},\tilde{q}_{i_2}\}_{i_2\in\{h,\ell,\varnothing\}}\}$ is sufficient information for characterizing the full distribution of sales, profits, R&D and M&A expenditures, cash flows, and market values. The rescaled revenues of firm $i$ are $$
\frac{p_{it}y_{it}}{Y_t/L_t} = \frac{L_t}{N_t}\left(\frac{a_{i_1}}{\bar{a}}\right)^{\eta-1}\sum_{i_2\in\{h,\ell,\varnothing\}}n_{i_2}\tilde{q}_{i_2}\left(\frac{\mu_{i_1i_2}}{\mathcal{M}_{\eta-1}}\right)^{-(\eta-1)}.
$$ The firm-level markup is $$
\mu_{it} = \left(\frac{p_{ijt}y_{ijt}}{p_{it}y_{it}}\mu_{ijt}^{-1}\right)^{-1} = \frac{\sum_{i_2\in\{h,\ell,\varnothing\}}n_{i_2}\tilde{q}_{i_2}\mu_{i_1i_2}^{-(\eta-1)}}{\sum_{i_2\in\{h,\ell,\varnothing\}}n_{i_2}\tilde{q}_{i_2}\mu_{i_1i_2}^{-\eta}}.
$$ and hence rescaled profits are $$
\hat{\Pi}_{it} = \left(1-\frac{1}{\mu_{it}}\right)\frac{p_{it}y_{it}}{Y_t/L_t}.
$$ From this it is straightforward to see that cashflows equal $\hat{D}_{it} = \hat{\Pi}_{it} - \hat{w}_{\rm x}x_i$, and that rescaled firm value equals $$
\hat{V}_{it} = \sum_{i_2\in\{h,\ell,\varnothing\}}n_{i_2}(v_{i_1i_2}^{\rm P}\tilde{q}_{i_2} + v_{i_1}^{\rm G}).
$$ These expressions confirm that the states $\{a_i,\{n_{i_2},\tilde{q}_{i_2}\}_{i_2\in\{h,\ell,\varnothing\}}\}$ are sufficient information for the variables in which we are interested.

The simulation exploits the fact that we know the distribution of ages and types $m_i(\tau)$ in closed-form. This means that, for each $a_i$, we can simulate a single cohort of firms from inception ($\tau=0$) to some terminal age ($\tau=\tau_{\rm max}$), keeping track of moments conditional on $\{a_i,\tau\}$, and then use $m_i(\tau)$ to compute unconditional moments after simulating. I discretize the age grid with equal time increments $\Delta t$: $$
\tau \in \{0,\Delta t,\dots,\tau_{\rm max}-\Delta t,\tau_{\rm max}\},
$$ where the maximum age $\tau_{\rm max}$ is chosen such that the truncated mass of firms above $\tau_{\rm max}$, $1-\int_{\tau_{\rm max}}^\infty m(\tau)d\tau$, is close to zero. For convenience, I re-index age $\tau$ to time $t$ and let time start at $t=0$. The simulation begins with $M_{\rm sim}$ firms for each $a_i$. At each time step, I store the firms' characteristics $\{n_{i_2 t},\tilde{q}_{i_2 t}\}_{i_2\in\{h,\ell,\varnothing\}}$.

### Initialization

At the beginning of its lifecycle ($\tau=0$), each firm begins with a single product $n_{i_2}=1$ for some $i_2$ that has its own initial relative quality $\hat{q}_{i_20}$. A share $\Phi_{{\rm n}i}/(\Phi_{{\rm n}i}+\Phi_{{\rm q}i})$ of these entrants will begin with a blueprint of type $i_2=\varnothing$, for which the relative quality is $\hat{q}_{\varnothing0}=\log\lambda$. A share $\Phi_{{\rm n}i}f_h/(\Phi_{{\rm n}i}+\Phi_{{\rm q}i})$ will possess type $i_2=h$ with relative quality $\hat{q}_{h0}\sim f_{hi_2,\hat{q}-\log\lambda}/f_h$. And the remaining entrants will have type $i_2=\ell$ with analogous quality. Numerically, to be able to draw $\hat{q}_{i_20}$ from the given quality distributions, I create piecewise CDFs over the discretized support for $\hat{q}$ and invert simulated qualities from a set of uniform random variables. Each $\hat{q}_{i_20}$ implies a corresponding starting point $\{\tilde{q}_{i_20}\}$, with $\tilde{q}_{i_20}=0$ for those $i_2$ with $n_{i_2}=0$. For each firm, we store the values $\{\tilde{q}_{i_20},n_{i_20}\}_{i_2}$ .

### Simulation of firm characteristics

At each time step $t>0$, we simulate the evolution of blueprint counts and quality statistics. At time $t+\Delta t$, the simulated number of monopoly products for a firm with $n_{\varnothing t}$ current blueprints is $$
n_{\varnothing,t+\Delta t} = \max\{n_{\varnothing t} + J_{\varnothing,t+\Delta t}^+ - J_{\varnothing,t+\Delta t}^-,0\},
$$ where $$
\quad J_{\varnothing,t+\Delta t}^+\sim\mathrm{Poisson}((\Phi_{{\rm n}i}+\mathbb{I}_{i_1=h}\Psi_hf_{\ell\varnothing})n_t\Delta t),
$$ is the number of type-$\varnothing$ blueprints added and $$
J_{\varnothing,t+\Delta t}^-\sim\mathrm{Poisson}((\delta+\mathbb{I}_{i_1=\ell}\Psi_hf_h)n_{\varnothing t}\Delta t),
$$ is the number of type-$\varnothing$ blueprints lost. Changes in $n_h$ and $n_\ell$ are simulated accordingly. The average quality of the portfolio of monopoly products then evolves approximately according to the discretized law of motion $$
\tilde{q}_{\varnothing,t+\Delta t} = \frac{1}{n_{\varnothing,t+\Delta t}}\left(n_{\varnothing t}\tilde{q}_{\varnothing t}e^{ - (\eta-1)g_Q\Delta t} + \tilde{q}_{\varnothing t}^+J_{\varnothing,t+\Delta t}^+ - \tilde{q}_{\varnothing t}^-J_{\varnothing,t+\Delta t}^-\right).

$$ The first term on the right-hand side is the current average quality, after depreciation. The second term is the change in average quality from new blueprints: $$
\tilde{q}_{\varnothing t}^+J_{\varnothing,t+\Delta t}^+ \equiv \sum_{j=1}^{J_{\varnothing,t+\Delta t}^+}e^{(\eta-1)\hat{q}_{\varnothing jt}^{\rm new}},
$$ where the new blueprints' qualities are drawn independently from the distribution $$
\hat{q}_{\varnothing jt}^{\rm new} \sim \begin{cases}
\mathbb{I}_{\hat{q}=\log\lambda} & \quad\textrm{with probability }\Phi_{{\rm n}i}/(\Phi_{{\rm n}i}+\mathbb{I}_{i=h}\Psi_hf_{\ell\varnothing}), \\
f_{\ell\varnothing\hat{q}|\ell\varnothing} & \quad\textrm{with probability }\mathbb{I}_{i=h}\Psi_hf_{\ell\varnothing}/(\Phi_{{\rm n}i}+\mathbb{I}_{i=h}\Psi_hf_{\ell\varnothing}).
\end{cases}
$$ The third and final term in ) captures the change from lost blueprints, with $\tilde{q}_{\varnothing t}^-$ representing the average quality of those blueprints. The exact distribution from which $\tilde{q}_{\varnothing t}^-$ is drawn requires knowledge of the entire current quality distribution $\{\hat{q}_{\varnothing jt}\}_{j=1}^{n_{\varnothing t}}$. Because it is infeasible to keep track of all of these qualities (as explained above), I approximate the average quality of a lost blueprint with the average quality of all blueprints: $\tilde{q}_{\varnothing t}^-\approx \tilde{q}_{\varnothing t}$. This approximation has two appealing properties. First, it means that the change in total quality from product loss will on average be correct for all firms. Second, the approximation will be nearly exact for firms with large $n$ (and these large firms get the most weight in macroeconomic moments). I confirm that this approximation does not introduce significant error by verifying that simulation-implied moments that depend on the quality distribution do not differ significantly from the exact equilibrium moments computed before the simulation.

The procedure for simulating the blueprint counts and average qualities for the other two types $i_2\in\{h,\ell\}$ is analogous to this one, with the appropriate changes to the Poisson intensities and the quality distributions.

### Computation of moments

As shown above, the set $\{a_i,\{n_{i_2},\tilde{q}_{i_2}\}_{i_2}\}$ is sufficient information to compute a firm's (rescaled) sales, profits, and value. Any moment (mean, covariance, etc.) of any function these variables at time step $t$ (age $\tau$) can be expressed in terms of expected values (means). For example, to compute the cross-sectional variance of firm markups, we can use the fact that $$
\textrm{var}(\mu_{it}|a_i,\tau) = \mathbb{E}[\mu_{it}^2|a_i,\tau] - \mathbb{E}[\mu_{it}|a_i,\tau]^2,
$$ computing and storing each of these two expectations at each iteration of the simulation. Then, after the simulation, we can use the distribution $m_i(\tau)$ to compute $$
\mathbb{E}[\mu_{it}] = \sum_{i\in\{h,\ell\}}\int_0^\infty m_i(\tau)\mathbb{E}[\mu_{it}|a_i,\tau]d\tau.
$$ and $$
\textrm{var}(\mu_{it}) = \sum_{i\in\{h,\ell\}}\int_0^\infty m_i(\tau)\biggl(\textrm{var}(\mu_{it}|a_i,\tau) + \mathbb{E}[\mu_{it}-\mathbb{E}[\mu_{it}]|a_i,\tau]^2\biggr)d\tau,
$$ The same logic applies to any moment of the firm distribution in which we are interested.

## Transition path: Equilibrium solution method 

### State space and initial and terminal conditions

Starting at some initial time $t_0$, a transition-path equilibrium begins with a blueprint distribution $\{f_{i_1i_2\hat{q}t_0}\}$ and ratio $L_{t_0}/N_{t_0}$. After $t_0$, it may feature any arbitrary time path of model parameters, provided that those parameters eventually stabilize at some constant values (and are consistent with a well-defined solution). This appendix shows the numerical solution for declining innovation productivity $\varphi_t$: the economy transitions from an initially high innovation productivity $\varphi_{t_0}=\varphi^+$ to a steady state with low productivity $\lim_{t\to\infty}\varphi_t = \varphi^-$. Extending to other parameters just requires adding $t$ subscripts to those parameters. Because every transition-path equilibrium converges to a balanced-growth equilibrium, the terminal condition for the solution method is the terminal balanced-growth path.

The relevant state space for a transition path includes both the balanced-growth states $\{i_1,i_2,\hat{q}\}$ and the new time dimension $t\geq t_0$. Numerically, we can discretize this time grid into increments $\Delta t$ as $$
t\in\{t_{\rm min},t_{\rm min}+\Delta t,\dots,t_{\rm max}-\Delta t,t_{\rm max}\},
$$ where $t_{\rm min}=t_0$ and $t_{\rm max}$ is sufficiently large that the economy has converged near the new steady state. The solution method then takes as its terminal condition ($t=t_{\rm max}$) the balanced-growth equilibrium, solved as described in Appendix 13.1 above.

### Overview of solution algorithm

The algorithm is a backward-forward iteration scheme:

1. Starting from the final steady-state value functions $\{v_{i_1i_2\hat{q}t_{\rm max}}^{\rm P}\}$ and $\{v_{{i_1}t_{\rm max}}^{\rm G}\}$, solve the HJBs backward to $t=t_{\rm min}$, re-computing R&D and M&A policies at each time step.

2. Starting from the initial steady-state distribution $f_{i_1i_2\hat{q}t_{\rm min}}$, solve the KFEs forward to $t=t_{\rm max}$, re-solving for profits, aggregates (including $L_t/N_t$), and R&D and M&A policies at each time step.

3. Check if $\{f_{i_1i_2\hat{q}t}\}$ has converged at all points in the the state space $\{i_1,i_2,\hat{q},t\}$; if not, return to step 1.

### Initialization

To initiate this algorithm, we need an initial guess for the path of the distribution over time $t$. The solution is not particularly sensitive to this choice. For simplicity, I start with the stationary distributions under $\varphi_{t_{\rm min}}$ and $\varphi_{t_{\rm max}}$ ($f_{i_1i_2\hat{q}t_{\rm max}}^0$ and $f_{i_1i_2\hat{q}t_{\rm max}}^0$, respectively), then interpolate them along the time grid as if they evolve in proportion to the parameter $\varphi_t$: $$
f_{i_1i_2\hat{q}t}^0 = f_{i_1i_2\hat{q}t_{\rm min}}^0 + \frac{\varphi_{t_{\rm min}}-\varphi_t}{\varphi_{t_{\rm min}}-\varphi_{t_{\rm max}}}(f_{i_1i_2\hat{q}t_{\rm max}}^0-f_{i_1i_2\hat{q}t_{\rm min}}^0)
$$ To solve the initial HJB equations, we need a sequence $\{\Phi_{{\rm n}i_1t},\Phi_{{\rm q}i_1t},\Psi_{i_1t},LN_t\}$ from $t=t_{\rm min}$ to $t=t_{\rm max}-\Delta t$. I therefore similarly interpolate $\{v_{i_1i_2\hat{q}t}^{\rm P}\}$ and $\{v_{i_1t}^{\rm G}\}$, then use these interpolated series to compute an initial sequence $\{\Phi_{{\rm n}i_1t}^0,\Phi_{{\rm q}i_1t}^0,\Psi_{i_1t}^0,LN_t^0\}$.

### Solving HJBs backwards

HJBs are solved backward, starting from the known terminal values $\{v_{i_1i_2\hat{q}t_{\rm max}}^{\rm P}\}$ and $\{v_{i_1t_{\rm max}}^{\rm G}\}$. The HJB equation for product values is $$
(\rho + \delta_t + (\eta-1)g_{Qt})v_{i_1i_2\hat{q}t}^{\rm P} = \pi_{i_1i_2\hat{q}t} + \mathbb{I}_{i_1=\ell}\Psi_{ht}f_{ht}(1-\varrho)(v_{hi_2\hat{q}t}^{\rm P}-v_{\ell i_2\hat{q}t}^{\rm P}) + \frac{\partial}{\partial t}v_{i_1i_2\hat{q}t}^{\rm P}.
$$ Approximating the time derivative with a forward difference implies, at iteration $k$, $$
v_{i_1i_2\hat{q}t}^{{\rm P},k} = \frac{v_{i_1i_2\hat{q},t+\Delta t}^{{\rm P},k}/\Delta t + \pi_{i_1i_2\hat{q}t}^{k-1} + \mathbb{I}_{i_1=\ell}\Psi_{ht}^{k-1}f_{ht}^{k-1}(1-\varrho)v_{hi_2\hat{q}t}^{{\rm P},k}}{1/\Delta t + \rho + \delta_t^{k-1} + (\eta-1)g_{Qt}^{k-1} + \mathbb{I}_{i_1=\ell}\Psi_{ht}^{k-1}f_{ht}^{k-1}(1-\varrho)}.
$$ Like in Appendix 13.1, the value of growth opportunities $v_t^{\rm G} = [v_{ht}^{\rm G}\quad v_{\ell t}^{\rm G}]^\top$ solves the two-by-two linear system $$
C_t^{{\rm G},k}v_t^{{\rm G},k} = \tilde{v}_t^{{\rm G},k},
$$ where (suppressing $k$ superscripts for the moment) $$
\tilde{v}_t^{\rm G} = \begin{bmatrix}
v_{h,t+\Delta t}^{\rm G}/\Delta t + (1-\varepsilon)(\Phi_{{\rm n}ht}\bar{v}_{{\rm n}ht}^{\rm P}+\Phi_{{\rm q}ht}\bar{v}_{{\rm q}ht}^{\rm P}+\Psi_{ht}f_{\ell t}\bar{v}_{{\rm m}ht}^{\rm P}) \\
v_{\ell,t+\Delta t}^{\rm G}/\Delta t + (1-\varepsilon)(\Phi_{{\rm n}\ell t}\bar{v}_{{\rm n}\ell t}^{\rm P}+\Phi_{{\rm q}\ell t}\bar{v}_{{\rm q}\ell t}^{\rm P})
\end{bmatrix}
$$ and where $$
C_t^{\rm G} = \begin{bmatrix} c_{hht} & c_{h\ell t} \\ c_{\ell ht} & c_{\ell\ell t} \end{bmatrix},
$$ with entries $$\begin{align}
c_{hht} &= 1/\Delta t + \rho + \delta_t - (1-\varepsilon)(\Phi_{{\rm n}ht}+\Phi_{{\rm q}ht}+\Psi_{ht}f_{\ell t}\varrho), \\
c_{h\ell t} &= (1-\varepsilon)\Psi_{ht}f_{\ell t}\varrho, \\
c_{\ell ht} &= -\Psi_{ht}f_{ht}(1-\varrho), \\
c_{\ell\ell t} &= 1/\Delta t + \rho + \delta_t -(1-\varepsilon)(\Phi_{{\rm n}\ell t}+\Phi_{{\rm q}\ell t})+\Psi_{ht}f_{ht}(1-\varrho).
\end{align}$$ Inverting this system gives us $\{v_{i_1t}^{{\rm G},k}\}$.

### Solving KFEs forward

After solving HJBs backward, we solve forward from $t_{\rm min}$ (i.e., from $LN_{t_{\rm min}}$ and $f_{i_1i_2\hat{q}t_{\rm min}}$), taking the path of blueprint values as given. At each time step, compute the wage $\hat{w}_{{\rm x}t}^k$ and policies $\{x_{{\rm n}i_1t}^k,x_{{\rm q}i_1t}^k,x_{{\rm m}i_1t}^k\}$. These imply intensities $\{\Phi_{{\rm n}i_1t}^k,\Phi_{{\rm q}i_1t}^k,\Psi_{i_1t}^k\}$ and $\delta_t^k$ and growth rates $g_{Nt}^k$ and $g_{Qt}^k$. Using these growth rates, we have $$
LN_{t+\Delta t}^k = LN_t^k\exp\{(g_L-g_{Nt}^k)\Delta t\}
$$ Next, we solve forward for the total blueprint share $f_{h,t+\Delta t}^k$ using the aggregated KFE $$
\frac{\partial}{\partial t}f_{ht} + g_{Nt}f_{ht} = (\Phi_{{\rm n}ht} + \Phi_{{\rm q}ht})(\nu_{ht} + f_{ht}) + \Psi_{ht}f_{ht}f_{\ell t} - \delta_tf_{ht}.
$$ Discretizing this expression over time with a forward difference implies that the distribution at time $t+\Delta t$ in iteration $k$ equals $$
f_{h,t+\Delta t}^k = f_{ht}^k + \left((\Phi_{{\rm n}ht}^k+\Phi_{{\rm q}ht}^k)(\nu_h^k + f_{ht}^k) + \Psi_{ht}^kf_{ht}^k(1-f_{ht}^k) - (\delta_t^k + g_{Nt}^k)f_{ht}^k\right)\Delta t.
$$

Finally, by the same reasoning as for the total blueprint shares, we can discretize the full system of KFEs over the grids for $t$ and $\hat{q}$ and solve forward in similar fashion. As in the balanced-growth case, we must make two numerical adjustments to ensure products are not lost at the boundaries of the quality grid. First, we must eliminate the loss from quality depreciation at $\hat{q}_{\rm min}$; second, we must eliminate the loss from quality improvements in the uppermost set of qualities. See Appendix 13.1 for specifics.

For the distribution of monopoly products ($\{i_1=i$, $i_2=\varnothing$), we have $$\begin{multline}
f_{i\varnothing\hat{q},t+\Delta t}^k = f_{i\varnothing\hat{q}t}^k + \biggl((f_{i\varnothing,\hat{q}+\log\lambda/m,t}^k-\mathbb{I}_{\hat{q}>\hat{q}_{\rm min}}f_{i\varnothing\hat{q}t}^k)\frac{g_{Qt}}{\log\lambda/m} + \Phi_{{\rm n}it}^k(\nu_{it}^k + f_{it}^k)\mathbb{I}_{\hat{q}=\log\lambda} \\
+ \mathbb{I}_{i=h}\Psi_{ht}^kf_{ht}^kf_{l\varnothing\hat{q}t}^k - ((1 - \mathbb{I}_{\hat{q}>\hat{q}_{\rm max}-\log\lambda})\delta_t^k + \mathbb{I}_{i=\ell}\Psi_{ht}^kf_{ht}^k + g_{Nt}^k)f_{i\varnothing\hat{q}t}^k\biggr)\Delta t.
\end{multline}$$ For the distribution of same-type competitors ($i_1=i_2=i$), we have $$\begin{multline}
f_{ii\hat{q},t+\Delta t}^k = f_{ii\hat{q}t}^k + \biggl((f_{ii,\hat{q}+\log\lambda/m,t}^k-\mathbb{I}_{\hat{q}>\hat{q}_{\rm min}}f_{ii\hat{q}t}^k)\frac{g_{Qt}}{\log\lambda/m} \\
+ \Phi_{{\rm q}it}^k(\nu_{it}^k + f_{it}^k)[f_{ih,\hat{q}-\log\lambda,t}^k+f_{i\ell,\hat{q}-\log\lambda,t}^k+f_{i\varnothing,\hat{q}-\log\lambda,t}^k] \\
+ \mathbb{I}_{i=h}\Psi_{ht}^kf_{ht}^kf_{\ell h\hat{q}t}^k - ((1 - \mathbb{I}_{\hat{q}>\hat{q}_{\rm max}-\log\lambda})\delta_t^k+\mathbb{I}_{i=\ell}\Psi_{ht}^kf_{ht}^k + g_{Nt}^k)f_{ii\hat{q}t}^k\biggr)\Delta t.
\end{multline}$$ For $i_1=h$ and $i_2=\ell$, we have $$\begin{multline}
f_{h\ell\hat{q},t+\Delta t}^k = f_{h\ell\hat{q}t}^k + \biggl((f_{h\ell,\hat{q}+\log\lambda/m,t}^k-\mathbb{I}_{\hat{q}>\hat{q}_{\rm min}}f_{h\ell\hat{q}t}^k)\frac{g_{Qt}}{\log\lambda/m} \\
+ \Phi_{{\rm q}ht}^k(\nu_{ht}^k + f_{ht}^k)[f_{\ell\ell,\hat{q}-\log\lambda,t}^k+f_{\ell h,\hat{q}-\log\lambda,t}^k+f_{\ell\varnothing,\hat{q}-\log\lambda,t}^k] \\
+\Psi_{ht}^kf_{ht}^kf_{\ell\ell\hat{q}t}^k - ((1 - \mathbb{I}_{\hat{q}>\hat{q}_{\rm max}-\log\lambda})\delta_t^k + g_{Nt}^k)f_{h\ell\hat{q}t}^k\biggr)\Delta t.
\end{multline}$$ And for $i_1=\ell$ and $i_2=h$, we have $$\begin{multline}
f_{\ell h\hat{q},t+\Delta t}^k = f_{\ell h\hat{q}t}^k + \biggl((f_{\ell h,\hat{q}+\log\lambda/m,t}^k-\mathbb{I}_{\hat{q}>\hat{q}_{\rm min}}f_{\ell h\hat{q}t}^k)\frac{g_{Qt}}{\log\lambda/m} \\
+ \Phi_{q\ell t}^k(\nu_{\ell t}^k + f_{\ell t}^k)[f_{hh,\hat{q}-\log\lambda,t}^k+f_{h\ell,\hat{q}-\log\lambda,t}^k+f_{h\varnothing,\hat{q}-\log\lambda,t}^k] \\
- ((1 - \mathbb{I}_{\hat{q}>\hat{q}_{\rm max}-\log\lambda})\delta_t^k+\Psi_{ht}^kf_{ht}^k + g_{Nt}^k)f_{\ell h\hat{q}t}^k\biggr)\Delta t.
\end{multline}$$ Finally, at each time step in this iteration, we can re-compute the macroeconomic aggregates that will be needed for the next iteration of HJBs: the aggregates $\{\bar{a}_t^k,\mathcal{M}_{\eta-1,t}^k\}$, from which we get profits $\pi_{i_1i_2 t}^k$.

### Convergence criteria and final computations

The convergence criterion is that the entire distribution $f_{i_1i_2\hat{q}t}$ has converged over the entire time path between iterations. I implement this by checking whether the average error at any given point in time $$
\frac{||\mathbf{f}^k-\mathbf{f}^{k-1}||}{\mathrm{dim}(t)}<\epsilon_{\rm outer}^{\rm tol},
$$ where $||\cdot||$ is the Euclidean norm, $\mathrm{dim}(t) \equiv 1+(t_{\rm max}-t_{\rm min})/\Delta t$ is the number of points in the time grid, and $\epsilon_{\rm outer}^{\rm tol}$ is the same (very small) tolerance threshold used in the balanced-growth solution algorithm. If this condition is satisfied, then the iteration is complete. Otherwise, we reinitiate at iteration $k+1$.

## Transition path: Simulation of firm distribution 

The simulation of firms along the transition path consists of two steps: first, simulate a panel of firms along the balanced-growth path; and then, simulate that panel of firms forward under the transition-path equilibrium, adding a new cohort of firms at each iteration.

In the first step, we can first simulate a cohort of $M_{\rm sim}$ firms using the simulation method in Appendix 13.2 above. Then, for each type $a_i$ and age $\tau\in\{0,\dots,\tau_{\rm max}-\Delta\tau,\tau_{\rm max}\}$, we keep $M_{\rm sim}(a_i,\tau) = \lfloor M_{\rm sim}\frac{m_i(\tau)}{m(\tau=0)}\rfloor$ of them, giving each a unique firm identifier, so that the distribution of firms in the panel reflects the true theoretical distribution $m_i(\tau)$.

In the second step, we simulate this panel forward in time $t\in\{t_0,\dots,t_{\rm end}-\Delta t,t_{\rm end}\}$. The method for simulating a firm forward and for drawing a cohort of new firms is, again, the same as in Appendix 13.2 above, the only difference being that at each point in time we now use the updated equilibrium path of the distribution, aggregates, and firm policies and values. To determine the number of new firms in each cohort, we need to compute the evolution of the entry rate ). At the beginning of the transition ($t=t_0$), we have a constant ratio $N_t/M_t$ and constant entry and exit rates $\mathcal{E}_{\rm entry}$ and $\mathcal{E}_{\rm exit}$ from the balanced-growth path. Over the transition, the exit rate $\mathcal{E}_{{\rm exit},t+\Delta t}$ equals the number of exiting firms divided by the total number of firms (per unit of time $\Delta t$). The number of blueprints per firm evolves as $$
\frac{N_{t+\Delta t}}{M_{t+\Delta t}} = \frac{N_t}{M_t}\exp\{(g_{Nt}-(\mathcal{E}_{{\rm entry},t}-\mathcal{E}_{{\rm exit},t}))\Delta t\}.
$$ We can then use this ratio to compute the entry rate $\mathcal{E}_{{\rm entry},t+\Delta}$, which determines the number of new firms that must be simulated.

# Details of model estimation 

## Estimation procedure 

I estimate the parameters by simulated method of moments (SMM). In the first step, I search for the parameters that minimize the distance between the model-implied moments and data moments. Recall that $\Theta$ denotes the vector of estimated parameters. Let $\hat{m}$ denote the vector of empirical moments and $m(\Theta)$ the corresponding moments in the model. Let $\Sigma$ denote the sample covariance matrix of moments $\hat{m}$. Then the SMM estimator is defined as $$
\hat{\Theta} = \mathop{\arg\!\min}\limits_{\Theta} \: \bigl\{(m(\Theta) - \hat{m})^\top \Sigma (m(\Theta) - \hat{m}) \bigr\}.
$$ I find this minimum-distance estimate using the numerical approach recently proposed by Catherine et al. (2023), which has three steps.[^86] First, I solve and simulate the model for 3000 pseudorandom combinations of parameters generated from a Halton sequence. Second, using the resulting dataset, I fit a cubic polynomial (with interactions) between parameters and moments. To increase precision around the correct estimate, I apply OLS weights proportional to the inverse squared error of each point. Third, using this fitted model, I apply the Nelder-Mead simplex algorithm to locally search for the minimum-distance estimate $\hat{\Theta}$.

I compute standard errors of moments using the following asymptotic results. Let $J$ denote the Jacobian matrix of moments with respect to parameters: for moment $m_j\in m(\Theta)$ and parameter $\theta_k\in\Theta$, element $(j,k)$ of the Jacobian equals $$
J_{jk} = \frac{\partial m_j}{\partial \theta_k}.
$$ Standard asymptotic results imply that the (estimated) covariance matrix of parameters is given by $$
\hat{\Omega} = (J^\top \Sigma^{-1} J)^{-1}.
$$ Standard errors are then given by the square root of the diagonal elements of $\hat{\Omega}$: $$
\textrm{se}(\hat{\Theta}) = \sqrt{\textrm{diag}(\hat{\Omega})}.
$$ Numerically, I approximate the derivatives in $J$ using a double-sided finite difference with increments equal to 1% of each estimated parameter value (Judd 1998). I discuss the computation of the matrix $\Sigma$ next.

In the estimation of the balanced-growth path, $\Theta$ is the vector of parameters ) and $\hat{m}$ and $m(\Theta)$ the corresponding data moments. In the transition estimation, $\Theta$ is the vector of parameters $\{\kappa_\varphi,t_{\rm min},\varphi_\infty\}$ governing ) and the moments are the 45 values of aggregate R&D-to-value $\{w_{{x}t}X_{{\rm r}t}/V_t\}$ from 1975--2020.

## Computation of empirical moments and standard errors 

As explained in Section 4.2, I estimate seven parameters on the balanced-growth path from seven moments in micro and macro data. The moments themselves are computed as described in the main text. The five moments depending on value, sales, profits, R&D, and M&A are computed using CRSP/Compustat merged with SDC Platinum. Profits in profit-to-sales ratios are defined as sales minus cost of goods sold. Value-to-sales and profit-to-sales ratios are computed using data from 1965--1975. R&D-to-value and M&A-to-value must be computed after this because, as explained in Appendix 9, R&D reporting begins in 1975 and M&A transaction data begin in 1978.

Output growth is the annualized percentage growth rate of real per capita quarterly GDP, as reported by the BEA. Specifically, from year $t$ to $t+0.25$ (i.e., across quarters), annualized growth equals $g_{y,t+0.25} = (y_{t+0.25}/y_t)^4 - 1$. The firm entry rate in year $t$ is computed from the Business Dynamics Statistics (BDS) as the ratio of the number of entrants from year $t-1$ to year $t$ (computed as the net change in the number of firms from $t-1$ to $t$ plus the number of firm deaths from $t-1$ to $t$) divided by the number of firms in year $t-1$.

Finally, for the sales-weighted standard deviations of value-to-sales and profit-to-sales, I rescale each variable by the corresponding aggregate value to remove level effects. For example, for value-to-sales, the moment (in the model notation) is equal to $$
\frac{\sqrt{\textrm{var}_{py}(V_{it}/(p_{it}y_{it}))}}{V_t/Y_t} = \sqrt{\int_0^{M_t} \frac{p_{it}y_{it}}{Y_t}\left(\frac{V_{it}/(p_{it}y_{it})}{V_t/Y_t} - 1\right)^2 di}.
$$ This rescaling removes the effect of the level of the aggregate ratios, which are mainly affected by other parameters. To ensure that the model captures the same amount of valuation dispersion as in the empirical decompositions, I do the same winsorizing of the cross-sectional data in each year (10th and 90th percentiles) as in the decompositions. Winsorizing also has the benefit of removing extreme dispersion in valuations that is caused by transitory shocks outside of the model (e.g., discount-rate shocks), which would lead to overestimates of the productivity gap $a_h/a_\ell$ and therefore overstate the effect of reallocation on the transition.

Because some of the moments must use non-overlapping samples or time periods, I compute the empirical covariance matrix of moments $\Sigma$ in diagonal blocks, setting the unavailable off-diagonal elements to zero.[^87] For aggregate value-to-sales and the two cross-sectional standard deviations, I estimate the $3\times3$ covariance matrix by bootstrap. Specifically, I sample firms from Compustat in each year from 1965--1975, 1000 times, recomputing the moments each time, then compute the covariance of these sampled moments. Likewise, for aggregate R&D-to-value and M&A-to-value, I estimate a $2\times2$ matrix from 1975--1985. Finally, the diagonal elements corresponding to output growth and the firm entry rate (which are aggregated data) equal the respective squared standard errors.

## Additional details on identification of parameters 

### Identification of balanced-growth path

As further evidence of identification, Figure 21 plots the effect of changing model parameters on the corresponding identifying moments in the estimation of the balanced-growth equilibrium. In each case, the parameter on the x-axis is changed and all other parameters are held fixed at their estimated values. The direction of these local changes confirm the intuition for identification given in the main text.

<figure id="fig:momsversusparams" data-latex-placement="t">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><img src="paperplots/est_momsvsparams" style="width:95.0%" alt="image" /></td>
<td style="text-align: center;"></td>
</tr>
</tbody>
</table>
</div>
<figcaption>The figure plots the effect of changing each parameter in the model on the corresponding identifying moment. The vertical red lines denote the estimated value. Each panel keeps all other parameters fixed at their estimated values and changes only the parameter on the x-axis.</figcaption>
</figure>

### Identification of transition path

Section 4.3 presents the equilibrium R&D-to-value ratio under the special case of inelastic labor supply ($\zeta=0$). Here I derive this in the more general case of elastic labor supply ($\zeta\geq0$). The total amount of R&D spending equals $$\begin{align}
w_{{\rm x}t}X_{{\rm r}t} &= w_{{\rm x}t}N_t\sum_{i\in\{h,\ell\}} f_{it}(x_{{\rm n}it}+x_{{\rm q}it}) \\
&= w_{{\rm x}t}^{-\frac{\varepsilon}{1-\varepsilon}}(\varepsilon\varphi_t)^{\frac{1}{1-\varepsilon}}N_t\biggl(\sum_{i\in\{h,\ell\}} f_{it}\left(((1-\alpha)\bar{v}_{{\rm n}it})^{\frac{1}{1-\varepsilon}}+(\alpha\bar{v}_{{\rm q}it})^{\frac{1}{1-\varepsilon}}\right)\biggr).
\end{align}$$ Substituting in the equilibrium wage and dividing by value then implies $$
\frac{w_{{\rm x}t}X_{{\rm r}t}}{V_t} = \underbrace{\varphi^{1+\frac{\zeta\varepsilon}{1+\zeta(1-\varepsilon)}} \vphantom{\biggr)^{-\varepsilon}} }_{\textrm{prod.}} \times \underbrace{\frac{\bar{v}_{\rm r}^{1+\frac{\zeta\varepsilon}{1+\zeta(1-\varepsilon)}}}{V_t/N_t} \vphantom{\biggr)^{-\varepsilon}} }_{\textrm{values}} \times \underbrace{\varepsilon^{1+\frac{\zeta\varepsilon}{1+\zeta(1-\varepsilon)}}\biggl(1+\frac{N_t}{\bar{s}L_t}\biggl(1+\left(\frac{\psi}{\varphi}\frac{\bar{v}_{\rm m}}{\bar{v}_{\rm r}}\right)^{\frac{1}{1-\varepsilon}}\biggr)\biggr)^{-\frac{\varepsilon}{1+\zeta(1-\varepsilon)}}}_{\textrm{substitution between factors and technologies}},
$$ which reduces to the equation in the main text if $\zeta=0$. Thus, if we again apply the intuition that $\Delta\log\bar{v}_{\rm r}\approx \Delta\log(V_t/N_t)$ and that the substitution term is insensitive to parameter changes, we have that $$\begin{align}
\Delta\log\frac{w_{{\rm x}t}X_{{\rm r}t}}{V_t} &\approx \left(\frac{1+\zeta}{1+\zeta(1-\varepsilon)}\right)\Delta\log\varphi_t + \left(\frac{\zeta\varepsilon}{1+\zeta(1-\varepsilon)}\right)\Delta\log\bar{v}_{{\rm r}t}, \\
&= 1.2\times\Delta\log\varphi_t + 0.2\times\Delta\log\bar{v}_{{\rm r}t},
\end{align}$$ where the second equality substitutes in the calibrated values $\zeta=\varepsilon=1/2$. There is now a labor-supply effect: if research productivity or valuations rise, then labor supply endogenously increases, so R&D spending goes up slightly more than value. For a decline in $\varphi_t$, net labor supply effects turn out to be close to zero, because $\varphi_t$ falls but $\bar{v}_{{\rm r}t}$ rises.

<figure id="fig_app:transest_sensitivity" data-latex-placement="t">
<div class="center">
<p></p>
<table>
<tbody>
<tr>
<td style="text-align: center;"><img src="paperplots/trans_xrdval_counterfactual" style="width:80.0%" alt="image" /></td>
<td style="text-align: center;"></td>
</tr>
</tbody>
</table>
</div>
<figcaption>The figure plots the model-implied change in the aggregate R&amp;D-to-value ratio in response to an exogenous change in various model parameters. In each line, only the one parameter changes; all others remain fixed at their pre-transition values. See the text for more details</figcaption>
</figure>

Figure 22 illustrates this intuition. The figure plots the response of aggregate R&D-to-value to various alternative parameter shifts in the model. These shifts include a decline in discount rates $\rho$, an increase in M&A productivity $\psi$, a decline in the creative destruction rate $\alpha$, and a decline in the innovation step size $\lambda$. Some of these have been proposed as explanations for other moments in the data---for example, a decline in discount rates could cause a rise in aggregate valuations, and a decline in innovation size (ideas becoming more incremental) could explain a decline in growth. However, none of these shifts can explain the decline in R&D-to-value. The intuition is as just laid out above. In fact, some of these shifts, like the decline in discount rates, even cause R&D-to-value to *increase*, because of the positive labor-supply effects.

## Robustness check: Alternative identification of $\omega$ 

Here I consider an alternative identification strategy for the firm-heterogeneity parameters $\{a_h/a_\ell,\omega\}$ that instead matches the amount of concentration in the firm distribution. This is useful both as a robustness check and as an illustration of how this model can quantitatively explain concentration. In the baseline estimation, these parameters were jointly identified by the sales-weighted cross-sectional variances of value-to-sales and profit-to-sales, under the argument that these moments were differentially sensitivity to each parameter. An alternative strategy is to identify $\omega$ from the Pareto tail of the employment distribution. The right tail of the firm distribution is populated by high-productivity firms; the lower is $\omega$, the fewer high-productivity firms there are, and the greater the number of blueprints in any given high-productivity firm. In other words, a greater share of blueprints becomes concentrated in a smaller number of high-productivity firms.

The Pareto tail coefficient $\theta\in1,\infty)$ is defined and derived in Appendix [12.5. For high levels of employment $L_i$, the distribution follows $\mathbb{P}(L_i>L)\propto L^{-\theta}$, which corresponds to a density $m(L)\propto L^{-(\theta+1)}$. A lower value of $\theta$ represents a thicker Pareto tail (more concentration). I estimate $\theta$ in the data from the share of employment in large firms. Consider two (large) values of employment, a lower bound $\underline{L}$ and an upper bound $\bar{L}>\underline{L}$. The share of employment held by firms with more than $\bar{L}$ employees, as a fraction of the total employment of all firms with more than $\underline{L}$ employees, is given by $$
\frac{s_L(\bar{L})}{s_L(\underline{L})} \equiv \frac{\int_{\bar{L}}^\infty Lm(L)dL}{\int_{\underline{L}}^\infty Lm(L)dL} = \frac{\int_{\bar{L}}^\infty L^{-\theta}dL}{\int_{\underline{L}}^\infty L^{-\theta}dL} = \left(\frac{\bar{L}}{\underline{L}}\right)^{-(\theta-1)}.
$$ Thus, we can infer the coefficient $\theta$ as $$
\theta = 1 - \frac{\log(s_L(\bar{L})/s_L(\underline{L}))}{\log(\bar{L}/\underline{L})}.
$$ I compute this in the Business Dynamics Statistics, which reports the employment shares for employment bins in each year; I use $\underline{L}=5000$ and $\bar{L}=10000$. The empirical moment used in the estimation is then the average of annual estimates $\hat{\theta}_t$ from 1978--1985.

+:------------------------------+:-:+:------------:+:----:+:--------:+:-:+:---------------------------+:-----------:+:--------:+:-------:+:-------:+
| | | Parameters | | Identifying moments |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| | | Notation | | Estimate | | Moment | | Data | | Model |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| **Preferences** | | | | | | | | | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| Discount factor | | $\rho$ | | 0.0085 | | Agg. value-to-sales | | 1.0006 | | 1.0006 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| | | | | (0.0169) | | | | (0.0230) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| **Innovation (R&D)** | | | | | | | | | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| Innovation productivity | | $\varphi_0$ | | 0.0415 | | Agg. R&D-to-value | | 0.0204 | | 0.0204 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| | | | | (0.0041) | | | | (0.0008) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| Quality innovation size | | $\lambda$ | | 1.1296 | | Agg. output growth | | 0.0248 | | 0.0248 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| | | | | (0.0281) | | | | (0.0040) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| Relative quality productivity | | $\alpha$ | | 0.7992 | | Firm entry rate | | 0.1191 | | 0.1191 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| | | | | (0.0096) | | | | (0.0048) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| **Acquisition (M&A)** | | | | | | | | | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| M&A search productivity | | $\psi$ | | 0.0795 | | Agg. M&A-to-value | | 0.0116 | | 0.0116 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| | | | | (0.0095) | | | | (0.0013) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| **Firm heterogeneity** | | | | | | | | | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| Productivity gap | | $a_h/a_\ell$ | | 1.1283 | | St. dev. of value-to-sales | | 0.5806 | | 0.5806 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| | | | | (0.0509) | | | | (0.0303) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| High-prod. entrep. share | | $\omega$ | | 0.0011 | | Employment Pareto tail | 1.2786 | | 1.2793 |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+
| | | | | (0.0008) | | | | (0.0027) | | |
+-------------------------------+---+--------------+------+----------+---+----------------------------+-------------+----------+---------+---------+

: The table reports the estimates of parameters in the balanced-growth equilibrium before the transition. The estimation is the same as the baseline (see Table 2), except the high-productivity entrepreneur share $\omega$ is now instead identified by the Pareto tail of the employment distribution. See the text for details

Table 6 reports the results of this alternative estimation.[^88] Most of the parameters take very similar values to the baseline estimation. The exception is $\omega$, which takes a much lower value. The reason is that there is substantial concentration in the data. Indeed, as the Pareto tail approaches the lower bound $\theta=1$ (Zipf's law), which represents maximal concentration, the estimated value of $\omega$ approaches zero.[^89]

This lower value of $\omega$ does not have a substantial impact on the main results. The estimated decline in innovation productivity is virtually identical, around 47% in total. The decline in per-capita output growth is also similar. The rise in aggregate value to-sales is larger than in the baseline---about 96% overall, compared to 83% in the baseline. This is because there is even more reallocation when we start from a low value of $\omega$: the total high-productivity share of sales goes from about 10% to about 30%, whereas it goes from about 15% to about 30% in the baseline. Overall, the predictions are the same.

# Model extensions: Derivations and results 

## Systematic risk and risk aversion 

Consider the assumptions )--). Letting $Z_t=e^{z_t}$, we have $$
\frac{dZ_t}{Z_{t^-}} = \left(\kappa_z(\bar{z} - z_t) + \frac{1}{2}\sigma_z^2\right)dt + \sigma_zdB_t + (e^{-\zeta_z}-1)dJ_t.
$$ For all values of $Z_t$, the static production allocations across firms will be the same as in the case without systematic risk, and aggregate output will equal $$
Y_t = N_t^{1/(\eta-1)}Q_t\bar{a}_tZ_t\Omega_tL_{{\rm p}t}.
$$ Hence, if the distribution is stationary, then expected per-capita output growth equals $$
g_{yt} \equiv \mathbb{E}_t\left[\frac{dy_t}{y_{t^-}}\right]\frac{1}{dt} = \bar{g}_y - \kappa_zz_t,
$$ where $$
\bar{g}_y \equiv \frac{1}{\eta-1}g_N + g_Q + \kappa_z\bar{z} + \frac{1}{2}\sigma_z^2 + p_Z(e^{-\zeta_z}-1).
$$ Expected growth is affine in $z$, so we should expect the same of asset prices.

Asset prices can be fully characterized by trading in a riskfree bond and the (per-capita) consumption claim. Let $V_{yt}$ denote the value of this claim. The return on the consumption claim is, by definition, its yield plus its capital gains: $$
dR_t = \frac{y_t}{V_{yt}}dt + \frac{dV_{yt}}{V_{yt^-}} = (r_{ft} + \nu_t)dt + \sigma_{Vt}dB_t + (e^{-\zeta_{Vt}}-1)dJ_t,
$$ for some risk premium $\nu$ and risk coefficient $\sigma_{V}$ and $\zeta_{V}$, for which we will solve. To simplify the proof, first conjecture that risk premia are constant ($\nu_t=\nu$) and the stock's risk coefficients are constants ($\sigma_{Vt}=\sigma_V$ and $\zeta_{Vt}=\zeta_V$), which follow from i.i.d. risks.

Letting $\alpha_t$ denote the share of the household's wealth $W_t$ invested in the consumption claim, the household's wealth evolves according to[^90] $$
\frac{dW_t}{W_{t^-}} = \left(r_{ft}+\alpha_t\nu-\frac{c_t}{W_t}\right)dt + \alpha_t\sigma_VdB_t + \alpha_t(e^{-\zeta_V}-1).
$$ The household's HJB equation (suppressing time subscripts) is then $$\begin{multline}
0 = \sup_{\{c,l,\alpha\}}\biggl\{U_W(W(r_f+\alpha\nu) - c) + \frac{1}{2}U_{WW}W^2\alpha^2\sigma_V^2 - U_z\kappa_zz + \frac{1}{2}U_{zz}\sigma_z^2 + U_{Wz}W\alpha\sigma_V\sigma_z \\
+ p_z[U(W(1+\alpha(e^{-\zeta_V}-1)),z-\zeta_z)-U(W,z)] + u(c,l,U) \biggr\}.
\end{multline}$$ The first-order condition for consumption is $$
U_W = u_c(c,l,U) = \rho(1-\gamma)\frac{U}{c},
$$ and that for the portfolio share is $$
0 = U_WW\nu + U_{WW}W^2\alpha\sigma_V^2 + U_{Wz}W\sigma_V\sigma_z + p_zU_W(W(1+\alpha(e^{-\zeta_V}-1)),z-\zeta_z)W(e^{-\zeta_V}-1).
$$ Conjecture that the solution to this system is $$
U(W,z) = \frac{W^{1-\gamma}}{1-\gamma}\exp\bigl\{(1-\gamma)(a_z+b_zz)\bigr\},
$$ for constants $a_z$ and $b_z$. Under this conjecture, we have $$
c = \rho W
$$ and $$
0 = \nu - \gamma\alpha\sigma_V^2 + (1-\gamma)b_z\sigma_V\sigma_z + p_z[(1+\alpha(e^{-\zeta_V}-1))^{-\gamma}(e^{-\zeta_V}-1)e^{-(1-\gamma)b_z\zeta_z}].
$$ Households consume a constant fraction of their wealth, the standard unit-EIS result. Their portfolios are governed by the classic Merton (1969) risk-return tradeoff and a hedging demand against shocks to aggregate productivity $z$. Market clearing requires that the bond be in zero net supply, so we can set $\alpha=1$ and $\sigma_V=\sigma_z$ and $\zeta_V=\zeta_z$ and solve for the risk premium $$
\nu = (\gamma+(\gamma-1)b_z)\sigma_z^2 + p_z[e^{(\gamma+(\gamma-1)b_z)\zeta_z}(1-e^{-\zeta_z})],
$$ which is indeed a constant, as conjectured. Standard asset-pricing results (see, e.g., Paron (2022)) and market clearing $c=y$ imply that the riskfree interest rate equals $$
r_{ft} = \rho + g_{yt} - \gamma\sigma_z^2 - p_z[e^{\gamma\zeta_z}(1-e^{-\zeta_z})],
$$ where $g_{yt}$ is defined above. Interest rates are increasing in impatience $\rho$ and growth $g_{yt}$ and decreasing in precautionary savings demand from risk and risk aversion. Putting these together, the total expected return on the stock is $$
r_{ft} + \nu = \rho + g_{yt} + (\gamma-1)b_z\sigma_z^2 + p_z[e^{\gamma\zeta_z}(e^{(\gamma-1)b_z\zeta_z}-1)(1-e^{-\zeta_z})].
$$ If $b_z\in(-1,0)$ (which I show below), then the risk premium is strictly increasing in risk aversion but the total expected return is decreasing in risk aversion, due to the dominance of the interest-rate decline.

Let us now verify the conjecture for the value function and, in particular, solve for $b_z$. Intuitively, we should expect $b_z<0$, since high aggregate productivity $z$ means low future consumption growth due to mean reversion. Substituting $U(W,z)$, $c=\rho W$, and market clearing into the HJB equation, then dividing by $(1-\gamma)U$, implies $$\begin{multline}
0 = r_f+\nu-\rho - \frac{1}{2}\gamma\sigma_z^2 - b_z\kappa_zz + \frac{1}{2}(1-\gamma)b_z^2\sigma_z^2 + (1-\gamma)b_z\sigma_z^2 \\
+ \frac{p_z}{1-\gamma}[e^{(\gamma-1)(1+b_z)\zeta_z}-1] + \rho(\log\rho - a_z - b_zz).
\end{multline}$$ Substituting in $r_f$, $g_y$, and $\nu$, then collecting coefficients on $z$, we get $$
b_z = -\frac{\kappa_z}{\kappa_z+\rho} \in (-1, 0).
$$ Collecting constants, we get $$\begin{multline}
a_z = \log{\rho} + \frac{1}{\rho}\biggl(\bar{g}_y - \frac{1}{2}(\gamma + (\gamma-1)b_z^2)\sigma_z^2 \\
+ p_z\left[e^{\gamma\zeta_z}(e^{(\gamma-1)b_z\zeta_z}-1)(1-e^{-\zeta_z}) + \frac{e^{(\gamma-1)(1+b_z)\zeta_z}-1}{1-\gamma}\right]\biggr).
\end{multline}$$ Substituting $b_z$ back into the risk premium expression implies $$
\nu = \left(1+\frac{\rho}{\kappa_z+\rho}(\gamma-1)\right)\sigma_z^2 + p_z\left[e^{\left(1+\frac{\rho}{\kappa_z+\rho}(\gamma-1)\right)\zeta_z}(1-e^{-\zeta_z})\right].
$$ This verifies the conjectured value function and completes the household's problem.

To price firms, it is useful to first derive an expression for the law of motion of the state-price density $\xi$ of the representative household. Using the results of Paron (2022), we have $$
\frac{d\xi_t}{\xi_t} = -r_{ft}dt - (\gamma+(\gamma-1)b_z)\sigma_zdB_t + (e^{(\gamma+(\gamma-1)b_z)\zeta_z}-1)(dJ_t-p_zdt).
$$ By the absence of arbitrage, it must be that $$
r_{ft}dt = -\mathbb{E}_t\frac{d\xi_t}{\xi_t},
$$ and that $$
\nu dt = -\mathbb{E}_t\frac{d[\xi,y]_t}{\xi_ty_t},
$$ both of which are consistent with the results above.

Finally, let us assess the consequences of risk and risk aversion for firms' valuations and decisions. Recall that the value of the firm equals $$
V_{it} = \mathbb{E}_t\left[\int_0^\infty\frac{\xi_{t+\tau}}{\xi_t}D_{i,t+\tau}d\tau\right],
$$ which we can scale by $y_t$ to get $$
\hat{V}_{it} = \mathbb{E}_t\left[\int_0^\infty\frac{\xi_{t+\tau}}{\xi_t}\frac{y_{t+\tau}}{y_t}\hat{D}_{i,t+\tau}d\tau\right],
$$ where the scaled dividend equals $$
\hat{D}_{it} = \sum_{j=1}^n\pi_{ijt} - \hat{w}_{{\rm x}t}x_{it}n_{it},
$$ and is thus independent of $y$ and $Z$. After some algebra, this expression can be rewritten $$
\int_{-\infty}^0(\xi_{t+\tau}y_{t+\tau})\hat{D}_{i,t+\tau}d\tau + (\xi_ty_t)\hat{V}_{it} = \mathbb{E}_t\left[\int_{-\infty}^{\bar{\tau}}(\xi_{t+\tau}y_{t+\tau})\hat{D}_{i,t+\tau}d\tau\right] + \mathbb{E}_t\left[(\xi_{t+\bar{\tau}}y_{t+\bar{\tau}})\hat{V}_{i,t+\bar{\tau}}\right],
$$ and so the expression on the left-hand side is a martingale. It is hence a local martingale, and has zero expected drift: $$
0 = \xi_ty_t\hat{D}_{it}dt + \mathbb{E}_t[d(\xi_ty_t\hat{V}_{it})].
$$ Because $\hat{V}$ is independent of $\xi$ and $y$, this expression can be rewritten $$
0 = \hat{D}_{it}dt + \mathbb{E}_t\left[\frac{d(\xi_ty_t)}{\xi_ty_t}\right]\hat{V}_{it} + \mathbb{E}_t[d\hat{V}_{it}].
$$ Using the household's state-price density along with no-arbitrage conditions, we know $$\begin{align}
\mathbb{E}_t\left[\frac{d(\xi_ty_t)}{\xi_ty_t}\right] &= \mathbb{E}_t\frac{d\xi_t}{\xi_t} + \mathbb{E}_t\frac{d[\xi,y]_t}{\xi_ty_t} + \mathbb{E}_t\frac{dy_t}{y_t} \\
&= -(r_{ft} + \nu - g_{yt})dt \\
&\equiv - \rho^*dt,
\end{align}$$ which is a constant independent of the state $z_t$ (and of the expected growth rate $g_{yt}$). Substituting this back in, we have the scaled, risk-adjusted HJB equation $$
\rho^*\hat{V}_{it} = \hat{D}_{it} + \mathbb{E}_t[d\hat{V}_{it}],
$$ where the firm's expected growth rate is of the exact same form as it was with log utility and no aggregate risk. This proves that $\hat{V}$ is independent of aggregate state variables.

The only difference between this model and the riskless model is that, where there was once discounting by the constant $\rho$, there is is now discounting by the risk-adjusted constant $$
\rho^* = \rho - (\gamma-1)\frac{\kappa_z}{\kappa_z+\rho}\sigma_z^2 - p_z[e^{\gamma\zeta_z}(1-e^{-(\gamma-1)\frac{\kappa_z}{\kappa_z+\rho}\zeta_z})(1-e^{-\zeta_z})].
$$ If $\gamma=1$, as before, then $\rho^*=\rho$, so the model truly is identical with and without systematic risk. If $\gamma>1$, then the risk-adjusted discount factor is lower than the parameter $\rho$, meaning that, in order to match the same set of moments in the model with risk, we will need a higher value of $\rho$ to get the same $\rho^*$. Most importantly, this means that, from the firm's perspective, whether discount rates are driven by impatience or by risk is irrelevant to R&D decisions; all that matters is the total effect of these channels.

## Physical capital 

Suppose instead the production function of firms takes as an input both labor and capital: $$
Y_{ijt} = a_iL_{ijt}^{1-\beta}K_{ijt}^\beta.
$$ Like labor, physical capital $K$ can be used in the production of any good. It follows that a firm owning capital stock $K_i$ will use what it needs of its capital stock and, if it needs more (less), rent from (to) another firm. Consequently, it is equivalent to assume that the total capital stock is owned by a single, competitive "capital sector" which produces capital and rents it out to goods-producing firms at a rental rate $r_{Kt}$.

Defining the capital-labor ratio as $k=K/L$, an incumbent firm's profits equal $$
\Pi_{ijt} = \max_{p_{jt},k_{ijt}}\left(p_{jt} - \frac{w_{{\rm p}t}+r_{Kt}k_{ijt}}{q_{ijt}a_ik_{ijt}^\beta}\right)p_{jt}^{-\eta}\left(Y_t - \sum_{-i\neq i}q_{-ijt}Y_{-ijt}\right)
$$ The first-order condition for the capital-labor ratio is $$
\beta w_{{\rm p}t} k_{ijt}^{-1-\beta} = (1-\beta)r_{Kt}k_{ijt}^{-\beta},
$$ and therefore equals $$
k_{ijt} = \frac{\beta}{1-\beta}\frac{w_{{\rm p}t}}{r_{Kt}}.
$$ This also means that $$
\frac{w_{{\rm p}t}+r_{Kt}k_{ijt}}{k_{ijt}^\beta} = \frac{r_{Kt}}{\beta}k_{ijt}^{1-\beta} = \left(\frac{w_{{\rm p}t}}{1-\beta}\right)^{1-\beta}\left(\frac{r_{Kt}}{\beta}\right)^\alpha,
$$ so the quality-adjusted marginal cost of a firm is $$
mc_{ijt} = \frac{1}{q_{ijt}a_i}\left(\frac{w_{{\rm p}t}}{1-\beta}\right)^{1-\beta}\left(\frac{r_{Kt}}{\beta}\right)^\beta.
$$ Hence, the optimal markup for the leader $i_1$ is the same as in the case without capital: $$
\mu_{i_1jt} = \min\left\{\frac{a_{i_1}q_{i_1jt}}{a_{i_2}q_{i_2jt}},\frac{\eta}{\eta-1}\right\}.
$$ All of this implies that the labor and capital shares of revenue for product $j$ are $$
\frac{w_{{\rm p}t}L_{ijt}}{p_{ijt}Y_{ijt}} = \frac{1-\beta}{\mu_{ijt}} \quad\textrm{and}\quad \frac{r_{Kt}K_{ijt}}{p_{ijt}Y_{ijt}} = \frac{\beta}{\mu_{ijt}},
$$ respectively. Consequently, the aggregate expenditure shares of output will equal $$
\Lambda_{Lt} = \frac{w_{{\rm p}t}L_{{\rm p}t}}{Y_t} = (1-\beta)\Lambda_t \quad\textrm{and}\quad \Lambda_{Kt} = \frac{r_{Kt}K_t}{Y_t} = \beta\Lambda_t
$$ with $\Lambda_t$ equal to an inverse of the aggregate markup, as in the labor-only benchmark model.

Capital is accumulated according to $$
\dot{K}_t = I_t-\delta K_t.
$$ The objective of the capital sector is to choose a capital investment policy $I_t$ to maximize the present value $V_{Kt}$ of its rental income less investment cost, $r_{Kt}K_t-I_t$, where the rental rate $r_{Kt}$ is taken as given since the capital sector is perfectly competitive. The total value of the sector thus satisfies the HJB equation $$
r_{ft}V_{Kt} = r_{Kt}K_t - I_t + \frac{\partial V_{Kt}}{\partial K_t}(I_t-\delta K_t).
$$ The investment first-order condition is $$
\frac{\partial V_{Kt}}{\partial K_t} = 1,
$$ which implies $$
V_{Kt} = \frac{r_{Kt} - \delta}{r_{ft}}K_t = K_t,
$$ and hence that $$
r_{Kt} = r_{ft}+\delta.
$$ Using the fact that the aggregate capital expenditure share equals, $$
\Lambda_{Kt} = \frac{r_KK_t}{Y_t} = \beta\Lambda,
$$ it must be that the growth rate of capital equals $$
g_{Kt} = g_{Yt} + g_{\Lambda t},
$$ where $g_{\Lambda t}=\dot{\Lambda}_t/\Lambda_t$ is zero on a balanced-growth path.[^91] Thus, the equilibrium investment-capital ratio equals $$
\frac{I_t}{K_t} = \delta + g_{Yt} + g_{\Lambda t}.
$$ Higher total output growth, either through technological innovation or population growth, necessitates more capital, increasing the intensity of capital investment. If $\Lambda_t$ rise (markups fall), then firms increase investment to increase the capital share. Along the transition path, the investment rate will therefore fall both because the rate of economic growth falls and because the aggregate markup rises.

The only remaining change in the model is the fact that resources must now be allocated to the creation of new capital. Hence, total consumption will equal $$
C_t = Y_t - I_t = \left(1 - \frac{I_t}{Y_t}\right) Y_t.
$$ Because $I_t/Y_t$ is a constant in a stationary equilibrium, consumption growth and output growth are identical, preserving this characteristic of the baseline model.

This decline in investment intensity amid booming market valuations aligns with the literature studying the divergence of average $Q$ (the value of the stock market relative to capital) and marginal $q$ of investment (Crouzet and Eberly 2023). In this model, average $Q$ rises, for the reasons laid out in the paper, while marginal $q$ always equals $\partial V_K/\partial K=1$. The reason is that, in this model, most of the value of the market comes from intangible capital---blueprints, generated by R&D investment---and it is the present value of profits from intangible capital that drives average $Q$ up. Physical capital merely serves to scale up production once new ideas are in place, and therefore stagnates as output growth stagnates.

## Cournot competition 

Suppose firms choose quantities $Y_{ijt}$ instead of prices $p_{ijt}$ when they operate within a product market. The conditions for household demand remain the same, so $$
p_{ijt} = q_{ijt}\left(\frac{Y_{jt}}{Y_t}\right)^{-1/\eta} = q_{ijt}p_{jt}.
$$ The profit maximization problem of a firm becomes $$\begin{align}
\Pi_{ijt} &= \max_{Y_{ijt}}\left(p_{ijt}- \frac{w_{{\rm p}t}}{a_i}\right)Y_{ijt} \\
&= \max_{Y_{ijt}}\left(\left(\frac{\sum_{i'}q_{i'jt}Y_{i'jt}}{Y_t}\right)^{-1/\eta} - \frac{w_{{\rm p}t}}{a_iq_{ijt}}\right)q_{ijt}Y_{ijt}.
\end{align}$$ The first-order condition for an interior solution is $$
p_{jt} - \frac{w_{{\rm p}t}}{a_iq_{ijt}} = \frac{1}{\eta}p_{jt}^{\eta+1}\frac{q_{ijt}Y_{ijt}}{Y_t}.
$$ Summing over the $m_{jt}$ producing firms $i$ then implies $$
m_{jt}\left(p_{jt} - \frac{w_{{\rm p}t}}{\overline{aq}_{jt}}\right) = \frac{1}{\eta}p_{jt}^{\eta+1}\frac{Y_{jt}}{Y_t},
$$ where $$
\overline{aq}_{jt} \equiv \left(\frac{1}{m_{jt}}\sum_{i'=1}^{m_{jt}}(a_iq_{ijt})^{-1}\right)^{-1}
$$ is a harmonic average of quality-adjusted productivity in the product market. Substituting in the household demand curve then implies that the product-level price equals $$
p_{jt} = \underbrace{\frac{\eta m_{jt}}{\eta m_{jt}-1}}_{\equiv\mu_{jt}} \frac{w_{{\rm p}t}}{\overline{aq}_{jt}},
$$ where we have defined $\mu_{jt}$ as the markup over product-level marginal cost. Because all producers must set $p_{ijt}=q_{ijt}p_{jt}$, firm $i$'s markup over marginal cost equals $$
\mu_{ijt} \equiv \frac{p_{ijt}}{w_{{\rm p}t}/a_i} = \mu_{jt}\frac{a_iq_{ijt}}{\overline{aq}_{jt}} = \frac{\eta m_{jt}}{\eta m_{jt}-1}\frac{a_iq_{ijt}}{\overline{aq}_{jt}}.
$$ Among the $m_{jt}$ producing firms, those with higher quality-adjusted productivity relative to the average will earn higher markups.

The question that remains is which firms will choose to produce and which will decide not to operate. That is, what determines $m_{jt}$? A firm will choose to produce ($Y_{ijt}>0$) if and only if it will earn positive profits, or, equivalently, a markup greater than one: $$
\frac{\eta m_{jt}}{\eta m_{jt}-1}\frac{a_iq_{ijt}}{\overline{aq}_{jt}} \geq 1.

$$ Under our parameterization $\lambda>a_h/a_\ell$, the markups of producers are exactly ranked by their quality rankings. Consequently, the equilibrium number of producers $m$ is the number such that (i) the $m$th highest-quality firm earns a positive markup and (ii) the $(m+1)$th highest-quality firm would have a negative markup if it decided to produce.

This intuition can be used to find all possible equilibrium market structures. First, note that, given any set of potential producers, a unique equilibrium exists with a finite number of producing firms. Formally, there exists a finite $\bar{m}\in\mathbb{R}$ such that, for any combination of high- and low-productivity blueprint holders, no product market will ever have $m_{jt}>\bar{m}$ producing firms. This maximum occurs when the lowest-quality firm is high-productivity and all higher-quality blueprint holders are low-productivity (since, given $m$, this maximizes $a_iq_{ijt}/\overline{aq}_{jt}$ for the worst firm). In this case, the necessary condition for participation is $$
\frac{\eta\bar{m}}{\eta\bar{m}-1}\frac{1}{\bar{m}}\left(1 + \frac{a_\ell}{a_h}\sum_{k=1}^{\bar{m}-1}\lambda^{-k}\right) \geq 1,
$$ or, evaluating this sum, $$
\frac{a_\ell}{a_h}\frac{\lambda^{-1}-\lambda^{-\bar{m}}}{1-\lambda^{-1}} \geq \bar{m} - \frac{\eta+1}{\eta}.
$$ The maximum number of possible producers is then the value of $\bar{m}$ that makes this an equality.[^92] Second, note that, under the parameter estimates in the main model, we will have $\bar{m}>2$. To see this, substitute $m=2$ into ), to see that the condition for the follower ($a_i=a_{i_2}$) to produce is $$
\frac{\eta}{\eta-1} > \lambda\frac{a_{i_1}}{a_{i_2}},
$$ which always holds in the estimated Bertrand model.

From the fact of finite $m<\bar{m}$, we could, in principle, characterize all possible market structures and their evolution in response to successful R&D and M&A. Still, one can see how the dimensionality in this Cournot world is potentially much larger than in the Bertrand world. For every new market structure we add into the model, we must add states to the HJBs and KFEs. This is, in principle, feasible, but does not add much to the model. Moreover, it adds an additional step to estimating the model: for each set of parameters, one must re-solve for the possible set of markets, which will change as $\{\eta,\lambda,a_h/a_\ell\}$ change.

## Life-cycle productivity dynamics 

Suppose that, with Poisson intensity $p_{h\ell}$, high-productivity firms become low-productivity firms indefinitely. In this case, blueprint characteristics are still summarized by $\{i_1,i_2,\hat{q}\}$, and we make the following alterations to the equilibrium conditions in Appendix 11. Of course, blueprints with high-type leaders ($i_1=h$) transition to low-type leaders ($i_1=\ell$) with intensity $p_{h\ell}$. Likewise, for any blueprint with a high-type follower ($i_2=h$), the follower may become low-productivity ($i_2=\ell$) with intensity $p_{h\ell}$. These possibilities mean that the HJB equations ) and ) become $$\begin{multline}
(\rho+\delta_t+(\eta-1)g_{Qt})v_{i_1i_2t}^{\rm P} = \pi_{i_1i_2t} + \mathbb{I}_{i_1=\ell}\Psi(x_{{\rm m}ht})f_{ht}(1-\varrho)(v_{hi_2t}^{\rm P}- v_{\ell i_2t}^{\rm P}) + \frac{\partial v_{i_1i_2t}^{\rm P}}{\partial t} \\
+ p_{h\ell}(\mathbb{I}_{i_2=h}(v_{i_1ht}^{\rm P}-v_{i_1\ell t}^{\rm P}) - \mathbb{I}_{i_1=h}(v_{hi_2t}^{\rm P}-v_{\ell i_2t}^{\rm P})),

\end{multline}$$ $$\begin{multline}
(\rho + \delta_t)v_{i_1t}^{\rm G} = (1-\varepsilon)(\Phi_{\rm n}(x_{{\rm n}it})\bar{v}_{{\rm n}it} +\Phi_{\rm q}(x_{{\rm q}it})\bar{v}_{{\rm q}it} + \Psi(x_{{\rm m}it})\bar{v}_{{\rm m}it}) \\
+ \mathbb{I}_{i=\ell}\Psi(x_{{\rm m}ht})f_{ht}(1-\varrho)(v_{ht}^{\rm G}- v_{\ell t}^{\rm G}) + \frac{\partial v_{i_1t}^{\rm G}}{\partial t} - p_{h\ell}\mathbb{I}_{i_1=h}(v_{ht}^{\rm G}-v_{\ell t}^{\rm G}).

\end{multline}$$ The second and only other alteration we must make is to the KFEs ), ), and ), which become $$\begin{multline}
\dot{f}_{i\varnothing\hat{q}t} + g_{Nt}f_{i\varnothing\hat{q}t} = \frac{\partial f_{i\varnothing\hat{q}t}}{\partial \hat{q}}g_{Qt} + \Phi_{{\rm n}it}(f_{it} + \nu_{it})\Gamma_{\hat{q}} \\
+ (\mathbb{I}_{i=h}-\mathbb{I}_{i=\ell})\Psi_{ht}f_{ht}f_{\ell\varnothing\hat{q}t} - (\delta_t+p_{h\ell}\mathbb{I}_{i=h})f_{i\varnothing\hat{q}t}+ p_{h\ell}\mathbb{I}_{i=l}f_{h\varnothing\hat{q}t},

\end{multline}$$ $$\begin{multline}
\dot{f}_{ii\hat{q}t} + g_{Nt}f_{ii\hat{q}t} = \frac{\partial f_{ii\hat{q}t}}{\partial \hat{q}}g_{Qt} + \Phi_{{\rm q}it}(f_{it} + \nu_{it})(f_{ii,\hat{q}-\log\lambda,t}+f_{i,-i,\hat{q}-\log\lambda,t}+f_{i\varnothing,\hat{q}-\log\lambda,t}) \\
+ \mathbb{I}_{i=h}\Psi_{ht}f_{ht}f_{\ell h\hat{q}t} - (\delta_t+2p_{h\ell}\mathbb{I}_{i=h})f_{ii\hat{q}t} - \mathbb{I}_{i=\ell}\Psi_{ht}f_{ht}f_{\ell\ell\hat{q}t} + p_{h\ell}\mathbb{I}_{i=l}(f_{\ell h\hat{q}t}+f_{h\ell\hat{q}t}),

\end{multline}$$ $$\begin{multline}
\dot{f}_{i,-i,\hat{q}t} + g_{Nt}f_{i,-i,\hat{q}t} = \frac{\partial f_{i,-i,\hat{q}t}}{\partial \hat{q}}g_{Qt} + \Phi_{{\rm q}it}(f_{it} + \nu_{it})(f_{-ii,\hat{q}-\log\lambda,t}+f_{-i,-i,\hat{q}-\log\lambda,t}+f_{-i\varnothing,\hat{q}-\log\lambda,t}) \\
+ \mathbb{I}_{i=h}\Psi_{ht}f_{ht}f_{\ell\ell\hat{q}t} - (\delta_t+p_{h\ell})f_{i,-i,\hat{q}t} - \mathbb{I}_{i=\ell}\Psi_{ht}f_{ht}f_{\ell h\hat{q}t} + p_{h\ell}f_{hh\hat{q}t}.

\end{multline}$$ All other equilibrium conditions remain the same.

One can see by looking at these KFEs how $p_{h\ell}>0$ will affect the estimates. Type switching will, all else equal, reduce the total blueprint share $f_h$ of high-productivity firms. In order to match the other moments, then, the model will require a higher productivity gap $a_h/a_\ell$ and a higher share of high-productivity entrepreneurs $\omega$. In other words, a larger share of entrants will be high-productivity firms, which tend to grow exceedingly quickly over the early part of their life cycles before ultimately slowing down. This would allow the model to match the empirical fact that large firms tend to have had higher R&D intensities and growth rates earlier in life (Luttmer 2011; Acemoglu et al. 2018; Akcigit and Kerr 2018).

Acemoglu, Daron, Ufuk Akcigit, Harun Alp, Nicholas Bloom, and William Kerr. 2018. "Innovation, Reallocation, and Growth." *American Economic Review* 108 (11): 3450--91.

Achdou, Yves, Jiequn Han, Jean-Michel Lasry, Pierre-Louis Lions, and Benjamin Moll. 2022. "Income and Wealth Distribution in Macroeconomics: A Continuous-Time Approach." *Review of Economic Studies* 89 (1): 45--86.

Aghion, Philippe, Antonin Bergeaud, Timo Boppart, Peter J. Klenow, and Huiyu Li. 2023. "A Theory of Falling Growth and Rising Rents." *The Review of Economic Studies* 90 (6): 2675--702.

Aghion, Philippe, and Peter Howitt. 1992. "A Model of Growth Through Creative Destruction." *Econometrica* 60 (2): 323--51.

Aghion, Philippe, Benjamin F. Jones, and Charles I. Jones. 2019. "Artificial Intelligence and Economic Growth." In *The Economics of Artificial Intelligence: An Agenda*. University of Chicago Press.

Akcigit, Ufuk, and Sina T. Ates. 2023. "What Happened to US Business Dynamism?" *Journal of Political Economy* 131 (8): 2059--124.

Akcigit, Ufuk, and William R. Kerr. 2018. "Growth Through Heterogeneous Innovations." *Journal of Political Economy* 126 (4): 1374--443.

Andrade, Gregor, Mark Mitchell, and Erik Stafford. 2001. "New Evidence and Perspectives on Mergers." *Journal of Economic Perspectives* 15 (2): 103--20.

Arikan, Asli M., and René M. Stulz. 2016. "Corporate Acquisitions, Diversification, and the Firm's Life Cycle." *The Journal of Finance* 71 (1): 139--93.

Atkeson, Andrew, Jonathan Heathcote, and Fabrizio Perri. 2024. "The End of Privilege: A Reexamination of the Net Foreign Asset Position of the United States." Unpublished manuscript. May.

Autor, David, David Dorn, Lawrence F Katz, Christina Patterson, and John Van Reenen. 2020. "The Fall of the Labor Share and the Rise of Superstar Firms." *The Quarterly Journal of Economics* 135 (2): 645--709.

Baily, Martin Neil, Charles Hulten, and David Campbell. 1992. "Productivity Dynamics in Manufacturing Plants." *Brookings Papers on Economic Activity (Microeconomics)*, 187--249.

Bartelsman, Eric J., and Mark Doms. 2000. "Understanding Productivity: Lessons from Longitudinal Microdata." *Journal of Economic Literature* 38 (3): 569--94.

Beeler, Jason, and John Y. Campbell. 2012. "The Long-Run Risks Model and Aggregate Asset Prices: An Empirical Assessment." *Critical Finance Review* 1: 141--82.

Betton, Sandra, B. Espen Eckbo, and Karin S. Thorburn. 2008. "Corporate Takeovers." In *Handbook of Empirical Corporate Finance*, edited by B. Espen Eckbo. Handbooks in Finance. Elsevier.

Binsbergen, Jules van. 2021. "Duration-Based Stock Valuation: Reassessing Stock Market Performance and Volatility." Unpublished manuscript.

Bloom, Nicholas, Charles I. Jones, John Van Reenen, and Michael Webb. 2020. "Are Ideas Getting Harder to Find?" *American Economic Review* 110 (4): 1104--44.

Bloom, Nick, Rachel Griffith, and John Van Reenen. 2002. "Do R&D Tax Credits Work? Evidence from a Panel of Countries 1979--1997." *Journal of Public Economics* 85 (1): 1--31.

Blundell, Richard, Rachel Griffith, and Frank Windmeijer. 2002. "Individual Effects and Dynamics in Count Data Models." *Journal of Econometrics* 108 (1): 113--31.

Broda, Christian, and David E. Weinstein. 2006. "Globalization and the Gains from Variety." *The Quarterly Journal of Economics* 121 (2): 541--85.

Catherine, Sylvain, Mehran Ebrahimian, David Sraer, and David Thesmar. 2023. "Robustness Checks in Structural Analysis." Unpublished manuscript.

Chetty, Raj. 2012. "Bounds on Elasticities with Optimization Frictions: A Synthesis of Micro and Macro Evidence on Labor Supply." *Econometrica* 80 (3): 969--1018.

Cho, Thummim, Marco Grotteria, Lukas Kremens, and Howard Kung. 2024. "The Present Value of Future Market Power." Unpublished manuscript.

Cocci, Matthew D., and Mikkel Plagborg-Møller. Forthcoming. "Standard Errors for Calibrated Parameters." *Review of Economic Studies*, Forthcoming.

Cochrane, John H., Francis A. Longstaff, and Pedro Santa-Clara. 2008. "Two Trees." *The Review of Financial Studies* 21 (1): 347--85.

Corhay, Alexandre, Howard Kung, and Lukas Schmid. 2020. "Competition, Markups, and Predictable Returns." *The Review of Financial Studies* 33 (12): 5906--39.

Crouzet, Nicolas, and Janice Eberly. 2023. "Rents and Intangible Capital: A q+ Framework." *The Journal of Finance* 78 (4): 1873--916.

David, Joel M. 2021. "The Aggregate Implications of Mergers and Acquisitions." *The Review of Economic Studies* 88 (4): 1796--830.

De Loecker, Jan, Jan Eeckhout, and Gabriel Unger. 2020. "The Rise of Market Power and the Macroeconomic Implications." *The Quarterly Journal of Economics* 135 (2): 561--644.

De Ridder, Maarten. 2024. "Market Power and Innovation in the Intangible Economy." *American Economic Review* 114 (1): 199--251.

Doidge, Craig, G. Andrew Karolyi, and René M. Stulz. 2017. "The u.s. Listing Gap." *Journal of Financial Economics* 123 (3): 464--87.

Duffie, Darrell, and Larry G. Epstein. 1992. "Asset Pricing with Stochastic Differential Utility." *The Review of Financial Studies* 5 (3): 411--36.

Eckbo, B. Espen, and Markus Lithell. 2023. "Merger-Driven Listing Dynamics." *Journal of Financial and Quantitative Analysis*, 1--49.

Edmond, Chris, Virgiliu Midrigan, and Daniel Yi Xu. 2023. "How Costly Are Markups?" *Journal of Political Economy* 131 (7): 1619--75.

Eggertsson, Gauti B., Jacob A. Robbins, and Ella Getz Wold. 2021. "Kaldor and Piketty's Facts: The Rise of Monopoly Power in the United States." *Journal of Monetary Economics* 124: S19--38.

Ewens, Michael, Ryan H. Peters, and Sean Wang. Forthcoming. "Measuring Intangible Capital with Market Prices." *Management Science*, Forthcoming.

Farhi, Emmanuel, and François Gourio. 2018. "Accounting for Macro-Finance Trends: Market Power, Intangibles, and Risk Premia." *Brookings Papers on Economic Activity*, 147--250.

Fons-Rosen, Christian, Pau Roldan-Blanco, and Tom Schmitz. 2022. "The Effects of Startup Acquisitions on Innovation and Economic Growth." Unpublished manuscript.

Foster, Lucia, John C. Haltiwanger, and C. J. Krizan. 2001. "Aggregate Productivity Growth: Lessons from Microeconomic Evidence." In *New Developments in Productivity Analysis*. University of Chicago Press.

Gârleanu, Nicolae, Leonid Kogan, and Stavros Panageas. 2012. "Displacement Risk and Asset Returns." *Journal of Financial Economics* 105 (3): 491--510.

Gordon, Robert J. 2016. *The Rise and Fall of American Growth: The US Standard of Living Since the Civil War*. Princeton University Press.

Greenwald, Daniel L., Martin Lettau, and Sydney C. Ludvigson. 2025. "How the Wealth Was Won: Factor Shares as Market Fundamentals." *Journal of Political Economy* 133 (4): 1083--132.

Griliches, Zvi, and Haim Regev. 1995. "Firm Productivity in Israeli Industry 1979--1988." *Journal of Econometrics* 65: 175--203.

Grossman, Gene M., and Elhanan Helpman. 1991. "Quality Ladders in the Theory of Growth." *Review of Economic Studies* 58 (1): 43--61.

Hall, Bronwyn H. 1993. "R&D Tax Policy During the 1980s: Success or Failure?" *Tax Policy and the Economy* 7: 1--35.

Hall, Bronwyn H., and Rosemarie Ham Ziedonis. 2001. "The Patent Paradox Revisited: An Empirical Study of Patenting in the u.s. Semiconductor Industry, 1979-1995." *The RAND Journal of Economics* 32 (1): 101--28.

Haltiwanger, John C. 1997. "Measuring and Analyzing Aggregate Fluctuations: The Importance of Building from Microeconomic Evidence." *Federal Reserve Bank of St. Louis Review* 79 (3): 55--77.

Healy, Paul M., Krishna G. Palepu, and Richard S. Ruback. 1992. "Does Corporate Performance Improve After Mergers?" *Journal of Financial Economics* 31 (2): 135--75.

Hopenhayn, Hugo A. 1992. "Entry, Exit, and Firm Dynamics in Long Run Equilibrium." *Econometrica* 60 (5): 1127--50.

Hurst, Erik, and Benjamin Wild Pugsley. 2011. "What Do Small Businesses Do?" *Brookings Papers on Economic Activity*, no. 2: 73--118.

Jensen, Michael C., and Richard S. Ruback. 1983. "The Market for Corporate Control: The Scientific Evidence." *Journal of Financial Economics* 11 (1): 5--50.

Jones, Benjamin F. 2009. "The Burden of Knowledge and the 'Death of the Renaissance Man': Is Innovation Getting Harder?" *The Review of Economic Studies* 76 (1): 283--317.

Jones, Benjamin F. 2010. "Age and Great Invention." *The Review of Economics and Statistics* 92 (1): 1--14.

Jones, Charles I. 1995. "R&D-Based Models of Economic Growth." *Journal of Political Economy* 103 (4): 759--84.

Jones, Charles I. 2022a. "Taxing Top Incomes in a World of Ideas." *Journal of Political Economy*.

Jones, Charles I. 2022b. "The End of Economic Growth? Unintended Consequences of a Declining Population." *American Economic Review* 112 (11): 3489--527.

Jones, Charles I., and Christopher Tonetti. 2026. "Past Automation and Future a.i.: How Weak Links Tame the Growth Explosion." Unpublished manuscript.

Judd, Kenneth L. 1998. *Numerical Methods in Economics*. MIT Press.

Keane, Michael P. 2011. "Labor Supply and Taxes: A Survey." *Journal of Economic Literature* 49 (4): 961--1075.

Kehrig, Matthias, and Nicolas Vincent. 2021. "The Micro-Level Anatomy of the Labor Share Decline." *The Quarterly Journal of Economics* 136 (2): 1031--108.

Kendall, David G. 1948. "On the Generalized "Birth-and-Death" Process." *The Annals of Mathematical Statistics* 19 (1): 1--15.

Klette, Tor Jakob, and Samuel Kortum. 2004. "Innovating Firms and Aggregate Innovation." *Journal of Political Economy* 112 (5): 986--1018.

Kogan, Leonid, Dimitris Papanikolaou, Amit Seru, and Noah Stoffman. 2017. "Technological Innovation, Resource Allocation, and Growth." *The Quarterly Journal of Economics* 132 (2): 665--712.

Lentz, Rasmus, and Dale T. Mortensen. 2008. "An Empirical Model of Growth Through Product Innovation." *Econometrica* 76 (6): 1317--73.

Liu, Ernest, Atif Mian, and Amir Sufi. 2022. "Low Interest Rates, Market Power, and Productivity Growth." *Econometrica* 90 (1): 193--221.

Loualiche, Erik. Forthcoming. "Asset Pricing with Entry and Imperfect Competition." *The Journal of Finance*, Forthcoming.

Luttmer, Erzo G. J. 2011. "On the Mechanics of Firm Growth." *The Review of Economic Studies* 78 (3): 1042--68.

Maksimovic, Vojislav, and Gordon Phillips. 2001. "The Market for Corporate Assets: Who Engages in Mergers and Asset Sales and Are There Efficiency Gains?" *The Journal of Finance* 56 (6): 2019--65.

Maksimovic, Vojislav, and Gordon Phillips. 2002. "Do Conglomerate Firms Allocate Resources Inefficiently Across Industries? Theory and Evidence." *The Journal of Finance* 57 (2): 721--67.

Martin, Ian. 2013. "The Lucas Orchard." *Econometrica* 81 (1): 55--111.

Merton, Robert C. 1969. "Lifetime Portfolio Selection Under Uncertainty: The Continuous-Time Case." *The Review of Economics and Statistics* 51 (3): 247--57.

Miller, Max, James D. Paron, and Jessica A. Wachter. 2026. "Sovereign Default and the Decline in Interest Rates." *The Review of Financial Studies*.

Nakamura, Emi, Jón Steinsson, Robert Barro, and José Ursúa. 2013. "Crises and Recoveries in an Empirical Model of Consumption Disasters." *American Economic Journal: Macroeconomics* 5 (3): 35--74.

Olmstead-Rumsey, Jane. 2022. "Market Concentration and the Productivity Slowdown." Unpublished manuscript.

Paron, James D. 2022. "[Heterogeneous-agent asset pricing: Timing and pricing idiosyncratic risks]." Unpublished manuscript.

Peters, Michael. 2020. "Heterogeneous Markups, Growth, and Endogenous Misallocation." *Econometrica* 88 (5): 2037--73.

Peters, Michael, and Conor Walsh. 2022. "Population Growth and Firm-Product Dynamics." Unpublished manuscript.

Phillips, Gordon M, and Alexei Zhdanov. 2013. "R&d and the Incentives from Merger and Acquisition Activity." *The Review of Financial Studies* 26 (1): 34--78.

Romer, Paul M. 1990. "Endogenous Technological Change." *Journal of Political Economy* 98 (5): S71--102.

Wilson, Daniel J. 2009. "BEGGAR THY NEIGHBOR? THE IN-STATE, OUT-OF-STATE, AND AGGREGATE EFFECTS OF R&D TAX CREDITS." *The Review of Economics and Statistics* 91 (2): 431--36.

[^1]: Stanford University, Graduate School of Business. [jparon@stanford.edu](jparon@stanford.edu).

[^2]: I am exceedingly grateful to Tom Winberry, Jules van Binsbergen, Jessica Wachter, Sylvain Catherine, and many others at Wharton for their extensive guidance. I also thank Ufuk Akcigit, Maarten De Ridder, Ben Hébert, Joachim Hubmer, Chad Jones, Erik Loualiche, Max Miller, Simon Mongey, Lukas Nord, Dimitris Papanikolaou, Monika Piazzesi, Tom Sargent, Chris Tonetti, Colin Ward, and seminar participants at Wharton, Cambridge, Harvard, Toronto, LBS, Duke, NYU, Stanford, Berkeley, Northwestern, MIT, UT Austin, USC, Columbia, Maryland, Notre Dame, HEC, SFS Cavalcade, WFA, and the Chicago Booth Asset Pricing Conference for helpful comments.

[^3]: I find a decline of similar magnitude in the aggregate market value of patents (Kogan et al. 2017), meaning that this fact is true of both research outputs and inputs.

[^4]: At higher frequencies, within-firm changes explain most aggregate movements. This includes the collapse and recovery of valuations around 1980 and the dot-com boom around 2000.

[^5]: In support of the hypothesis that this was also a reallocation toward high-growth firms, I document that high-valuation firms tend to be high-R&D firms in the cross-section. An analogous decomposition of aggregate R&D-to-sales reveals that within-firm R&D-to-sales has been falling, but high-R&D firms have been gaining market share, consistent with both declining innovation productivity and reallocation.

[^6]: Beyond valuations, Luttmer (2011) shows that this is essential for matching the firm-size distribution. It also implies a long-run link between population and economic growth (Jones 1995, 2022b).

[^7]: The transition does not explain the late-1990s M&A wave; however, the cumulative rise in M&A over this period is similar in the model and data, which is what matters for quantifying total reallocation.

[^8]: In explaining both the decline in R&D-to-value and the rise in value-to-sales, the model naturally explains the well-known fact that aggregate R&D-to-sales (and R&D-to-GDP) has not fallen over this period.

[^9]: This episode was mainly driven by a large "within-firm" change in the micro decompositions of aggregate valuations, further supporting the idea that this is explained by a common discount-rate shock.

[^10]: Farhi and Gourio (2018) and Miller et al. (2026) focus on the puzzle that interest rates have plummeted since 1980 but the market has not risen commensurately. Farhi and Gourio (2018) argue that the equity premium rose while Miller et al. (2026) argue that the true riskfree rate did not fall as much as real yields on government debt. Both of these stories are qualitatively consistent with my findings. Like Binsbergen (2021), I argue that a decline in growth can help reconcile the performance of stocks and bonds.

[^11]: This is consistent with the argument of Crouzet and Eberly (2023) that intangibles (in this case, incumbents' stock of ideas) drove a wedge between "average $Q$" and "marginal $q$."

[^12]: Of course, insofar as markets are incomplete, some households may have benefited more than others.

[^13]: The U.S. stock market is historically only about twice the size of one year of aggregate consumption. Even with extremely high discount rates, the consumption claim would have a much higher multiple than 2.

[^14]: My estimate of a roughly $50\%$ total decline in research productivity since 1975 is smaller in magnitude than, but still reasonably close to, most of the micro and macro estimates in Bloom et al. (2020).

[^15]: Greenwald et al. (2025) assume the stock market is a levered consumption claim, thus making the critical distinction between the profit claim and human capital. They do not, however, distinguish between incumbents' profits and future entrants' profits, which I find to be essential.

[^16]: They find that declining population growth initially boosts growth before lowering it. Feeding the population-growth decline into my model would imply both a short- and a long-run growth decline. This could also generate the more extreme prediction of Jones (2022b) of the "end of economic growth."

[^17]: In De Ridder (2024), growth falls because of decreasing returns to R&D within firms: rising concentration reduces the productivity of the average researcher, rendering new ideas effectively harder to find.

[^18]: Aghion et al. (2023) also seek to explain the empirical fact that markups have fallen within firms, yet aggregate markups have risen because of a reallocation to high-markup firms (Kehrig and Vincent 2021; De Loecker et al. 2020; Autor et al. 2020). My model also generates diverging micro and macro trends, because the high-productivity firms that gain market share are also high-markup firms, but have to compete more with each other as they gain market share.

[^19]: While declining knowledge diffusion could be a cause of declining innovation productivity---say, if knowledge about competitors' ideas inspires researchers to invent new ones---the key mechanism in these papers is the asymmetric effect between leaders and laggards, which is different from the uniform shift in my paper.

[^20]: The series is filtered with a five-year, two-sided moving average to help with visualization and, specifically, to smooth out the M&A wave of the late 1990s. Appendix 10.1.2 reports and analyzes the unfiltered series.

[^21]: Acquisitions of assets may include acquisitions of ideas and product lines---the focus of this paper---or purchases of resources like physical capital. These cannot be disentangled in the SDC data.

[^22]: See Appendix 10.2 for a derivation.

[^23]: I also show how the decomposition ) can be constructed from the Haltiwanger (1997) decomposition that is commonly used in macroeconomics, and how the adjustments I make are important for addressing econometric biases that emerge in this class of decomposition when studying secular change in valuations.

[^24]: This is consistent with the findings of Doidge et al. (2017) and Eckbo and Lithell (2023) that the U.S. listing gap is in part explained by the increasing M&A of public targets by public acquirers. In other words, the "missing firms" in public markets are still operating under the ownership of public acquirers.

[^25]: Note that this assumption is not the same as a decentralized economy with perfect consumption insurance (complete markets), because entrepreneurial decisions would be distorted by a moral hazard problem from redistributing private gains (Jones 2022a). One could consider an economy without consumption insurance, but then the distribution of households would become a state variable. The assumption of a family, in contrast, yields essentially complete markets without distorting private incentives for firm creation.

[^26]: It is common in endogenous-growth models to put the good-specific product quality $q_{ijt}$ into the production function ) instead of the demand function ). These are isomorphic assumptions.

[^27]: To ensure a unique equilibrium, I assume that firms indifferent between producing and not producing (i.e., with zero markup) do not produce. This can be justified by an infinitesimal fixed operating cost.

[^28]: An alternative assumption would be to have a single R&D technology with undirected innovations ex post: researchers come up with an idea, and then that idea happens to be a quality innovation with probability $\alpha$ (Peters and Walsh 2022). Properly calibrated, this alternative would yield similar results.

[^29]: It would be straightforward to allow firms to direct search in quality innovations by allowing them to target blueprints with specific characteristics. Further, as Section 7 discusses, the assumption that firms only innovate on other firms' blueprints could be relaxed by introducing an own-innovation technology.

[^30]: I assume that firms acquire individual blueprints instead of entire firms because the latter would significantly increase the dimensionality of the state space. This is of minimal economic consequence, provided that the M&A search function is properly calibrated---having firms acquire single blueprints at a higher frequency has the same implication as having them acquire collections of blueprints at a lower frequency.

[^31]: David (2021) and [Fons-Rosen et al.]{style="color: blue"} (2022) also model M&A as arising from endogenous search.

[^32]: Note that this distribution is not the same as the distribution across *firms*---which firms own which sets of blueprints---which we will see is sufficient but not necessary for defining and solving the equilibrium.

[^33]: The equilibrium is well-defined provided that, for every firm $i$ and time $t$, the expected return on the firm is greater than the expected growth rate of its value: $\mathbb{E}_t[dR_{it}] > \mathbb{E}_t[dV_{it}]/V_{it^-}$. This "$r$ minus $g$" condition is necessary for all firms to have finite market values; if it is violated, then the valuations of firms with high expected growth rates are infinite/undefined. An example of a parameterization that might violate this condition is a very low value of $\alpha$ and a high $a_h/a_\ell$, in which case incumbents face very little creative destruction, so high-productivity firms may grow more quickly than the returns demanded by investors.

[^34]: Note the re-indexing of $\{a_i,q_{ijt},\mu_{ijt}\}$ to $\{a_{jt},q_{jt},\mu_{jt}\}$, because each variety has only one producer.

[^35]: I call this "aggregate productivity" even though it puts weight on both qualities $q_j$ and productivities $a_j$. The reason is that, if there is no heterogeneity in productivity ($a_h=a_\ell=1$), then we get $\bar{a}=1$ for any quality distribution; but if there is no quality heterogeneity ($q_{jt}/Q_t=1$, $\forall j$), then $\bar{a}_t\in[a_\ell,a_h]$. In other words, $\bar{a}_t$ is principally a measure of how blueprints are distributed across productivity types $a_i$.

[^36]: Note that $\Omega$ is only a "static" misallocation measure because it takes as given the current blueprint distribution, which can only be changed over time by R&D and M&A technologies. It also corresponds to the allocation that maximizes output per labor hour ($Y_t/L_{{\rm p}t}$), not necessarily output per capita ($Y_t/L_t$).

[^37]: A natural definition of the aggregate markup is thus $\Lambda^{-1}$, as in Edmond et al. (2023).

[^38]: There is some abuse of notation here and in what follows: to be precise, we should say that $\Pi_{ijt} = \pi_{i_{1jt}i_{2jt}\hat{q}_{jt}}Y_t/L_t$, where $\{i_{1jt},i_{2jt},\hat{q}_{jt}\}$ is the set of variety $j$'s characteristics at time $t$.

[^39]: Labor hours $l_{\rm x}$ include hours worked for both incumbents and entrepreneurship.

[^40]: The constants $\{\bar{a},\Omega,l_{\rm p}\}$ affect the *level* of output, but not the growth rate; they will become relevant for growth only along a transition path in which there is net reallocation.

[^41]: Intuitively, if $g_{Nt}<g_L$, then $L_t/N_t$ will rise, increasing $g_{Nt}$ until it is equal to $g_L$.

[^42]: For models that assume an adjustment cost instead of an idea-production function, $\varepsilon=0.5$ corresponds to a quadratic cost function, which is also common (Klette and Kortum 2004; Peters 2020).

[^43]: One could instead estimate $\varrho$ from the average merger premium in SDC Platinum, which is roughly $1.4\times$ the target's pre-announcement value. Doing so yields an estimate close to $0.5$. The main results are in general not sensitive to $\varrho$ because total surplus, not the split of surplus, is what matters for aggregate value. For the same reason, results would not change if acquirers systematically "overpaid" due to agency frictions.

[^44]: In Section 7 below, I investigate what happens when stock returns are risky. In this case, $\rho$ can be replaced by a risk-adjusted discount rate that adds risk premia and subtracts precautionary savings demand. Thus, the estimation is still valid, but one should interpret the estimate for $\rho$ as also including a risk adjustment.

[^45]: As a robustness check, Appendix 14.4 re-estimates the model under alternative identifying moments for $a_h/a_\ell$ and $\omega$ and shows that results are unchanged.

[^46]: As $a_h/a_\ell$ becomes very large, this moment will begin to decrease as high-productivity firms accumulate 100% of the revenue share; however, quantitatively, these large values of $a_h/a_\ell$ are far outside of the relevant parameter space, so this is not a concern for identification. The same intuition holds for $\omega$.

[^47]: A low $\omega$ is also consistent with the amount of concentration in the data. If $\omega$ is low, the concentration of the firm-size distribution (i.e., the thickness of the Pareto tail) will be higher, because more blueprints will be owned by a smaller number of high-productivity firms. High values of $\omega$ will imply too little concentration. Appendix 14.4 shows this by re-estimating the model using the Pareto tail coefficient of the employment distribution as an identifying moment; it implies an even lower value of $\omega$ and yields similar main results.

[^48]: See Appendix 14.3 for a derivation of this and for a discussion of the general case with $\zeta>0$.

[^49]: For this reason, it is largely irrelevant whether skilled labor can be perfectly substituted between M&A and R&D (as in this model) or cannot be substituted between these tasks at all.

[^50]: Appendix 14.3 explains why these labor-supply effects are negligible in response to changes in $\varphi_t$.

[^51]: This time path corresponds to the estimates $\kappa_\varphi=0.3$, $t_{\rm mid}=1992.5$, and $\varphi_\infty=0.022$.

[^52]: I assume all agents can foresee the transition starting in 1965. Were we to depart from perfect foresight, we would need to make additional assumptions about learning. This would dramatically complicate the numerical solution, because we would need the full *expected* time path of the distribution at each time step.

[^53]: Jones (1995) points out that, in the very long-run, the rate of economic growth is ultimately tied to the rate of population growth, because people are the source of ideas. In other words, very-long-run economic growth is not endogenous, but "semi-endogenous." My model also features semi-endogenous growth, as seen in the fact that the rate of new-variety creation $g_N$ will equal the rate of population growth $g_L$ in any balanced-growth equilibrium (BGE). However, along the transition between BGEs, this is no longer true. When innovation productivity falls, $g_{Nt}$ falls below $g_L$. This means that the number of researchers slowly begins to outgrow the number of blueprints, so the ratio $L_t/N_t$ begins to increase. Because researchers are an input into blueprint creation, this raises the rate of innovation until $g_N$ again equals $g_L$. This transition takes centuries to materialize as higher growth, and thus does not matter much over the period in Figure 6.[^93] If we assume population growth will also fall in the future (as current U.N. forecasts predict), then the decline in growth will be permanent, as in Peters and Walsh (2022) and Jones (2022b).

[^54]: Changes in static misallocation $\Omega$ (due to markup dispersion) are close to zero.

[^55]: In support of this claim, recall that the empirical decomposition in Figure 3 showed that this episode is explained by a common within-firm change in valuations, not a compositional change in the market.

[^56]: Making M&A search temporarily easier would not only cause an M&A wave in the model, it would also reduce the amount of M&A spending after the 1990s. This is because, after having found and bought a large number of acquisition targets, it becomes harder for acquirers to find new targets.

[^57]: In reality, the entry rate is a weighted average of innovative firms (this model) and non-innovative firms (Hurst and Pugsley 2011). While the model cannot speak directly to the latter, one possibility is that fewer new ideas imply less demand for new non-innovative firms whose services are based around these ideas.

[^58]: In the data, I infer the real rate from monthly realized returns on short-term bonds, defined as the Fed Funds rate deflated by realized one-month-ahead CPI inflation. In the spirit of Beeler and Campbell (2012), I run time-series regressions of next month's realized return on the current nominal Fed Funds rate, last month's inflation, and last year's inflation. The expected return is then the predicted value from the regression. I use this approach because survey expectations (e.g., the SPF) do not go back far enough.

[^59]: For example, if the valuation boom came from a decline in aggregate risk or increased savings demand from abroad (which would raise the wealth of U.S. investors relative to the rest of the world), then agents might still be better off from a welfare perspective.

[^60]: $k(t)$ includes the effect of current and future labor disutility on welfare, which I do not consider here.

[^61]: $\kappa_z>0$ implies that disasters are followed by recoveries (Nakamura et al. 2013).

[^62]: As Appendix 15.1 shows, keeping an EIS of one is essential for tractability with aggregate shocks. If the EIS is not one, then the blueprint distribution is no longer stationary, but depends on the history of $z_t$.

[^63]: Adding capital adjustment costs or assuming goods producers choose investment are straightforward.

[^64]: Note that this measure is based on who performs the R&D, so it includes R&D funded by the federal government if the research is ultimately performed by businesses. The share of R&D performed by the federal government since the 1950s is both small and stable relative to GDP.

[^65]: A second reason is coverage: the set of firms that report R&D is larger than the set of firms that patent. Additionally, within firms, not all new ideas that are generated by R&D necessarily result in patents.

[^66]: In the corporate finance literature, this has sometimes been called the "neoclassical" view. It is often contrasted with agency or behavioral views, which posit that M&A arises from frictions like information asymmetry (mispricing) or overconfidence; however, as discussed below, these views are not always mutually exclusive.

[^67]: Specifically, they consider a value-weighted average of the CARs for the target and acquirer, meaning that this is the CAR for an investor who held both firms before the announcement, as in the model.

[^68]: These empirical results should be interpreted with caution for two major reasons. First, the gains from M&A must be measured relative to the counterfactual in which the transaction did not take place. Second, as Betton et al. (2008) point out, most of the large-bidder acquisitions with negative returns that drive the average results are concentrated in a few firms in 1999 and 2000, which may indicate a temporary period of overvaluation and also confounds acquirer-size and time fixed effects in empirical studies.

[^69]: Formally, if $\Delta n_{\rm m}$ represents the number of successful M&A searches (acquisitions) over a small time interval $\Delta t$, then the variance of acquisitions per blueprint behaves like $\mathrm{var}(\Delta n_{\rm m}/n) \propto (\psi X_{\rm m}^\varepsilon n^{1-\varepsilon}\Delta t)/n^2\propto1/n$. For large $n$, this percentage variance tends to zero, so M&A announcements yield little news about future firm cashflows; in contrast, for small $n$, news of a successful M&A deal is almost entirely unanticipated, so returns should jump up upon announcement. Recall that firm value $V$ grows with $n$ on average, so the unexpected return upon announcement, $(\Delta V - \mathbb{E}\Delta V)/V$, behaves like $(\Delta n_{\rm m} - \mathbb{E}\Delta n_{\rm m})/n$ on average.

[^70]: This is not a problem for De Loecker et al. (2020), who show that the cross term is close to zero for their markup series.

[^71]: Griliches and Regev (1995) propose a similar accounting for TFP. Foster et al. (2001) argue that the Haltiwanger (1997) decomposition is more ideal for TFP accounting, because the cross term is meaningful in that context.

[^72]: Specifying stationary dynamics for firm-level value-to-sales ratios ensures a stationary distribution.

[^73]: Specifically, choosing smaller cutoffs (e.g., the 1st and 99th percentile) or larger cutoffs yields the same conclusion: most of the increase in the aggregate value-to-sales ratio comes from a compositional change. Under smaller cutoffs, the positive composition effect in the decomposition is even larger, which is why I say that this choice is "conservative."

[^74]: The two lines do not necessarily add to one, because there are alternative categories of targets (e.g., government entities).

[^75]: If $\lambda<a_h/a_\ell$, then we could, in principle, have markets in which a high type with a quality *disadvantage* leads a low type because of its productivity gap. However, this would never emerge in equilibrium because entry by R&D and M&A always implies that the new firm replaces the leader, not the follower.

[^76]: I call this "aggregate productivity" even though it puts weight on both qualities $q_j$ and productivities $a_j$. The reason is that, if there is no heterogeneity in productivity ($a_h=a_\ell=1$), then we get $\bar{a}=1$ for any quality distribution; but if there is no quality heterogeneity ($q_{jt}/Q_t=1$, $\forall j$), then $\bar{a}_t\in[a_\ell,a_h]$. In other words, $\bar{a}_t$ is principally a measure of how blueprints are distributed across productivity types $a_i$.

[^77]: The reason (relative) markups reduce labor allocations is the household demand curve: when firms charge a higher price (markup), households demand less of their output.

[^78]: At this point, $\{\xi_t\}$ may or may not be unique; but market completeness will imply that it is.

[^79]: The formal proof of this is stated in the proof of Proposition 1 in Appendix 11.6

[^80]: Note that $l_{{\rm x}t}$ represents hours supplied to both incumbent firms and private entrepreneurial efforts, so we must subtract out entrepreneurial labor hours $x_{{\rm En}kt}$ and $x_{{\rm Eq}kt}$.

[^81]: A more precise setup would be to let the household choose portfolio weight $\alpha_{it}$ in every firm $i\in[0,M_t]$ in the market, then clear the market with $\alpha_{it}=V_{it}/V_t$. This would ultimately yield the same solution.

[^82]: $\dot{V}_t$ denotes growth inclusive of entry and exit, so subtracting entry $V_{{\rm E}t}$ gives incumbent value growth.

[^83]: As a reality check, note that doing the same for $i_1=\ell$ and then adding it to this equation yields the expression $g_{Nt}=\sum_i\Phi_{{\rm n}it}(f_{it}+\nu_{it})$, which is true by definition.

[^84]: Specifically, I use the notation $f(n)\stackrel{\rm lim}{\sim}g(n)$ to represent the fact that there exists a finite constant $k>0$ such that $\lim_{n\to\infty}\frac{f(n)}{g(n)} = k$.

[^85]: The vast majority of elements in $C$ are zero, so it is stored numerically as a sparse matrix for efficiency.

[^86]: The benefit of this approach is that the computationally intensive step (step one) does not depend on the empirical moments, so robustness checks with respect to moments can be performed at low cost.

[^87]: An alternative approach could be to look at "worst-case" entries for the unknown elements of the covariance matrix, as recently proposed by Cocci and Plagborg-Møller (Forthcoming). This would have a minor effect on the main results, which turn out to be very robust to moments.

[^88]: In the SMM algorithm, I actually use the inverse of the Pareto coefficient $1/\theta$ as the moment, because $\theta=g_L/(\bar{\Phi}_h-\bar{\delta}_h)$ is discontinuous (i.e., changes from negative infinity to positive infinity) in any part of the parameter space over which we cross from $\bar{\Phi}_h>\bar{\delta}_h$ to $\bar{\Phi}_h<\bar{\delta}_h$.

[^89]: As discussed in Section 7, this is no longer true if we allow for type switching (i.e., high-productivity firms becoming low-productivity). In this case, we could still have a thick-tailed distribution with a higher value of $\omega$ and a higher productivity gap $a_h/a_\ell$ (so most growth occurs at the beginning of a firm's life).

[^90]: Technically, in the notation of the main model, this is the evolution of total wealth per person $\overline{W}_t/L_t$, not market wealth $W_t$. I use $W_t$ to simplify notation.

[^91]: If $g_K$ is too high, then there will be too much capital and $r_K$ will fall below $r_f+\delta$, defying the investment first-order condition.

[^92]: Intuitively, as $\bar{m}$ increases, the upper bound on the left-hand side approaches a constant and the lower bound on the right-hand side rises to infinity, so $\bar{m}$ must be finite.