Dynamics Stochastic Modeling: Active vs. Passive Learning

In the late 1960s and early 1970s, Rausser, with a number of fellow PhD students at UC Davis, began a journey to conquer and determine the policy relevance of control theory in its richest versions, including dual control, open-loop feedback control, close-loop control, and M-measurement feedback control formulations. With this common interest, many of his fellow students selected him as their PhD director after Rausser had become a faculty member following only two years of coursework towards his own PhD degree.[1] As a group, they recognized early that many agricultural and natural resource systems require stochastic and dynamic models. A host of publications emerged from their collaboration, including the first application of adaptive control to trade policy, (Rausser & Freebairn, 1974) and the first application of M-measurement feedback control to environmental externalities (Rausser and Howitt, 1975). The former publication was based on an Outstanding AAEA Dissertation Award and the latter received the Outstanding 1975 AAEA Research Discovery Award. Rausser’s book with E. Hochman, Dynamic Agricultural Systems: Economic Prediction and Control, won the AAEA Enduring Quality Award. 

Along the way, a stream of publications focused on agricultural and natural resource systems (Rausser and Lapan, 1979), (Pekelman and Rausser, 1978), (Rausser, 1978), (Freebairn and Rausser, 1975), (Rausser, 1975), and (Rausser and Pekelman, 1980), (Rausser and Willis, 1976), (Rausser, 1974), (Rausser et al., 1972). All of these publications embedded learning, sometimes passively and other times actively. Two later publications included the critical role of information, measurement, and learning, (Yasur et al., 1981) and (Rausser and Small, 2000). 

All of these contributions advanced a policy or optimal control dimension that required taking a stand on the treatment of evolving measurements and information. Among the various approaches, the two most common are open-loop-with-revision and feedback. The former sets as a benchmark a deterministic problem under the fiction that new information will not arrive, but with an understanding that when it does emerge, it will be incorporated into a decision or policy revision. The latter formulations create a stochastic problem with “anticipated but passive learning;” the decision maker chooses the current policy recognizing that subsequent policy will be adapted to information or data not currently available. In contrast to a deterministic formulation, the state of the system is stochastic, but the moments of the stochastic and dynamic process are generally presumed to be known. 

Both the open loop (with revision) and the feedback approaches incorporate new measurements or information as it becomes available but neither selects actions or decisions with the objective of acquiring better measurements of the causal impacts of implemented decisions.  A third approach “dual control” or “active learning” recognizes that choices not only have direct effects on outcomes or payoffs but also have indirect effects on improved measurements of causal impacts, sometimes referred to as response impact curves (Judge et al., 1977, Rausser & Johnson, 1975, Rausser et al. 1979). Pekelman & Rausser 1978 and Rausser & Pekelman, 1980 recognized that firms can learn about an unknown demand function by varying the prices that they charge. A particular pricing choice not only affects current revenues or profits but also provides information about the price elasticity of demand. More accurate measurements increase future profitability. The optimal pricing policy balances the effects on current profits and on the acquisition of information. Active learning recognizes the tradeoff between expected losses in the short run and the future gain arising from more accurate measurements of the uncertain demand function.

The various policy control formulations attained here were originally introduced and applied by electrical engineers. The NBER in the 1970s attempted to integrate the work of electrical engineers with the economic profession. In several NBER conferences, Rausser presented his applications to agricultural and resource economics. Macroeconomists became intrigued with the methods’ applicability to monetary and fiscal policy. Following an NBER conference at the University of Chicago, Business Week wrote: 

“Control theory has swept into the economics profession so rapidly in the past two or three years that most economists are only dimly aware that it is around. But for econometricians and mathematical economists, and for the companies and government agencies that use their skills, it promises an improved ability to manage short-run economic stabilization, long-run economic growth, investment portfolios, and corporate cash positions” (Business Week May 19, 1973; quoted in Athans and Kendrick, 1974).”

This quote shows the significant early interest in using active learning formulations to obtain more accurate measurements of economic agents’ behavioral responses. However, the electrical engineering formulations generally dealt with physical responses, not agents’ behavioral responses. Moreover, macroeconomic applications are perhaps one of the least likely fields for which useful designs could be advanced of active learning models. Designing fiscal and monetary policies to actively acquire information is potentially valuable but the high cost of manipulating the system is typically unacceptable.  Solving, implementing and managing active learning to more accurately estimate behavioral responses can present technically insurmountable problems. [2]

The “Lucas critique” (1976) and the subsequent work by Kydland and Prescott (1977), which followed the development of rational expectations modeling, presented a conceptual rather than technical challenge to policy applications of control theory. This critique recognizes that a standard optimal control formulation leads to time inconsistency in a setting where a decision-maker would like to announce a sequence of future policies (e.g. taxes) in order to influence other agents’ current decisions (e.g. investment), and moreover the “future self” of this decision-maker would want to deviate from the announced sequence. Absent the ability to commit to this future policy sequence, agents with rational expectations would have no reason to believe that it will be carried out, so the announcement will not have its intended effect in influencing agents’ current decisions. The Lucas critique means that policy cannot be effective when it relies on repeatedly surprising people, e.g. by using inflationary shocks to increase effective demand, or by promising low future capital taxes to encourage investment. This critique, sometimes construed to imply that public policy is powerless (Mundlak 1990), made optimal control methods appear less important or even irrelevant in macroeconomics. 

A different response that has gained widespread currency in both micro- and macroeconomics starts with the assumption that the policymaker understands that agents have rational expectation (Klein et al. 2008). These agents make decisions (e.g. about investment) based on their rational expectations about future government policy (e.g. taxes). The individual agents are too small to influence future government policy by manipulating the aggregate stock of endogenously changing capital; they therefore behave non-strategically despite having rational expectations.  However, the agents’ aggregate decisions do change the stock of capital – or some other payoff-relevant state variable. Moreover, the policymaker in the current period is unable to commit successors to a particular policy sequence. Commitment in this setting is implausible, and it would vacuously “solve” the time consistency problem by assumption.

The resulting model is formally a Stackelberg dynamic game, in which both the strategic policymaker and a large number of nonstrategic agents have rational expectations (Karp and Havenner, 1984). There are many types of equilibria in such a model, although a standard refinement uses Markov Perfection, where all agents condition their current decision on a payoff-relevant state variable such as aggregate capital; the non-strategic agents also condition their decision on their private stock of capital. The solution to a standard optimal control problem requires finding the planner’s optimal decision rule. The type of game described here is more complicated, because it requires finding a pair of equilibrium decision rules, one for the policymaker and one that represents the behavior of the nonstrategic representative agent. The Nash condition requires that each decision rule is the best response to the other agent’s decision rule. Moreover, each rule is a best response to the equilibrium decision rules that agents’ “future selves” use. The last requirement guarantees time consistency. The Lucas critique vitiates the applicability, to public policy, of a particular naïve optimal control model, but not dynamic modeling in general.

Note most of these papers won a number of awards but unfortunately, two of the most important of these papers were Annals of Economic and Social Measurement, a journal that was established by the NBER, which because of their limited budget, they terminated after four years of publication.


[1] Three of these PhD students have had sterling careers: John Freebairn, Richard Howitt and Cleve Willis.

[2] The curse of dimensionality is particularly important in active learning formulations. These formulations require at least one state variable for each unknown parameter, potentially leading to an unmanageable number of state variables. The use of conjugate priors leads to tractable equations of motions for these state variables. However, even for specifications where conjugate priors make sense, the curse of dimensionality of the resulting system often makes dynamic programming impractical. Improved algorithms (e.g. a judicious choice of grid points for approximating functions) and computing capacity have dramatically relaxed constraints imposed by the curse of dimensionality.