Agency theory is one of the most important developments in microeconomics in the past 20 years. It has application to accounting, industrial organization, and labor economics, and it has become the basis of the economic model of compensation. Agency studies incentives, risk, and selection of employees.
An agentA person who works for, or on behalf of, another. is a person who works for, or on behalf of, another. Thus, an employee is an agent of a company. But agency extends beyond employee relationships. Independent contractors are also agents. Advertising firms, lawyers, and accountants are agents of their clients. The CEO of a company is an agent of the board of directors of the company. A grocery store is an agent of the manufacturer of corn chips sold in the store. Thus, the agency relationship extends beyond the employee into many different economic relationships. The entity—person or corporation—on whose behalf an agent works is called a principalThe entity—person or corporation—on whose behalf an agent works..
Agency theory is the study of incentives provided to agents. Incentives are an issue because agents need not have the same interests and goals as the principal. Employees spend billions of hours every year browsing the Web, e-mailing friends, and playing computer games while they are supposedly working. Attorneys hired to defend a corporation in a lawsuit have an incentive not to settle, to keep the billing flowing. (Such behavior would violate the attorneys’ ethics requirements.) Automobile repair shops have been known to use substandard or used replacement parts and bill for new, high-quality parts. These are all examples of a conflict in the incentives of the agent and the goals of the principal.
Agency theory focuses on the cost of providing incentives. When you rent a car, an agency relationship is created. Even though a car rental company is called an agency, it is most useful to look at the renter as the agent because it is the renter’s behavior that is an issue. The company would like the agent to treat the car as if it were her own car. The renter, in contrast, knows it isn’t her own car and often drives accordingly.
rented P. J. O'Rourke, Republican Party Reptile (Boston: Atlantic Monthly, 1987), 242.How can the car rental company ensure that you don’t put its car into reverse while going forward at a high rate of speed? It could monitor your behavior, perhaps by putting a company representative in the car with you. That would be a very expensive and unpleasant solution to the problem of incentives. Instead, the company uses outcomes—if damage is done, the driver has to pay for it. That is also an imperfect solution because some drivers who abuse the cars get off scot-free, and others who don’t abuse the car still have cars that break down and are then mired in paperwork while they try to prove their good behavior. That is, a rule that penalizes drivers based on outcomes imposes risk on the drivers. Modern technology is improving monitoring with GPS tracking.
To model the cost of provision of incentives, we consider an agent like a door-to-door encyclopedia salesperson. The agent will visit houses and sell encyclopedias to some proportion of the households; the more work the agent does, the more sales that are made. We let x represent the average dollar value of sales for a given level of effort; x is a choice the agent makes. However, x will come with risk to the agent, which we model using the variance δ2.
The firm will pay the agent a share s of the money generated by the sales. In addition, the firm will pay the agent a salary y, which is fixed independently of sales. This scheme—a combination of salary and commission—covers many different situations. Real estate agents receive a mix of salary and commission. Authors receive an advance and a royalty, which works like a salary and commission.
The monetary compensation of the agent is sx + y. In addition, the agent has a cost of effort, which we take to be Here, a represents the ability of the agent: more able agents, who have a higher value of a, have a lower cost of effort. Finally, there is a cost of risk. The actual risk imposed on the agent is proportional to the degree he shares in the proceeds. If s is small, the agent faces almost no monetary risk, but if s is high, most of the risk is imposed on the agent. We use the linear cost of risk model, developed earlier, to impose a cost of risk, which is sλδ2. Here, δ2 is the variance of the monetary risk, λ defines the agent’s attitude or cost of risk, and s is the share of the risk imposed on the agent. This results in a payoff to the agent of
The part of the equation represented by sx + y is the payments made to the agent. The next term is the cost of generating that level of x. The final term is the cost of risk imposed on the agent by the contract.
The agency game works as follows. First, the principal offers a contract, which involves a commission s and a salary y. The agent can either accept or reject the contract and accepts if he obtains at least u0 units of utility, the value of his next best offer. Then the agent decides how much effort to expend; that is, the agent chooses x.
As with all subgame perfect equilibria, we work backward to first figure out what x an agent would choose. Because our assumptions make u quadratic in x, this is a straightforward exercise, and we conclude x = sa. This can be embedded into u, and the agent’s optimized utility u* is
The agent won’t accept employment unless u* ≥ u0, the reservation utility. The principal can minimize the cost of employing the agent by setting the salary such that u* = u0, which results in
Observe that the higher the salary, the greater is the risk δ2. That is, the principal has to cover the cost of risk in the salary term.
The principal obtains profits, which are the remainder of the value after paying the agent minus the salary:
Note that the principal gets the entire output x = sa minus all the costs—the reservation utility of the agent u0, the cost of providing effort, and the risk cost on the agent. That is, the principal obtains the full gains from trade—the value of production minus the total cost of production. However, the fact that the principal obtains the full gains from trade doesn’t mean the principal induces the agent to work extremely hard because there is no mechanism for the principal to induce the agent to work hard without imposing more risk on the agent, and this risk is costly to the principal. Agents are induced to work hard by tying their pay to their performance, and such a link necessarily imposes risk on the agent, and risk is costly.There is a technical requirement that the principal’s return π must be positive; otherwise, the principal would rather not contract at all. This amounts to an assumption that u0 is not too large. Moreover, if s comes out less than zero, the model falls apart, and in this case, the actual solution is s = 0.
We take the principal to be risk neutral. This is reasonable when the principal is economically large relative to the agent, so that the risks faced by the agent are small compared to those faced by the principal. For example, the risks associated with any one car are small to a car rental company. The principal who maximizes expected profits chooses s to maximize π, which yields
This formula is interesting for several reasons. First, if the agent is neutral to risk, which means λ = 0, then s is 1. That is, the agent gets 100% of the marginal return to effort, and the principal just collects a lump sum. This is reminiscent of some tenancy contracts used by landlords and peasants; the peasant paid a lump sum for the right to farm the land and then kept all of the crops grown. Because these peasants were unlikely to be risk neutral, while the landlord was relatively neutral to risk, such a contract was unlikely to be optimal. The contract with s = 1 is known as selling the agency because the principal sells the agency to the agent for a lump sum payment. (Here, y will generally be negative—the principal gets a payment rather than paying a salary.) The more common contract, however, had the landowner and the tenant farmer share the proceeds of farming, which gives rise to the name sharecropper.
Second, more risk or more risk aversion on the part of the agent decreases the share of the proceeds accruing to the agent. Thus, when the cost of risk or the amount of risk is high, the best contract imposes less risk on the agent. Total output sa falls as the costs of risk rise.
Third, more able agents (higher a) get higher commissions. That is, the principal imposes more risk on the more able agent because the returns to imposition of risk—in the form of higher output—are greater and thus worth the cost in terms of added risk.
Most real estate agencies operate on a mix of salary and commission, with commissions paid to agents averaging about 50%. The agency RE/MAX, however, pays commissions close to 100%, collecting a fixed monthly fee that covers agency expenses from the agents. RE/MAX claims that their formula is appropriate for better agents. The theory developed suggests that more able agents should obtain higher commissions. But in addition, RE/MAX’s formula also tends to attract more able agents because able agents earn a higher wage under the high commission formula. (There is a potential downside to the RE/MAX formula: it discourages agency-wide cooperation.)
Consider what contracts attract what kinds of agents. For a fixed salary y and commission s, the agent’s utility, optimizing over x, is
The agent’s utility is increasing in a and decreasing in λ. Thus, more able agents get higher utility, and less risk-averse agents get higher utility.
How do the terms of the contract affect the pool of applicants? Let us suppose that two contracts are offered, one with a salary y1 and commission s1, the other with salary y2 and commission s2. We suppose y2 < y1 and s2 > s1. What kind of agent prefers Contract 2, the high-commission, low-salary contract, over Contract 1?
or the equivalent:
Thus, agents with high ability a or low level of risk aversion λ prefer the high-commission, low-salary contract. A company that puts more of the compensation in the form of commission tends to attract more able agents and agents less averse to risk. The former is a desirable feature of the incentive scheme because more able agents produce more. The latter, the attraction of less risk-averse agents, may or may not be desirable but is probably neutral overall.
One important consideration is that agents who overestimate their ability will react the same as people who have higher ability. Thus, the contract equally attracts those with high ability and those who overestimate their ability.
Agency theory provides a characterization of the cost of providing incentives. The source of the cost is the link between incentives and risk. Incentives link pay and performance; when performance is subject to random fluctuations, linking pay and performance also links pay and the random fluctuations. Thus, the provision of incentives necessarily imposes risk on the agent, and if the agent is risk averse, this is costly.
In addition, the extent to which pay is linked to performance tends to affect the type of agent who is willing to work for the principal. Thus, a principal must not only consider the incentive to work hard created by the commission and salary structure but also the type of agent who would choose to accept such a contract.
Multi-taskingPerforming several activities simultaneously. refers to performing several activities simultaneously. All of us multitask. We study while drinking a caffeinated beverage; we think about things in the shower; we talk all too much on cell phones and eat french fries while driving. In the context of employees, an individual employee is assigned a variety of tasks and responsibilities, and the employee must divide her time and efforts among the tasks. Incentives provided to the employee must direct not only the total efforts of the employee, but also the allocation of time and effort across activities. An important aspect of multitasking is the interaction of incentives provided to an employee, and the effects of changes in one incentive on the behavior of the employee over many different dimensions. In this section, we will establish conditions under which the problem of an employer disaggregates; that is, the incentives for performing each individual task can be set independently of the incentives applied to the others.
This section is relatively challenging and involves a number of pieces. To simplify the presentation, some of the analyses are set aside as claims.
To begin the analysis, we consider a person who has n tasks or jobs. For convenience, we will index these activities with the natural numbers 1, 2, …, n. The level of activity, which may also be thought of as an action, in task i will be denoted by xi. It will prove convenient to denote the vector of actions by We suppose the agent bears a cost c(x) of undertaking the vector of actions x. We make four assumptions on c:
For example, if there are two tasks (n = 2), then all four of these assumptions are met by the cost function This function is increasing in x1 and x2, has continuous derivatives, is strictly convex (more about this below), and is homogeneous of degree 2.
It is assumed that c is increasing to identify the activities as costly. Continuity of derivatives is used for convenience. Convexity of c will ensure that a solution to the first-order conditions is actually an optimum for the employee. Formally, a function is a convex functionThe function that lies below the straight line segment connecting two points, for any two points in the interval. such that, for any vectors x ≠ y and scalar α between zero and one (0 ≤ α ≤ 1),
In other words, a convex function is any function that lies below the straight line segment connecting two points on the function, for any two points in the interval, when x is a scalor.
One way of interpreting this requirement is that it is less costly to do the average of two things than the average of the costs of the things. Intuitively, convexity requires that doing a medium thing is less costly than the average of two extremes. This is plausible when extremes tend to be very costly. It also means the set of vectors that cost less than a fixed amount, {x | c(x) ≤ b}, is a convex set. Thus, if two points cost less than a given budget, the line segment connecting them does, too. Convexity of the cost function ensures that the agent’s optimization problem is concave and thus that the first-order conditions describe a maximum. When the inequality is strict for α satisfying 0 < α < 1, we refer to convexity as strict convexity.
The assumption of homogeneity dictates that scale works in a particularly simple manner. Scaling up activities increases costs at a fixed rate r. Homogeneity has very strong implications that are probably unreasonable in many settings. Nevertheless, homogeneity leads to an elegant and useful theory, as we shall see. Recall the definition of a homogeneous function: c is homogeneous of degree r means that for any λ > 0,
Claim: Strict convexity implies that r > 1.
Proof of Claim: Fix any x and consider the two points x and λ x. By convexity, for 0 < α <1, which implies
Define a function k that is the left-hand side minus the right-hand side:
Note that k(0) = k(1) = 0. Moreover, It is readily checked that if a convex function of one variable is twice differentiable, then the second derivative is greater than zero. If r ≤ 1, implying that k is convex, and hence, if 0 < α < 1,
Similarly, if r > 1, k is concave and k(α) > 0. This completes the proof, showing that r ≤ 1 is not compatible with the strict convexity of c.
How should our person behave? Consider linear incentives, which are also known as piece rates. With piece rates, the employee gets a payment pi for each unit of xi produced. The person then chooses x to maximize
Here • is the dot product, which is the sum of the products of the components.
The agent chooses x to maximize u, resulting in n first-order conditions where ci is the partial derivative of c with respect to the ith argument xi. This first-order condition can be expressed more compactly as where is the vector of partial derivatives of c. Convexity of c ensures that any solution to this problem is a global utility maximum because the function u is concave, and strict convexity ensures that there is at most one solution to the first-order conditions.This description is slightly inadequate because we haven’t considered boundary conditions. Often a requirement like xi ≥ 0 is also needed. In this case, the first-order conditions may not hold with equality for those choices where xi = 0 is optimal.
One very useful implication of homogeneity is that incentives scale. Homogeneity has the effect of turning a very complicated optimization problem into a problem that is readily solved, thanks to this very scaling.
Claim: If all incentives rise by a scalar factor α, then x rises by
Proof of Claim: Note that differentiating with respect to xi yields and thus That is, if c is homogeneous of degree r, is homogeneous of degree r – 1. Consequently, if Thus, if the incentives are scaled up by α, the efforts rise by the scalar factor
Now consider an employer with an agent engaging in n activities. The employer values the ith activity at vi and thus wishes to maximize
This equation embodies a standard trick in agency theory. Think of the principal (employer) not as choosing the incentives p, but instead as choosing the effort levels x, with the incentives as a constraint. That is, the principal can be thought of as choosing x and then choosing the p that implements this x. The principal’s expected profit is readily differentiated with respect to each xj, yielding
However, because cj(x) is homogeneous of degree r – 1,
and thus
This expression proves the main result of this section. Under the maintained hypotheses (convexity and homogeneity), an employer of a multitasking agent uses incentives that are a constant proportion of value; that is, where r is the degree of homogeneity of the agent’s costs. Recalling that r > 1, the principal uses a sharing ruleIn agency theory, sharing a fixed proportion of the output with the agent., sharing a fixed proportion of value with the agent.
When agents have a homogeneous cost function, the principal has a very simple optimal incentive scheme, requiring quite limited knowledge of the agent’s cost function (just the degree of homogeneity). Moreover, the incentive scheme works through a somewhat surprising mechanism. Note that if the value of one activity, for example, Activity 1, rises, p1 rises and all the other payment rates stay constant. The agent responds by increasing x1, but the other activities may rise or fall depending on how complementary they are to Activity 1. Overall, the agent’s substitution across activities given the new incentive level on Activity 1 implements the desired effort levels on other activities. The remarkable implication of homogeneity is that, although the principal desires different effort levels for all activities, only the incentive on Activity 1 must change.
In the previous section we saw, for example, that if the agent has quadratic costs, the principal pays the agent half the value of each activity. Moreover, the more rapidly costs rise in scale, the lower are the payments to the agent.
This remarkable theorem has several limitations. The requirement of homogeneity is itself an important limitation, although this assumption is reasonable in some settings. More serious is the assumption that all of the incentives are set optimally for the employer. Suppose, instead, that one of the incentives is set too high, at least from the employer’s perspective. This might arise if, for example, the agent acquired all the benefits of one of the activities. An increase in the power of one incentive will then tend to spill over to the other actions, increasing for complements and decreasing for substitutes. When the efforts are substitutes, an increase in the power of one incentive causes others to optimally rise, to compensate for the reduced supply of efforts of that type.Multi-tasking (and agency theory more generally) is a rich theory with many implications not discussed here. For a challenging and important analysis, see Bengt Holmstrom and Paul Milgrom, “The Firm as an Incentive System,” American Economic Review 84, no. 4 (September 1994): 972–991.
We can illustrate the effects of cost functions that aren’t homogeneous in a relatively straightforward way. Suppose the cost depends on the sum of the squared activity levels:
This is a situation where vector notation (dot-products) dramatically simplifies the expressions. You may find it useful to work through the notation on a separate sheet, or in the margin, using summation notation to verify each step. At the moment, we won’t be concerned with the exact specification of g, but instead we will use the first-order conditions to characterize the solution.
The agent maximizes
This gives a first-order condition
It turns out that a sufficient condition for this equation to characterize the agent’s utility maximization is that g is both increasing and convex (increasing second derivative).
This is a particularly simple expression because the vector of efforts, x, points in the same direction as the incentive payments p. The scalar that gives the overall effort levels, however, is not necessarily a constant, as occurs with homogeneous cost functions. Indeed, we can readily see that x • x is the solution to
Because x • x is a number, it is worth introducing notation for it: S = x • x. Then S is the solution to
The principal or employer chooses p to maximize
This gives the first-order condition
Thus, the principal’s choice of p is such that x is proportional to v, with constant of proportionality Using the same trick (dotting each side of the first-order condition with itself), we obtain
which gives the level of x • x = S* induced by the principal. Given S*, p is given by
Note that this expression gives the right answer when costs are homogeneous. In this case, g(S) must be in the form Sr/2, and the formula gives as we already established.
The natural assumption to impose on the function g is that is an increasing function of S. This assumption implies that as the value of effort rises, the total effort also rises.
Suppose is increasing in S. Then an increase in vi increases S, decreasing pj for j ≠ i. That is, when one item becomes more valuable, the incentives for performing the others are reduced. Moreover, because an increase in S only occurs if p • p increases.
These equations together imply that an increase in any one vi increases the total effort (as measured by S* = x • x), increases the total incentives as measured by p • p, and decreases the incentives for performing all activities other than activity i. In contrast, if is a decreasing function of S, then an increase in any one vi causes all the incentives to rise. Intuitively, the increase in vi directly causes pi to rise because xi is more valuable. This causes the agent to substitute toward activity i. This causes the relative cost of total activity to fall (because decreases), which induces a desire to increase the other activity levels. This is accomplished by an increase in the incentives for performing the other activities.
This conclusion generalizes readily and powerfully. Suppose that c(x) = g(h(x)), where h is homogeneous of degree r and g is increasing. In the case just considered, h(x) = x • x. Then the same conclusion, that the sign of is determined by the derivative of holds. In the generalization, S now stands for h(x).