You are on page 1of 53

INSTITUTE OF PHYSICS PUBLISHING REPORTS ON PROGRESS IN PHYSICS

Rep. Prog. Phys. 64 (2001) 429–481 www.iop.org/Journals/rp PII: S0034-4885(01)83723-4

Nonlinear theory of diffusive acceleration of particles


by shock waves
M A Malkov1 and L O’C Drury2
1
University of California at San Diego, 9500 Gilman Dr, La Jolla, CA 92093-0319, USA
2
Dublin Institute for Advanced Studies, 5 Merrion Square, Dublin 2, Republic of Ireland

E-mail: mmalkov@ucsd.edu

Received 13 August 1999, in final form 29 November 2000

Abstract

Among the various acceleration mechanisms which have been suggested as responsible for
the nonthermal particle spectra and associated radiation observed in many astrophysical and
space physics environments, diffusive shock acceleration appears to be the most successful. We
review the current theoretical understanding of this process, from the basic ideas of how a shock
energizes a few reactionless particles to the advanced nonlinear approaches treating the shock
and accelerated particles as a symbiotic self-organizing system. By means of direct solution
of the nonlinear problem we set the limit to the test-particle approximation and demonstrate
the fundamental role of nonlinearity in shocks of astrophysical size and lifetime. We study
the bifurcation of this system, proceeding from the hydrodynamic to kinetic description under
a realistic condition of Bohm diffusivity. We emphasize the importance of collective plasma
phenomena for the global flow structure and acceleration efficiency by considering the injection
process, an initial stage of acceleration and, the related aspects of the physics of collisionless
shocks. We calculate the injection rate for different shock parameters and different species.
This, together with differential acceleration resulting from nonlinear large-scale modification,
determines the chemical composition of accelerated particles. The review concentrates on
theoretical and analytical aspects but our strategic goal is to link the fundamental theoretical
ideas with the rapidly growing wealth of observational data.
(Some figures in this article are in colour only in the electronic version; see www.iop.org)

0034-4885/01/040429+53$90.00 © 2001 IOP Publishing Ltd Printed in the UK 429


430 M A Malkov and L O’C Drury

Contents

Page
1. Introduction 431
2. Transport of energetic particles in nonuniform plasma flows 433
3. Test particle solution 434
3.1. ‘Box’ approximation 435
4. Related topics of the physics of collisionless shocks 438
4.1. Quasi-perpendicular shocks 438
4.2. Quasi-parallel shocks 439
5. Injection 441
5.1. Electron injection 443
6. Nonlinear fluid theories 446
6.1. Two-fluid model 447
6.2. Wave excitation 449
6.3. Limitations of the TFM: attempts at inclusion of the injection and losses 452
7. Nonlinear kinetic theories 453
7.1. An exact solution for momentum independent CR diffusivity 454
7.2. An exact solution for arbitrary κ(p) 456
7.3. Adjusting the flow 457
7.4. Asymptotic universality of acceleration in strong shocks 458
7.5. The method of integral equation 459
7.6. Quasi-phenomenological and numerical studies of strongly nonlinear
acceleration 464
7.7. The role of turbulent heating 467
7.8. Simplified time-dependent treatment 468
7.9. Shock acceleration as self-organized critical phenomenon 469
8. Observations 473
9. Summary and outlook 475
Acknowledgments 477
References 477
Theory of diffusive shock acceleration 431

1. Introduction

There exist many diverse astrophysical objects and systems where strong shocks are observed,
directly or indirectly, to be accompanied by swarms of energetic charged particles. Examples
range in scale from the bow-shock formed when the solar wind impinges on the Earth’s
magnetosphere to the hot spots seen in the lobes of giant radio galaxies. Generically the
energetic charged particles exhibit power-law energy spectra with rather similar exponents.
One particular mechanism, a form of first order Fermi acceleration, usually called diffusive
shock acceleration (DSA) has been extremely successful in explaining why this should be
so. The mechanism is basically simple and persuasive—particles gain energy by bouncing
between converging upstream and downstream regions of the flow. There are, however, three
conditions for it to work efficiently: (i) at least a few thermal particles downstream must be
able to return upstream; (ii) the accelerated particles must not propagate freely to keep crossing
the shock; and (iii) if many particles are accelerated to high energies, their pressure must not
smear out the shock completely. These three conditions constitute, in fact, three current issues
of the theory known as the particle injection, particle confinement, and shock robustness.
When the basic ideas of this mechanism were published >20 years ago by Krymsky [1],
Axford et al [2], Bell [3, 4], Blandford and Ostriker [5], the above three conditions were
usually implicitly assumed although there were important exceptions. The first condition
in essence states that there is no need for a distinct injection process (as in other Fermi
acceleration mechanisms); the accelerated particle distribution is extracted naturally from the
shock-heated thermal particle distribution. The second condition requires that the particles be
magnetically coupled to the bulk plasma (on both sides of the shock) and that this interaction
produces sufficient scattering to keep the angular distributions close to isotropy. This is usually
assumed to be fulfilled through the generation of magneto-hydrodynamic (MHD) waves by
the accelerated particles themselves making the acceleration a distinctively nonlinear bootstrap
process. The waves provide the necessary pitch-angle scattering for energetic particles so that
they propagate diffusively rather than freely and can recross the shock gaining energy. The
importance of the third condition above was emphasized in the paper by Axford et al [2],
where the pressure of accelerated particles or cosmic rays (CRs) was included in hydrodynamic
equations for the shock structure.
To some extent the difficulties arising from our limited understanding of these three
processes can be circumvented if one is prepared to assume that: (1) the injection can be
either parametrized and treated as an ad hoc black box process or it is not needed at all if there
is a pre-existing population of energetic particles upstream; (2) the particle diffusion is due
to pre-existing background turbulence or if the turbulence is excited by accelerated particles
then again its amplitude can be parametrized; and (3) the accelerated particles leave the system
before they reach such energies that their pressure becomes significant. Many studies have
followed various combinations of these strategies in attempting to understand this complicated
nonlinear process.
In reality we cannot switch the injection on and off at will, or prescribe its intensity in
advance. The main difficulty associated with (ii) is that the standard plasma physics approach to
the wave generation by resonant particles, the so called quasi-linear theory, predicts impossibly
high wave amplitudes necessitating nonlinear wave description. Note that this is a technical
problem that does not question the physics of acceleration in any respect. Regarding (iii) we
cannot terminate the acceleration arbitrarily before it enters the phase of nonlinear modification
of the global shock structure. Moreover, these three issues are strongly interrelated in the
nonlinear acceleration regime, and become critical elements of the same feedback loop, making
parametrization approaches insufficient. In this synergetic picture, the central part is played by
432 M A Malkov and L O’C Drury

the maximum energy of accelerated particles. Within the linear approach it is easily estimated
from the shock size or acceleration time. With the onset of the nonlinear phase such estimates
can at best give perhaps an upper bound to the particle energy since their confinement and
acceleration timescale depend on the diffusion coefficient, which in turn is related to the level
of MHD turbulence. Being powered by efficiently accelerated particles, the turbulence must be,
as we emphasized, in a strongly nonlinear regime which is also a notorious problem (see, e.g. [6]
and references therein). Therefore, in general, the maximum energy of accelerated particles and
thus the other important characteristics of acceleration such as the particle spectrum, cannot be
determined without a self-consistent description of particle interaction with the self-generated
turbulence.
On the observational ground, rather conservative estimates show [7] that if the CR
background radiation is produced in supernova remnant (SNR) shocks, these must spend
for this purpose between 10 and 30% of their energy. These rather moderate figures might
suggest a perturbative approach to the problem of CR acceleration. There are, however,
serious arguments against it. First, the acceleration efficiency is unlikely to be at the same
level throughout the life time of a SNR so that the peak efficiency can be substantially higher.
Second, and more important, the transition from the linear to nonlinear acceleration regime
will be shown to be abrupt for the maximum particle energy exceeding a few hundreds GeV.
This suggests that we should seek to obtain this 10–30% efficiency by mediating strongly
nonlinear, efficient acceleration regime rather than by stretching the linear description. In the
most natural way, the efficiency reduction can be associated with the suppression of particle
injection and impairment of their confinement through the nonlinear blow up of the shock
front. Therefore, the above issues (i)–(iii) are indeed strongly coupled and should be central
to the shock acceleration theory.
Over the past two decades there has been significant progress in application of the DSA
theory to the modelling of concrete astrophysical shocks, mostly SNRs with the intention to
explain the origin of the bulk CR spectrum as well as the radiation coming from the SNR
shells. These studies along with the basics of the DSA theory became subject of numerous
research articles, e.g., [8–19] (see [19] and the remainder of this review for further references)
also covered in a number of comprehensive reviews [20–34]. Further difficulty of the SNR
modelling is the spherical geometry and time dependence. Yet the fundamental problem
of the distribution of shock energy between thermal and nonthermal (CRs) particles is not
resolved even in the simplest case of a plane steady shock. Moreover, we will argue that
an intrinsic variability and three-dimensionality of a plane, nonlinearly accelerating shock
are more important issues for calculating the spectra and acceleration efficiency. Therefore,
the present review concentrates mostly on this basic case of plane shock. It is intended to
complement the review written by one of us [22], which we will refer to as part I for short, by
concentrating on the nonlinear aspects of the theory.
Our introductory discussion suggests to organize the review by the issues (i)–(iii) above.
This is also as the mechanism works: particles enter acceleration, then they must be bound to
the shock front to continue and, finally, their further energization modifies the shock structure
influencing their injection and confinement. The reader should be aware, however, that not all
these phenomena are equally well understood and the theory of this mechanism is still under
development. To begin, we introduce in the next section a mathematical tool commonly used
for describing the propagation of energetic particles near a shock front.
Theory of diffusive shock acceleration 433

2. Transport of energetic particles in nonuniform plasma flows

A complete analytic description of CRs in the turbulent shock environment at the Vlasov–
Maxwell level is clearly impossible. Therefore it is necessary to simplify the system. A
standard plasma physics reduction scheme of the Vlasov–Maxwell system consists of the
following two steps (e.g., [35]). First, one derives a quasi-linear system, e.g., [36] under the
assumption that the wave–particle interaction is due to the excitation of MHD waves via the
cyclotron resonance with a slightly anisotropic energetic particle distribution. This interaction
leads to the pitch angle diffusion of particles which is assumed to be the fastest process in
the quasi-linear (here gyro-phase averaged) kinetic equation. Since the hydromagnetic waves
(scattering centres) propagate essentially at the Alfvén velocity vA  U in the local plasma
frame, where U (x) is the bulk plasma speed, they are seen by energetic particles as frozen
into the flow. Thus, the particle momentum distribution must be almost isotropic in the local
plasma frame which suggests further reduction of the quasi-linear equation to an equation for
a pitch-angle averaged distribution f (x, p, t). The result is known as a diffusion–convection
equation and may be written in the following form (see e.g., part I):
∂f ∂f ∂ ∂f 1 ∂U ∂f
+U − κ = p . (2.1)
∂t ∂x ∂x ∂x 3 ∂x ∂p
Here the coordinate x is directed along the shock normal and κ(p, x) is the spatial diffusion
coefficient originating from the pitch-angle scattering (wave–particle collisions). Strictly κ
is a second-order tensor relating the diffusive flux of energetic particles to the gradient of
the density, however, in one-dimensional problems (and shocks are basically one-dimensional
structures) it can be regarded as a scalar coefficient.
It is very important for a clear understanding of shock acceleration to note that a mixed
coordinate system has been used in formulating this equation. Spatial distances (the x values)
are measured with respect to a global observation frame (often taken to be that in which
the shock is at rest) whereas particle momenta (the p values) are measured in a local frame
moving with the bulk fluid velocity U (x). This choice is essential to obtain the relatively simple
form of the equation because it is only in this local fluid frame that the particle scattering is
magnetostatic and conserves p (in other words the magnetic fields change the direction of the
particle’s motion but not its energy). More generally, in the presence of waves travelling at
different speeds and in different directions, one can use a weighted mean of the wave speeds to
define a reference frame velocity in which, on average, the particle energy is not changed by
the scattering. In this case an additional momentum-space diffusion term should be added to
the equation to allow for the residual random changes. This term describes classical second-
order Fermi acceleration and is normally unimportant in the cases we want to consider (strong
shocks with U  vA ).
It is important to realize that the diffusion–convection equation (2.1) can be derived without
using quasi-linear theory as was done originally by Krymski [37], Parker [38], Gleeson and
Axford [39] and Jokipii [40]. Generally, the most systematic approaches to such derivations
are based on the decomposition of the full distribution function into a set of eigenfunctions
{fn } of the underlying pitch-angle scattering operator and the subsequent derivation of an
evolution equation for the isotropic component of this expansion, f0 (x, p, t) which is denoted
as f in (2.1) for short. For certain simple scattering operators, the Legendre polynomials
{Pn } are appropriate and have been used in [39]. More general cases have been considered by
Webb [41, 42].
Since the scale of the pitch-angle scattering (mean free path (m.f.p.), λ) is eliminated
from (2.1), all other scales (at least from the formal point of view) need to be larger
than the m.f.p. Indeed, the diffusion–convection equation has been derived and used for
434 M A Malkov and L O’C Drury

describing particle transport in turbulent magnetic fields primarily in smooth gas flows [43–46].
Interestingly enough, its potential for studying also discontinuous environments (e.g., shocks)
has been realized and discussed already in [39], however, for the case λ  x, where x is
the typical scale of U (x) in (2.1). The opposite case of a ‘true’ discontinuity λ  x has
been considered in [20,47] and in part I. As we will see in the sequel, nonlinearly accelerating
shocks develop structures with disparate scales and where both cases occur. Some care should
be exercised in the derivation of matching conditions for equation (2.1) at a discontinuity of
U (x) (λ  x) even in the simplest case of a quasiparallel shock (see section 4) in which
particles cross the discontinuity freely. The reason is that the function f is not the complete
phase space density but only its isotropic part and, as we emphasized, is related to different
reference frames. Fortunately, rigorous derivation of the matching conditions for f in (2.1)
leads to the result which can be recovered directly from (2.1) by its integration across the
discontinuity of U (x) (see e.g. part I). This will be used in the next section.

3. Test particle solution

We begin with the simplest exact solution of (2.1) in which the reaction of the accelerated
particles on the flow structure is ignored; this is usually called the test particle approximation.
We look for a steady solution with no time dependence and a flow profile in the shock frame
given by

−u1 , if x > 0;
U (x) = (3.1)
−u2 , if x  0
where u1 > u2 are the constant upstream and downstream flow speeds, respectively (note that
this is the opposite sign convention to that used in part I). We also assume that there are no
accelerated particles far upstream, at x = ∞, i.e., f (x = ∞) = 0. Then, the steady-state
solution of (2.1) in the upstream medium, x > 0, is trivially found to be
f = f0 (p) exp (−u1 x/κ). (3.2)

Note that if κ depends not only on p but on x as well one should replace x/κ → dx/κ in the
last solution. Downstream from the shock (x  0) the only bounded solution is f = f0 (p).
We have used here the continuity of f at x = 0 assuming particles cross the shock ballistically
(λ  x). Integrating (2.1) across the shock transition then formally gives a differential
equation for the particle spectrum at the shock with solution
3r
f0 = Qinj p −q , where q= , (3.3)
r −1
r = u1 /u2 is the shock compression and Qinj is the normalization constant characterizing
the rate at which the high-energy population is supplied with particles from the thermal pool
(injection rate).
Essentially the above approach was followed in the original papers by Krymsky [1], Axford
et al [2] and Blandford and Ostriker [5], whereas Bell [3, 4] gave a microscopic derivation of
this result based on the kinetics of individual particles near the shock front. These approaches
are summarized and discussed in detail in part I. The formal derivation is open to the criticism
that the validity of (2.1) at a velocity discontinuity is not clear while Bell’s approach, although
extremely physical, appears to put too much attention on individual particles. As an alternative,
which we feel has some advantages, we give here a derivation based on particle number
conservation which also leads to a useful model for shock acceleration, usually called ‘box’
approximation (see [48] and references therein).
Theory of diffusive shock acceleration 435

3.1. ‘Box’ approximation


The key to this approach is to identify the various fluxes of particles in and out of a region of
phase space and then write down a standard conservation equation. As noted in the discussion
of the diffusion convection equation (2.1) it uses a ‘mixed’ system of coordinates where particle
momenta are measured in a local frame such that scattering changes their direction, but not
their magnitude. For the idealized unmodified shock with velocity profile (3.1) this means that
all upstream momenta are measured in a frame moving towards the shock at speed u1 and all
downstream ones in a frame moving away from the shock at velocity u2 . By assumption all the
scattering, in either of these frames, is magnetostatic and leaves the energy or momentum of
a particle, in that frame, unchanged: the only place where a particle’s momentum can change
is when it crosses from one region to the other because, on our convention, we are then forced
to change the reference frame.
Let us consider a particle of momentum p crossing the shock at an angle ϑ to the shock
normal from the downstream to the upstream. An elementary exercise in special relativity
shows that the upstream value of the momentum, p , is related to that downstream, p, by the
exact formula
 2  
p 1 2βc β 2 c2
= 1+ cos ϑ + 2 − β sin ϑ 2 2
(3.4)
p 1 − β2 v v
where β = (u1 − u2 )/c is the dimensionless velocity change between frames and v is the
particle velocity. We are only interested in the case where the shock is nonrelativistic and
β  11 . Expanding in powers of β we get
 
β cos ϑ
p =p 1+ + O(β ) .
2
(3.5)
v
The result is perhaps more transparently expressed in vectorial notation as
p · (U 1 − U 2 )
p = + O(β 2 ) (3.6)
v
where p is the increment in p. In this form it can be easily seen that this applies even to
oblique shocks where U1 and U2 are not parallel to the shock normal n.
We can now easily calculate the flux of particles crossing the shock from a momentum
value less than p to a value greater than p as
 p 
(p) = dp f (p ) v · n p 2 d (3.7)
p−p

≈ (p)f (p) v · n p 2 d (3.8)

p · (U 1 − U 2 ) 2
≈ f (p) v · n p d (3.9)
v
where the solid angle integration is over all the directions of the momentum vector p (and the
velocity vector v ). Note that we assume both (p)  p, which requires v  u1 , and β  1
which requires u1  c. These are also the conditions for the distribution function to be almost
isotropic so that f (p) ≈ f (p) and this simplifies to

p · (U 1 − U 2 ) 4πp3
(p) = p f (p) v · n
2
d = f (p) n · (U1 − U2 ). (3.10)
v 3
This very general expression is the key element in shock acceleration; associated with the
sharp localized compression in the flow, that is the shock, there is a flux of energetic particles
1 For a review of particle acceleration at relativistic shocks see [32].
436 M A Malkov and L O’C Drury

directed upwards in momentum (or energy). The flux is proportional to the number density of
particles at the shock and the velocity jump in the shock. Note that (2.1) can be written in the
conservative form
   
∂f ∂ 1 ∂ 4πp3 ∂U ∂ ∂f
+ (Uf ) = f + κ (3.11)
∂t ∂x 4πp 2 ∂p 3 ∂x ∂x ∂x
with a momentum flux term
4πp3 ∂U
(p) = f . (3.12)
3 ∂x
If we are prepared to formally apply this to a velocity profile with a step discontinuity or if we
smooth the velocity transition over a small scale and integrate over the transition region, we
recover the above result for the accelerated flux at a velocity discontinuity.
We can now write down a conservation equation for the particles associated with the shock.
Physically it is clear that the particles interacting with the shock are those located within about
one diffusion length of the shock. Indeed the steady solution of the diffusion equation in the
upstream region shows that the upstream particles have an exponential distribution with an e-
folding distance of κ1 /u1 and a simple argument (given in part I) shows that the probability of
a downstream particle returning to the shock also falls off exponentially with e-folding length
κ2 /u2 . Thus the number of particles of momentum p interacting with the shock is simply
 
κ1 κ 2
4πp 2 f (p) + . (3.13)
u1 u2
By assumption there are no particles far upstream, and thus the flux of particles carried in by
the flow from upstream is zero. However, downstream it is clear that there is a flux of particles
carried away from the shock by the flow. In the steady state this advective flux is simply
4πp2 f (p)u2 (3.14)
and even in time-dependent situations this should be a good estimate of the advective loss from
the shock region.
Particle conservation now requires that the time rate of change of the particle number be
balanced by the divergence of the momentum space acceleration flux and the loss of particle
from the system by downstream advection
  
∂ κ1 κ 2 ∂(p)
4πp2 f (p) + + = −4πp2 f (p)u2 (3.15)
∂t u1 u2 ∂p
or simplifying
 
κ1 κ2 ∂f u1 − u2 ∂f
+ + p + u1 f = 0. (3.16)
u1 u2 ∂t 3 ∂p
Clearly the steady solution for the spectrum is (3.3), a simple power law with exponent
determined, as in all Fermi processes, by the balance between acceleration and escape. The
remarkable thing about shock acceleration is that this balance is fixed by the velocity structure
of the shock and has no energy scale; thus the power law can, indeed must, extend over a very
large dynamic range in momentum with fixed exponent q = 3u1 /(u1 − u2 ).
The conservation equation is exact in the steady case, but as noted above should also be
a good approximation even in time-dependent situations. An immediate deduction from the
equation (3.16) is that the acceleration timescale from some momentum p0 to p is
 p 
3 κ1 κ2 dp
tacc (p) = + , (3.17)
u 1 − u 2 p 0 u1 u 2 p
Theory of diffusive shock acceleration 437

a result confirmed by detailed analysis in part I (see also [108, 109]). One can think of this as
the time scale for the acceleration flux, , to ‘fill’ the acceleration region. A mathematical way
of looking at (3.17) is that it actually represents a family of characteristics of (3.16) labelled
by p0 .
Another immediate deduction from the ‘box’ approximation (3.16) is that in the general
time-dependent case the simple power-law spectrum (3.3) should extend from the point where
this simple analysis becomes valid (v  u1 , magnetostatic scattering and almost isotropic
angular distributions) up to a maximum energy determined either by the finite age of the
system (t ≈ tacc where t is the age or dynamical timescale of the shock) or the finite size of the
shock (L ≈ κ/u where L is a characteristic length, e.g., the radius of curvature of the shock
front).
The main defect of the ‘box’ approximation is that it lumps all the accelerated particles
together and assumes that they gain energy at the same rate. Physically it is clear that not all
particles gain energy at exactly the same rate, and in fact there is a long ‘tail’ to the distribution
caused by particles which spend long time periods diffusing at some distance from the shock
before eventually returning. This can be made precise through a detailed analysis [49] of the
time-dependent test-particle theory which yields closed expressions for the mean and variance
of the acceleration time distribution (the probability distribution function for the time taken
to accelerate a particle from one momentum p0 to another higher one p1 ) with κ(x, p) an
arbitrary (positive) function of x and p. The mean acceleration time from p0 to p1 (at the
shock) is found to be
 p1     x 
3 dp +∞  U (x ) dx 
c1 = dx exp −  . (3.18)
u1 − u2 p0 p −∞ 0 κ(x , p) 
In agreement with the physical view expressed above this is equivalent to

3
tacc = θ (x) dx (3.19)
u1 − u 2
where the ‘modulation factor’
  x 
 U (x ) 
θ (x) = exp −   
dx  (3.20)
κ(p, x )
0
measures the penetration of the accelerated particles into the up- and downstream regions.
There is an analogous, but more complicated, expression for the variance of the acceleration
time
 p1    ∞  ∞ 2 
6 dp 1
c2 = 2 xθ (x) dx − θ(x) dx
u1 − u 2 p 0 p u1 0 0
  −∞   −∞ 2 
1
+ 2 xθ (x) dx − θ(x) dx
u2 0 0
which relates in a similar way the variance of the acceleration time to the spatial variance in
the distribution of the particles relative to the shock. These quantities can then be used to
‘renormalize’ the exact special solution [20].
 −3/2  
1 t −c1 (t − c1 )2
√ exp (3.21)
2πc2 c1 2tc2
for the acceleration time distribution (t is the acceleration time) which holds in the case where
κ/U 2 is constant everywhere by changing the mean, c1 , and variance, c2 , to the correct values.
Comparison with numerical results [49] shows that this gives a good approximation in most
cases.
438 M A Malkov and L O’C Drury

4. Related topics of the physics of collisionless shocks

While having determined the form of the spectrum the test particle solution provided no
information about its amplitude. Any attempt to determine Qinj in (3.3), by matching this
solution with a thermal plasma out of which it should emerge, brings us into the realm of
collisionless shocks. This is a difficult branch of plasma physics which seeks to explain how
shocks form in supersonic flows with essentially no Coulomb collisions and how the flow
energy is then dissipated in these shocks. It developed very rapidly during the 1960s and
1970s and the collisionless shock phenomenon served as a testbed for many new ideas in the
studies of nonlinear collective interactions in plasmas. There are a number of excellent reviews
devoted to this topic, [50–52], just to mention three where many other references can also be
found (see also [53] for more recent discussion and references).
To appreciate the problem in general, the following gedanken experiment, which is actually
carried out in many numerical simulations of collisionless shocks, appears to be useful.
Suppose the supersonic gas flow hits a perfectly reflecting wall. In the next moment two
streams will appear in front of the wall, one the incident and another the reflected. If collisions
are strong, the two streams couple to form a flow which is at rest in the wall frame and the
energy of the bulk gas motion is converted into gas internal energy. This gas is separated from
the upstream flow by a shock propagating away from the wall instead of the reflected flow.
After passing a distance of a few m.f.p.s the shock ceases to communicate with the wall and
the parameters of the gas between the wall and the shock may be obtained from those ahead
of the shock on the grounds of conservation of mass, momentum and energy fluxes across the
shock (Rankine–Hugoniot (RH) relations, e.g., [54]).
In a collisionless plasma, by assumption, two-body collisions can be ignored, however,
the two-stream state is unstable and growing waves produce ‘effective’ collisions ensuring
momentum and energy exchange between the two streams and a necessary dissipation. In
this case the two-stream structure cannot disappear completely and must be maintained in the
vicinity of the shock at a level sufficient to generate enough turbulence to create the necessary
effective collisions. Further details of the plasma behaviour near the shock depend on the
orientation of the ambient magnetic field to the shock normal. One distinguishes between
quasi-parallel and quasi-perpendicular shocks in which the angle nB between the magnetic
field and the shock normal is close to zero or to π/2, respectively. Although there is no distinct
boundary separating these two cases, the value nB = π/4 is commonly regarded as such,
mostly on the grounds that for nB  π/4 a significant number of incident ions reflect off the
shock. This classification is not to be confused with another, completely unambiguous one that
is based on the speed at which the intersection point of field line moves along the shock front
(e.g., part I). If it is subluminal all such shocks can be transformed to the so-called parallel
shock frame in which the flow is everywhere parallel to the magnetic field. If this speed is
superluminal, a special frame can be chosen in such a way that the magnetic field is strictly
perpendicular to the shock normal.

4.1. Quasi-perpendicular shocks


Plasma is not only a nonlinear medium which allows shocks to form but it is also a dispersive
medium which makes them rich and complex. Sagdeev [50] describes in an elegant way
a perpendicular magnetosonic shock wave as a dissipatively modified solitary wave. A
nondissipative solitary wave (soliton) is characterized by an exact balance of nonlinearity and
dispersion which makes possible the propagation of a symmetric magnetic pulse which neither
steepens nor spreads. Even the smallest amount of dissipation introduces irreversibility and
Theory of diffusive shock acceleration 439

breaks the symmetry between the upstream and downstream states; the leading edge of the pulse
remains practically unchanged but its trailing edge becomes oscillatory. These oscillations
decay through this same dissipation and relax to a state different from that upstream. In this
way the soliton turns into a shock. Although this solution describes only weak, laminar shocks
it identifies the dissipation mechanism that must come into play for stronger shocks. That
is, when the Alfvén Mach number exceeds MA = 2, the soliton must reflect some small part
of incident ions so that they should form an unstable population ahead of the wavefront and
provide dissipation.
Even strong turbulent shocks, which are particularly interesting from an acceleration
point of view, but lack consistent analytical description, remain conceptually well understood
on the ground of the following two features. First, there exist reflected particles capable of
providing a dissipation mechanism via instability. And second, the hot downstream plasma
is isolated from the cold upstream plasma by a magnetic field and cannot penetrate upstream
farther than to its Larmor radius so that a distinct shock transition can be maintained. An
electric potential overshoot plays also a critical role in insulating the hot downstream plasma
in quasi-perpendicular shocks [52, 55, 56].

4.2. Quasi-parallel shocks


The first difficulty which one encounters when attempting to understand the structure of a
quasi-parallel shock is the following. Since the downstream ion temperature should follow
the RH jump relations at least approximately (which is proven by observations e.g., [57]) the
number of particles that can potentially return upstream is so large that, were they all to do
so, the density of particles injected upstream would be similar to the upstream background
density. This is an absurd result and if it were true such a strong leakage would smear out
the shock transition. Returning to the injection problem we emphasize that conceptually, the
crucial point is that the real problem is not how to achieve injection, but rather how to restrict
it to a sufficiently low level.
A good handle on this situation is provided by hybrid simulations in which electrons are
treated as a fluid but ions are described kinetically together with self-consistent fields. A
comprehensive review is given by Quest [58]. According to these results such a strong leakage
does not occur. Instead, the shocked plasma is trapped downstream despite an apparently
unfavourable field geometry. The details of the trapping mechanism are not understood
completely but it is clear that the hydromagnetic turbulence behind the shock plays a central
part.

4.2.1. Quasi-monochromatic shock model. One simple model largely based on the results of
hybrid simulations presented in [58] and capable of producing the number density of leaking
ions has been suggested in [59]. Its starting point is the assumption that some small fraction
of the downstream plasma leaks upstream to form a beam there. This beam drives MHD
waves via a cyclotron instability according to the resonance condition ω − k v = ±ωci ,
where ω is the frequency of the wave, k and v are the parallel to the ambient magnetic field
components of the wavevector and the velocity of the beam particles, ωci = eB0 /mi c is the
proton gyrofrequency. In the upstream frame, the waves propagate away from the shock nearly
along the magnetic field, k ≈ k with Alfvén velocity ω/k ≈ vA which is much smaller than
the upstream bulk velocity u1 . Therefore, they are continuously convected downstream with
the flow. The beam particles are pitch-angle scattered on these waves and eventually return
downstream, so that in a steady state it has a finite spatial extension upstream. A critical
approximation is that the beam is relatively narrow in its v velocity, v < u1  vb , where
440 M A Malkov and L O’C Drury

vb and v are the mean velocity and the rms of the beam distribution along the field as seen
in the upstream frame. Therefore, the spectrum of the beam driven MHD turbulence may
also be assumed to be quasi-monochromatic, k < k1  ωci /vb , where k1 is the principal
mode wavenumber and k is the width of the spectrum. Put another way, the small parameter
here is not the wave amplitude (as e.g., in a quasi-linear approach) but the small width of the
spectrum.
Upon crossing the shock the perpendicular component of the wave magnetic field increases
even further following the flow compression r = u1 /u2 , B⊥2  rB⊥1 and becomes larger than
the conserved parallel component B0 . Although the wavenumber increases in the same way,
the majority of thermal ions downstream are magnetized, k2 ρ⊥  1. Here k2  rk1 is the
wavenumber downstream and ρ⊥ is the Larmor radius of a thermal particle, ρ⊥ = VT2 /ω⊥ ,
where ω⊥ = ωci B⊥2 /B0 . These particles perceive the local magnetic field directed quasi-
perpendicularly to the flow and they are convected further downstream with the wave. In
other words they interact with the wave resonantly. In contrast, particles with higher energies,
k2 v/ω⊥  1, interact with the wave adiabatically, i.e., they perceive a spatially averaged rather
than local field. The former is clearly equal to the unperturbed field B0 directed perpendicularly
to the shock front. Therefore, in the case of a favourable velocity direction, these particles can
freely recross the shock from the downstream side and leak upstream. Of course, their number
falls off sharply with energy so that the leakage is dramatically suppressed through trapping
of the bulk of low-energy particles by the turbulence behind the shock. Since the particle
dynamics in a monochromatic MHD wave is exactly integrable, one calculates the leakage for
arbitrary (in fact very large) wave amplitude. This latter is then determined by considering
the saturation of the instability upstream.
One obvious feature of this mechanism is its dependence upon the mass to charge ratio
of different species. As we argued above, the leakage upstream must be controlled by the
parameter k2 ρα , where ρα = (VTα /ω⊥ )(A/Z), is the Larmor radius of a species α in the wave
field, VTα is a corresponding thermal velocity, A and Z are the mass and charge numbers,
respectively. It is clear that strongly magnetized particles (k2 ρα  1) cannot be injected,
whereas unmagnetized particles (k2 ρα  1 are injected as readily as in the case without
magnetic trapping. For protons this parameter is k2 ρp  ε ≡ B0 /B⊥2  1, which means that
protons are ‘marginally’ injected. Species with smaller A/Z (electrons) cannot be injected by
this mechanism (unless they are anomalously heated downstream by some other waves, see
section 5.1 below) whereas particles with higher A/Z can be injected with higher efficiency.
This is illustrated in figure 1; further details can be found in [59].
Besides the species sensitive regulation of the leakage, this mechanism has another
important aspect. When the cold, nearly monoenergetic upstream distribution begins to interact
with the strong downstream wave it slows down and becomes broader. This is what is actually
expected at a shock transition.
It is also important to emphasize here that the selective injection of different species by
trapping them in the downstream turbulence is not the only mechanism that may determine
the chemical composition of CRs presumably accelerated in strong astrophysical shocks such
as SNRs. The backreaction of accelerated particles themselves results in an extended shock
structure, which will be one of the main themes of these review. Returning to the chemical
composition of CRs, however, this structure also gives rise to preferential acceleration of
heavy elements. This phenomenon has been first studied using the Monte Carlo modelling
of nonlinear acceleration in [60]. This paper particularly focused on acceleration of charged
grains and comparison of the obtained results has been made both with the conventional first
ionization potential approach to the chemical composition and with observations. The authors
of [60] argued that nonlinear selective effects are needed to achieve better agreement with the
Theory of diffusive shock acceleration 441

η
ηp
ε=0.2
10

6
ε=0.3
4
ε=0.4
2

1 2 3 4
A/Z

Figure 1. Injection efficiencies of different species normalized to proton efficiency as functions of


mass to charge ratio A/Z and for different amplitude parameters - = B0 /B⊥2 . The self-consistent
magnitude of - (see text) is calculated to be - ≈ 0.3 for strong shocks. Two other curves are also
shown to demonstrate how the stronger (weaker) field compression would influence the abundances
of different species through the stronger (weaker) suppression of the injection of protons which are
more sensitive to this suppression mechanism (for -  1) than species with A/Z > 1.

data. A different approach to the question of chemical composition based on the assumption
of acceleration of the fresh ejecta material in SN associations is given in [61]. From now on,
however, we concentrate on the acceleration of protons and electrons and we consider their
injection in some more detail in the next section.

5. Injection

We have learned from the collisionless shock theory that a fast and cold flow upstream becomes
slower and hotter downstream, exactly as in ordinary shocks. The form of the downstream
distribution is difficult to calculate, but observations and the hybrid simulations show that
a Maxwellian can reasonably approximate the thermal core. On the other hand, from the
arguments of section 3, high-energy particles, should they be present at the shock, must develop
a power-law tail given by (3.3) with an unknown normalization constant Qinj . Theoretically,
these two parts of the whole particle distribution can be noncontiguous [62] which is actually
the case far upstream, where the low-energy particles cannot penetrate (see section 7.2) since
κ(p) is generally a growing function of the momentum. Thus, there is a gap between the
thermal upstream and high-energy distributions. However, there is no gap between the thermal
distribution and the nonthermal tail downstream as, e.g., hybrid simulations show. To determine
the constant Qinj , these two fairly different downstream distributions must be linked at some
energy. Clearly, we first need to identify a physical process whereby particles from the low-
energy part of the spectrum are accelerated to energies sufficient for describing them by the
means of section 3. In a broader sense, this is known as the injection problem and in any
instance it should be based on an interaction of low-energy particles with the shock and a self-
generated turbulence. It is intimately related to mechanisms of collisionless shock dissipation,
as e.g., discussed briefly in the preceding section. Perhaps due to our poor understanding
of such mechanisms, there is still no general consensus about concrete injection scenarios.
442 M A Malkov and L O’C Drury

One obvious parameter that should discriminate between different possibilities is again the
angle nB . That is, at sufficiently oblique shocks, reflected particles must be increasingly
important. For nearly parallel shocks, nB  1, the thermal leakage concept [58] discussed
earlier, is widely accepted.
There exists also a third group of particles which might play an important role in the
injection process but may not be directly involved in the shock dissipation process. These
are particles that after their arrival at the shock front neither cross it, nor are reflected, but
simply sit there. Based on hybrid simulations of quasi-parallel shocks, the authors of [63]
argued that some ions can oscillate at the shock front for up to ten ion gyroperiods gaining
energy from the resonant tangential electric field. These particles have also been observed
in strictly parallel shocks in [64]. A comprehensive description of this phenomenon is given
in [65]. Typically, these particles leave the shock front in the upstream direction and can
then be described in the same way as reflected and leaking particles. It is, however, not clear
whether these particles represent a genuine phenomenon or an artefact of one-dimensional
simulations [66]. Low-dimensional dynamics is usually less chaotic and such a quasi-coherent
interaction might disappear when two constraining integrals of motion, the two components
of canonical momentum in the shock plane, fail to exist.
Either of the three groups of particles eventually becomes subject to DSA, although
initially, they appear upstream in form of an anisotropic distribution which does not obey the
diffusion–convection equation (2.1). Therefore, the entire injection problem consists actually
of two different parts: (i) one identifies particles which are capable of returning upstream
after their first encounter with the shock and one determines their momentum distribution (first
generation of injected particles); (ii) one follows the (stochastic) trajectories of these particles
when they multiply recross the shock until they have achieved energies acceptable for the
standard description of the DSA via equation (2.1) (if not swept downstream before).
As discussed in the preceding section, the first task belongs to collisionless shock physics
and can at least formally be addressed independently of the DSA process. The second
constitutes the injection problem itself as a part of DSA theory and can be formulated in
more detail as follows. Suppose the task (i) is solved, as e.g., described in section 4.2.1
or by a direct computer simulation2 . Then, given the distribution of thermal particles that
are able to penetrate into the upstream region, one calculates the high-energy asymptotics
of their distribution. This provides the coefficient in the power-law solution of the standard
acceleration theory and thus the injection rate. The solution of the injection problem (ii) as
formulated above has been obtained analytically in [67]. The high-energy asymptotics of this
solution indeed matches the power law of the standard theory. At the lower energy end it
smoothly joins the downstream thermal distribution.
The strategy of solution can be outlined as follows (see [67] and [59] for details). We
assume that a parallel shock runs into the positive x-direction and the shock front is at x = 0.
For concreteness we consider the thermal leakage based injection. Suppose that some fraction
of the downstream plasma leaks upstream to form at x = 0+ the one-sided distribution3
F (v ), v = vx > 0. Here v is the velocity in the shock frame and we reserve the notation
v for the downstream frame. Due to the pitch-angle scattering in the upstream medium these
particles turn around and eventually cross the shock in the downstream direction forming the

2 The problem (i) is physically more complicated than (ii), but it is easier to address e.g., within the hybrid simulations

since formally it deals only with the initial phase of interaction of the upstream plasma with the shock. On the other
hand, the absence of high-energy particles and their backreaction on the shock structure and turbulence is a serious
limitation of such simulations.
3 It should be noted that this distribution is unknown even if the task (i) mentioned above is solved since it contains

also the ‘higher generations’ of injected particles.


Theory of diffusive shock acceleration 443

distribution F − (v ), vx < 0 which can be written as F − = L1 F , again at x = 0. The linear


integral operator L1 , the upstream propagator, can be obtained from the solution of the kinetic
equation upstream. While penetrating further downstream these particles are still pitch-angle
scattered so that some of them acquire positive velocities and move back to the shock. We
denote their distribution within the turbulence zone downstream by F + . For F + we thus have
F + = L2 L1 F + fM . (5.1)
Here L2 is the downstream counterpart of L1 ; fM is the distribution function of the downstream
thermal plasma that emerges upon the first crossing of the shock interface (without higher
generations), whereas the first term in (5.1) represents higher generations of injected particles.
We assume for simplicity that fM is a Maxwellian so that L2 fM ≈ fM because it is nearly
isotropic in the downstream frame. Now, the injection spectrum that appears just upstream
of the shock is determined by the fraction of F + distribution that will be able to penetrate
through the turbulence zone, i.e., by the transparency coefficient τ (v) of the turbulent region
downstream with respect to the particle leakage into the upstream medium. In other words,
the distribution of injected particles F is simply F = τ F + . Substituting F + from (5.1), we
obtain the following equation for F :
F = τ L2 L1 F + τfM . (5.2)
The function τ (v) is calculated in [59] where also the solution of equation (5.2) was obtained
and compared to the broad dynamical range hybrid simulations [68], figure 2. The resulting
spectra are: (a) in good agreement with the simulations; (b) they evolve into a standard power
law at higher energies; (c) their intensity may easily exceed the threshold of a nonlinear
acceleration regime (see section 7).

5.1. Electron injection


Since protons carry most of the mass, momentum and kinetic energy of the plasma across the
shock they should play the main role in the collisionless shock formation and dissipation of the
flow energy. By the same token, the nonthermal (shock accelerated) part of their population
may be expected to receive more shock energy than that of the electrons. Ironically, we
have more evidence of electron acceleration in shocks of astrophysical scale. This does not
necessarily mean, of course, that the protons are injected or accelerated less efficiently than
the electrons but the latter are simply more visible because of their higher radiative efficiency.
The main difficulty with electron injection may be seen from the already discussed
cyclotron resonance condition ω − k v α = ±ωcα where α = e, i. According to this
condition waves generated by thermally leaking protons with k ∼ ωci /u1 are clearly too
long to resonantly interact with electrons. Therefore, electrons require a separate injection
scenario. A few possibilities discussed in the literature may be grouped, again, depending on
the magnetic field angle θnB . In the parallel case the existing approaches are fundamentally
similar to the standard theory of shock acceleration of particles in a self-generated wave field,
as developed by Bell [3] and Lee [10]. Indeed, since the thermal velocity of electrons, in
contrast to the ions, can be much higher than the shock speed on the both sides of it, one can
use the small pitch-angle anisotropy and apply the diffusion–convection equation to describe
kinetics of thermal electrons exactly as it was done for energetic ions. The main difficulty
with the above resonance condition can be overcome in two ways. One suggestion is that the
necessary scattering is due to the self-generated whistler waves [69], which have the required
short wavelengths. The electron distribution, however, must be still anisotropic enough in the
upstream co-moving frame to generate whistlers, which √ requires sufficiently strong shocks,
i.e., roughly, Vshock  VTe or the Mach number M  mi /me . More accurate analysis of the
444 M A Malkov and L O’C Drury

-1
log Flux [(cm s sr keV) ]
-1

-2

Standard Slope
-3
2

-4
10

Maxwellian
-5

-6
-2 -1 0 1 2
log10 Energy (keV)

Figure 2. Particle spectra behind the shock, pitch-angle averaged in the shock frame. The squares
are from hybrid simulations [68]. Thin curve is a corresponding Maxwellian fit which is taken as a
source term fM in equation (5.2). The solution of this equation is shown with the heavy curve. The
dashed curve shows the solution of the same equation for τ ≡ 1. The dotted-dashed line indicates
the slope of the spectrum appropriate for high-energy particles and a shock compression of 4.


linear growth rate leads, however, to the following instability condition MA > mi /βe me [69].
Since the downstream distribution of thermal electrons cannot be inferred so easily as that of
ions (from RH relations) this injection model chiefly deals with the conditions under which
electrons may be extracted from the thermal pool rather than with calculation of the injection
rate. With the help of numerical simulation, however, the same author was able to estimate
the fraction of injected electrons relative to the protons at 1 GeV as 1–10% [70].
A different approach to overcome the difficulty of the lack of resonant waves was suggested
in [71]. It uses the same ion generated (long) waves of the standard theory for scattering also
the electrons by assuming that their amplitudes are much higher than the weak turbulence level,
as implied in the above resonant condition. The turbulent wave–particle collisions are thus no
longer assumed to be quasi-linear. Presumably, they result from the phase-space granulation
and vortices emerging in the high-amplitude magnetic waves inside the shock transition (note
that such phenomena have been extensively studied in plasma physics, e.g., [72, 73], not
surprisingly, in many cases with regard to the foundations of quasi-linear theory). The electron
dynamics remains, however, approximately diffusive and the authors of [71] further argue that
by invoking methods developed earlier in [74] it can be described by the diffusion–convection
equation supplemented with the second-order Fermi term (momentum diffusion) and properly
renormalized transport coefficients. The momentum dependence of the second-order Fermi
term still requires some parametrization, but the high-energy asymptotic behaviour of the
resulting spectra (injection rate) was found to be rather insensitive to the parametrization.
Theory of diffusive shock acceleration 445

The calculated spectra are qualitatively similar to that shown in figure 2 (under comparable
conditions) and can be also characterized by a Maxwellian with the emerging power law of
an index prescribed by the shock compression. The amplitude of the power law (the electron
injection rate) is at least one order of magnitude higher than the ion injection rate, e.g., shown
in figure 2 [59, 68] or calculated recently in a nonlinear acceleration model [75] with the self-
consistent injection. The difference is clearly attributable to the trapping effect of the strong
downstream wave on ions, that was included in all these ion injection models. The Monte
Carlo simulations [76–79] as well as the analytic calculations of ion injection [67] that neglect
this trapping effect appear to be much closer to the high electron injection rate obtained in [71].
In this model, the requirement of pitch-angle isotropy also at thermal√energies in the shock
transition clearly implies that the shock should not be too strong, M  mi /me , which makes
the two discussed models of electron injection complementary to each other. The common
remaining difficulty is the lack of our understanding of how electrons are initially thermalized
upon crossing a parallel shock. As a result, the first model is not firmly connected with the
thermal pool while the second one must rely on not well understood strong wave–particle
interaction mechanisms.
There exists also a rather sceptical view on the electron injection efficiency at quasi-
parallel shocks (see e.g., [80]). Indeed, if we assume that the quasi-parallel shock structure
is largely supported by the ion beam cyclotron instability as discussed earlier, it is difficult to
see how the electron adiabaticity can be violated in order to heat and accelerate them out of
the thermal pool since they are strongly magnetized in such a structure (kρe  1, ω  ωce ).
Note that generally very high amplitudes of ion waves (δB ∼ B0 or even higher) do not break
adiabatic conditions, although may indeed cause mirroring and additional heating. However,
the scepticism seems to be supported by the observations of energetic particles streaming ahead
of the Earth’s bow-shock, where energetic ions were found to come from the quasi-parallel
regions of the shock surface while electrons come from the quasi-perpendicular ones [81].
These arguments motivated studies of acceleration of thermal electrons and, in particular,
their injection into the diffusive acceleration at quasi-perpendicular shocks. The basic idea,
is rooted (not unexpectedly) to the mechanism of shock dissipation itself which for the quasi-
perpendicular shocks is believed to be due to reflected ions [56, 82, 83]. The reflected ion
beam gyrates in the foot region of the shock carrying a sizeable fraction of its ram energy and
efficiently generates the so-called lower-hybrid waves (they belong to the already mentioned
whistler branch of plasma oscillations in the magnetic field). These waves have a frequency

ω ∼ ωLH  ωce ωci and—propagating nearly perpendicularly to the magnetic field—can
interact with electrons of virtually arbitrary energy via a Cherenkov resonance ω = k v . In
addition, these waves being powered by (heavy) protons should be able to efficiently accelerate
(light) thermal electrons with v ∼ VTe to v ∼ c. There are two limitations, however, that
follow immediately from the linear theory of wave generation. The first one rose from the
observation [84] that the group velocity of these waves may be smaller than the flow speed
so that they are swept rapidly off the shock foot and there is no time for them to grow. This
issue has been addressed in [85] where the linear dispersion equation for these waves has been
numerically solved under reasonable assumption about the velocity distribution of reflected
ions. The result is that for sufficiently fast shocks, Vshock  0.02c, the wave group velocity
Vgr  Vshock and the waves can grow in the foot of the shock. It should be noted that while the
difficulty of slow wave propagation is justified in the special case of a strictly perpendicular
shock wave, θnB = π/2, in the more general case of strongly oblique shocks with θnB ≈ π/2, it
has been probably exaggerated. Indeed, if the waves are convected with the flow so rapidly and
have no time to grow, the reflected ion beam does not relax either and propagate farther upstream
along the field line until the waves have enough space to reach amplitudes sufficient for the
446 M A Malkov and L O’C Drury

beam relaxation before they are swept downstream. The beam relaxation length upstream is
L  Vshock /γ , so that the wave growing time t = L/Vshock ∼ 1/γ is independent of their
group velocity. This is similar to the Bell–Lee [3, 10] theory of relaxation of CRs on self-
generated Alfvén waves upstream; the wave group velocity is also irrelevant to this theory
when Vgr = VA  Vshock .
Another problem is related to the amount of electrons that can be accelerated in this
way. The acceleration is inseparable from the wave damping which being proportional to
∂fe /∂v at v = ωk /k exceeds their growth rate when the phase velocity ωk /k  V∗ where
V∗ ∼ VTe ln(n0 /nb ) (for a Maxwellian electron distribution). Apart from the ratio of ion
beam to the background density nb /n0 , the critical velocity V∗ depends on a number of other
parameters that we omitted for simplicity. Unless the beam density is unusually high, the
critical velocity V∗ is unlikely to be lower than ∼3VTe , so that the amount of accelerated
electrons is exponentially small, ∼ exp (−V∗2 /2VTe 2
).
A mechanism for accessing also the thermal electrons was suggested in [86], where the
backreaction of the acceleration on the electrodynamics of the shock has been included.
According to this mechanism, the acceleration of initially small amount of suprathermal
electrons with v  V∗ results in their escape from the acceleration region (i.e., from the
turbulent region) along the field lines. This builds up an electrostatic potential which is needed
to maintain quasi-neutrality and also pre-accelerates thermal particles from the nonresonant
region v < V∗ to the resonant one, v > V∗ . The solution for the shock structure is
characterized by the potential drop φ across the shock which rises the number of accelerated
electrons by the factor exp (eφ/Te ). Depending on the Mach number, this may be between 10
and 100 although the approximation used for this solution probably needs to be modified for
shocks of very high Mach numbers when the potential drop also becomes too high. For such
strong shocks current driven instabilities (such as the Buneman instability and ion acoustic,
if electrons are heated efficiently, so as Te > Ti ) should be expected and have been discussed
indeed in the literature both in the papers specifically devoted to acceleration of electrons
(e.g., [87–89] and references therein) as well as in those addressing the fundamental issue
of collisionless shock structure and dissipation, in particular the phenomenon of anomalous
resistivity (e.g., [52, 90–92]).

6. Nonlinear fluid theories

In test-particle (linear) theory the plasma flow, which enters the diffusion convection
equation (2.1) through the velocity field U (x), is assumed to be unaffected by the high energy
distribution f . However it is clear that if the number density of CRs nC is not vanishingly small,
and the particle spectrum extended to sufficiently high energies, the pressure exerted by these
particles on the inflowing gas can be also large enough to invalidate the test particle theory.
What is more important perhaps, from the point of view of applications, if the acceleration
operates efficiently and a significant part of the energy dissipated in the shock is transferred
to the high-energy population, the reaction of the accelerated particles on the flow and shock
structure must be included. Despite a generally very small nC , the slowing down of the
upstream flow may be very strong due to the following positive feedback: the effect of CR
pressure PC is to harden the spectrum through a stronger compression which further increases
the pressure. As a result, the system jumps from a nearly test-particle solution with small PC
to a solution in which almost all the flow ram pressure is converted into the CR pressure.
A general method for describing such bifurcation phenomena consists in successive
reductions in the dimensionality of an appropriate dynamical system, ideally down to a set
of algebraic equations. Their solutions can then be explicitly represented in the form of a
Theory of diffusive shock acceleration 447

bifurcation diagram on which a suitable physical quantity such as the total shock compression
or CR pressure is given as a function of a governing parameter, such as the Mach number or
injection rate. In classical gas dynamics, starting from the Boltzman equation (a nonlinear
integro-differential equation) after deriving the hydrodynamic moment equations (partial
differential equations) one finally arrives at RH jump relations (algebraic equations). An
analogue to this procedure in the theory of CR shocks is known as the two-fluid model (TFM).
The TFM equations were introduced, but only incompletely analysed, in one of the first papers
on the DSA by Axford et al [2]. A comprehensive solution and complete classification of
steady solutions was then given by Drury and Völk [93] using graphical methods while Axford
et al [94] gave an equivalent algebraic formulation.
The TFM model was the first consistent approach to appreciate the CR nonlinearity and
bifurcation in shock acceleration. It is important to understand that, as ordinary hydrodynamics,
it is, under appropriate conditions, an exact reduction scheme and its limitations are entirely
due to our insufficient knowledge of its closure parameters. We shall discuss the limitations
of the TFM in sections 6.3 and 7.5.2.

6.1. Two-fluid model


The basic idea of the TFM, as the name implies, is to treat the accelerated particles as a second
fluid characterized by an energy density and pressure, but negligible mass density and inertia.
The energy density, EC and pressure, PC , are defined by integrals over the distribution function
of the accelerated particles
 
pv
EC = 4πp2 f (p)T (p) dp, PC = 4πp2 f (p) dp (6.1)
3
where T (p) is the kinetic energy of a particle of momentum p (the subscript C is historical
and originally denoted ‘CR’). Formally, starting with the advection diffusion equation (2.1)
we can derive an exact equation for the energy density of the accelerated particles,
 
∂EC ∂EC ∂ ∂f 1 ∂U ∂f
+U = 4πp2 T κ dp + 4πp2 T p dp
∂t ∂x ∂x ∂x 3 ∂x ∂p

∂ ∂f ∂U
= 4πp2 T κ dp − (EC + PC ) (6.2)
∂x ∂x ∂x
where we have integrated by parts and used the result from special relativity, dT /dp = v. This
can be written
∂EC ∂(U EC ) ∂Q ∂U
+ =− − PC (6.3)
∂t ∂x ∂x ∂x
which simply states that the accelerated particle energy density is almost conserved, but with
a ‘heat flux’ like term Q associated with the diffusion and a ‘PdV’ type work term associated
with the divergence or convergence of the flow.
Now in ordinary gas dynamics the gas thermal energy density, EG , and thermal pressure,
PG , satisfy an almost identical equation and it is tempting to set up a reduced system of
dynamics where, instead of the full distribution function f of the accelerated particles we
work with the dynamically important quantities PC and EC . This system (simplifying to one
dimension) has the form of mass conservation
∂ρ ∂
+ (ρU ) = 0 (6.4)
∂t ∂x
where ρ is the gas mass density, Newton’s law relating the acceleration of a fluid element to
the pressure forces,
∂U ∂U 1 ∂
+U =− (PG + PC ) (6.5)
∂t ∂x ρ ∂x
448 M A Malkov and L O’C Drury

(note that both the gas and the accelerated particle pressures appear in this equation) and the
two energy equations
∂EG ∂ ∂U
+ (U EG ) = −PG
∂t ∂x ∂x
∂EC ∂ ∂U ∂Q
+ (U EC ) = −PC − .
∂t ∂x ∂x ∂x
Unfortunately we only have four equations for the seven unknowns ρ, U, PG , EG , PC , EC , Q
so there is a closure problem. The same situation occurs in ordinary gas dynamics where there
are three equations for four quantities and closure requires the specification of an equation of
state relating EG , PG and ρ the simplest being a polytropic relation, PG = (γG − 1)EG . The
perfect monatomic gas (a reasonable model for a hydrogen plasma) corresponds to γG = 5/3.
This suggest two closure relations,
PC = (γC − 1)EC , PG = (γG − 1)EG (6.6)
in terms of ‘adiabatic exponents’ for the gas and the accelerated particles which specify the
ratio between pressure and internal energy density. The remaining problem is to specify the
energy flux carried by diffusion and it is plausible to write
∂EC
Q = −κ̄ (6.7)
∂x
which states that the energy flux carried by the diffusing particles is proportional to an effective
‘mean’ diffusion coefficient κ̄ times the gradient of the energy density. This closes the system
and defines the TFM.
The equation of state closure for the gas (strictly plasma) is absolutely standard and
noncontroversial. Furthermore it is physically clear that the energy equation for the accelerated
particles must hold under very general circumstances. Thus the TFM really only depends on
two assumptions. One is that PC can be unambiguously related to EC , the simplest possibility
being to assume a polytropic relation. The second is that the diffusive energy flux can be
represented as an effective diffusion coefficient, κ̄, times the gradient of the energy density.
The problem in relating PC to EC is that it depends crucially on the relative importance
of the nonrelativistic and relativistic parts of the distribution. It is trivial to show that for any
isotropic distribution of nonrelativistic particles the pressure is two-thirds the energy density
(corresponding to γ = 5/3) and that for relativistic particles it is one-third (corresponding to
γ = 4/3). Thus 4/3  γC  5/3 ≈ γG . The simplest assumption, which has often been
made, is that the relativistic particles dominate and take γC ≈ 4/3. Generally

PC 4πp 2 (pv/3)f dp
γC = 1 + =1+ 
EC 4πp2 Tf dp
p 
[4πp3 Tf ]pmax 1 (p/f )(∂f/∂p)4πp 2 Tf dp
= min
− 
3EC 3 4πp2 Tf dp
so that if the contribution from the endpoints of the momentum integration can be ignored
(see below) γC is one-third of the energy weighted average of the logarithmic slope of the
spectrum. In particular we have the rather remarkable result, that if the spectrum is a power
law in momentum, f (p) ∝ p −q with exponent q in the range 4 < q < 5 then γC = q/3.
Similarly, the mean diffusion coefficient is formally given by

4πp2 T κ(∂f/∂x) dp
κ̄ =  . (6.8)
4πp2 T (∂f/∂x) dp
While it is clearly a weighted mean of κ(p) (and in the case that κ is constant and momentum
independent κ̄ = κ) the weighting function, p 2 T (∂f/∂x), is not simple and can, at least in
Theory of diffusive shock acceleration 449

principle, change sign. However the TFM equations are only well defined if κ̄ is strictly positive
and physically it is hard to imagine a naturally occurring situation where the diffusive energy
flux transports energy from a region of low-energy density to one of high-energy density. It is
interesting to note that in the shock acceleration context we typically have
∂f fU
≈ (6.9)
∂x κ
and if this holds then

4πp2 Tf dp
κ̄ ≈  (6.10)
4πp2 T (f/κ) dp
giving κ̄ as a weighted harmonic mean of κ with a positive definite weighting function (H Völk,
personal communication). This is perhaps the closest one can get to a formal proof that κ̄ > 0.
The TFM, with prescribed values for the two closure parameters γC and κ̄, constitutes an
interesting dynamical system in its own right. Regarded as a model for particle acceleration it
is obvious that it correctly includes many of the important aspects of the interaction between
the accelerated particles and the flow (notably the acceleration through the work done by the
flow compression against the particle pressure and the reaction back on the flow dynamics
through the particle pressure gradient). It is sufficiently simple that detailed analytical and
numerical studies of even quite complex systems are feasible. Its first major success was
the systematic classification of all possible steady shock structures and the discovery of the
bifurcation between strongly and weakly modified shock structures beyond a critical Mach
number. More recently it has been used to demonstrate a very interesting instability of acoustic
modes in the upstream region of strongly modified shocks and to show that, in the case where
three shock solutions exist, the intermediate solution is unstable [95] according to the classical
criteria of Dyakov and Kontorovich [54].

6.2. Wave excitation


The importance of the resonant excitation of Alfvén waves in the upstream region for shock
acceleration was first pointed out by Bell [3]. The process is quite straightforward although
its nonlinear ‘bootstrap’ character makes it rather difficult to analyse. If we have a shock
which is efficiently accelerating energetic particles there is a strong gradient in the accelerated
particle pressure in the upstream region. Under these conditions Alfvén waves propagating
down the pressure gradient are strongly amplified giving enhanced scattering of the particles
and a reduced value of the diffusion coefficient.
Quasi-linear theory shows that the diffusion coefficient can be expressed in the form
κB
κ= (6.11)
I
where κB is the Bohm diffusion coefficient
rg v
κB = (6.12)
3
corresponding to a random walk with m.f.p. equal to the particle gyroradius and I is the
dimensionless power in the resonant waves. The total wave energy density is

δB 2  B2
= I (k) d ln k (6.13)
8π 8π
integrating over all wavenumbers k. Strictly particles interact resonantly with those waves
where the wavenumber, k, projected on the direction of the mean magnetic field B equals the
450 M A Malkov and L O’C Drury

spatial frequency of the helical trajectory of the particle. For particles of momentum p and
pitch µ this condition is
eB 1
k = = (6.14)
µpc µrg
and thus the scattering of particles of momentum p involves waves on all lengthscales smaller
than the gyroradius rg . However it is clear that the bulk of the scattering depends on those waves
with scales close to rg and the discussion is greatly simplified if we ‘sharpen the resonance’
and assume that particles in a given logarithmic interval of p-space interact only with waves
in a logarithmic interval of k-space where k and p are related by kp = O(eB/c) (e.g., [35]).
Assuming the upstream scattering wave field to be dominated by waves travelling in the
positive x-direction at velocity V relative to the flow the diffusion advection equation for the
particles is
 
∂f ∂f ∂ ∂f
+ (U + V ) = κ (6.15)
∂t ∂x ∂x ∂x
(this is one case where we do need to distinguish between the mean fluid speed, U and that
of the scattering structures, U + V ). Neglecting modifications to U the (dimensional) energy
density I of the resonant waves is given by
∂I ∂I ∂P
+ (U + V ) =V − γI (6.16)
∂t ∂x ∂x
where γ is a damping coefficient and P is the pressure in resonant particles (the case of varying
U and V would require a more general wave action equation). The energy transferred to the
waves is simply the difference between the work done by the particles, (U + V )∇P , and the
work done on the fluid, U ∇P . It is convenient to work with non-dimensional quantities. We
have already expressed through I (k) the wave energy per logarithmic interval in terms of the
background magnetic field energy density and we introduce by analogy
1 pv
P (p) = 4πp 3 f (p) (6.17)
ρU 2 3
to represent the resonant particle pressure per logarithmic interval in terms of the ram pressure
of the upstream flow. The total accelerated particle pressure is

PC = ρU 2 P (p) d ln p. (6.18)

It is then easy to derive the nonlinear coupled system


 
∂P ∂P ∂ κB ∂ P
+ (U + V ) = (6.19)
∂t ∂x ∂x I ∂x
∂I ∂I 2U 2 ∂ P
+ (U + V ) = − γI (6.20)
∂t ∂x V ∂x
where we have assumed that V is the Alfvén speed and thus
B2 1
= ρV 2 . (6.21)
8π 2
One advantage of this nondimensional form is that we see at once that if, as is almost universally
the case in applications, U  V , then the wave excitation is extremely efficient. Formally
the steady solutions of the above system can readily be determined if we introduce a new
independent variable, analogous to the optical depth used in radiative transfer problems
 x
U +V
τ= dx I (x ) (6.22)
x0 κB
Theory of diffusive shock acceleration 451

where x0 is some reference point. In the steady state there is obviously a first integral,
corresponding to the particle flux in the advection diffusion equation
 
dP
φ = (U + V ) P − (6.23)

and the solution is easily verified to be
 
φ φ
P= + P0 − exp τ (6.24)
U +V U +V
κB γ 2U 2 [P0 (U + V ) − φ]
I = I0 − τ + (exp τ − 1). (6.25)
(U + V )2 V (U + V )2
When wave damping is included at some distance upstream the damping exceeds the excitation.
If we assume that particles reaching this point can freely escape from the system it is easy to
show that the escaping flux is
κB γ V
φ= (6.26)
2U 2
and this will significantly steepen the spectrum if it comparable to that leaving the system by
downstream advection. This analysis is based on that in [96] where a fuller analysis of this
system of equations, including numerical studies of the time-dependent system, is given.
More interestingly, for our present purposes, is that when the damping is negligible the
steady excitation equation can be integrated to give
2U 2
I= P + const. (6.27)
V (U + V )
Thus the dimensionless wave intensity is of order the Alfvén Mach number of the shock times
the dimensionless particle pressure, I = O(U/V )P . Now as U/V  1 and for efficient
acceleration P ≈ 1/ ln(pmax /pmin ) = O(10−1 ) there is a problem. The linear theory of wave
excitation predicts dimensionless wave amplitudes much larger than unity and is therefore
clearly invalid. Note that, as in many other areas of shock acceleration, this is a problem not
of the physics but of our ability to handle the situation mathematically. What is clear is that
the creation of strong scattering in the upstream region of the shock is not a problem; the
only problem is that the transfer of energy from the flow through the particles to the waves is
too efficient and is attempting to create a wave energy density which is the geometric mean
between the kinetic energy density of the flow and the background magnetic field energy
density. Although we do not understand what happens once the waves become strongly
nonlinear, it is plausible to assume that the scattering in the upstream region near the shock
is such that the diffusion coefficient is close to the Bohm value and has the Bohm scaling,
κ ∝ pv. Either the excess energy transferred to the waves is dissipated by a nonlinear process
and ends up as thermal energy of the incoming plasma, or nonlinear effects act to reduce the
wave excitation; in either case it is very hard to see why the wave amplitude should saturate
before reaching levels of order unity or, in other words, the only ‘natural’ value for the diffusion
coefficient is that implied by Bohm scaling.
The waves generated upstream will be advected through the shock and amplified on
compression so that, as long as they are not strongly damped in the downstream region, there
will also be strong scattering behind the shock. There is some evidence from numerical
studies [97] that the mean field strength can be significantly enhanced by nonlinear wave
interaction so that diffusion coefficients even smaller than the Bohm value might be achieved.
Thus the use of the Bohm value as an estimate of the diffusion coefficient in the shock
neighbourhood seems well justified.
452 M A Malkov and L O’C Drury

It is important to note that it is the local value of the diffusion coefficient near the shock
which is crucial for the acceleration process. Apart from the theoretical arguments given
above there is direct observational evidence for reduced values of the diffusion coefficient
in the neighbourhood of heliospheric shocks and the sharpness of the rims of some SNRs
seen in radio observations has been interpreted as evidence for significantly reduced diffusion
coefficients upstream of SNR shocks [98]. Furthermore, the upstream turbulent layers of five
SNRs have been probed directly by VLBI observations of interstellar scattering of extragalactic
sources whose lines of sight pass just outside of the respective shells [99,100]. One particularly
solid result of this study concerns the SNR HB9. It was demonstrated that the turbulence zone
is definitely narrower than 3.5% (0.8 pc) of its radius. This indicates that the CR precursor
must be relatively thin i.e., the turbulence is indeed enhanced in the acceleration layer which
is consistent with the aforementioned interpretation of sharp radio-rims. Finally the recent
observations of x-ray synchrotron emission and TeV γ -rays from the remnant of SN1006 [101]
indicate electron acceleration to energies of 1014 eV which is only possible if the diffusion
coefficient is indeed of the order of the Bohm value.

6.3. Limitations of the TFM: attempts at inclusion of the injection and losses
The limitations of the TFM are seen from its derivation. Integrating the diffusion–convection
equation in section 6.1 we dropped the terms coming from the momentum limits confining
the CR distribution in (6.2). The situation with the upper limit is the same as in ordinary
hydrodynamics. Clearly, f (p) must vanish at p = ∞ faster than ∝p −4 (for κ(p) = const and
even faster when κ grows with p). At p = 0 one may require that f is less singular than p−5 .
Then, the lower and upper limits can be taken to be 0 and ∞, respectively, and the limit terms
disappear causing no closure problem. This led the authors of [102] to the conclusion that
the predictions of the TFM are useful only if the postshock distribution satisfies the inequality
4  q ≡ −dlnf/dlnp  5. Although this is clearly a sufficient validity condition for the
stationary TFM, it should be born in mind that the lower limit cannot be set to zero since
the diffusion–convection equation is invalid for small momenta. Instead, the input of thermal
particles into the CR population must occur at an injection momentum p = p0 separating (of
course, somewhat ambiguously, see e.g., [103]) the thermal plasma from CRs. The energy
injection from the thermal subshock to CRs has been incorporated into the TFM in a number
of papers (e.g., [104], part I, [105–107]). Our interest, however, will be mostly confined to
cases in which CRs tap energy not from the subshock during the injection phase but from the
whole shock when they are accelerated to very high energies. This is, of course, consistent
with the notion of efficient acceleration.
The high-energy limitation is more serious. At the same time, it must not necessarily
take the form of the constraint q > 4. For example, in time-dependent cases and if the
total compression is >4, a quasi-steady spectrum may form below a slowly advancing cut-
off momentum p1 (t) and be harder than q = 4. This may be deduced from an exact
solution of the time-dependent test-particle problem (see e.g., [108, 109] and part I where
also further references can be found). A behaviour consistent with this was also observed
in a number of direct numerical solutions of the full nonlinear problem (2.1), (6.4), (6.5),
e.g., [110, 111]. Another possibility is when there are strong losses at p  p1 so that the
condition f (p > p1 ) ≡ 0 may be imposed and a steady-state solution exists. In this last
case the limit term should be retained, which causes an additional closure problem, since the
quantity f (p1 ) cannot be calculated if the kinetic solution is not obtained. Some attempts
to circumvent this difficulty were made in [112–114]. The essence of these approaches is
to express the limit term in (6.2) through the CR pressure PC simply on the ground that the
Theory of diffusive shock acceleration 453

pressure integral in the case of sufficiently hard spectrum (q < 4) is dominated by particles
near the cut-off p1


p1 4 − q, q<4
f (p)p 3
1+p −1
2 = 3PC (6.28)
p0 0, q  4.

One can attempt to estimate q in (6.28) from test-particle theory or its perturbative
modifications [113], expressing q through the (unknown) total and subshock compressions
which may then be obtained from the RH conditions similar to those derived in [94].
Unfortunately, as we shall see in section 7.6, in a strongly nonlinear acceleration regime such
estimates are not accurate enough to correctly include the energy losses. Thus, the model with
losses remains generally also unclosed as does the ordinary TFM considered in the preceding
section. Moreover, the impact of the energy escape (6.28) on the shock compression and thus
on the particle spectrum is generally more serious than that of the TFM closure parameters γC
and κ̄. This situation is similar to the radiative shocks, where energy escape renders the shock
compression to be in principle unlimited. To demonstrate this consider the usual RH relations
with the CR energy escape flux Qesc through the cut-off boundary p1 (e.g., [115])

[ρU ] = 0 (6.29)
[ρU 2 + P ] = 0 (6.30)
[ 21 ρU 3 + U (P + E)] = Qesc (6.31)

where [A] = A1 − A2 and the indices 1, 2 refer to the upstream and downstream values of
the quantity A, respectively. The above three equations constitute conservation of the mass,
momentum, and energy fluxes across the shock. Assuming for simplicity the Mach number in
the upstream flow to be very high, i.e., neglecting the gas thermal pressure and energy density
upstream, we easily obtain
U1 E2 Qesc
=r =1+2 +2 (6.32)
U2 P2 U 2 P2
or alternatively, by using the relation (6.6), P = (γ − 1)E
γ +1
r=  (6.33)
γ− 1 + 2(γ 2 − 1)Qesc /ρ1 u31

where Qesc can be explicitly defined as



4πcp1 ∞ dU 3
Qesc = − p1 f (p1 , x) dx. (6.34)
3 −∞ dx

The message from (6.33) is clear: unless the escape flux Qesc is small it should be calculated
extremely accurately which means that both the flow profile U (x) and the particle distribution
must be obtained from the coupled equations (2.1), (6.4), (6.5), i.e., essentially at a kinetic level.
The perturbative test particle estimates based on the determination of q(p1 ) are intrinsically
inadequate. Another immediate observation from (6.33) is that if the whole available energy
flux is converted into the CR flux and ‘escapes’, i.e., ρ1 u31 /2 = Qesc , the flow stagnates
downstream (r = ∞). There would be no internal energy in the gas in this case (if M = ∞)
and if, the shock is driven e.g., by a piston, the gas would condensate on its surface. As we
will see this does not happen, although the shock compression r may be very large indeed but
always limited even for M = ∞, if the CR density downstream is limited, nC < ∞.
454 M A Malkov and L O’C Drury

7. Nonlinear kinetic theories

After having obtained in section 3 the steady test particle spectrum, we turned to the problem
of injection which led us to the conclusion that particles must be injected in numbers sufficient
for the onset of nonlinear acceleration. We began its study with a hydrodynamic approach
that provided essential information about the bifurcation and critical values of parameters.
Naturally, its prediction about particle spectra are at best rather indirect, being only via the
preceding test particle results, given the overall and subshock compressions. Therefore, we
reconsider the problem at a kinetic level focusing first on the impact of two particular nonlinear
shock modifications on the particle spectrum which are the shock broadening and the increased
total compression. Fortunately, a complete exact solution is known for the case in which κ is
momentum independent, the gas is cold and the shock transition smooth. For convenience,
however, we give first a more general steady-state formulation that also includes a finite
subshock and the upper cut-off momentum.
Let a strong shock propagate in the positive x-direction. In its own frame the steady mass
flow profile is defined as follows U (x) = −u(x), x  0 and −u2 , x < 0, where u2 > 0
is the (constant) downstream mass velocity, u(0+) = u0  u2 , and u(∞) = u1 > u0 . The
steady-state equations read (see equations (2.1), (6.4), (6.5))
 
∂ ∂g 1 du ∂g
ug + κ(p) = p , (7.1)
∂x ∂x 3 dx ∂p
ρu = ρ1 u1 , (7.2)
PC + ρu = ρ1 u1 .
2 2
(7.3)
Here g = p3 f , i.e., the number density of CRs is normalized to 4πg dp/p, the particle
momentum p to mc, ρ(x) is the mass density, ρ1 = ρ(∞), PC is the CR pressure (6.1)

4π 2 p1 p dp
PC (x) = mc g(p, x). (7.4)
3 p0 p2 + 1
The upper limit p1 stands for a boundary in the momentum space (cut-off) beyond which
particles are assumed to leave the system instantaneously (g ≡ 0, p > p1 ) unless the
integral converges and p1 may be set to ∞. Note, that any separable x-dependence of particle
diffusivity
 κ(p, x) = κ(p)K(x) can be removed from equation (7.1) by the transformation
x → dx/K(x). If a subshock is present at x = 0, equation (7.3) is invalid in the region
x  0 where the contribution of the gas pressure (i.e., particles with 0 < p < p0 ) should be
retained. The subshock strength is then to be determined from a regular RH condition
u0 γG + 1
rs ≡ = . (7.5)
u2 γG − 1 + 2M0−2
Here M0 is the Mach number of the flow in front of the subshock. The last equation is coupled
with equations (7.1)–(7.3) through the gas deceleration and heating rates in the precursor. In
the case of a purely adiabatic heating
M0 = MR −(γG +1)/2 (7.6)
with R = u1 /u0 . As stated before, we start with the particular case of a completely smooth
shock transition with M = ∞, ∀x. Therefore, using the above shock arrangement we set
u(x) = −U (x), ∀x and u0 = u2 = u(−∞).

7.1. An exact solution for momentum independent CR diffusivity


As discussed in part I section 4.3.1 [116, 117] it is possible to obtain an exact Green function
solution relating the far upstream and far downstream spectra for the steady-state diffusion
Theory of diffusive shock acceleration 455

advection equation if the condition


du
κ = β(u1 − u)(u − u2 ) (7.7)
dx
is satisfied where β is a constant. The Green function is then given by an infinite series of
power laws and at large momenta, where the leading term dominates, the asymptotic spectral
index is
 
∂ ln f 3u1 1 u2
→− 1+ . (7.8)
∂ ln p u1 − u 2 β u1 − u 2
The constant β can be thought of as a dimensionless measure of the size of the diffusion
coefficient. The above ansatz essentially requires the shock profile to be of hyperbolic tangent
form and if we substitute
u 1 + u 2 u1 − u2
x
u(x) = + tanh (7.9)
2 2 L
we obtain

β= (7.10)
L(u1 − u2 )
where L is the lengthscale of the transition from u1 to u2 . In the limit β → ∞ the diffusion
lengthscale is very much larger that L, the transition appears as a sharp jump, and we recover
the standard result for the spectral index. For finite values of β this solution shows explicitly
the effect of a finite width of the transition in causing the spectral index to steepen.
The most remarkable thing about this solution is that in the special (and admittedly very
artificial) case of a globally constant κ and in the strong shock limit it can be made completely
consistent. The dynamical equations for the structure of a strong shock in the TFM (which of
course is exact for the case of constant κ) are those of mass conservation
ρu = A (7.11)
where A is the constant mass flux, momentum conservation
Au + PC = Au1 (7.12)
ignoring the thermal pressure (strong shock limit) and the energy equation
1 2 γC κ dPC 1
Au − uPC − = Au21 . (7.13)
2 γC − 1 γC − 1 dx 2
Eliminating PC the shock-structure equation reduces to
du γC + 1
κ − (u1 − u)(u − u2 ) = 0 (7.14)
dx 2
with
γC − 1
u2 = u1 . (7.15)
γC + 1
Thus the specific ansatz required for the solution of the transport equation coincides with the
shock structure produced by the reaction of the accelerated particles if β = (γC + 1)/2 and the
compression ratio is r = u1 /u2 = (γC + 1)/(γC − 1). Note that in this case the shock thickness
L is proportional to particle diffusivity κ as it is always the case in nonlinear solutions. The
corresponding spectral slope is simply 3γC which is of course consistent with the result that
for a power-law spectrum f (p) ∝ p −q of exponent q the corresponding adiabatic index is
q/3. Formally therefore one can construct a one-parameter family of exact solutions with
4  q  5. However, the result γC = q/3 only holds exactly for infinitely extended power
456 M A Malkov and L O’C Drury

laws. If the spectrum starts at some finite initial momentum and (after a sufficiently long time)
extends to very high momenta γC will be very slightly closer to 4/3 than q/3 and the solutions
will gradually drift towards the limiting case of q = 4, γC = 4/3 and r = 7 as the relativistic
part of the spectrum becomes more and more dominant.
To conclude this section we note that further nonlinear studies of the case of momentum
independent κ have been performed recently by Toptygin [118]. One separable analytical
solution of this equation for a special form of the function κ(x, p) (allowing the separation of
variables) and in a prescribed flow profile was found earlier in [119].

7.2. An exact solution for arbitrary κ(p)


While the nonlinear broadening of the shock transition softens the spectrum, the decrease of
the specific heat ratio γC , caused by acceleration, hardens it. These two factors are present
in the exact solution discussed above. Remarkably, for strong shocks, the net effect of these
two factors is to keep the CR pressure marginally convergent as in the simple (linear) test-
particle theory with γ = 5/3, M → ∞. As the story unfolds, we encounter further surprising
coincidences with test-particle theory.
The next step towards a more realistic treatment that includes the momentum-dependent
κ(p), encounters difficulties. First, the escape length of particles upstream is now momentum
dependent, κ(p)/u, which allows only the particles with highest momenta to sample the total
flow compression. Thus, the scale invariance seems broken and a power-law spectrum is no
longer to be expected. Furthermore, since the total compression should be >4, the highest
energy particles can make a pressure divergent contribution, at least if one naively estimates
their spectral slope using the test-particle formula (3.3). Therefore, a finite cut-off momentum
is required in a steady state and the energy escape discussed in section 6.3 is likely to drive
compression to even higher values which, in turn, should harden the spectrum and further
boost the escape. Eichler [120] called this regime ‘runaway acceleration’ and as we shall see
it is this acceleration regime that operates when both the injection rate and cut-off momentum
are sufficiently high.
Due to the above mentioned positive feedback, none of the terms in equations (7.1)–(7.3)
is small when acceleration is efficient. As in the previous section, an exact solution seems
necessary to find the net effect of the above discussed oppositely acting factors. A complete
exact solution is perhaps impossible to find. What can be found is a solution which tends to the
exact one if the system parameters tend to their extremes, remaining physically quite realistic.
That is, if the maximum energy is very high and, hence, the shock transition is very broad the
solution well inside the shock transition has a self-similar form which can be found exactly.
To obtain it we note that in the downstream medium, x < 0, the solution has the same
form as in the test-particle case g = G(p) ≡ g(p, x = 0). Introducing the flow potential φ,
such as u = dφ/dx we seek the solution of equation (7.1) upstream in the form [103, 121]
 
1+B
g = g0 (p) exp − φ(x) , x>0 (7.16)
κ(p)
where g0 is unknown and
B (p) ≡ −(1/3) d ln g0 /d ln p. (7.17)
The B term in the exponent deserves a comment. Without it (7.16) trivially balances
the advection and diffusion terms on the left-hand side of (7.1) which would be a good
approximation well outside the shock transition where du/dx → 0. Note that it is also a
less important region since there are only a few particles there. With the B term (7.16) exactly
balances all the three terms inside the shock transition where they all are of the same order and,
Theory of diffusive shock acceleration 457

what is particularly important as we will see, for a physically relevant self-consistent velocity
field u(x). To find this solution, we consider u as u(φ) and, substituting (7.16) in (7.1), after
separation of variables, we obtain
du/dφ = λu/φ (7.18)
 
dB d ln κ 3
p = (1 + B) − B (7.19)
dp d ln p λ
where λ is a separation constant. It is important to realize that the full solution (7.16)
is essentially nonseparable; its spatial and momentum scales are strongly coupled.
Equation (7.18) may be readily integrated and yields for the flow potential
−λ/(1−λ)
φ(x) = φ0 [(1 − λ)u0 x + φ0 ]1/(1−λ) (7.20)
where φ0 = φ(0) is another constant (it may be determined from the comparison of (7.20)
and (7.3)). It is straightforward to verify that the following expression is the first integral of
system (7.17), (7.19):
g0 (p)κ λ (1 + B)−λ = const. (7.21)
Denoting κ0 ≡ κ(p0 ) and B0 ≡ B(p0 ), for g0 we finally have
 3   −λ
p B0 + 1 −3/λ p
g0 (p) = g0 (p0 ) 1+3 p0 κ(p )p 3/λ−1 dp . (7.22)
p0 λκ0 p0

For p  p0 , more precisely for (κ/κ0 )(p/p0 )3/λ  1, the spectral slope is determined by
merely κ(p) and λ, i.e., it ‘forgets’ its behaviour at p  p0 :
g0 (p) ∝ κ −λ (p) and B  (λ/3) d ln κ/d ln p. (7.23)
As we shall see, the parameter λ depends on the scaling of κ(p) as well, and the most surprising
consequence of this dependence is that the resulting slope of g0 (p) is, in fact, independent
of κ(p).
The region p  p0 cannot be described within the present approach which produces two
integration constants, the magnitude g0 (p0 ) and the slope B(p0 ) ≡ B0 of particle distribution
in the solution (7.22). They serve as external parameters provided by the injection theory
(section 5) that operates on an anisotropic at the shock front distribution function to which
equation (7.1) is irrelevant. However, a consistent asymptotic theory must be able to obtain
the parameter B0 , also within the present approach to ensure smooth matching of the spectrum
at p ∼ p0 (see 7.5).

7.3. Adjusting the flow


We obtained a one-parameter (λ) family of exact solutions to equation (7.1) under the special
flow profiles u(φ). One parameter is, generally speaking, not enough to satisfy the functional
relation (7.3). Miraculously, it can be done in most of the shock transition, so that a remaining
inconsistency at its periphery poses a rather technical and not a principal problem which will
be resolved in section 7.5. To demonstrate this we substitute (7.16) into (7.3), (7.4). Using
equation (7.2), condition (7.3) rewrites as
 s0
ds λ−1 −φs p 2 (s)
u(φ) + µ s e = u1 . (7.24)
s1 B (s) 1 + p 2 (s)
We have introduced a new variable s in place of p
s = (1 + B)/κ (7.25)
458 M A Malkov and L O’C Drury

and the limits s0,1 = s(p0,1 ). We have also used the first integral (7.21), g0 ∝ s λ . The
parameter µ = (λ/3p0 )νu1 s0−λ , where the injection rate ν is defined as
4π mc2
ν= p0 g0 (p0 ) (7.26)
3 ρ1 u21
and the function p(s) in equation (7.24) should be determined from equation (7.25). As we
argued in section 6.2 the most plausible κ(p) dependence is that of a Bohm-type
κ(p) = Kp2 (1 + p 2 )−1/2 , (7.27)
i.e., the m.f.p. of a particle is proportional to its Larmor radius (here K is a reference diffusivity).
Then, equation (7.24) rewrites as
  
µ s0 1
u(φ) = u1 − 1+ s λ−2 e−φs ds. (7.28)
K s1 B (s)
According to equations (7.23), (7.25), B(s) is a very simple function, taking in most of its
domain nearly constant and relatively close values, B  λ/3 for Ks  1 and B  2λ/3 for
Ks  1. It varies monotonically between these limiting values where Ks ∼ 1. Differentiating
equation (7.28) with respect to φ, assuming 0 < λ < 1 and considering first the region
1/s0  φ  1/s1 we may obviously replace the lower limit by zero and the upper one by
infinity. From equation (7.28) we then obtain
 ∞
du µ
 [1 + 1/B(τ/φ)]τ λ−1 e−τ dτ
dφ Kφ λ 0
µ>(λ)
= [1 + 1/B(τ̄ /φ)] (7.29)
Kφ λ
where > is the gamma function and B(τ/φ) is replaced in the last integral by its mean value
at τ̄ /φ with τ̄ ∼ 1. As we have already seen the function B(τ̄ /φ) varies slowly and it is
close to λ/3 for φ > K and to 2λ/3 for φ < K. Therefore, the φ dependence of du/dφ is
determined by the factor φ −λ and is indeed consistent with equation (7.18), i.e. with u ∝ φ λ
provided that λ = 1/2. Equation (7.29) becomes invalid for φ  1/s1 , since the lower limit
in equation (7.28) cannot be replaced by zero in this case and the function du/dφ cuts off
(see equation (7.28)). In section 7.5 we present a modification of the above solution which
describes the entire shock structure on a universal basis.

7.4. Asymptotic universality of acceleration in strong shocks


The downstream particle spectrum given by equations (7.21) with B from equation (7.23)
being expressed in terms of kinetic energy E rather than momentum exhibits a fairly uniform
behaviour throughout the entire energy range, relativistic and nonrelativistic. In a standard
normalization F (E) dE this spectrum is close to E −1.5 except for the injection energy (if
rs < 4), the cut-off energy and the region E ∼ mc2 . It is natural to assume, however, that this
1.5 index is not universal but depends on the CR diffusivity κ(p) which we have specified as
κ ∝ p 2 / 1 + p 2 . To examine this idea we replace κ by κ = κ α so that unless α = 1, the
spectral index 1.5 should change. Now, the spectral slope B, that as we have shown may be
written as B = (λ/3) d ln κ/d ln p, should be replaced by B = (αλ /3)d ln κ/d ln p were the
new index λ is to be determined (see equation (7.23)). Recalculating du/dφ in equation (7.29)

with these rescaled spectrum and CR diffusivity κ we obtain du/dφ ∝ φ 1/α−1−λ . Since the

formula u ∝ φ λ (equation (7.18)) holds, we deduce that for our new λ, i.e., λ = 1/2α ≡ λ/α.
Consequently, the spectral slope B remains unchanged, B = B.
Theory of diffusive shock acceleration 459

This surprising result means that the asymptotic spectral form is insensitive to the spectrum
of the underlying MHD turbulence (since the latter prescribes κ(p)). On the other hand, the
velocity profile does depend on κ(p). From the above analysis we obtain u ∝ x 1/(2α−1)
which also imposes the condition α > 1/2 on this solution. Remarkably, a precisely opposite
condition α < 1/2 is required to produce a steady velocity jump without momentum cut-off and
injection but with a secularly broadening CR precursor. This was demonstrated in section 4.5
of part I. It is perhaps more than a coincidence that the strictly stationary solution—which is
essentially based on particle injection at p = p0 and energy escape through the upper cut-
off—appears immediately beyond α = 1/2. Also, the time saving numerical solutions with
α  1/2 might be insufficient for modelling the more realistic case of α = 1.
Summarizing these results, when κ(p) rescales, so does the flow profile u(x) but B remains
invariant. In fact, it is not difficult to understand why this is so. As usual in the Fermi process
the spectral slope of course depends on the flow compression. But, since the flow is modified,
a particle with momentum p, bound diffusively to the shock front, samples not the total
compression but only a compression accessible to it. The latter is determined by the relation
φ(x) ∝ κ α (p) (equation (7.16)). As we have shown, u(φ) ∝ φ 1/2α . Therefore, √ the flow
compression u/u2 , as seen by this particle, scales as u/u2 ∝ φ 1/2α ∝ κ(p). As this is
independent of α the index B must also be. Remarkably, this universal power-law index of
1.5 coincides in a nonrelativistic region with the plain test-particle result for a strong shock
if it were to propagate in a purely thermal nonrelativistic gas, whereas in a relativistic region
it precisely coincides with that for a relativistic gas. This is true even when the flow profile
is strongly modified and the total compression ratio is much higher than the values of 4 or
7 occurring in nonrelativistic and relativistic gases, respectively. Moreover, the subshock
compression ratio rs may be significantly lower than 4 and cannot account for 1.5 spectrum
in the nonrelativistic energy range. In fact nonrelativistic and relativistic particles create their
own portions of shock transition which they sample4 . These observations demonstrate that
comparisons of the strongly nonlinear (u1 /u0  1) particle spectra with test-particle formulae
are misleading, even though a formal coincidence is possible. It should be clear from the above
scaling analysis and from the form of the solution (7.16) that a particle with a momentum p
noticeably lower than p1 cannot reach the remote parts of the shock structure and thus does
not ‘see’ the total compression r. Moreover, even the slope at p = p1 , which one may naively
estimate from the total compression r (since these particles do see the total compression), does
not depend, in fact, on r or rs (see [103] for an explicit formula, and plots in the next section).
The reason is that the spectral index, besides the flow compression accessible to a particle with
given momentum p depends also on the length of that part of the shock structure, which is
sampled by this particle. As we have seen, these factors compensate each other as to make the
index independent of the both. Similarly, as follows from equation (7.22), for particles with
p  p0 the index does not depend on rs .

7.5. The method of integral equation


What we obtained so far in the case of momentum-dependent diffusivity κ is a test-particle
solution on the one hand and a strongly nonlinear asymptotic solution given in sections 7.2–7.4
on the other. These two have different particle spectra and flow structures so that it is not clear
whether they are connected to each other in any relevant parameter space. If they are, the
theory should be able to describe them on a universal basis as two limits of the same global
solution when a governing parameter (such as the injection rate ν) takes appropriate values.
4 Interestingly, the respective power-law indices can be obtained on the basis of their specific heat ratios γ (=5/3

and 4/3) by formally expressing the spectral index directly through γ .


460 M A Malkov and L O’C Drury

Evidently, it must tend to the test-particle solution as ν → 0. The key to obtaining the global
solution is the similarity of its general representation (7.16) in both the linear and nonlinear
limits [103]. We start by rewriting it in a slightly different form
 
1 + B̂
g(x, p) = G(p) exp − ? . (7.30)
κ
Here
 x
1 d ln G
? = φ − φ0 = u dx and B̂ = − ≈ B. (7.31)
0 3 d ln p
Once the functional dependence of the solution is suggested by (7.30), the next obvious
step is to generalize the test-particle procedure for obtaining the spectrum. That is, we
integrate equation (7.1) between 0− and +∞, cf [122]. A decisive step now is to use the
substitution (7.30) with the unknown G and ?(x). The result reads (details can be found
in [103])
 
1 ∂ ln G 1 1 ∂ V̄
− = u2 + . (7.32)
3 ∂ ln p V̄ 3 ∂ ln p
Here the function V̄ is the following integral transform of the flow profile u(?):
 ∞
V̄ (p) = e−ŝ(p)? du (?) (7.33)
0−
with
 
1 1 ∂ V̄
ŝ(p) = u2 + V̄ (p) + . (7.34)
κ(p)V̄ (p) 3 ∂ ln p
The function V̄ (p) (which we term spectral function) explicitly reflects the degree of shock
modification. Put another way, a similar function û(p) ≡ V̄ (p) + u2 is an effective flow
velocity upstream that depends not on x but on momentum p in such a way that a particle with
momentum p on average escapes upstream to a distance x to see the flow speed û(p), that
is u(x) = û(p). For example in an unmodified shock (u0 = u1 ), V̄ (p) ≡ u ≡ u0 − u2 ,
since then du/dx = 0 in the upstream region and û(p) ≡ u1 ≡ u0 ; the spectral index
q̂ = −d ln G/d ln p is just the conventional q̂ = 3u2 /(u1 − u2 ) = q − 3 (see equation (7.32)).
In general u  V̄ (p)  u1 − u2 and V̄ (p) → u1 − u2 as p → ∞. Even if the shock is
appreciably modified, one may show that at small p  p0 , V̄ (p)  u. The spectral index
then corresponds simply to the subshock compression ratio, and at lower momenta we have
 −q̂0
3u2 p
q̂  q̂0 = , G(p) = Qinj . (7.35)
u0 − u 2 p0
The injection solution described in section 5 produces essentially the same asymptotic result
for p  p0 , yielding thus the injection rate Qinj . Thus, once V (p) is found, both the flow
profile and the complete (including connection with its thermal part) particle distribution can
be determined by inverting transform (7.33) and integrating equation (7.32). Now, using the
linearity of equation (7.3) (ρu = const), we derive the integral equation for V by applying the
transformation (7.33) to the x-derivative of equation (7.3) [103]. The result reads
   p 
νu1 p1 p dp ŝ(p ) V̄ (p0 ) d ln p
V̄ (p) = exp − 3u2 + u. (7.36)
p0 p 0 1 + p 2 ŝ(p ) + ŝ(p) V̄ (p )
p0 V̄ (p )

The injection rate ν (defined as in (7.26) with G instead of g0 ) may be given by the injection
solution, section 5). The two constants u0 and u2 are, however, unknown (we will use the pre-
compression R ≡ u1 /u0 and the subshock compression rs ≡ u0 /u2 instead, along with the
Theory of diffusive shock acceleration 461

total compression r = u1 /u2 ≡ rs R). Therefore we need two further algebraic equations for
them, one of which is already available. This is the RH relation for the subshock, rs = rs (R, M)
given by (7.5), (7.6). The second equation is for R(ν) which may be easily obtained by writing
equation (7.3) for x = ? = 0+. For convenience, we use ν = ν(R) since the function R(ν) is
not always single-valued. Introducing also a dimensionless spectral function U (t) = V̄ (t)/u1
with t = ln p, t0,1 = ln p0,1 , all three equations can be written as
   t1  −1
1 1 ν 1 1 + B̂(t )
U (t) = 1− + dt +
R rs Kp0 t0 κ(t ) κ(t)[1 + B̂(t)]
  t 
U (t0 ) 3 dt
× exp − (7.37)
U (t ) Rrs t0 U (t )
4
rs = (7.38)
1 + 3M −2 R 8/3
  t1   t  −1
U (t0 ) 3 dt
ν = Kp0 (1 − R −1 ) κ(t) dt exp − . (7.39)
t0 U (t) Rrs t0 U (t )
The spectral index B̂(t) is related to U (t) as follows (see (7.32)):
1 d ln U 1 1
B̂ (t) = + . (7.40)
3 dt rs R U (t)
We have also set γG = 5/3 and assumed that the plasma in the CR precursor is heated
adiabatically, i.e., M0−2 = M −2 R 8/3 .
The approximate solution to system (7.37)–(7.40) was obtained analytically in [103,123].
Here, we present the results of its numerical integration for κ(p) defined in (7.27). It is
convenient to discuss the parameter dependence of the flow profile first and then to turn to the
particle spectrum that is consistent with it.

7.5.1. The flow structure. The flow profile u(x) in the precursor is a featureless monotonic
function (section 7.2) so that it is sufficient to trace only two parameters—the precursor
(R = u1 /u0 ) and the subshock (rs = u0 /u2 ) compressions. In the simplest case of adiabatic
precursor heating rs is related to R by (7.38). The dependence of the pre-compression R on the
injection rate ν is demonstrated in figure 3 for various Mach numbers and maximum momenta
p1 (we show ν(R) instead, for convenience). One striking aspect of this solution is a very
sensitive, even nonunique, dependence R(ν). This is at least in qualitative agreement with the
TFM predictions. As the Mach number M and cut-off momentum p1 grow, the monotonic, test-
particle-like dependence R(ν) bifurcates so that there appear three different shock solutions
with different compressions R (and thus rs ) given the injection rate ν ∈ (ν1 , ν2 ) where ν1,2
are the two local extrema of ν(R) at R = R1,2 (see also figure 7 below for illustration). A
substantial subshock reduction, however, occurs only when R approaches M 3/4 (see (7.38)).
On the other hand, it can be shown (see [103, 123]) that when R → M 3/4 (and thus rs → 1)
the function ν(R) in (7.39) diverges. In other words rs never crosses unity (but may approach
it, if the injection rate ν tends to infinity) so that at least a weak subshock must remain. This
is what actually observed in numerous computer simulations (e.g., [27]) but seems to be in
conflict with the TFM. We explain this discrepancy later.
Another interesting issue that may be resolved on the basis of the ν(R) dependence (7.39)
concerns the possibility of shocks with r ≡ rs R = ∞, which is frequently claimed to be the
case in the literature (e.g., [124]) and seems to be indeed possible from equation (6.33). This
problem was addressed, e.g. in [125] where the scaling r ∼ M 3/4 was studied analytically and
462 M A Malkov and L O’C Drury

0.03
(a) M=50 M=80 M=150

0.02

5
0.01 M=10
Injection ν

0.00
0 5 10 15 20

0.10
(b) p1=100 p1=10
3 4
p1=10
5
10

0.05

0.00
0 5 10 15 20
Precursor Compression R

Figure 3. The nonlinear response of an accelerating shock (characterized by the precursor


compression R) to the thermal injection ν given in the form of the function ν(R). (a) The function
ν(R) calculated for the fixed injection and cut-off momenta p0 = 10−3 and p1 = 105 and for
different Mach numbers M = 50; 80; 150; 105 . (b) The same as in (a) but for the fixed p0 = 10−3
and M = 150 for different p1 = 100; 103 ; 104 ; 105 .

numerically5 . If this scaling was correct for all M, the flow compression would be infinite at
M = ∞. The analytic solution of (7.37)–(7.39) shows that r(M) indeed grows as M 3/4 as
long as M is limited by 1  M < (νp1 /p0 )4/3 , but it saturates at the level r ∼ νp1 /p0 for
M > (νp1 /p0 )4/3 (see [103, 123] for a complete analytic solution). The difference is simply
because the injection rate ν itself in [125] as well as in the Monte Carlo studies [126, 128]
scales as r (ν ∝ r), as being normalized to the downstream (or subshock) plasma density n2 ,
which scales as n2 ∝ r. The injection rate ν in the analytic solution was normalized to the
far upstream (r-independent) plasma density, which formally complies to the usual practice
of shock studies. If we set ν ∝ r, the saturation requirement M > (νp1 /p0 )4/3 cannot be
met and both approaches agree upon the M 3/4 law. On the other hand, we then hardly answer
the question about the possibility of reaching r = ∞ at M = ∞ since the number density
of CRs produced in this case is also infinite (ν ∝ r) and the flow stagnation simply results
from this fact. On the other hand it would not be physically impossible if the accelerated CRs
had balanced the upstream ram pressure ρ1 u21 and stopped the flow completely even if their
density nCR (injection rate ν) were finite. This is a delicate question, particularly because in the
kinetic description (in contrast to the TFM) the CR density and pressure do not decouple, even
5 See also [126,127] where this scaling was first obtained, roughly on the ground that the compression ‘felt’ by 1 GeV

particles is fixed. One can also fix rs in equation (7.38) and obtain the M 3/4 law.
Theory of diffusive shock acceleration 463

if the density integral (injection rate ν) is determined by the low-energy part of the spectrum
whereas the pressure comes from the high-energy end (when the acceleration is efficient). The
aforementioned saturation of r(M) at the level ∼νp1 /p0 for M → ∞ rules out the infinite
compression for finite νp1 /p0 . Note that the numerical studies, have not spanned enough of
the parameter space to detect the saturation regime. What seems to be a more important task,
however, is to determine the physically realistic behaviour of the injection rate ν(R) = νs (R)
that is momentarily established at the subshock while the flow compression r increases in the
course of acceleration. The intersection(s) of the curve νs (R) with the respective (for given M
and p1 ) curve on figure 3 will give steady-state solution(s) without parametrization.
At first glance the assumption νs ∝ r seems to be well grounded since the injection, as we
discussed earlier, occurs at the subshock and its rate should be proportional to the local density,
quasi-independent of the global shock structure. There are, however, many other, oppositely
acting factors. Recently, an ab initio numerical solution of a diffusion–convection system
with the underlying gas dynamics complemented by a self-consistent injection model [75]
provided a clue to the possible behaviour of νs (r). The authors of [75] show that the injection
being fed on the high-energy tail of the Maxwellian distribution, is very sensitive to its
temperature which, in turn, drops while the injected particles carry its energy away. This
self-regulation of the injection along with the change in the particle trapping properties of the
compressed MHD waves seem to compensate for the density compression making the injection
rate remarkably constant or even a slightly decreasing function of the flow compression r.
Due to the computational complexity of this model only relatively low maximum momentum
pmax  1 and compression r ≈ 5 values have been reached so far, which is clearly in the
subcritical, test-particle regime, in terms of the diagrams in figure 3.

7.5.2. Two-fluid limit of kinetic theory. There are two major aspects of the TFM that have
been debated and criticized in the literature mostly because of their contradiction to the kinetic
numerical results (e.g., [27]). These are the efficient acceleration solution with no injection
(ν = 0) and the complete smoothing of the subshock rs ≡ 1 for M  10. Having obtained
the analytic kinetic solution, it is a simple task to resolve these ‘paradoxes’.
It is sufficient to trace the deformation of the bifurcation curve ν(R) under the transition to
the two-fluid limit p1 → ∞. What essentially happens [123] is that the curve ν(R) approaches
the abscissa in the region 1  R < M 3/4 preserving, however, its singularity at R = M 3/4 . If
we now let the actual injection rate approach zero from above, R will tend to some R = R0
that lies between R1 and M 3/4 instead of R = 1. Note that the magnitude of R0 may be rather
large whereas in the TFM R0  7 which is only because of the complete subshock smoothing
(see below) and the formal ignorance of particle losses at p = p1 in the TFM (Qesc = 0,
see (6.33)). Thus, there is nothing surprising in this residual flow modification as ν → 0 in
the two-fluid description. It stems from the order of limiting transition p1 → ∞, ν → 0 and
in our kinetic description it yields R → R0 (M)  1 for M  1. On inverting the sequence,
i.e. letting ν → 0, p1 → ∞ we clearly get R → 1, but this cannot be done within the TFM
since the transition p1 → ∞ has already been performed during the derivation of the two-fluid
equations. The sensitivity to the order of the limiting transitions is a signature of the singular
dependence of the kinetic solution upon p1 at p1 = ∞. This phenomenon is generally well
known in the theory of singular perturbations (e.g., [129]) and it is not specific to the TFM.
Let us look now into the issue of the subshock smoothing. As the analytic solution [103]
shows, the subshock may indeed be reduced significantly and its strength for sufficiently large
M depends on only one parameter @ ≡ νp1 /p0 M 3/4 . The subshock strength, when it is weak,
scales as rs − 1 ∼ 1/@, @  1. Since the TFM implies that p1 = ∞, there is again nothing
strange that it produces a completely smooth solution rs = 1. One should note that the limit
464 M A Malkov and L O’C Drury

p1 → ∞ is clearly a difficult one for numerical simulations and they were unable identify the
critical system parameter @. This explains the discrepancies.

7.5.3. The spectra. As seen from figure 3 a shock may develop vastly different structures for
very close or even the same injection rate ν. Therefore, the respective particle spectra should
also be different. These are shown in figure 4 for a few pairs (ν, R) on one of the curves ν(R)
given in figure 3. The first thing to notice is that at the low-energy end p  p0 , the spectral
slope q = −∂ ln f/∂ ln p indeed obeys the subshock condition, i.e., q(p0 ) ≈ 3rs /(rs − 1).
But, for larger p and in the case of not too high Mach numbers, spectra do not stick to any
constant power-law value, contrary to what was claimed in section 7.2 (see also figure 5).
The agreement improves as M increases. This is seen in figure 5 where the solution indeed
approaches the universal form predicted in section 7.2. Note, that the situation here is very
different from what we had in the test-particle solution. While in the latter case there was
practically no difference between a shock of say M = 10 and that of M = ∞ as regards to
the particle spectrum, in the nonlinear treatment the spectra produced by a shock of M = ∞
differ strongly even from the case of M ∼ 103 . In other words the transition to the universal
spectrum of section 7.2 with M is very slow and depends on the exact position of the shock
on the bifurcation curve ν(R). At really high Mach numbers the spectrum perfectly coincides
with the analytic solution described in section 7.2.
As may be seen from figures 4 and 5, the test particle spectrum of section 3 relevant for low
R ≈ 1 and the scale invariant spectrum of section 7.2 expected at very high Mach numbers and
R  1 both exhibit remarkable universality in a sense that they are fairly independent of any
physical parameters (except, of course for the low Mach number region where the test-particle
solution depends on M through the shock compression). These two solutions may be thought
of as extreme (in fact opposite with respect to the relevant asymptotic analysis) cases of the
same solution in an extended parameter space formed by M, ν, p0 , and p1 . In this parameter
space they occupy only small regions, perhaps of limited observational interest; in the first
one the injection rate ν is unrealistically small (at least for sufficiently high p1 ), the second
requires extremely high Mach numbers. The region between them is perhaps physically the
most interesting one. Needless to say, however, that no theory can be considered complete if
it does not recover the limiting cases regardless of their observational significance.

7.6. Quasi-phenomenological and numerical studies of strongly nonlinear acceleration


Many crucial results on nonlinear particle spectra were obtained numerically. A comprehensive
review is given in [27] although with an emphasis on the Monte Carlo steady-state simulations.
More recent time-dependent numerical solutions are presented in [75, 125, 130–132]. Steady-
state numerical solutions of the diffusion–convection equation were studied in [133].
There were considerable analytical efforts to retrieve some information about the particle
spectrum and shock compression without nonlinear solution of the diffusion–convection
equation (7.1). One straightforward result is due to Eichler [122] who integrated (7.1)
across the entire shock transition which led to an expression for the power-law index q(p)
through a functional of the unknown solution f (x, p) = p−3 g(x, p) (integral over x). This
relation bounds the postshock power-law index to the interval qt  q(p)  qs , where
qs,t = 3rs,t /(rs,t −1) with rs = u0 /u2 and rt = u1 /u2 , i.e., the subshock and total compression,
respectively. Unfortunately neither of these two quantities could be obtained in the nonlinear
case accurately without solving (7.1) for g(x, p). On a qualitative level, however, this result
made clear that at lower energies q must be close to qs whereas at higher energies the spectrum
should become harder as it is influenced by higher compression.
Theory of diffusive shock acceleration 465

6.0
(a) 0.20
ν (b)
5.5 0.15 4

0.10

1 2 3
5.0
Spectral Index q

0.05

0.00
0 5 10 15 R 20
4.5 R=18(4)

R=10(3)
4.0 R=1.2(1)

R=2.0(2)
3.5

3.0
-4 -3 -2 -1 0 1 2 3 4 5 6
log10(p/mc)

Figure 4. (a) Spectral indices q of the downstream particle distributions f ∝ p −q as functions


of particle momentum in the log-normal format. Every curve corresponds to one particular point
on the bifurcation curve ν(R) shown in (b). (b) The same bifurcation curve as in figure 3(b) for
M = 150; p0 = 10−3 ; p1 = 105 , with marked points corresponding to the flow compressions
R = 1.2; 2.0; 10; 18 for which the spectra are drawn in (a).

In order to obtain more information from this approach Eichler [115, 122] and Ellison
and Eichler [128] used a Heaviside function in x variable with a momentum-dependent
jump position to approximate the solution of (7.1). Strictly speaking, this equation cannot
accommodate the step function in x (or any other solution varying on a scale smaller than
κ/u) because of the second-order derivative in (7.1). A matter of greater concern is that this
approximation becomes increasingly inaccurate as κ grows with p, since the particle diffusion
length also grows as κ(p)/u1 increases with p. Therefore, the most critical quantity, the escape
CR flux Qesc (6.34) could been inferred only from the accompanying Monte Carlo simulations.
Nevertheless, these studies shed a new light on complicated interrelations between injection
and losses as well as their critical influence on the shock compression and particle spectrum.
We believe that they delivered a qualitatively correct picture of this complicated process.
More recently, Berezhko [113] estimated the magnitude of the spectral index qm at the
maximum momentum p = pmax ≡ p1 by extending the perturbative approach (strictly valid
for R − 1  1) to a strongly nonlinear case (R  1). The result for qm still contained a
fitting parameter k (equations (32)–(34) in [113]). However, assuming k = 1, the following
expression is derived:
3.5 − 0.5rs
qm = 3.5 + . (7.41)
2r − rs − 1
466 M A Malkov and L O’C Drury

6.0
(4)
(5)
Spectral Index q

5.0 (1)
(2)
(3)

4.0

(6)

3.0
-4.0 -2.0 0.0 2.0 4.0 6.0
log10(p/mc)

Figure 5. Particle spectra for shocks of different Mach numbers M and precursor compressions
R: (1) M = 50, R = 12; (2) M = 150, R = 24.5; (3) M = 150, R = 18.5; (4) M = 105 ,
R = 5 × 103 ; (5) M = 107 , R = 1.5 × 105 ; (6) M = 107 , R = 5 × 104 . The analytic
solution (7.22) is shown by diamonds. The corrections needed for both the injection and cut-off
regions as discussed in section 7.2 are not shown in this asymptotic solution.

This formula produces a correct test-particle slope qm ≈ 4 for r ≈ rs ≈ 4 as it should which


probably points to the proper choice of the constant k. Of course, it is an extremely appealing
idea to relate the shock characteristics r and rs to the spectral index q since this is perhaps the
best ‘observable’ and could thus serve as an indicator of acceleration efficiency, since the latter
can indeed be expressed through r and rs . It is, however, not clear how far this expression can
be extrapolated from the point r = rs = 4 (where it is strictly valid) towards larger r. For
example, an asymptotic solution of (7.1), obtained specifically for r  1, predicts a rather
different behaviour at p  p1 , namely f ∝ p−3.5 (1 + p/p1 )1/2 [103] yielding qm = 3.25
which is 0.25 lower than (7.41) would formally produce for r → ∞. However, the slope
qm = 3.25 seems to match perfectly the Monte Carlo result presented in figure 5 of [124] as
well as the numerical solutions of equation (7.37), figures 4 and 56 .
It is important to note here that the determination of qm = q(p1 ) without accurate
calculation of q(p) for p < p1 , even if correct, is not physically significant because the
true spectrum at p = p1 is dominated by losses and is extremely idealized by the introduction
of the abrupt cut-off at p1 . It is also clear from equation (7.1) (see also equation (7.16))
that the probability of a particle of the momentum p = p1 /a (with some a > 1) to sample
the total compression r scales with a as e−a for the Bohm diffusivity κ(p). Therefore, for
6 Note that some other Monte Carlo simulations by Ellison and collaborators (e.g., [19, 60]) utilize the configuration

space spectrum truncation (free escape boundary, FEB) rather than momentum cut-off employed in [124] and earlier
in this section. The introduction of the FEB leads to somewhat steeper spectra at p  p1 since particles can
probabilistically reach the FEB and leave the system at p < p1 .
Theory of diffusive shock acceleration 467

momenta only a few times smaller than p1 the total shock compression cannot enter the index
q algebraically as it does in equation (7.41). The same is true for rs for p exceeding a few p0
(see equations (7.22), (7.23)). Thus, the perturbative expression (7.41) cannot be extended to
the lower momenta if r  1. This probably requires some further modification of a simplified
model of nonlinear acceleration suggested recently in [124,134] and based upon equation (7.41)
along with a test-particle expression for the spectrum in nonrelativistic region. Finally, a regular
perturbation expansion in R − 1  1 has been obtained earlier in [135, 136] and discussed in
detail in part I. Unfortunately, we know of no further perturbative studies addressing the issue
of the maximum injection level ν or parameter R − 1  1 that would be still valid for the
test-particle theory. This appears to be important, because the test-particle theory is widely
used in modelling emission from astrophysical objects. One immediate deduction from the
ν(R) diagrams shown in figure 3 is that the convergence radius of the perturbation series in ν
cannot exceed ν = ν2 = ν(R2 ), where ν (R2 ) = 0, ν (R2 ) < 0. This is simply because ν2 is
a branch point of the response function R(ν) where the perturbation series
∞
R =1+ an ν n
n=1

clearly diverges. As a result, the two upper branches of R(ν) (that exist for even lower ν
and where the strongly nonlinear regime R  1 occurs) are inaccessible to the perturbation
theory.

7.7. The role of turbulent heating


The critical dependence of acceleration efficiency upon the injection rate ν, Mach number
M and maximum momentum p1 suggests that physical processes not included in the above
description may dramatically influence the acceleration process. The precursor heating by
Alfvén waves is perhaps the most obvious candidate for that.
There is, however, another mechanism responsible for the gas heating. This is the acoustic
instability driven by the pressure gradient of CRs [137–142]. Note that the first mechanism
contains a smallness ∼1/MA , where MA is the Alfvénic Mach number, in the conversion
efficiency of the CR energy into MHD energy [104], part I. The second mechanism seems to be
more suitable for this purpose particularly because the acoustic waves steepen into shocktrains
and thus heat protons very efficiently. At the present stage parametrization is needed in either
case and we modify equation (7.6) as follows:
α
M0−2 = M −2 R 8/3 + (R − 1)p1 . (7.42)
3
Here the second term on the right-hand side represents the turbulent heating of the efficiency
∝α powered by a (normalized) pressure contrast of CRs (represented by (R − 1)) and assumed
to be proportional to the precursor length l ∼ κ(p1 )/u1 ∝ p1 .
We demonstrate the impact of turbulent heating onto the acceleration process in figure 6
given in the same format as figure 3. This result looks rather discouraging. Indeed, it is very
difficult to draw any conclusion about acceleration efficiency (and, thus, as we have seen about
the spectra) without accurate calculation of the heating rate α. The efficiency may be very high
(for α  10−8 ) or very low (for α  10−6 ) when α changes only by two orders of magnitude.
Very strong efficiency variation occurs already on an order of magnitude variation of α. Since
there is no reliable theory for determining α, the only exception to this uncertain situation
would be the region of low injection if the actual injection parameter ν were below its critical
value ν2 . For the cut-off momenta appropriate to the problem of CR acceleration in SNRs, i.e.,
p1  105 , both critical injections ν1 and ν2 are very small (10−2 ) and the actual injection rate
468 M A Malkov and L O’C Drury

0.06
-6
α=10
Injection ν

0.04
-7
α=10

-8
0.02 α=10

α=0

0.00
0 5 10 15
Compression R

Figure 6. Bifurcation diagrams ν(R) for a shock of the Mach number M = 150, injection
momentum p0 = 0.001 and the cut-off momentum p1 = 105 shown in the same format as figure 3
but for different turbulent heating rates α as indicated at each curve.

ν obtained analytically and inferred from the hybrid simulations (see section 5) or even from
the in situ observations of heliospheric shocks [10] is higher7 . In other words, for sufficiently
high values of p1 , the nonlinearity of the acceleration must always play a crucial role; either
the efficiency is high (R-large) or the effect of turbulent heating is strong, as figure 6 suggests.

7.8. Simplified time-dependent treatment


In the test-particle theory the maximum momentum p1 grows monotonically on the timescale
tacc given by (3.17). One can easily determine also the nonlinear timescale assuming that p1
grows in time while the background flow u(x) and the particle distribution g(x, p) depend
only implicitly on time through p1 (t). Rewriting (7.1) for the time-dependent case yields
 
∂g ∂ ∂g 1 du ∂g
− + ug + κ(p) = p . (7.43)
∂t ∂x ∂x 3 dx ∂p
Based on the adiabatic approximation, explained above, the solution g(x, p, p1 ), u(x, p1 ),
found earlier in this section under the assumption about a sharp steady cut-off at p = p1 ,
remains approximately valid for p < p1 (t). We thus replace g → gH (p1 (t) − p), where H
is a Heaviside function. Substituting this into (7.43) and extracting the most singular terms
(i.e., those with δ functions) we formally obtain
dp1 1 du
= p1 . (7.44)
dt 3 dx
7 Note that the injection parameter ν ∼ (nCR /n1 )(cp0 /mu21 ), i.e., it is typically much larger than nCR /n1 .
Theory of diffusive shock acceleration 469

ν2

ν1

0 1 R2 R1 R

Figure 7. Schematic representation of the bifurcation curve ν(R) with a limit cycle behaviour
of the system (curve with the arrows). The cycle starts at the critical point R2 , ν2 (p1 ) when the
actual injection rate exceeds ν2 as p1 increases. The transition to the efficient solution for given ν
(horizontal line) is in reality incomplete since as R increases, the actual injection rate drops below
the critical value (ν < ν1 ) and the solution returns to the inefficient branch. Starting from that
point ν grows again and the cycle repeats. During the low–high transition the bifurcation curve
may straighten (not shown in the figure) when the critical injections ν1 and ν2 both increase and
merge because of decreasing p1 (see text).

This treatment seems to be inconsistent with our assumption p1 = p1 (t) since ux depends on
x. Fortunately, in a strongly nonlinear regime, when u1  u0 , u(x) is nearly linear in the most
of the CR precursor, u(x) ≈ (x/ l)u1 , where l was calculated in [103], l ≈ 1.4κ1 (p1 )/u1 so
that ux in (7.44) is constant and we obtain for tacc
κ1 (p1 )
tacc = 3 · 1.4 . (7.45)
u21
Interestingly, this result can also be obtained by taking the limit u1  u2 and dropping the
downstream residence time contribution in the linear formula (3.17) altogether. This is proba-
bly justified because the acceleration takes place in the extended CR precursor. The remaining
numerical factor of 1.4 is not meaningful at all in view of the simplifications made here.

7.9. Shock acceleration as self-organized critical phenomenon


The analysis of section 7.5 shows that the acceleration to high energies must be essentially
nonlinear. This makes the acceleration process rather unpredictable. To see this suppose it
starts from some low maximum momentum p1 with a monotonic dependence ν(R) as e.g., one
shown for p1 = 100 in figure 3. Initially, the acceleration is relatively inefficient (R ≈ 1). As
p1 grows, however, the ν(R) diagram flattens and a pair of local extrema emerge at R = R1
and R = R2 < R1 , figure 7. It may be shown [123] that both critical values ν1,2 ≡ ν(R1,2 )
decrease monotonically with p1 . Therefore, no matter how small the actual injection rate
ν is, the condition ν > ν2 eventually meets and the inefficient solution fails to exist. The
system must ‘jump’ to an efficient regime with high R. One may think of a phenomenon
470 M A Malkov and L O’C Drury

similar to a phase transition in which R is an order parameter, and ν is a control parameter.


This renders the deterministic approach very limited. First, the ‘jump’ will take significant
time τtr (transition time) since it also changes the shock structure which, in turn, requires
τtr ∼ l/u1 = κ(p1 )/u21 ∼ tacc . Second, it is unlikely to be completed as prescribed by the
‘phase transition’ diagram in figure 7 since the ‘control parameter’ ν changes as R grows. Since
both the subshock strength rs and the downstream temperature decreases with R, the actual
injection rate may decrease as well, in spite of increase in the downstream plasma density
ρ2 = Rrs ρ1 (e.g., [67]). As we discussed earlier, the effect of increased ρ2 on the injection
rate is recently shown to be entirely compensated by the cooling of the thermal pool due to the
injection itself (although so far for relatively small R [75]). The next mechanism of injection
limitation comes from the nonlinear wave compression and amplification of B⊥ (see section 5
and figure 6 in [59]). Finally, when R grows, so does the subshock angle θnB which is known
to strongly reduce injection for kinematic reasons, e.g., [78]. Hence, the system must move
back to the inefficient mode. Once this has happened, the full strength subshock begins to
be restored as does the injection rate ν. Consequently, the transition to the efficient branch
(R > R1 ) must start again and the process repeats itself, figure 7.
The other ‘control parameter’, namely the maximum momentum p1 is also subject to self-
regulation. As we know from both linear and nonlinear theories it grows in time according
to dp1 /dt = p1 /tacc (p1 ). In the nonlinear theory, however, its growth is likely to be limited
intrinsically. First, particles are confined near the shock by MHD-waves generated via the
cyclotron resonance kcp = ωci (here p is in units of mc). Thus, the maximum wavelength
in the MHD spectrum should be λmax = 2π(c/ωci )p1 . Since these longest waves are created
first in the outermost part of the precursor at x ∼ κ(p1 )/u1 and convected then to the
subshock with the decreasing flow speed u(x), they are compressed there by the factor R,
i.e., λmax (x = 0) = λmax /R and these waves are no longer in resonance with the particles from
the momentum interval p1 /R < p < p1 . Although new long waves are generated all across
the precursor, they are also subject to continuous refraction (through the flow compression).
This clearly impairs particle confinement so that the overall particle spectrum should decay
more rapidly8 starting from some p∗  p1 /R. It is important to emphasize that the ‘intrinsic
cut-off’ p∗ is in fact a ‘dynamical cut-off’ in the sense that the particles with p∗ < p < p1 are
still present in the spectrum but they are dynamically unimportant. From an application point
of view this may appear as a spectral break at p∗ .
The next mechanism of the maximum momentum limitation is crucial for understanding
the remainder of this section. When ν becomes supercritical (as p1 grows), this does not
occur simultaneously throughout the entire shock front. Some part of it will go to the high-
acceleration mode earlier simply because of variations in ν caused by e.g., local variation of
the θnB angle (due to the CR generated turbulence) or by the inhomogeneity of the interstellar
medium (ISM). Excessive CRs with high-momenta diffuse to neighbouring sites which are
barely stable to the above low–high transition, and thus, an effective maximum momentum
in the efficient region decreases and at the same time, the subcritical state nearby is upset
by increasing the maximum momentum. Thus, the low–high transition event will propagate
along the shock front as an avalanche until the injection suppression mechanisms, the decrease
in maximum momentum (particle losses) and the Mach number (turbulent heating) bring the
system back to the subcritical state. This spontaneous breaking of translational symmetry
along the shock front9 must result in fluctuations on all spatial scales undamped by the CR
8 To mention a possible factor that might act in opposite direction we refer to recent studies [143–145] where the

long wave CR-driven fire-hose instability was considered.


9 Generally speaking, this makes the results of any one-dimensional theories including the one described above

subject to three- or at least two-dimensional verification.


Theory of diffusive shock acceleration 471

diffusivity, i.e., l > κ1 (p1 )/u1 , in other words, the corresponding damping time l 2 /κ1 (p1 ) must
exceed the precursor crossing time or, similarly, the acceleration time, κ1 (p1 )/u21 . Since there
is essentially no internal characteristic lengthscale for these oscillations up to the global system
length, like the SNR radius, these fluctuations must have a featureless power-law spectrum. In
other words, avalanches of essentially all sizes can develop.
What we described above is rather similar to the concept of self-organized criticality
(SOC) [146, 147]) extensively studied over the past decade. The example of pressure gradient
driven transport in a plasma close to marginal stability [148] is, perhaps, physically the most
relevant to our discussion. In the simplest SOC paradigm, however, the sandpile, the transport
events (avalanches) prevent the system driven by toppling sand from developing slopes that are
too steep and keep it at a nearly marginal stability state. In our case the transition to the efficient
acceleration is also inhibited by a number of phenomena. A fundamental one concerns the
above-discussed formation of the intrinsic energy cut-off. As we have seen, when the actual
transition started, p∗ is likely to decrease, thus straightening the bifurcation curve. Also the
growth of p1 (t) should slow down due to the decrease in particle number density (and thus
resonant waves) at p = p1 . If p∗ drops so strongly that the curve returns to a monotonic
behaviour as it is shown in figure 3 for p1 = 100 (which now should be replaced by p∗ ) with
the moderate injection rate yielding R ≈ 1, there is apparently no reason why p1 does not start
to grow again rendering ν(R) nonmonotonic. It seems that the system cannot stably evolve to
higher p∗ being in either the test particle or efficient mode and a reasonable way to reconcile
them is to assume a marginal state in which the maximum and the minimum of ν(R) merge,
at least in a sense of an averaged (in time and space) process. The last requirement implies
ν (R) = ν (R) = 0 at some R = Rc . These two equations and the dependence ν(R) itself,
not only determine Rc and νc ≡ ν(Rc ) but also provide an additional relation that involve other
parameters of the problem which clearly enter the function ν(R). These are the Mach number
M, the heating rate α and the maximum momentum pmax ∼ p∗ . For example, given M and α
we can easily calculate pmax [149].
As we already discussed in section 3.1, within a linear or perturbative (in terms of the
shock modification by CRs) treatment, the maximum energy derives from the shock age or size,
e.g., [108]. Pessimistic (lower) estimates of Emax in SNRs by Lagage and Cesarsky [150] gave
Emax  1013 –1014 eV for protons, whereas an optimistic one, performed by Berezhko [113],
led to Emax ≈ 1015 eV. There exists a broad consensus on the estimate Emax ∼ 1014 Z eV, where
Z is a charge number [28], which is derivable from both size- and age-based arguments. Indeed,
the acceleration is rooted in shock electrodynamics and Emax estimated on the grounds of the
finite size (Rs , growing with time) of the shock may be translated into the energy of a charge e
acquired in the induced electric field E ∼ (u/c)B, i.e., Emax ∼ (u/c)eBRs (t) (e.g., [151,152])
which for the SNRs yields Emax ∼ 3 × 1013 eV. Jokipii, however, observed that in strongly
oblique shocks particles do not need to make long diffusive excursions away from the front
and may recross it more frequently gaining energy up to 100 or more times faster, at least in
the case of a favourable relation between the cross-field and parallel diffusivities [153, 154].
The further discussions of this suggestion may be found in e.g., [155] and [156].
In the strongly nonlinear acceleration regime as opposed to the linear one, the particle
energy limitation may occur due to the response of the shock structure to the acceleration,
i.e., due to the growing pmax itself. This response may be understood from figures 3 and 7.
In essence, it appears to be too strong to allow unlimited acceleration due to energy and
confinement requirements. In very simple terms, strong shocks tend to develop universal
spectra with diverging pressure which must thus decay beyond some p = pmax . A widely
accepted alternative to this scenario would be a decrease of the spectrum amplitude at higher
energies through its softening at lower energies. This mechanism clearly works for moderate
472 M A Malkov and L O’C Drury

Mach numbers as the inspection of figure 5 shows but becomes less and less efficient as M
grows. To see this consider the extreme M = ∞ case. If the heating is adiabatic then rs = 4
and the spectrum is thus nowhere softer than p−4 , which is already hard enough to diverge
(logarithmically). Through the nonlinear effects it becomes even harder at higher momenta so
1/2
the pressure diverges ∝νpmax (section 7.2). Thus either the injection must be suppressed, or
pmax must be constrained.
We have calculated pmax based on the above criticality requirement ν (Rc ) = ν (Rc ) = 0
with no turbulent heating (α = 0) for a few Mach numbers M = 50–80 for which a dramatic
decrease in pmax occurs (see also [149] for some further details). The corresponding bifurcation
curves and the spectra are shown in figures 8(a) and (b). As is generally the case at critical
states, the fluctuations may be quite significant, so that local deviations from the obtained
spectra and maximum energies can be expected. Note that the interesting momentum range
105 –106 GeV c−1 (the so-called knee region that presumably stems from the maximum energy
achievable in SNRs) corresponds to a fairly narrow Mach number range between M = 60 and
70. Weaker shocks do not limit the maximum momentum significantly whereas stronger shocks
cannot accelerate to the ‘knee’ energies without self-destructive effects caused by the pressure
of accelerated particles. In other words, weaker shocks spend energy for acceleration more
economically producing higher energies with a lesser acceleration efficiency. Interestingly,
if we define the latter simply as PC /ρ1 u21 ≈ 1 − 1/R (see (7.3)), in the critical state this
quantity turns out to be close to 50% at M = 50 and increases slowly with M. Note, that the
‘equipartition’ idea was advocated by Eichler already in the early years of the DSA research
(see also our comment on this in part I). Another important aspect of the energetic efficiency
that is less appreciated in the literature is that it is substantially lower in the upstream frame
than in the shock frame (Axford personal communication) in which the above figures have
been obtained. The upstream frame is clearly more relevant physically, particularly for the
problem of the CR production in SNRs.
Now consider the factors that can change this simple picture. First of all, an obvious
alternative to the critical regime would be the realization of the intermediate solution on
the decreasing part of the ν(R) diagram with unlimited p1 . This could provide a moderate
compression R with the reduced subshock and injection rate. The hydrodynamic counterpart
of this solution, however, has been proven unstable [95]. This should also be true for the kinetic
solution, because the inefficient solution is known to be stable from numerical simulations and
stable and unstable branches usually alternate (see [123] for details). The next possibility is
strong bursting dynamics under significantly bent, nonmonotonic bifurcation curve ν(R) (see
figure 7) so that no momentum cut-off would be needed. In terms of a relevant dynamical
system, one may consider large amplitude limit cycle oscillations instead of soft fluctuations
around a fixed point. This possibility is indeed difficult to rule out without a full dynamical
treatment of the three-dimensional problem and the careful study of the impact of turbulence
compression on the losses at highest energies. Moreover, it may be potentially interesting for
low-dimensional modelling of an intrinsic variability of sources such as extragalactic jets or
γ -ray bursts. The onset of dynamical chaos (e.g., [157]) in these oscillations would be perhaps
only a matter of dimensionality of an appropriate Galerkin system (3 required) and the choice
of parameters.
The strong turbulent heating would clearly drive the system to a deeply subcritical, test-
particle acceleration regime, as shown e.g., in figure 6. In this case p1 can also grow as long as
the shock size allows [113], but instead of the problem of shock energy distribution between
CRs and bulk/thermal motion downstream an even more difficult problem arises, i.e., the
problem of dissipation of the CR driven turbulence. The intrinsic, acceleration based breaking
of the one-dimensional symmetry discussed above, as well as an initial inhomogeneity [158]
Theory of diffusive shock acceleration 473

0.03
(4)
Injection Rate ν

0.02 (3)
(2)
0.01
(1)

0.00
1.0 2.0 3.0 4.0 5.0
Precursor Compression R

5.0
Spectral Index q

4.0 (1)
(4) (3) (2)
3.0

2.0
-4.0 -2.0 0.0 2.0 4.0 6.0 8.0
log10(p/mc)

Figure 8. (a) Bifurcation curves ν(R) in critical states ν (Rc ) = ν (Rc ) = 0 shown for a few
values of Mach number M = 50 (1), 60 (2), 70 (3), 80 (4). (b) The particle spectra taken at the
critical points R = Rc and νc = ν(Rc ). The cut-off momenta p1 calculated from the criticality
condition for M = 50, 60, 70, 80 are p1 = 2 × 108 ; 105 ; 5 × 104 ; 1.5 × 104 , respectively.

of the ISM make it even more complex. Even at a purely hydrodynamic level (without CRs)
shock symmetry breaking is a difficult problem. We address the reader to the recent review by
Zabusky [159].
Finally, an obvious result of the sharp R(ν) dependence for a shock close to criticality
would be a substantially distorted subshock front. Clearly, the regions of the subshock with
less efficient acceleration (smaller R) will be much ahead of the regions where R is large.
This might serve as, an intrinsic remnant shaping mechanism totally independent of the ISM
inhomogeneity, but sensitive to the angle between the shock normal and the ambient magnetic
field.

8. Observations

There has been a revolutionary improvement of the measurement of radiation from a variety
of objects in the Universe (e.g., EGRET [160, 161], BeppoSAX [162], Chandra [163], TeV—
astronomy [164, 165]). It would be impossible to even list all the significant results in such
a brief review. Therefore, we limit our discussion to a few interpretative issues and to some
recent developments in the field, addressing the reader to the cited papers for more information.
474 M A Malkov and L O’C Drury

In most cases the primary source of the observed radiation is believed to be accelerated
particles, often of remarkably high energies, such as 1020 eV or even higher [166]. Varieties
of models are built upon secondary radiation processes like electron synchrotron emission
and inverse Compton (IC) scattering [17, 18], neutrino production in p–γ reactions, and
high-energy γ -ray emission from the decay of π 0 -mesons born in hadronic reactions [167–
173]. Therefore, one needs to distinguish between observational signatures of accelerated
protons and electrons. There are well documented measurements of nonthermal electron
emission in radio, e.g., [174], x-rays [175–177] and possibly also in γ rays [101] coming from
SNR shells. The γ -ray band must be directly related to ultrarelativistic particles. The first
object where this was detected is the shell-type SNR1006 [101] (see also [178] for a second
possible candidate) with currently accepted interpretation of this γ radiation being the 2.7 K
cosmic microwave background photons up-scattered by shock accelerated electrons via the IC
process (e.g., [179, 180]). An ultimate interpretation would probably require the calculation
of injection rates of both protons and electrons. At the same time, even the possibility of
alternative, nucleonic interpretation would be crucial to the SN hypothesis of CR origin as
well as to the theory of shock acceleration. This motivated the authors of [181] carefully to
scrutinize other scenarios. They arrived at the conclusion that if a sufficient nonlinear increase
in shock compression and thus in target density for the p–p interaction (and magnetic field) is
allowed, then the detected radiation is due to protons rather than electrons. These authors also
discussed another critical test for distinguishing between the two scenarios which is based on
the spatial distribution of emission. If this is from π 0 -mesons, it should be localized more in the
shocked gas area (high target density for p–p reactions), whereas electrons interact with 2.7 K
radiation obviously in the entire area of their propagation, i.e., their emission region should be
significantly broader. A final judgement will be passed by the new generation of instruments
with dramatically enhanced resolution and sensitivity, such as the Chandra, GLAST and
imaging atmospheric Cherenkov telescopes that are currently being built [164, 165]. In the
meantime we may confirm that based upon the results discussed earlier in this review, the
nonlinear compression ratios potentially achievable in this process may be indeed high enough
to favour nucleonic interpretation.
It should be noted, however, that when both species become relativistic and, if the
synchrotron cooling time is longer than the acceleration time, the electron spectrum form
should be identical to that of the protons. Therefore, any evidence for electron acceleration
implies (again, up to the injection efficiencies) the same for protons. Understanding the
injection process for different species is thus critical not only theoretically, as seen from the
bifurcation diagrams in figure 3, but also to the interpretation of the observed radiation.
Remarkably, new ideas might be in the pipeline: there is a broad initiative aimed at the
modelling of astrophysical explosions using high-energy lasers, e.g., [182]. Considerable
efforts have to date been devoted to the scaling of astrophysical objects, particularly SNRs
to laboratory experiments with high-power lasers such as the Nova laser e.g., [183], to the
hydrodynamic similarity and difference between these two environments e.g., [184] as well
as explosion–implosion duality, [185]. At the same time there is an ambitious project based,
however, on a careful analysis of laboratory capabilities and theoretical constraints [186] which
is specifically aimed at particle acceleration in collisionless shocks. It is planned to drive a
shock at a speed of 200 km s−1 into an initially cold plasma of density 1013 cm−3 in a
magnetic field of 200 gauss, all within a 9 × 6 m chamber, fully equipped with the necessary
diagnostics. This will deliver the global shock structure, the characteristics of MHD and
plasma turbulence as well as ion and electron spectra. The shock crossing time will be 100
ion gyroperiods, which is clearly too short for it to develop into a strongly modified CR-shock
but should be sufficient for gaining critical insight into many injection mechanism issues for
Theory of diffusive shock acceleration 475

both electrons and ions as well of the physics of collisionless shocks discussed earlier in this
review.
However, nonlinear shock modification is also becoming a subject of observational
astrophysics. The unprecedented spatial resolution of the Chandra x-ray observatory allowed
the authors of [163] to simultaneously measure the shock speed and the postshock electron
temperature of the SNR E0102.2-7219. They further argue that since this temperature is so
much lower than RH relations would prescribe, these measurements present evidence of the
CR modification of the shock structure. There are indeed mechanisms of anomalous electron
heating in the collisionless shock environment which under certain circumstances should be
even more efficient than those for ions [91]. Nevertheless, perhaps we really need to wait for
the results of the above described laboratory experiments to understand what cold postshock
electrons really mean. This may still result from the adiabatic interaction of electrons with an
ion-dominated electromagnetic shock structure.
Another evidence for the nonlinearity of acceleration should be looked for perhaps in the
morphology of radiosynchrotron images of SNRs. This is often of a ‘barrel’ type suggesting
that significantly stronger emission comes from the ‘quasi-perpendicular’ part of the shock
surface, i.e., where the shock obliquity θnB  π/2. Since the emissivity is proportional
3/2
to Pc B⊥ , where Pc is the pressure of CR electrons and B⊥ is perpendicular to the line of
sight component of the B -field, these two should be responsible for the observed anisotropy.
Indeed, only the perpendicular to the shock normal component of magnetic field is compressed
leading to the enhanced emission from the equatorial part of the remnant. Also the dependence
of injection efficiency on the shock obliquity θnB (see section 5.1) should contribute to
the anisotropy of synchrotron images. These factors have been included in the model by
Fullbright and Reynolds [187]. Note that the dependence of the acceleration timescale on
θnB is unimportant here since the emission is due to GeV electrons, as pointed out by Axford
et al [188].
Ratkiewicz et al [189] argued, however, that a better agreement with observations may be
achieved by also including nonlinear effects. Indeed, the shock compression (and thus that of
the magnetic field) is stronger in the nonlinear regime. The authors of [189] included therefore
the magnetic field dynamics into the Chevalier [190] self-similar spherically symmetric
solution for a blast wave with the CR pressure Pc . They were able to synthesize emission
images labelled by the parameter of nonlinearity w = Pc /(Pc + Pth ), where Pth is the pressure
of the thermal gas. Consistent with the observations turned out to be images with w ≈ 1 rather
than with w  1.

9. Summary and outlook

The focus of this review has been on what analytical methods can contribute to our
understanding of the shock acceleration process. Among the results, the greatest observational
significance is attached to the spectra of the accelerated particles. We believe that the current
theory can describe them reasonably well from thermal to high-energy particles for a fairly
arbitrary momentum dependence of CR diffusivity and in all practically interesting parameter
regimes. This represents a very significant development compared to the situation when part I
was written 17 years ago.
The most attractive property of the test-particle spectrum is that it is a simple power law,
f (p) ∝ p−q , with an exponent which is, in the case of strong shocks, q = 4, intriguingly close
to what is usually inferred from the observation of galactic CRs (4.1 for propagation models
with little or no reacceleration e.g., [156], models with reacceleration yield steeper inferred
476 M A Malkov and L O’C Drury

source spectra with exponents closer to 4.4, e.g., [152,170]) and similar to the electron spectral
indices inferred in many radio sources. Although not usually emphasized much in the literature
we feel that an equally attractive and significant feature is that this power-law spectrum can
naturally extend over a very large dynamic range in energy; it is hard to think of another
process which can operate in essentially the same way from subrelativistic energies all the way
to 1014 eV and beyond.
The strict mathematical requirements for the test-particle approach to be valid is that
the CR energy and number density be infinitesimal. These, however, turned out to be quite
significant. Therefore, we determined the tolerable injection level by examining the nonlinear
reaction of CRs on the flow. A strong warning came from the TFM, according to which the
injection may be literally turned off but the nonlinearity still dominates the acceleration in strong
shocks. We attempted then to find kinetic prototypes of the efficient TFM solution. One exact
solution characterized by a marginally converging pressure has been found for the momentum
independent diffusion coefficient in accord with the TFM. In the momentum-dependent case,
a self-similar solution was found that essentially required upper momentum cut-off. We also
derived an integral equation for this case whose solution recovers the strongly nonlinear and the
test-particle regimes simply as limiting cases. This enabled us to explore the parameter region
between these two extremes. Although this region is rather narrow (in injection rate ν, e.g.,
figure 3), a variety of CR spectra that substantially differ from these two limiting cases have
been found. We argued that it is this region into which the acceleration process must evolve
to be capable of self-regulation. Therefore, the approaches based on the parametrization of
critical physical quantities such as the injection rate and maximum energy may be in many
cases inadequate. In particular, there are indications that the maximum energy may be limited
through the nonlinearly enhanced particle losses and breaking of the one-dimensional shock
structure.
These findings should have significant observational consequences. First of all ‘directly
observed’ spectra such as the synchrotron electron spectra should not necessarily be exact power
laws. The efficient acceleration might appear in ‘hot spots’ on the shock surface so that the
spectra may significantly vary along it. They may even become locally convex in momentum
due to the energy-dependent losses from the hot spots despite their generally concave character
in the plane nonlinear shock model. Another, wave damping based mechanism of the spectral
steepening at work in a purely one-dimensional case has been discussed in section 6.2. These
spectrum modifications and the possibility of the nonlinear, intrinsic cut-off in high Mach
number shocks make the predictions of the high-energy spectra based on their low-energy
measurements (e.g., [167–169]) very difficult.
It is important to recognize, that the simple test-particle theory often seems to be in better
agreement with observations than more consistent nonlinear theories. We feel, however, that
this is because the latter are still incomplete. Even the simplest modifications mentioned above
and in the preceding section make the spectra look more like the test-particle spectra (see e.g.,
figure 8(b)): they become steeper and, to a significant extent, loose their concave nonlinear
signature. Nonetheless it is obvious (see e.g., figure 8(a)) that the underlying acceleration
regime is by no means linear.
It would not be surprising if the integral spectra produced by a more complete three-
dimensional nonlinear theory were close to the test-particle power laws (as some of the SNR
observations seem to suggest, e.g., [191]). We have already encountered apparent returns to
test-particle theory in our nonlinear treatment of this acceleration mechanism. One good reason
for this could be the spatial (and/or temporal) scale invariance of fluctuations of acceleration
efficiency along the shock front. By the very nature of this acceleration process spatial scale
(l) is coupled with momentum (p) (through κ(p) ∼ ul) so that the scale invariant spatial
Theory of diffusive shock acceleration 477

distribution of the hot spots must translate into a power-law integral particle spectrum even
though local spectra may bear their flatter nonlinear form which is not an exact power law.
Notwithstanding these remarks some observational indications of flatter spectra similar to
the asymptotic, pure power law q = 3.5 spectrum of section 7.2 do exist even in shell-type
SNR [31, 174]. Still better sites to look for such spectra are probably extragalactic jets and
accretion shocks of clusters of galaxies.
The curse of symmetry breaking can also be turned into a blessing when dealing with the
problem of efficiency which is usually too low in the test-particle, linear regime and too high
in the nonlinear one, at least if no turbulent heating is introduced. This is typical for bistable
systems and the spatial and time splitting between the two states is a usual way in which such
systems accommodate their environments (e.g., phase transitions).
Although there is some reason for optimism coming from the comparison of a simplified
one-dimensional theory with selected observations, great care should be exercised in doing
so. We have seen that simple test-particle theory can falsify the results of nonlinear theory in
regimes where the former is clearly inapplicable. What is usually compared to observations
is that same power-law index (a number) by means of models equipped with quite a few
‘free parameters’, such as the particle m.f.p., the turbulent heating rate, injection rate and
the cut-off momentum about which we are fairly uncertain. It is not surprising then that
‘good agreement’ can be reached. The fact that the computed spectrum is a power-law was
convincing enough in the early days of DSA research, implying that a correct functional
form of the spectrum is captured by the model. The power-law spectra, however, is a rule
rather than an exception in dynamical systems lacking internal scales (e.g., [146, 147]). These
remarks do not invalidate the reviewed mechanism whose physical viability and universality
we have attempted to demonstrate. They rather necessitate more adequate three-dimensional
time-dependent implementation of it with a substantially improved description of the turbulent
particle transport in shocks and (not independently) their injection and losses. This should
provide more consistent comparison with observations. The forthcoming γ -ray detectors
such as GLAST (e.g., [192]) and stereoscopic Cherenkov arrays, (e.g., [164, 165]) that
should essentially gaplessly cover the critical energy interval from 0.1 GeV to 10 TeV with
dramatically improved sensitivity and spatial resolution will set new standards for theoretical
predictions.

Acknowledgments

We would like to thank W I Axford, R D Blandford, P H Diamond, U D J Gieseler, T W Jones,


H J Völk and V D Shapiro for useful discussions. The work of MM was supported by the
SFB-328 of the Deutsche Forschungsgemeinschaft and by US DOE under grant no FG03-
88ER53275. The work of LD was partially supported by the TMR program of the EU under
contract no FMRX-CT98-0168.

References

[1] Krymsky G F 1977 Dokl. Akad. Nauk 234 1306 (Engl. Transl. Sov. Phys.–Dokl. 23 327–8)
[2] Axford W I, Leer E and Skadron G 1977 Proc. 15th Int. Cosmic Ray Conf. (Plovdiv) 11 132–7
[3] Bell A R 1978 Mon. Not. R. Astron. Soc. 182 147–56
[4] Bell A R 1978 Mon. Not. R. Astron. Soc. 182 443–55
[5] Blandford R D and Ostriker J P 1978 Astrophys. J. 221 L29
[6] Goldreich P and Sridhar S 1997 Astrophys. J. 485 680
[7] Drury L O’C, Markiewicz W J and Völk H J 1989 Astron. Astrophys. 225 179
[8] Blandford R D and Ostriker J P 1980 Astrophys. J. 237 793–808
478 M A Malkov and L O’C Drury

[9] Achterberg A 1981 Astron. Astrophys. 98 161–72


[10] Lee M A 1982 J. Geophys. Res. 87 5063
[11] Webb G M, Drury L O and Biermann P 1984 Astron. Astrophys. 137 185
[12] Webb G M, Forman M A and Axford W I 1985 Astrophys. J. 298 684–709
[13] Völk H J and Bierman P L 1988 Astrophys. J. 333 L65–8
[14] Schlickeiser R 1989 Astrophys. J. 336 264
[15] Dorfi 1991 Astron. Astrophys. 251 597–610
[16] Jones T W and Kang H 1992 Astrophys. J. 396 575–86
[17] Sturner S J, Skibo J G, Dermer C D and Mattox J R 1997 Astrophys. J. 490 619
[18] Reynolds S P 1998 Astrophys. J. 493 375
[19] Baring M G, Ellison D C, Reynolds S P, Grenier I A and Goret P 1999 Astrophys. J. 513 311–38
[20] Toptygin I N 1980 Space Sci. Rev. 26 157
[21] Axford W I 1981 Proc. IAU Symp. ‘Origin of Cosmic Rays’ ed G Setti, G Spada and A W Wolfenfale (Boston:
Reidel) pp 339–58
[22] Drury L O’C 1983 Rep. Prog. Phys. 46 973
[23] Forman M A and Webb G M 1985 A Tutorial Review (Geophys. Monogr. Ser.) vol 34, ed R G Stone and
B T Tsurutani (Washington, DC: American Geophysical Union) p 91
[24] Blandford R D and Eichler D 1987 Phys. Rep. 154 1–75
[25] Berezhko E G and Krymsky G F 1988 Usp. Fiz. Nauk. 154 49 (Engl. Transl. Sov. Phys.–Usp. 31 27)
[26] Berezinskii V S, Bulanov S V, Ginzburg V L, Dogiel V A and Ptuskin V S 1990 Astrophysiscs of Cosmic Rays
(Amsterdam: Elsevier)
[27] Jones F C and Ellison D C 1991 Space Sci. Rev. 58 259
[28] Axford W I 1994 Astrophys. J. Suppl. 90 937–42
[29] Blandford R D 1994 Astrophys. J. Suppl. 90 515–20
[30] Bierman P L 1997 J. Phys. G: Nucl. Part. Phys. 23 1–27
[31] Jones T W et al 1998 Publ. Astron. Soc. Pac. 110 125–51
[32] Kirk J G and Duffy P 1999 J. Phys. G: Nucl. Part. Phys. 25 R163–94
[33] Bhattachargee P and Sigl G 2000 Phys. Rep. 327 110
[34] Nagano M and Watson A A 2000 Rev. Mod. Phys. 72 689–732
[35] Skilling 1975 Mon. Not. R. Astron. Soc. 172 557–66
[36] Kennel C F and Engelmann F 1966 Phys. Fluids 9 2377
[37] Krymski 1964 Geomagn. Aeron. 977 763
[38] Parker E N 1965 Planet Space Sci. 13 9
[39] Gleeson L J and Axford W I 1967 Astrophys. J. 149 L115
[40] Jokipii J R 1968 Astrophys. J. 152 997–03
[41] Webb G M 1985 Astrophys. J. 296 319
[42] Webb G M 1989 Astrophys. J. 340 1112
[43] Hasselmann K and Wibberenz G 1970 Astrophys. J. 162 1049
[44] Dorman L I and Kats M E 1977 Space Sci. Rev. 20 529
[45] Fisk L A 1976 J. Geophys. Res. 81 4646
[46] Dolginov A Z and Toptygin I N 1968 Icarus 8 54
[47] Webb G M 1983 Astron. Astrophys. 127 97
[48] Drury L O’C, Duffy P, Eichler D and Mastichiadis A 1999 Astron. Astrophys. 347 370–4
[49] Drury L O’C 1991 Mon. Not. R. Astron. Soc. 251 340–50
[50] Sagdeev R Z 1966 Cooperative phenomena and shock waves in collisionless plasmas Review of Plasma Physics
vol 4 (New York: Consultant Bureau)
[51] Kennel C F, Edmiston J P and Hada T 1985 A Tutorial Review (Geophys. Monogr. Ser.) vol 34, ed R G Stone
and B T Tsurutani (Washington, DC: American Geophysical Union) p 1
[52] Papadopoulos K 1985 A Tutorial Review (Geophys. Monogr. Ser.) vol 34, ed R G Stone and B T Tsurutani
(Washington, DC: American Geophysical Union) p 59
[53] Krall N A 1997 Adv. Space Res. 20 715–24
[54] Landau L D and Lifshitz E M 1987 Fluid Mechanics (Oxford: Pergamon)
[55] Scudder J D 1995 Adv. Space Res. 15 181–223
[56] Leroy M M, Winske D, Goodrich C C, Wu C S and Papadopoulos K 1982 J. Geophys. Res. 87 5081
[57] Kennel C F et al 1984 J. Geophys. Res. 89 5436
[58] Quest K B 1988 J. Geophys. Res. 93 9649
[59] Malkov M A Phys. Rev. E 1998 58 4911
[60] Ellison D C, Drury L O’C and Meyer J P 1997 Astrophys. J. 487 197
Theory of diffusive shock acceleration 479

[61] Lingenfelter R E, Ramaty R and Kozlovsky B 1998 Astrophys. J. 500 L153–6


[62] Bogdan T J and Webb G M 1987 Mon. Not. R. Astron. Soc. 229 41
[63] Kucharek H and Scholer M 1991 J. Geophys. Res. 96 22195
[64] Giacalone J, Burgess D, Schwartz S J and Ellison D C 1992 Geophys. Res. Lett. 19 433
[65] Scholer M, Kucharek H and Trattner K J 1998 Adv. Space Res. 21 533
[66] Jones F C, Jokipii J R and Baring M G 1998 Astrophys. J. 509 238
[67] Malkov M A and Völk H J 1995 Astron. Astrophys. 300 605–26
[68] Bennett L and Ellison D C 1995 J. Geophys. Res. 100 3439
[69] Levinson A 1992 Astrophys. J. 401 73
[70] Levinson A 1994 Astrophys. J. 426 327
[71] Bykov A M and Uvarov Yu A 1999 Sov. Phys.–JETP 88 465
[72] Boutros-Ghali T and Dupree T H 1981 Phys. Fluids 24 1939–858
[73] Völk H J 1984 High Energy Astrophysics ed J Tran Thanh Van (Gif-sur-Yvette: Editions Frontieres) p 281
[74] Bykov A M and Toptygin I N 1993 Phys. Usp. 36 1020
[75] Gieseler U D J, Jones T W and Hyesung Kang 2000 Astron. Astrophys. 364 911–22
[76] Ellison D C 1985 J. Geophys. Res. 90 29
[77] Giacalone J, Burgess D, Schwartz S J and Ellison D C 1993 Astrophys. J. 402 550
[78] Baring M G, Ellison D C and Jones F C 1993 Astrophys. J. 409 327
[79] Ellison D C, Giacalone J, Burgess D and Schwartz S J 1993 J. Geophys. Res. 98 221085
[80] Galeev A A 1984 Sov. Phys.–JETP 86 1655
[81] Anderson K A, Parks G K, Eastman T E, Garnett D A and Frank L A 1981 J. Geophys. Res. 86 4493
[82] Papadopoulos K 1981 Plasma Astrophys. ESA-161 145
[83] Vaisberg O L, Galeev A A, Zastenker G N, Klimov S I, Nozdrachev M N, Sagdeev R Z, Sokolov A Yu and
Shapiro V D 1983 Sov. Phys.–JETP 85 716
[84] Krasnoselskikh V V 1992 Collective Acceleration in Collisionless Plasmas ed D Le Queau, A Roux and
D Gresilon (Les Ulis: Les Editions de Physique) p 297
[85] McClements K G, Dendy R O, Bingham R, Kirk J G and Drury L O 1997 Mon. Not. R. Astron. Soc. 291 241
[86] Galeev A A, Malkov M A and Völk H J 1995 J. Plasma Phys. 54 59–76
[87] Shapiro V D, Bingham R, Dawson J M, Dobe Z, Kellett B J and Mendis D A 1999 J. Geophys. Res. 104 2537
[88] Bingham R, Kellett B J, Dawson J M, Shapiro V D and Mendis D A 2000 Astrophys. J. Suppl. 127 233
[89] Dieckmann M E, Chapman S C, McClements K G, Dendy R O and Drury L O 2000 Astron. Astrophys. 356
377
[90] Sagdeev R Z and Galeev A A 1969 Nonlinear Plasma Theory ed T M O’Neil and D L Book (New York:
Benjamin)
[91] Sagdeev R Z 1979 Rev. Mod. Phys. 51 1–20
[92] Coroniti F 1970 J. Plasma Phys. 4 265
[93] Drury L O’C and Völk H J 1981 Astrophys. J. 248 344–51
[94] Axford W I, Leer E and McKenzie J F 1982 Astron. Astrophys. 111 317
[95] Mond M and Drury L O’C 1998 Astron. Astrophys. 332 385
[96] Drury L O’C, Duffy P and Kirk J K 1996 Astron. Astrophys. 309 1002
[97] Lucek S G and Bell A R 2000 Astrophys. Space Sci. 272 255–62
[98] Achterberg A, Blandford R D and Reynolds S P 1994 Astron. Astrophys. 281 220–30
[99] Spangler S R, Mutel R L, Benson J M and Cordes J M 1986 Astrophys. J. 301 312–9
[100] Spangler S R, Fey A L and Cordes J M 1987 Astrophys. J. 322 909–16
[101] Tanimori T et al 1998 Astrophys. J. Lett. 497 L25
[102] Achterberg A, Blandford R D and Perival V 1984 Astron. Astrophys. 132 97–104
[103] Malkov M A 1997 Astrophys. J. 485 638
[104] McKenzie J E and Völk H J 1982 Astron. Astrophys. 116 191
[105] Kang H and Jones T W 1990 Astrophys. J. 353 149
[106] Zank G P, Webb G M and Donohue D J 1993 Astrophys. J. 406 67–91
[107] Ko M C, Chan K W and Webb G M 1997 J. Plasma Phys. 57 677–94
[108] Axford W I 1981 Proc. 10th Texas Symp. on Relativistic Astrophysics (Baltimore: Ann. NY Acad. Sci) vol 375
pp 297–313
[109] Axford W I 1981 Proc. Int. School and Workshop on Plasma Astrophysics vol. SP-161 (Varenna: European
Space Agency) pp 425–9
[110] Bell A R 1987 Mon. Not. R. Astron. Soc. 225 615
[111] Falle S A E G and Giddings J R 1987 Mon. Not. R. Astron. Soc. 225 399
[112] Malkov M A and Völk H J 1995 Proc. 24th Int. Cosmic Ray Conf. (Rome) 3 269–72
480 M A Malkov and L O’C Drury

[113] Berezhko E G 1996 Astropart. Phys. 5 367


[114] Malkov M A and Völk H J 1996 Astrophys. J. 473 347
[115] Eichler D 1984 Astrophys. J. 277 429
[116] Drury L O’C, Axford W I and Summers D 1982 Mon. Not. R. Astron. Soc. 198 833–41
[117] Blandford R D and Payne D G 1981 Mon. Not. R. Astron. Soc. 194 1041–55
[118] Toptygin I N 1997 Sov. Phys.–JETP 85 862–72
[119] Webb G M, Bogdan T J, Lee M A and Lerche I 1985 Mon. Not. R. Astron. Soc. 215 341–52
[120] Eichler D 1985 Astron. Astrophys. 294 40
[121] Malkov M A 1999 Astrophys. J. Lett. 511 L53–6
[122] Eichler D 1979 Astrophys. J. 229 419
[123] Malkov M A 1997 Astrophys. J. 491 584
[124] Berezhko E G and Ellison D C 1999 Astrophys. J. 526 385
[125] Berezhko E G, Yelshin V and Ksenofontov L 1996 Sov. Phys.–JETP 82 1
[126] Ellison D C and Eichler D 1985 Phys. Rev. Lett. 55 2735
[127] Kazanas D and Ellison D C 1986 Astrophys. J. 304 178–87
[128] Ellison D C and Eichler D 1984 Astrophys. J. 286 691
[129] Nayfeh A H 1973 Perturbation Methods (New York: Wiley)
[130] Duffy P, Drury L O’C and Völk H J 1994 Astron. Astrophys. 291 613
[131] Kang H and Jones T W 1995 Astrophys. J. 447 994
[132] Berezhko E G and Völk H J 1997 Astropart. Phys. 7 183
[133] Achterberg A 1987 Astron. Astrophys. 174 329
[134] Ellison D C, Berezhko E G and Baring M G 2000 Astrophys. J. 540 292
[135] Blandford R D 1980 Astrophys. J. 238 410
[136] Heavens A F 1983 Mon. Not. R. Astron. Soc. 204 699
[137] Drury L O’C 1984 Adv. Space Res. 4 185
[138] Drury L O’C and Falle S A E G 1986 Mon. Not. R. Astron. Soc. 223 353
[139] Berezhko E G 1987 Sov. Astron. Lett. 12 352
[140] Chalov S V 1988 Sov. Astron. Lett. 14 114
[141] Zank G P, Axford W I and McKenzie J F 1990 Astron. Astrophys. 233 275–84
[142] Kang H, Jones T W and Ryu D 1992 Astrophys. J. 385 193–204
[143] Bell A R 1995 Energetic Particles in Astrophysical and Space Plasmas 4th Workshop ed R O Dendy
(Netherlands: Hoenderloo)
[144] Quest K B and Shapiro V D 1996 J. Geophys. Res. 101 24
Quest K B and Shapiro V D 1996 J. Geophys. Res. 101 457–69
[145] Shapiro V D, Quest K B and Okolicsanyi 1998 J. Geophys. Res. Lett. 25 845–8
[146] Bak P, Tang C and Wiesenfeld K 1987 Phys. Rev. Lett. 59 381
[147] Bak P 1996 How Nature Works: The Science of Self-Organized Criticality (New York: Springer)
[148] Diamond P H and Hahm T S 1995 Phys. Plasmas 2 3640
[149] Malkov M A, Diamond P H and Völk H J 2000 Astrophys. J. 533 L171–4
[150] Lagage P O and Cesarsky C J 1983 Astron. Astrophys. 125 249
[151] Blandford R D 2000 Phys. Scr. T 85 191
[152] Ptuskin V S 1997 Adv. Space Res. 19 697–705
[153] Jokipii J R 1987 Astrophys. J. 313 842–6
[154] Jokipii J R and Morfill G E 1985 Astrophys. J. 290 L1
[155] Völk H J 1987 Particle acceleration in astrophysical shock waves Proc. 20th Int. Cosmic Ray Conf. (Moscow)
(Moscow: NAUKA) 7 157–200
[156] Axford W I 1991 Astrophysical Aspects of the Most Energetic Cosmic Rays ed M Nagano and G Takahara
(Singapore: World Scientific) p 402
[157] Sagdeev R Z, Usikov D A and Zaslavsky G M 1988 Nonlinear Physics: From the Pendulum to Turbulence and
Chaos (Chur: Harwood)
[158] McKee C F 1982 Supernovae: A Survey of Current Research ed M J Rees and R J Stoneham (Dordrecht:
Reidel) p 433
[159] Zabusky N J 1999 Annu. Rev. Fluid Mech. 31 495–536
[160] Esposito J A, Hunter S D, Kanbach G and Sreekumar P 1996 Astrophys. J. 461 461–820
[161] Hartman R C et al 1999 Astrophys. J. Suppl. 123 79–202
[162] Fossati G A, Celotti M and Chiaberge Y H 1999 Zhang BeppoSAX Observations of Mkn 421: clues on
the particle acceleration Proc. 5th Compton Symp. (Portsmouth NH) ed. M L McConnell and J M Ryan
(Melville, N.Y.: AIP) at press
Theory of diffusive shock acceleration 481

Fossati G A, Celotti M and Chiaberge Y H 1999 Zhang BeppoSAX Observations of Mkn 421: clues on the
particle acceleration Proc. 5th Compton Symp. (Portsmouth NH) Preprint astro-ph/9912055
[163] Hughes J P, Rakowski C E and Decourchelle A 2000 Astrophys. J. 543 L61–5
[164] Aharonian F A 1999 Astropart. Phys. 11 225–34
[165] Weekes T C 2000 Phys. Scr. T 85 195
[166] Cronin J 1999 Rev. Mod. Phys. 71 S165
[167] Drury L O’C, Aharonian F A and Völk H J 1994 Astron. Astrophys. 287 959–71
[168] Aharonian F A, Drury L O’C and Völk H J 1994 Astron. Astrophys. 285 645–7
[169] Naito T and Takahara F 1994 J. Phys. G: Nucl. Part. Phys. 20 477
[170] Seo E S and Ptuskin V S 1994 Astrophys. J. 431 705
[171] Mannheim K 1998 Science 279 684
[172] Waxman E and Bahcall J 1999 Phys. Rev. D 59 023002
[173] Mannheim K, Protheroe R J and Rachen J P 1998 Phys. Rev. D 63 023003
[174] Green D A 1991 Publ. Astron. Soc. Pac. 103 209
[175] Koyama K, Petre R, Gotthelf E V, Hwang U, Matsura M, Ozaki M and Holt S S 1995 Nature 378 255
[176] Keohane J W, Petre R, Gotthelf E V, Ozaki M and Koyama K 1997 Astrophys. J. 484 350
[177] Allen G E et al 1997 Astrophys. J. Lett. 487 L97
[178] Muraishi H et al 2000 Astron. Astrophys. 354 L57–61
[179] Pohl M 1996 Astron. Astrophys. 307 L57
[180] Mastichiadis A and de Jager O C 1996 Astron. Astrophys. 311 L5
[181] Aharonian F A and Atoyan A M 1999 Astron. Astrophys. 351 330–40
[182] Remington B A, Drake R P, Takabe H and Arnett D 2000 Phys. Plasmas 7 1641–52
[183] Drake R, Carroll J J, Estabrook K, Glendinning S G, Remington B A, Wallace R and McCray R 1998 Astrophys.
J. Lett. 500 L157
[184] Ryutov D, Drake R P, Kane J, Liang E, Remington B A and Wood-Vasey W M 1999 Astrophys. J. 518 821
[185] Drury L O’C and Mendonca T J 2000 Phys. Plasmas 7 5148–52
[186] Drake R P 2000 Phys. Plasmas 7 4690–8
[187] Fulbright M S and Reynolds S P 1990 Astrophys. J. 357 591
[188] Axford A I, Fisk L A and Lee M A 1987 Proc. 20th Int. Cosmic Ray Conf. (Moscow) vol 2 187
[189] Ratkiewicz R, Axford W I and McKenzie J F 1994 Astron. Astrophys. 291 935–42
[190] Chevalier R 1983 Astrophys. J. 272 765
[191] Liszt H and Lukas R 1999 Astron. Astrophys. 347 258–65
[192] Kamae T, Ohsugi T, Thompson D J and Watanabe K 1999 Cospar 98 Symp. Adv. Space Res. at press
(Kamae T, Ohsugi T, Thompson D J and Watanabe K 1999 Preprint astro-ph/9901187)

You might also like