You are on page 1of 131

First Edition, 2012

ISBN 978-81-323-2072-2

All rights reserved.

Published by:
Library Press
4735/22 Prakashdeep Bldg,
Ansari Road, Darya Ganj,
Delhi - 110002
Email: info@wtbooks.com
Table of Contents

Chapter 1 - Introduction to Function

Chapter 2 - Inverse Function

Chapter 3 - Special Functions & Implicit and Explicit Functions

Chapter 4 - Function Composition

Chapter 5 - Continuous Function

Chapter 6 - Additive Function

Chapter 7 - Algebraic Function

Chapter 8 - Analytic Function

Chapter 9 - Completely Multiplicative Function and Concave Function

Chapter 10 - Convex Function

Chapter 11 - Differentiable Function

Chapter 12 - Elementary Function and Entire Function

Chapter 13 - Even and Odd Functions

Chapter 14 - Harmonic Function

Chapter 15 - Holomorphic Function

Chapter 16 - Homogeneous Function

Chapter 17 - Indicator Function

Chapter 18 - Injective Function

Chapter 19 - Measurable Function


Chapter 1

Introduction to Function

Graph of example function,

Both the domain and the range in the picture are the set of real numbers between -1 and
1.5.

The mathematical concept of a function expresses the intuitive idea that one quantity (the
argument of the function, also known as the input) completely determines another
quantity (the value, or the output). A function assigns a unique value to each input of a
specified type. The argument and the value may be real numbers, but they can also be
elements from any given sets: the domain and the codomain of the function. An example
of a function with the real numbers as both its domain and codomain is the function f(x) =
2x, which assigns to every real number the real number with twice its value. In this case,
it is written that f(5) = 10.

In addition to elementary functions on numbers, functions include maps between


algebraic structures like groups and maps between geometric objects like manifolds. In
the abstract set-theoretic approach, a function is a relation between the domain and the
codomain that associates each element in the domain with exactly one element in the
codomain. An example of a function with domain {A,B,C} and codomain {1,2,3}
associates A with 1, B with 2, and C with 3.

There are many ways to describe or represent functions: by a formula, by an algorithm


that computes it, by a plot or a graph. A table of values is a common way to specify a
function in statistics, physics, chemistry, and other sciences. A function may also be
described through its relationship to other functions, for example, as the inverse function
or a solution of a differential equation. There are uncountably many different functions
from the set of natural numbers to itself, most of which cannot be expressed with a
formula or an algorithm.

In a setting where they have numerical outputs, functions may be added and multiplied,
yielding new functions. Collections of functions with certain properties, such as
continuous functions and differentiable functions, usually required to be closed under
certain operations, are called function spaces and are studied as objects in their own right,
in such disciplines as real analysis and complex analysis. An important operation on
functions, which distinguishes them from numbers, is the composition of functions.

Overview
Because functions are so widely used, many traditions have grown up around their use.
The symbol for the input to a function is often called the independent variable or
argument and is often represented by the letter x or, if the input is a particular time, by
the letter t. The symbol for the output is called the dependent variable or value and is
often represented by the letter y. The function itself is most often called f, and thus the
notation y = f(x) indicates that a function named f has an input named x and an output
named y.

A function takes an input, x, and returns an output (x). One metaphor describes the
function as a "machine" or "black box" that converts the input into the output.
The set of all permitted inputs to a given function is called the domain of the function.
The set of all resulting outputs is called the image or range of the function. The image is
often a subset of some larger set, called the codomain of a function. Thus, for example,
the function f(x) = x2 could take as its domain the set of all real numbers, as its image the
set of all non-negative real numbers, and as its codomain the set of all real numbers. In
that case, we would describe f as a real-valued function of a real variable. Sometimes,
especially in computer science, the term "range" refers to the codomain rather than the
image, so care needs to be taken when using the word.

It is usual practice in mathematics to introduce functions with temporary names like .


For example, (x) = 2x+1, implies (3) = 7; when a name for the function is not needed,
the form y = 2x+1 may be used. If a function is often used, it may be given a more
permanent name as, for example,

Functions need not act on numbers: the domain and codomain of a function may be
arbitrary sets. One example of a function that acts on non-numeric inputs takes English
words as inputs and returns the first letter of the input word as output. Furthermore,
functions need not be described by any expression, rule or algorithm: indeed, in some
cases it may be impossible to define such a rule. For example, the association between
inputs and outputs in a choice function often lacks any fixed rule, although each input
element is still associated to one and only one output.

A function of two or more variables is considered in formal mathematics as having a


domain consisting of ordered pairs or tuples of the argument values. For example
Sum(x,y) = x+y operating on integers is the function Sum with a domain consisting of
pairs of integers. Sum then has a domain consisting of elements like (3,4), a codomain of
integers, and an association between the two that can be described by a set of ordered
pairs like ((3,4), 7). Evaluating Sum(3,4) then gives the value 7 associated with the pair
(3,4).

A family of objects indexed by a set is equivalent to a function. For example, the


sequence 1, 1/2, 1/3, ..., 1/n, ... can be written as the ordered sequence <1/n> where n is a
natural number, or as a function f(n) = 1/n from the set of natural numbers into the set of
rational numbers.

Dually, a surjective function partitions its domain into disjoint sets indexed by the
codomain. This partition is known as the kernel of the function, and the parts are called
the fibers or level sets of the function at each element of the codomain. (A non-surjective
function divides its domain into disjoint and possibly-empty subsets).

Definition
One precise definition of a function is that it consists of an ordered triple of sets, which
may be written as (X, Y, F). X is the domain of the function, Y is the codomain, and F is a
set of ordered pairs. In each of these ordered pairs (a, b), the first element a is from the
domain, the second element b is from the codomain, and every element in the domain is
the first element in one and only one ordered pair. The set of all b is known as the image
of the function. Some authors use the term "range" to mean the image, others to mean the
codomain.

The notation :XY indicates that is a function with domain X and codomain Y.

In most practical situations, the domain and codomain are understood from context, and
only the relationship between the input and output is given. Thus

is usually written as

The graph of a function is its set of ordered pairs. Such a set can be plotted on a pair of
coordinate axes; for example, (3, 9) is the point of intersection of the lines x = 3 and
y = 9.

A function is a special case of a more general mathematical concept, the relation, for
which the restriction that each element of the domain appear as the first element in one
and only one ordered pair is removed (or, in other words, the restriction that each input be
associated to exactly one output). A relation is "single-valued" or "functional" when for
each element of the domain set, the graph contains at most one ordered pair (and possibly
none) with it as a first element. A relation is called "left-total" or simply "total" when for
each element of the domain, the graph contains at least one ordered pair with it as a first
element (and possibly more than one). A relation that is both left-total and single-valued
is a function.

In some parts of mathematics, including recursion theory and functional analysis, it is


convenient to study partial functions in which some values of the domain have no
association in the graph; i.e., single-valued relations. For example, the function f such that
f(x) = 1/x does not define a value for x = 0, and so is only a partial function from the real
line to the real line. The term total function can be used to stress the fact that every
element of the domain does appear as the first element of an ordered pair in the graph. In
other parts of mathematics, non-single-valued relations are similarly conflated with
functions: these are called multivalued functions, with the corresponding term single-
valued function for ordinary functions.

Some authors (especially in set theory) define a function as simply its graph f, with the
restriction that the graph should not contain two distinct ordered pairs with the same first
element. Indeed, given such a graph, one can construct a suitable triple by taking the set
of all first elements as the domain and the set of all second elements as the codomain: this
automatically causes the function to be total and surjective . However, most authors in
advanced mathematics outside of set theory prefer the greater power of expression
afforded by defining a function as an ordered triple of sets.

Many operations in set theorysuch as the power sethave the class of all sets as their
domain, therefore, although they are informally described as functions, they do not fit the
set-theoretical definition above outlined.

Vocabulary
A specific input in a function is called an argument of the function. For each argument
value x, the corresponding unique y in the codomain is called the function value at x,
output of for an argument x, or the image of x under . The image of x may be written
as (x) or as y.

The graph of a function is the set of all ordered pairs (x, (x)), for all x in the domain X.
If X and Y are subsets of R, the real numbers, then this definition coincides with the
familiar sense of "graph" as a picture or plot of the function, with the ordered pairs being
the Cartesian coordinates of points.

A function can also be called a map or a mapping. Some authors, however, use the terms
"function" and "map" to refer to different types of functions. Other specific types of
functions include functionals and operators.

Notation

Formal description of a function typically involves the function's name, its domain, its
codomain, and a rule of correspondence. Thus we frequently see a two-part notation, an
example being

where the first part is read:

" is a function from N to R" (one often writes informally "Let : X Y" to mean
"Let be a function from X to Y"), or
" is a function on N into R", or
" is an R-valued function of an N-valued variable",

and the second part is read:

maps to
Here the function named "" has the natural numbers as domain, the real numbers as
codomain, and maps n to itself divided by . Less formally, this long form might be
abbreviated

where f(n) is read as "f as function of n" or "f of n". There is some loss of information: we
no longer are explicitly given the domain N and codomain R.

It is common to omit the parentheses around the argument when there is little chance of
confusion, thus: sin x; this is known as prefix notation. Writing the function after its
argument, as in x , is known as postfix notation; for example, the factorial function is
customarily written n!, even though its generalization, the gamma function, is written
(n). Parentheses are still used to resolve ambiguities and denote precedence, though in
some formal settings the consistent use of either prefix or postfix notation eliminates the
need for any parentheses.

Functions with multiple inputs and outputs

The concept of function can be extended to an object that takes a combination of two (or
more) argument values to a single result. This intuitive concept is formalized by a
function whose domain is the Cartesian product of two or more sets.

For example, consider the function that associates two integers to their product: (x, y) =
xy. This function can be defined formally as having domain ZZ, the set of all integer
pairs; codomain Z; and, for graph, the set of all pairs ((x,y), xy). Note that the first
component of any such pair is itself a pair (of integers), while the second component is a
single integer.

The function value of the pair (x,y) is ((x,y)). However, it is customary to drop one set of
parentheses and consider (x,y) a function of two variables, x and y. Functions of two
variables may be plotted on the three-dimensional Cartesian as ordered triples of the form
(x,y,f(x,y)).

The concept can still further be extended by considering a function that also produces
output that is expressed as several variables. For example, consider the function swap(x,
y) = (y, x) with domain RR and codomain RR as well. The pair (y, x) is a single value
in the codomain seen as a Cartesian product.

Currying

An alternative approach to handling functions with multiple arguments is to transform


them into a chain of functions that each takes a single argument. For instance, one can
interpret Add(3,5) to mean "first produce a function that adds 3 to its argument, and then
apply the 'Add 3' function to 5". This transformation is called currying: Add 3 is
curry(Add) applied to 3. There is a bijection between the function spaces CAB and (CB)A.

When working with curried functions it is customary to use prefix notation with function
application considered left-associative, since juxtaposition of multiple argumentsas in
( x y)naturally maps to evaluation of a curried function.

Binary operations

The familiar binary operations of arithmetic, addition and multiplication, can be viewed
as functions from RR to R. This view is generalized in abstract algebra, where n-ary
functions are used to model the operations of arbitrary algebraic structures. For example,
an abstract group is defined as a set X and a function from XX to X that satisfies
certain properties.

Traditionally, addition and multiplication are written in the infix notation: x+y and xy
instead of +(x, y) and (x, y).

Injective and surjective functions

Three important kinds of function are the injections (or one-to-one functions), which
have the property that if (a) = (b) then a must equal b; the surjections (or onto
functions), which have the property that for every y in the codomain there is an x in the
domain such that (x) = y; and the bijections, which are both one-to-one and onto. This
nomenclature was introduced by the Bourbaki group.

When the definition of a function by its graph only is used, since the codomain is not
defined, the "surjection" must be accompanied with a statement about the set the function
maps onto. For example, we might say maps onto the set of all real numbers.
Function composition

A composite function g(f(x)) can be visualized as the combination of two "machines".


The first takes input x and outputs f(x). The second takes f(x) and outputs g(f(x)).

The function composition of two or more functions takes the output of one or more
functions as the input of others. The functions : X Y and g: Y Z can be composed
by first applying to an argument x to obtain y = (x) and then applying g to y to obtain z
= g(y). The composite function formed in this way from general and g may be written

This notation follows the form such that

The function on the right acts first and the function on the left acts second, reversing
English reading order. We remember the order by reading the notation as "g of ". The
order is important, because rarely do we get the same result both ways. For example,
suppose (x) = x2 and g(x) = x+1. Then g((x)) = x2+1, while (g(x)) = (x+1)2, which is
x2+2x+1, a different function.
In a similar way, the function given above by the formula y = 5x20x3+16x5 can be
obtained by composing several functions, namely the addition, negation, and
multiplication of real numbers.

An alternative to the colon notation, convenient when functions are being composed,
writes the function name above the arrow. For example, if is followed by g, where g
produces the complex number eix, we may write

A more elaborate form of this is the commutative diagram.

Identity function

The unique function over a set X that maps each element to itself is called the identity
function for X, and typically denoted by idX. Each set has its own identity function, so the
subscript cannot be omitted unless the set can be inferred from context. Under
composition, an identity function is "neutral": if is any function from X to Y, then

Restrictions and extensions

Informally, a restriction of a function is the result of trimming its domain.

More precisely, if is a function from a X to Y, and S is any subset of X, the restriction


of to S is the function |S from S to Y such that |S(s) = (s) for all s in S.

If g is a restriction of , then it is said that is an extension of g.

The overriding of f: X Y by g: W Y (also called overriding union) is an extension


of g denoted as (f g): (X W) Y. Its graph is the set-theoretical union of the graphs
of g and f|X \ W. Thus, it relates any element of the domain of g to its image under g, and
any other element of the domain of f to its image under f. Overriding is an associative
operation; it has the empty function as an identity element. If f|X W and g|X W are
pointwise equal (e.g., the domains of f and g are disjoint), then the union of f and g is
defined and is equal to their overriding union. This definition agrees with the definition
of union for binary relations.

Image of a set

The concept of the image can be extended from the image of a point to the image of a set.
If A is any subset of the domain, then (A) is the subset of im consisting of all images
of elements of A. We say the (A) is the image of A under f.
Use of (A) to denote the image of a subset AX is consistent so long as no subset of the
domain is also an element of the domain. In some fields (e.g., in set theory, where
ordinals are also sets of ordinals) it is convenient or even necessary to distinguish the two
concepts; the customary notation is [A] for the set { (x): x A }; some authors write
`x instead of (x), and ``A instead of [A].

Notice that the image of is the image (X) of its domain, and that the image of is a
subset of its codomain.

Inverse image

The inverse image (or preimage, or more precisely, complete inverse image) of a
subset B of the codomain Y under a function is the subset of the domain X defined by

So, for example, the preimage of {4, 9} under the squaring function is the set
{3,2,2,3}.

In general, the preimage of a singleton set (a set with exactly one element) may contain
any number of elements. For example, if (x) = 7, then the preimage of {5} is the empty
set but the preimage of {7} is the entire domain. Thus the preimage of an element in the
codomain is a subset of the domain. The usual convention about the preimage of an
element is that 1(b) means 1({b}), i.e

In the same way as for the image, some authors use square brackets to avoid confusion
between the inverse image and the inverse function. Thus they would write 1[B] and
1[b] for the preimage of a set and a singleton.

The preimage of a singleton set is sometimes called a fiber. The term kernel can refer to a
number of related concepts.

Specifying a function
A function can be defined by any mathematical condition relating each argument to the
corresponding output value. If the domain is finite, a function may be defined by
simply tabulating all the arguments x and their corresponding function values (x). More
commonly, a function is defined by a formula, or (more generally) an algorithm a
recipe that tells how to compute the value of (x) given any x in the domain.

There are many other ways of defining functions. Examples include piecewise
definitions, induction or recursion, algebraic or analytic closure, limits, analytic
continuation, infinite series, and as solutions to integral and differential equations. The
lambda calculus provides a powerful and flexible syntax for defining and combining
functions of several variables.

Computability

Functions that send integers to integers, or finite strings to finite strings, can sometimes
be defined by an algorithm, which gives a precise description of a set of steps for
computing the output of the function from its input. Functions definable by an algorithm
are called computable functions. For example, the Euclidean algorithm gives a precise
process to compute the greatest common divisor of two positive integers. Many of the
functions studied in the context of number theory are computable.

Fundamental results of computability theory show that there are functions that can be
precisely defined but are not computable. Moreover, in the sense of cardinality, almost all
functions from the integers to integers are not computable. The number of computable
functions from integers to integers is countable, because the number of possible
algorithms is. The number of all functions from integers to integers is higher: the same as
the cardinality of the real numbers. Thus most functions from integers to integers are not
computable. Specific examples of uncomputable functions are known, including the busy
beaver function and functions related to the halting problem and other undecidable
problems.

Function spaces
The set of all functions from a set X to a set Y is denoted by X Y, by [X Y], or by YX.

The latter notation is motivated by the fact that, when X and Y are finite and of size |X|
and |Y|, then the number of functions X Y is |YX| = |Y||X|. This is an example of the
convention from enumerative combinatorics that provides notations for sets based on
their cardinalities. Other examples are the multiplication sign XY used for the Cartesian
product, where |XY| = |X||Y|; the factorial sign X!, used for the set of permutations where

|X!| = |X|!; and the binomial coefficient sign , used for the set of n-element subsets

where

If : X Y, it may reasonably be concluded that [X Y].

Pointwise operations

If : X R and g: X R are functions with a common domain of X and common


codomain of a ring R, then the sum function + g: X R and the product function
g: X R can be defined as follows:
for all x in X.

This turns the set of all such functions into a ring. The binary operations in that ring have
as domain ordered pairs of functions, and as codomain functions. This is an example of
climbing up in abstraction, to functions of more complex types.

By taking some other algebraic structure A in the place of R, we can turn the set of all
functions from X to A into an algebraic structure of the same type in an analogous way.

Other properties
There are many other special classes of functions that are important to particular branches
of mathematics, or particular applications. Here is a partial list:

bijection, injection and surjection, or singularly:


o injective,
o surjective, and
o bijective function
continuous
differentiable, integrable
linear, polynomial, rational
algebraic, transcendental
trigonometric
fractal
odd or even
convex, monotonic, unimodal
holomorphic, meromorphic, entire
vector-valued
computable

History
Functions prior to Leibniz

Historically, some mathematicians can be regarded as having foreseen and come close to
a modern formulation of the concept of function. Among them is Oresme (1323-1382) . . .
In his theory, some general ideas about independent and dependent variable quantities
seem to be present.

Ponte further notes that "The emergence of a notion of function as an individualized


mathematical entity can be traced to the beginnings of infinitesimal calculus".
The notion of "function" in analysis

As a mathematical term, "function" was coined by Gottfried Leibniz, in a 1673 letter, to


describe a quantity related to a curve, such as a curve's slope at a specific point. The
functions Leibniz considered are today called differentiable functions. For this type of
function, one can talk about limits and derivatives; both are measurements of the output
or the change in the output as it depends on the input or the change in the input. Such
functions are the basis of calculus.

Johann Bernoulli "by 1718, had come to regard a function as any expression made up of a
variable and some constants", and Leonhard Euler during the mid-18th century used the
word to describe an expression or formula involving variables and constants e.g.,
x2+3x+2.

Alexis Claude Clairaut (in approximately 1734) and Euler introduced the familiar
notation " f(x) ".

At first, the idea of a function was rather limited. Joseph Fourier, for example, claimed
that every function had a Fourier series, something no mathematician would claim today.
By broadening the definition of functions, mathematicians were able to study "strange"
mathematical objects such as continuous functions that are nowhere differentiable. These
functions were first thought to be only theoretical curiosities, and they were collectively
called "monsters" as late as the turn of the 20th century. However, powerful techniques
from functional analysis have shown that these functions are, in a precise sense, more
common than differentiable functions. Such functions have since been applied to the
modeling of physical phenomena such as Brownian motion.

During the 19th century, mathematicians started to formalize all the different branches of
mathematics. Weierstrass advocated building calculus on arithmetic rather than on
geometry, which favoured Euler's definition over Leibniz's.

Dirichlet and Lobachevsky are traditionally credited with independently giving the
modern "formal" definition of a function as a relation in which every first element has a
unique second element. Eves asserts that "the student of mathematics usually meets the
Dirichlet definition of function in his introductory course in calculus, but Dirichlet's
claim to this formalization is disputed by Imre Lakatos:

There is no such definition in Dirichlet's works at all. But there is ample evidence that he
had no idea of this concept. In his [1837], for instance, when he discusses piecewise
continuous functions, he says that at points of discontinuity the function has two values:
...
(Proofs and Refutations, 151, Cambridge University Press 1976.)

In the context of "the Differential Calculus" George Boole defined (circa 1849) the notion
of a function as follows:
"That quantity whose variation is uniform . . . is called the independent variable. That
quantity whose variation is referred to the variation of the former is said to be a function
of it. The Differential calculus enables us in every case to pass from the function to the
limit. This it does by a certain Operation. But in the very Idea of an Operation is . . . the
idea of an inverse operation. To effect that inverse operation in the present instance is the
business of the Int[egral] Calculus."

The logician's "function" prior to 1850

Logicians of this time were primarily involved with analyzing syllogisms (the 2000 year-
old Aristotelian forms and otherwise), or as Augustus De Morgan (1847) stated it: "the
examination of that part of reasoning which depends upon the manner in which
inferences are formed, and the investigation of general maxims and rules for constructing
arguments". At this time the notion of (logical) "function" is not explicit, but at least in
the work of De Morgan and George Boole it is implied: we see abstraction of the
argument forms, the introduction of variables, the introduction of a symbolic algebra with
respect to these variables, and some of the notions of set theory.

De Morgan's 1847 "FORMAL LOGIC OR, The Calculus of Inference, Necessary and
Probable" observes that "[a] logical truth depends upon the structure of the statement, and
not upon the particular matters spoken of"; he wastes no time (preface page i) abstracting:
"In the form of the proposition, the copula is made as absract as the terms". He
immediately (p. 1) casts what he calls "the proposition" (present-day propositional
function or relation) into a form such as "X is Y", where the symbols X, "is", and Y
represent, respectively, the subject, copula, and predicate. While the word "function"
does not appear, the notion of "abstraction" is there, "variables" are there, the notion of
inclusion in his symbolism all of the is in the (p. 9) is there, and lastly a new
symbolism for logical analysis of the notion of "relation" (he uses the word with respect
to this example " X)Y " (p. 75)) is there:

" A1 X)Y To take an X it is necessary to take a Y" [or To be an X it is necessary to be a


Y]
" A1 Y)X To take an Y it is sufficient to take a X" [or To be a Y it is sufficient to be an
X], etc.

In his 1848 The Nature of Logic Boole asserts that "logic . . . is in a more especial sense
the science of reasoning by signs", and he briefly discusses the notions of "belonging to"
and "class": "An individual may possess a great variety of attributes and thus belonging
to a great variety of different classes" . Like De Morgan he uses the notion of "variable"
drawn from analysis; he gives an example of "represent[ing] the class oxen by x and that
of horses by y and the conjunction and by the sign + . . . we might represent the aggregate
class oxen and horses by x + y".
The logicians' "function" 1850-1950

Eves observes "that logicians have endeavored to push down further the starting level of
the definitional development of mathematics and to derive the theory of sets, or classes,
from a foundation in the logic of propositions and propositional functions". But by the
late 19th century the logicians' research into the foundations of mathematics was
undergoing a major split. The direction of the first group, the Logicists, can probably be
summed up best by Bertrand Russell 1903:9 -- "to fulfil two objects, first, to show that all
mathematics follows from symbolic logic, and secondly to discover, as far as possible,
what are the principles of symbolic logic itself."

The second group of logicians, the set-theorists, emerged with Georg Cantor's "set
theory" (18701890) but were driven forward partly as a result of Russell's discovery of a
paradox that could be derived from Frege's conception of "function", but also as a
reaction against Russell's proposed solution. Zermelo's set-theoretic response was his
1908 Investigations in the foundations of set theory I -- the first axiomatic set theory; here
too the notion of "propositional function" plays a role.

George Boole's The Laws of Thought 1854; John Venn's Symbolic Logic 1881

In his An Investigation into the laws of thought Boole now defined a function in terms of
a symbol x as follows:

"8. Definition.-- Any algebraic expression involving symbol x is termed a function of x,


and may be represented by the abbreviated form f(x)"

Boole then used algebraic expressions to define both algebraic and logical notions, e.g.,
1x is logical NOT(x), xy is the logical AND(x,y), x + y is the logical OR(x, y), x(x+y) is
xx+xy, and "the special law" xx = x2 = x.

In his 1881 Symbolic Logic Venn was using the words "logical function" and the
contemporary symbolism (x = f(y), y = f1(x), cf page xxi) plus the circle-diagrams
historically associated with Venn to describe "class relations", the notions "'quantifying'
our predicate", "propositions in respect of their extension", "the relation of inclusion and
exclusion of two classes to one another", and "propositional function" (all on p. 10), the
bar over a variable to indicate not-x (page 43), etc. Indeed he equated unequivocally the
notion of "logical function" with "class" [modern "set"]: "... on the view adopted in this
book, f(x) never stands for anything but a logical class. It may be a compound class
aggregated of many simple classes; it may be a class indicated by certain inverse logical
operations, it may be composed of two groups of classes equal to one another, or what is
the same thing, their difference declared equal to zero, that is, a logical equation. But
however composed or derived, f(x) with us will never be anything else than a general
expression for such logical classes of things as may fairly find a place in ordinary Logic".
Frege's Begriffsschrift 1879

Gottlob Frege's Begriffsschrift (1879) preceded Giuseppe Peano (1889), but Peano had
no knowledge of Frege 1879 until after he had published his 1889. Both writers strongly
influenced Bertrand Russell (1903). Russell in turn influenced much of 20th-century
mathematics and logic through his Principia Mathematica (1913) jointly authored with
Alfred North Whitehead.

At the outset Frege abandons the traditional "concepts subject and predicate", replacing
them with argument and function respectively, which he believes "will stand the test of
time. It is easy to see how regarding a content as a function of an argument leads to the
formation of concepts. Furthermore, the demonstration of the connection between the
meanings of the words if, and, not, or, there is, some, all, and so forth, deserves
attention".

Frege begins his discussion of "function" with an example: Begin with the expression
"Hydrogen is lighter than carbon dioxide". Now remove the sign for hydrogen (i.e., the
word "hydrogen") and replace it with the sign for oxygen (i.e., the word "oxygen"); this
makes a second statement. Do this again (using either statement) and substitute the sign
for nitrogen (i.e., the word "nitrogen") and note that "This changes the meaning in such a
way that "oxygen" or "nitrogen" enters into the relations in which "hydrogen" stood
before". There are three statements:

"Hydrogen is lighter than carbon dioxide."


"Oxygen is lighter than carbon dioxide."
"Nitrogen is lighter than carbon dioxide."

Now observe in all three a "stable component, representing the totality of [the] relations";
call this the function, i.e.,

"... is lighter than carbon dioxide", is the function.

Frege calls the argument of the function "[t]he sign [e.g., hydrogen, oxygen, or
nitrogen], regarded as replaceable by others that denotes the object standing in these
relations". He notes that we could have derived the function as "Hydrogen is lighter than .
. .." as well, with an argument position on the right; the exact observation is made by
Peano. Finally, Frege allows for the case of two (or more arguments). For example,
remove "carbon dioxide" to yield the invariant part (the function) as:

"... is lighter than ... "

The one-argument function Frege generalizes into the form (A) where A is the
argument and ( ) represents the function, whereas the two-argument function he
symbolizes as (A, B) with A and B the arguments and ( , ) the function and cautions
that "in general (A, B) differs from (B, A)". Using his unique symbolism he translates
for the reader the following symbolism:
"We can read |--- (A) as "A has the property . |--- (A, B) can be translated by "B
stands in the relation to A" or "B is a result of an application of the procedure to the
object A".

Peano 1889 The Principles of Arithmetic 1889

Peano defined the notion of "function" in a manner somewhat similar to Frege, but
without the precision. First Peano defines the sign "K means class, or aggregate of
objects", the objects of which satisfy three simple equality-conditions, a = a, (a = b) = (b
= a), IF ((a = b) AND (b = c)) THEN (a = c). He then introduces , "a sign or an
aggregate of signs such that if x is an object of the class s, the expression x denotes a
new object". Peano adds two conditions on these new objects: First, that the three
equality-conditions hold for the objects x; secondly, that "if x and y are objects of class s
and if x = y, we assume it is possible to deduce x = y". Given all these conditions are
met, is a "function presign". Likewise he identifies a "function postsign". For example
if is the function presign a+, then x yields a+x, or if is the function postsign +a then
x yields x+a.

Bertrand Russell's The Principles of Mathematics 1903

While the influence of Cantor and Peano was paramount, in Appendix A "The Logical
and Arithmetical Doctrines of Frege" of The Principles of Mathematics, Russell arrives at
a discussion of Frege's notion of function, "...a point in which Frege's work is very
important, and requires careful examination". In response to his 1902 exchange of letters
with Frege about the contradiction he discovered in Frege's Begriffsschrift Russell tacked
this section on at the last moment.

For Russell the bedeviling notion is that of "variable": "6. Mathematical propositions are
not only characterized by the fact that they assert implications, but also by the fact that
they contain variables. The notion of the variable is one of the most difficult with which
logic has to deal. For the present, I openly wish to make it plain that there are variables in
all mathematical propositions, even where at first sight they might seem to be absent. . . .
We shall find always, in all mathematical propositions, that the words any or some occur;
and these words are the marks of a variable and a formal implication".

As expressed by Russell "the process of transforming constants in a proposition into


variables leads to what is called generalization, and gives us, as it were, the formal
essence of a proposition ... So long as any term in our proposition can be turned into a
variable, our proposition can be generalized; and so long as this is possible, it is the
business of mathematics to do it"; these generalizations Russell named propositional
functions". Indeed he cites and quotes from Frege's Begriffsschrift and presents a vivid
example from Frege's 1891 Function und Begriff: That "the essence of the arithmetical
function 2*x3+x is what is left when the x is taken away, i.e., in the above instance 2*( )3
+ ( ). The argument x does not belong to the function but the two taken together make the
whole". Russell agreed with Frege's notion of "function" in one sense: "He regards
functions -- and in this I agree with him -- as more fundamental than predicates and
relations" but Russell rejected Frege's "theory of subject and assertion", in particular "he
thinks that, if a term a occurs in a proposition, the proposition can always be analysed
into a and an assertion about a".

Evolution of Russell's notion of "function" 1908-1913

Russell would carry his ideas forward in his 1908 Mathematical logical as based on the
theory of types and into his and Whitehead's 1910-1913 Principia Mathematica. By the
time of Principia Mathematica Russell, like Frege, considered the propositional function
fundamental: "Propositional functions are the fundamental kind from which the more
usual kinds of function, such as sin x or log x or "the father of x" are derived. These
derivative functions . . . are called descriptive functions". The functions of propositions .
. . are a particular case of propositional functions".

Propositional functions: Because his terminology is different from the contemporary,


the reader may be confused by Russell's "propositional function". An example may help.
Russell writes a propositional function in its raw form, e.g., as : " is hurt". (Observe
the circumflex or "hat" over the variable y). For our example, we will assign just 4 values
to the variable : "Bob", "This bird", "Emily the rabbit", and "y". Substitution of one of
these values for variable yields a proposition; this proposition is called a "value" of the
propositional function. In our example there are four values of the propositional function,
e.g., "Bob is hurt", "This bird is hurt", "Emily the rabbit is hurt" and "y is hurt." A
proposition, if it is significanti.e., if its truth is determinatehas a truth-value of
truth or falsity. If a proposition's truth value is "truth" then the variable's value is said to
satisfy the propositional function. Finally, per Russell's definition, "a class [set] is all
objects satisfying some propositional function" (p. 23). Note the word "all'" -- this is how
the contemporary notions of "For all " and "there exists at least one instance " enter
the treatment (p. 15).

To continue the example: Suppose (from outside the mathematics/logic) one determines
that the propositions "Bob is hurt" has a truth value of "falsity", "This bird is hurt" has a
truth value of "truth", "Emily the rabbit is hurt" has an indeterminate truth value because
"Emily the rabbit" doesn't exist, and "y is hurt" is ambiguous as to its truth value because
the argument y itself is ambiguous. While the two propositions "Bob is hurt" and "This
bird is hurt" are significant (both have truth values), only the value "This bird" of the
variable satisfies' the propositional function : " is hurt". When one goes to form the
class : : " is hurt", only "This bird" is included, given the four values "Bob", "This
bird", "Emily the rabbit" and "y" for variable and their respective truth-values: falsity,
truth, indeterminate, ambiguous.

Russell defines functions of propositions with arguments, and truth-functions f(p).


For example, suppose one were to form the "function of propositions with arguments" p1:
"NOT(p) AND q" and assign its variables the values of p: "Bob is hurt" and q: "This bird
is hurt". (We are restricted to the logical linkages NOT, AND, OR and IMPLIES, and we
can only assign "significant" propositions to the variables p and q). Then the "function of
propositions with arguments" is p1: NOT("Bob is hurt") AND "This bird is hurt"). To
determine the truth value of this "function of propositions with arguments" we submit it
to a "truth function", e.g., f(p1): f(NOT("Bob is hurt") AND "This bird is hurt")), which
yields a truth value of "truth".

The notion of a "many-one" functional relation": Russell first discusses the notion of
"identity", then defines a descriptive function (pages 30ff) as the unique value x that
satisfies the (2-variable) propositional function (i.e., "relation") .

N.B. The reader should be warned here that the order of the variables are reversed! y is
the independent variable and x is the dependent variable, e.g., x = sin(y).

Russell symbolizes the descriptive function as "the object standing in relation to y": R'y
=DEF (x)(x R y). Russell repeats that "R'y is a function of y, but not a propositional
function [sic]; we shall call it a descriptive function. All the ordinary functions of
mathematics are of this kind. Thus in our notation "sin y" would be written " sin 'y ", and
"sin" would stand for the relation sin 'y has to y".

Hardy 1908

Hardy 1908, pp. 2628 defined a function as a relation between two variables x and y
such that "to some values of x at any rate correspond values of y." He neither required the
function to be defined for all values of x nor to associate each value of x to a single value
of y. This broad definition of a function encompasses more relations than are ordinarily
considered functions in contemporary mathematics.

The Formalist's "function": David Hilbert's axiomatization of


mathematics (1904-1927)

David Hilbert set himself the goal of "formalizing" classical mathematics "as a formal
axiomatic theory, and this theory shall be proved to be consistent, i.e., free from
contradiction" . In his 1927 The Foundations of Mathematics Hilbert frames the notion of
function in terms of the existence of an "object":

13. A(a) --> A((A)) Here (A) stands for an object of which the proposition A(a)
certainly holds if it holds of any object at all; let us call the logical -function". [The
arrow indicates implies.]

Hilbert then illustrates the three ways how the -function is to be used, firstly as the "for
all" and "there exists" notions, secondly to represent the "object of which [a proposition]
holds", and lastly how to cast it into the choice function.

Recursion theory and computability: But the unexpected outcome of Hilbert's and his
student Bernays's effort was failure. At about the same time, in an effort to solve Hilbert's
Entscheidungsproblem, mathematicians set about to define what was meant by an
"effectively calculable function" (Alonzo Church 1936), i.e., "effective method" or
"algorithm", that is, an explicit, step-by-step procedure that would succeed in computing
a function. Various models for algorithms appeared, in rapid succession, including
Church's lambda calculus (1936), Stephen Kleene's -recursive functions(1936) and
Allan Turing's (1936-7) notion of replacing human "computers" with utterly-mechanical
"computing machines". It was shown that all of these models could compute the same
class of computable functions. Church's thesis holds that this class of functions exhausts
all the number-theoretic functions that can be calculated by an algorithm. The outcomes
of these efforts were vivid demonstrations that, in Turing's words, "there can be no
general process for determining whether a given formula U of the functional calculus K
[Principia Mathematica] is provable".

Development of the set-theoretic definition of "function"

Set theory began with the work of the logicians with the notion of "class" (modern "set")
for example De Morgan (1847), Jevons (1880), Venn 1881, Frege 1879 and Peano
(1889). It was given a push by Georg Cantor's attempt to define the infinite in set-
theoretic treatment(18701890) and a subsequent discovery of an antinomy
(contradiction, paradox) in this treatment (Cantor's paradox), by Russell's discovery
(1902) of an antinomy in Frege's 1879 (Russell's paradox), by the discovery of more
antinomies in the early 20th century (e.g., the 1897 Burali-Forti paradox and the 1905
Richard paradox), and by resistance to Russell's complex treatment of logic and dislike of
his axiom of reducibility (1908, 19101913) that he proposed as a means to evade the
antinomies.

Russell's paradox 1902

In 1902 Russell sent a letter to Frege pointing out that Frege's 1879 Begriffsschrift
allowed a function to be an argument of itself: "On the other hand, it may also be that the
argument is determinate and the function indeterminate . . .." From this unconstrained
situation Russell was able to form a paradox:

"You state ... that a function, too, can act as the indeterminate element. This I formerly
believed, but now this view seems doubtful to me because of the following contradiction.
Let w be the predicate: to be a predicate that cannot be predicated of itself. Can w be
predicated of itself?"

Frege responded promptly that "Your discovery of the contradiction caused me the
greatest surprise and, I would almost say, consternation, since it has shaken the basis on
which I intended to build arithmetic".

From this point forward development of the foundations of mathematics became an


exercise in how to dodge "Russell's paradox", framed as it was in "the bare [set-theoretic]
notions of set and element".
Zermelo's set theory (1908) modified by Skolem (1922)

The notion of "function" appears as Zermelo's axiom IIIthe Axiom of Separation


(Axiom der Aussonderung). This axiom constrains us to use a propositional function
(x) to "separate" a subset M from a previously formed set M:

"AXIOM III. (Axiom of separation). Whenever the propositional function (x) is definite
for all elements of a set M, M possesses a subset M containing as elements precisely
those elements x of M for which (x) is true".

As there is no universal setsets originate by way of Axiom II from elements of (non-


set) domain B -- "...this disposes of the Russell antinomy so far as we are concerned". But
Zermelo's "definite criterion" is imprecise, and is fixed by Weyl, Fraenkel, Skolem, and
von Neumann.

In fact Skolem in his 1922 referred to this "definite criterion" or "property" as a "definite
proposition":

"... a finite expression constructed from elementary propositions of the form a b or a = b


by means of the five operations [logical conjunction, disjunction, negation, universal
quantification, and existential quantification].

van Heijenoort summarizes:

"A property is definite in Skolem's sense if it is expressed . . . by a well-formed formula


in the simple predicate calculus of first order in which the sole predicate constants are
and possibly, =. ... Today an axiomatization of set theory is usually embedded in a logical
calculus, and it is Weyl's and Skolem's approach to the formulation of the axiom of
separation that is generally adopted.

In this quote the reader may observe a shift in terminology: nowhere is mentioned the
notion of "propositional function", but rather one sees the words "formula", "predicate
calculus", "predicate", and "logical calculus." This shift in terminology is discussed more
in the section that covers "function" in contemporary set theory.

The WienerHausdorffKuratowski "ordered pair" definition 19141921

The history of the notion of "ordered pair" is not clear. As noted above, Frege (1879)
proposed an intuitive ordering in his definition of a two-argument function (A, B).
Norbert Wiener in his 1914 (see below) observes that his own treatment essentially
"revert(s) to Schrder's treatment of a relation as a class of ordered couples". Russell
(1903) considered the definition of a relation (such as (A, B)) as a "class of couples"
but rejected it:

"There is a temptation to regard a relation as definable in extension as a class of couples.


This is the formal advantage that it avoids the necessity for the primitive proposition
asserting that every couple has a relation holding between no other pairs of terms. But it
is necessary to give sense to the couple, to distinguish the referent [domain] from the
relatum [converse domain]: thus a couple becomes essentially distinct from a class of two
terms, and must itself be introduced as a primitive idea. . . . It seems therefore more
correct to take an intensional view of relations, and to identify them rather with class-
concepts than with classes."

By 1910-1913 and Principia Mathematica Russell had given up on the requirement for
an intensional definition of a relation, stating that "mathematics is always concerned with
extensions rather than intensions" and "Relations, like classes, are to be taken in
extension". To demonstrate the notion of a relation in extension Russell now embraced
the notion of ordered couple: "We may regard a relation ... as a class of couples ... the
relation determined by (x, y) is the class of couples (x, y) for which (x, y) is true". In a
footnote he clarified his notion and arrived at this definition:

"Such a couple has a sense, i.e., the couple (x, y) is different from the couple (y, x) unless
x = y. We shall call it a "couple with sense," ... it may also be called an ordered couple.

But he goes on to say that he would not introduce the ordered couples further into his
"symbolic treatment"; he proposes his "matrix" and his unpopular axiom of reducibility
in their place.

An attempt to solve the problem of the antinomies led Russell to propose his "doctrine of
types" in an appendix B of his 1903 The Principles of Mathematics. In a few years he
would refine this notion and propose in his 1908 The Theory of Types two axioms of
reducibility, the purpose of which were to reduce (single-variable) propositional
functions and (dual-variable) relations to a "lower" form (and ultimately into a
completely extensional form); he and Alfred North Whitehead would carry this treatment
over to Principia Mathematica 1910-1913 with a further refinement called "a matrix".
The first axiom is *12.1; the second is *12.11. To quote Wiener the second axiom *12.11
"is involved only in the theory of relations". Both axioms, however, were met with
skepticism and resistance. By 1914 Norbert Wiener, using Whitehead and Russell's
symbolism, eliminated axiom *12.11 (the "two-variable" (relational) version of the axiom
of reducibility) by expressing a relation as an ordered pair "using the null set. At
approximately the same time, Hausdorff (1914, p. 32) gave the definition of the ordered
pair (a, b) as { {a,1}, {b, 2} }. A few years later Kuratowski (1921) offered a definition
that has been widely used ever since, namely { {a, b}, {a} }". As noted by Suppes (1960)
"This definition . . . was historically important in reducing the theory of relations to the
theory of sets.

Observe that while Wiener "reduced" the relational *12.11 form of the axiom of
reducibility he did not reduce nor otherwise change the propositional-function form
*12.1; indeed he declared this "essential to the treatment of identity, descriptions, classes
and relations".
Schnfinkel's notion of "function" as a many-one "correspondence" 1924

Where exactly the general notion of "function" as a many-one relationship derives from
is unclear. Russell in his 1920 Introduction to Mathematical Philosophy states that "It
should be observed that all mathematical functions result form one-many [sic --
contemporary usage is many-one] relations . . . Functions in this sense are descriptive
functions". A reasonable possibility is the Principia Mathematica notion of "descriptive
function" -- R 'y =DEF (x)(x R y): "the singular object that has a relation R to y". Whatever
the case, by 1924, Moses Schonfinkel expressed the notion, claiming it to be "well
known":

"As is well known, by function we mean in the simplest case a correspondence between
the elements of some domain of quantities, the argument domain, and those of a domain
of function values ... such that to each argument value there corresponds at most one
function value".

According to Willard Quine, Schnfinkel's 1924 "provide[s] for ... the whole sweep of
abstract set theory. The crux of the matter is that Schnfinkel lets functions stand as
arguments. For Schnfinkel, substantially as for Frege, classes are special sorts of
functions. They are propositional functions, functions whose values are truth values. All
functions, propositional and otherwise, are for Schnfinkel one-place functions".
Remarkably, Schnfinkel reduces all mathematics to an extremely compact functional
calculus consisting of only three functions: Constancy, fusion (i.e., composition), and
mutual exclusivity. Quine notes that Haskell Curry (1958) carried this work forward
"under the head of combinatory logic".

von Neumann's set theory 1925

By 1925 Abraham Fraenkel (1922) and Thoralf Skolem (1922) had amended Zermelo's
set theory of 1908. But von Neumann was not convinced that this axiomatization could
not lead to the antinomies. So he proposed his own theory, his 1925 An axiomatization of
set theory. It explicitly contains a "contemporary", set-theoretic version of the notion of
"function":

"[Unlike Zermelo's set theory] [w]e prefer, however, to axiomatize not "set" but
"function". The latter notion certainly includes the former. (More precisely, the two
notions are completely equivalent, since a function can be regarded as a set of pairs, and a
set as a function that can take two values.)".

His axiomatization creates two "domains of objects" called "arguments" (I-objects) and
"functions" (II-objects); where they overlap are the "argument functions" (I-II objects).
He introduces two "universal two-variable operations" -- (i) the operation [x, y]: ". . . read
'the value of the function x for the argument y) and (ii) the operation (x, y): ". . . (read 'the
ordered pair x, y'") whose variables x and y must both be arguments and that itself
produces an argument (x,y)". To clarify the function pair he notes that "Instead of f(x) we
write [f,x] to indicate that f, just like x, is to be regarded as a variable in this procedure".
And to avoid the "antinomies of naive set theory, in Russell's first of all . . . we must
forgo treating certain functions as arguments". He adopts a notion from Zermelo to
restrict these "certain functions"

Since 1950

Notion of "function" in contemporary set theory

Both axiomatic and naive forms of Zermelo's set theory as modified by Fraenkel (1922)
and Skolem (1922) define "function" as a relation, define a relation as a set of ordered
pairs, and define an ordered pair as a set of two "dissymetric" sets.

While the reader of Suppes (1960) Axiomatic Set Theory or Halmos (1970) Naive Set
Theory observes the use of function-symbolism in the axiom of separation, e.g., (x) (in
Suppes) and S(x) (in Halmos), they will see no mention of "proposition" or even "first
order predicate calculus". In their place are "expressions of the object language", "atomic
formulae", "primitive formulae", and "atomic sentences".

Kleene 1952 defines the words as follows: "In word languages, a proposition is expressed
by a sentence. Then a 'predicate' is expressed by an incomplete sentence or sentence
skeleton containing an open place. For example, "___ is a man" expresses a predicate ...
The predicate is a propositional function of one variable. Predicates are often called
'properties' ... The predicate calculus will treat of the logic of predicates in this general
sense of 'predicate', i.e., as propositional function".

The reason for the disappearance of the words "propositional function" e.g., in Suppes
(1960), and Halmos (1970), is explained by Alfred Tarski 1946 together with further
explanation of the terminology:

"An expression such as x is an integer, which contains variables and, on replacement of


these variables by constants becomes a sentence, is called a SENTENTIAL [i.e.,
propositional cf his index] FUNCTION. But mathematicians, by the way, are not very
fond of this expression, because they use the term "function" with a different meaning. ...
sentential functions and sentences composed entirely of mathematical symbols (and not
words of everyday languange), such as: x + y = 5 are usually referred to by
mathematicians as FORMULAE. In place of "sentential function" we shall sometimes
simply say "sentence" --- but only in cases where there is no danger of any
misunderstanding".

For his part Tarski calls the relational form of function a "FUNCTIONAL RELATION or
simply a FUNCTION" . After a discussion of this "functional relation" he asserts that:

"The concept of a function which we are considering now differs essentially from the
concepts of a sentential [propositional] and of a designatory function .... Strictly speaking
... [these] do not belong to the domain of logic or mathematics; they denote certain
categories of expressions which serve to compose logical and mathematical statements,
but they do not denote things treated of in those statements... . The term "function" in its
new sense, on the other hand, is an expression of a purely logical character; it designates
a certain type of things dealt with in logic and mathematics."

Further developments

The idea of structure-preserving functions, or homomorphisms, led to the abstract notion


of morphism, the key concept of category theory. More recently, the concept of functor
has been used as an analogue of a function in category theory.
Chapter 2

Inverse Function

A function and its inverse 1. Because maps a to 3, the inverse 1 maps 3 back to a.

In mathematics, if is a function from a set A to a set B, then an inverse function for is


a function from B to A, with the property that a round trip (a composition) from A to B to
A (or from B to A to B) returns each element of the initial set to itself. Thus, if an input x
into the function produces an output y, then inputting y into the inverse function
produces the output x, and vice versa.
A function that has an inverse is called invertible; the inverse function is then uniquely
determined by and is denoted by 1 (read f inverse, not to be confused with
exponentiation).

Definitions

If maps X to Y, then 1 maps Y back to X.

Let be a function whose domain is the set X, and whose codomain is the set Y. Then, if
it exists, the inverse of is the function 1 with domain Y and codomain X, with the
property:

Stated otherwise, a function is invertible if and only if its inverse relation is a function, in
which case the inverse relation is the inverse function.

Not all functions have an inverse. For this rule to be applicable, each element y Y must
correspond to exactly one element x X. This is generally stated as two conditions:

Every corresponds to no more than one ; a function with this


property is called one-to-one, or information-preserving, or an injection.
Every corresponds to at least one ; a function with this property
is called onto, or a surjection.

A function with both of these properties is called a bijection, so the above is often stated
as "a function is bijective if and only if it has an inverse function".

In elementary mathematics, the domain is often assumed to be the real numbers, if not
otherwise specified, and the codomain is assumed to be the image. Most functions
encountered in elementary calculus do not have an inverse.
Example: squaring and square root functions

The function (x) = x2 may or may not be invertible, depending on the domain and
codomain.

If the domain is the real numbers, then each element in Y would correspond to two
different elements in X (x), and therefore would not be invertible. More precisely, the
square of x is not invertible because it is impossible to deduce from its output the sign of
its input. Such a function is called non-injective or information-losing. Notice that neither
the square root nor the principal square root function is the inverse of x2 because the first
is not single-valued, and the second returns -x when x is negative.

If the domain and codomain are both the non-negative numbers, or if the domain is the
negative numbers, then the function is invertible (by the principal square root) and
injective.

Inverses in higher mathematics

The definition given above is commonly adopted in calculus. In higher mathematics, the
notation

means " is a function mapping elements of a set X to elements of a set Y". The source, X,
is called the domain of , and the target, Y, is called the codomain. The codomain
contains the range of as a subset, and is considered part of the definition of .

When using codomains, the inverse of a function : X Y is required to have domain Y


and codomain X. For the inverse to be defined on all of Y, every element of Y must lie in
the range of the function . A function with this property is called onto or a surjection.
Thus, a function with a codomain is invertible if and only if it is both one-to-one and
onto. Such a function is called a one-to-one correspondence or a bijection, and has the
property that every element y Y corresponds to exactly one element x X.

Inverses and composition

If is an invertible function with domain X and range Y, then

This statement is equivalent to the first of the above-given definitions of the inverse, and
it becomes equivalent to the second definition if Y coincides with the codomain of .
Using the composition of functions we can rewrite this statement as follows:
where idX is the identity function on the set X. In category theory, this statement is used
as the definition of an inverse morphism.

If we think of composition as a kind of multiplication of functions, this identity says that


the inverse of a function is analogous to a multiplicative inverse. This explains the origin
of the notation 1.

Note on notation

The superscript notation for inverses can sometimes be confused with other uses of
superscripts, especially when dealing with trigonometric and hyperbolic functions. To
avoid this confusion, the notations [1] or with the "-1" above the are sometimes
used/needed.

It is important to realize that 1(x) is not the same as (x)1. In 1(x), the superscript
"1" is not an exponent. A similar notation is used in dynamical systems for iterated
functions. For example, 2 denotes two iterations of the function ; if (x) = x + 1, then
2(x) = (x + 1) + 1, or x + 2. In symbols:

In calculus, (n), with parentheses, denotes the nth derivative of a function . For instance:

In trigonometry, for historical reasons, sin2(x) usually does mean the square of sin(x):

However, the expression sin1(x) does not always represent the multiplicative inverse to
sin(x). If that is the case then:

Then it denotes the inverse function for sin(x) (actually a partial inverse; see below). To
avoid confusion, an inverse trigonometric function is often indicated by the prefix "arc".
For instance the inverse sine is typically called the arcsine:

The function (sin x)1 is the multiplicative inverse to the sine, and is called the cosecant.
It is usually denoted csc x:
Hyperbolic functions behave similarly, using the prefix "ar", as in arsinh(x), for the
inverse function of sinh(x), and csch(x) for the multiplicative inverse of sinh(x).

Properties
Uniqueness

If an inverse function exists for a given function , it is unique: it must be the inverse
relation.

Symmetry

There is a symmetry between a function and its inverse. Specifically, if the inverse of is
1, then the inverse of 1 is the original function . In symbols:

This follows because inversion of relations is an involution.

This statement is an obvious consequence of the deduction that for to be invertible it


must be injective (first definition of the inverse) or bijective (second definition). The
property of symmetry can be concisely expressed by the following formula:
Inverse of a composition

The inverse of g o is 1 o g1.

The inverse of a composition of functions is given by the formula

Notice that the order of and g have been reversed; to undo g followed by , we must
first undo and then undo g.

For example, let (x) = x + 5, and let g(x) = 3x. Then the composition o g is the function
that first multiplies by three and then adds five:

To reverse this process, we must first subtract five, and then divide by three:

This is the composition (g1 o 1) (y).

Self-inverses

If X is a set, then the identity function on X is its own inverse:

More generally, a function : X X is equal to its own inverse if and only if the
composition o is equal to idx. Such a function is called an involution.
Inverses in calculus
Single-variable calculus is primarily concerned with functions that map real numbers to
real numbers. Such functions are often defined through formulas, such as:

A function from the real numbers to the real numbers possesses an inverse as long as it
is one-to-one, i.e. as long as the graph of the function passes the horizontal line test.

The following table shows several standard functions and their inverses:

Function (x) Inverse 1(y) Notes


x+a ya
ax ay
mx y/m m0
1/x 1/y x, y 0
x2 x, y 0 only
x3 no restriction on x and y
x p
y1/p (i.e. ) x, y 0 in general, p 0
x
e ln y y>0
ax loga y y > 0 and a > 0
trigonometric inverse trigonometric various restrictions (see table
functions functions below)

Formula for the inverse

One approach to finding a formula for 1, if it exists, is to solve the equation y = (x) for
x. For example, if is the function

then we must solve the equation y = (2x + 8)3 for x:


Thus the inverse function 1 is given by the formula

Sometimes the inverse of a function cannot be expressed by a formula. For example, if


is the function

then is one-to-one, and therefore possesses an inverse function 1. There is no simple


formula for this inverse, since the equation y = x + sin x cannot be solved algebraically
for x.

Graph of the inverse

The graphs of y = (x) and y = 1(x). The dotted line is y = x.

If and 1 are inverses, then the graph of the function

is the same as the graph of the equation


This is identical to the equation y = (x) that defines the graph of , except that the roles
of x and y have been reversed. Thus the graph of 1 can be obtained from the graph of
by switching the positions of the x and y axes. This is equivalent to reflecting the graph
across the line y = x.

Inverses and derivatives

A continuous function is one-to-one (and hence invertible) if and only if it is either


strictly increasing or decreasing (with no local maxima or minima). For example, the
function

is invertible, since the derivative (x) = 3x2 + 1 is always positive.

If the function is differentiable, then the inverse 1 will be differentiable as long as


(x) 0. The derivative of the inverse is given by the inverse function theorem:

If we set x = 1(y), then the formula above can be written

This result follows from the chain rule.

The inverse function theorem can be generalized to functions of several variables.


Specifically, a differentiable function : Rn Rn is invertible in a neighborhood of a
point p as long as the Jacobian matrix of at p is invertible. In this case, the Jacobian of
1 at (p) is the matrix inverse of the Jacobian of at p.

Real-world examples
For example, let be the function that converts a temperature in degrees Celsius to a
temperature in degrees Fahrenheit:

then its inverse function converts degrees Fahrenheit to degrees Celsius:


Or, suppose assigns each child in a family its birth year. An inverse function would
output which child was born in a given year. However, if the family has twins (or triplets)
then the output cannot be known when the input is the common birth year. As well, if a
year is given in which no child was born then a child cannot be named. But if each child
was born in a separate year, and if we restrict attention to the three years in which a child
was born, then we do have an inverse function. For example,

Generalizations
Partial inverses

The square root of x is a partial inverse to (x) = x2.

Even if a function is not one-to-one, it may be possible to define a partial inverse of


by restricting the domain. For example, the function
is not one-to-one, since x2 = (x)2. However, the function becomes one-to-one if we
restrict to the domain x 0, in which case

(If we instead restrict to the domain x 0, then the inverse is the negative of the square
root of x.) Alternatively, there is no need to restrict the domain if we are content with the
inverse being a multivalued function:

The inverse of this cubic function has three branches.

Sometimes this multivalued inverse is called the full inverse of , and the portions (such
as x and x) are called branches. The most important branch of a multivalued function
(e.g. the positive square root) is called the principal branch, and its value at y is called
the principal value of 1(y).

For a continuous function on the real line, one branch is required between each pair of
local extrema. For example, the inverse of a cubic function with a local maximum and a
local minimum has three branches (see the picture above).
The arcsine is a partial inverse of the sine function.

These considerations are particularly important for defining the inverses of trigonometric
functions. For example, the sine function is not one-to-one, since

for every real x (and more generally sin(x + 2n) = sin(x) for every integer n). However,
the sine is one-to-one on the interval [2, 2], and the corresponding partial inverse is
called the arcsine. This is considered the principal branch of the inverse sine, so the
principal value of the inverse sine is always between 2 and 2. The following table
describes the principal branch of each inverse trigonometric function:

function Range of usual principal value


sin1 2 sin1(x) 2
cos1 0 cos1(x)
tan1 2 < tan1(x) < 2
cot1 0 < cot1(x) <
sec1 0 sec1(x) < 2 or 2 < sec1(x)
csc1 2 csc1(x) < 0 or 0 < csc1(x) 2

Left and right inverses

If : X Y, a left inverse for (or retraction of ) is a function g: Y X such that


That is, the function g satisfies the rule

Thus, g must equal the inverse of on the range of , but may take any values for
elements of Y not in the range. A function has a left inverse if and only if it is injective.

A right inverse for (or section of ) is a function h: Y X such that

That is, the function h satisfies the rule

Thus, h(y) may be any of the elements of x that map to y under . A function has a right
inverse if and only if it is surjective (though constructing such an inverse in general
requires the axiom of choice).

An inverse which is both a left and right inverse must be unique; otherwise not. Likewise,
if g is a left inverse for , then g may or may not be a right inverse for ; and if g is a
right inverse for , then g is not necessarily a left inverse for . For example let
:R[0,) denote the squaring map, such that (x)=x2 for all x in R, and let g:[0,)R
denote the square root map, such that g(x)=x for all x0. Then (g(x))=x for all x in
[0,); that is, g is a right inverse to . However, g is not a left inverse to , since, e.g.,
g((-1))=1-1.

Preimages

If : X Y is any function (not necessarily invertible), the preimage (or inverse image)
of an element y Y is the set of all elements of X that map to y:

The preimage of y can be thought of as the image of y under the (multivalued) full
inverse of the function f.

Similarly, if S is any subset of Y, the preimage of S is the set of all elements of X that map
to S:
For example, take a function : R R, where : x x2. This function is not invertible
for reasons discussed above. Yet preimages may be defined for subsets of the codomain:

The preimage of a single element y Y a singleton set {y} is sometimes called the
fiber of y. When Y is the set of real numbers, it is common to refer to 1(y) as a level set.
Chapter 3

Special Functions & Implicit and Explicit


Functions

Special functions
Special functions are particular mathematical functions which have more or less
established names and notations due to their importance in mathematical analysis,
functional analysis, physics, or other applications.

There is no general formal definition, but the list of mathematical functions contains
functions which are commonly accepted as special. In particular, elementary functions
are also considered as special functions.

Tables of special functions


Many special functions appear as solutions of differential equations or integrals of
elementary functions. Therefore, tables of integrals usually include descriptions of special
functions, and tables of special functions include most important integrals; at least, the
integral representation of special functions. Because symmetries of differential equations
are essential to both physics and mathematics, the theory of special functions is closely
related to the theory of Lie groups and Lie algebras, as well as certain topics in
mathematical physics.

Symbolic computation engines usually recognize the majority of special functions. Not
all such systems have efficient algorithms for the evaluation, especially in the complex
plane.

Notations used in special functions

In most cases, the standard notation is used for indication of a special function: the name
of function, subscripts, if any, open parenthesis, then arguments, separated with comma,
and then close parenthesis. Such a notation allows easy translation of the expressions to
algorithmic languages avoiding ambiguities. Functions with established international
notations are sin, cos, exp, erf, and erfc.
Sometimes, a special function has several names. The natural logarithm can be called as
Log, log or ln, depending on the context. For example, the tangent function may be
denoted Tan, tan or tg (especially in Russian literature); arctangent may be called atan,
arctg, or tan 1. Bessel functions may be written ; usually, ,
, refer to the same function.

Subscripts are often used to indicate arguments, typically integers. In a few cases, the
semicolon (;) or even backslash (\) is used as a separator. In this case, the translation to
algorithmic languages admits ambiguity and may lead to confusion.

Superscripts may indicate not only exponentiation, but modification of a function.


Examples include:

usually indicates
is typically , but never
usually means , and not ; this one typically
causes the most confusion, as it is inconsistent with the others.

Evaluation of special functions

Most special functions are considered as a function of a complex variable. They are
analytic; the singularities and cuts are described; the differential and integral
representations are known and the expansion to the Taylor or asymptotic series are
available. In addition, sometimes there exist relations with other special functions; a
complicated special function can be expressed in terms of simpler functions. Various
representations can be used for the evaluation; the simplest way to evaluate a function is
to expand it into a Taylor series. However, such representation may converge slowly if at
all. In algorithmic languages, rational approximations are typically used, although they
may behave badly in the case of complex argument(s).

History of special functions


Classical theory

While trigonometry can be codified, as was clear already to expert mathematicians of the
eighteenth century (if not before), the search for a complete and unified theory of special
functions has continued since the nineteenth century. The high point of special function
theory in the period 1850-1900 was the theory of elliptic functions; treatises that were
essentially complete, such as that of Tannery and Molk, could be written as handbooks to
all the basic identities of the theory. They were based on techniques from complex
analysis.
From that time onwards it would be assumed that analytic function theory, which had
already unified the trigonometric and exponential functions, was a fundamental tool. The
end of the century also saw a very detailed discussion of spherical harmonics.

Changing and fixed motivations

Of course the wish for a broad theory including as many as possible of the known special
functions has its intellectual appeal, but it is worth noting other motivations. For a long
time, the special functions were in the particular province of applied mathematics;
applications to the physical sciences and engineering determined the relative importance
of functions. In the days before the electronic computer, the ultimate compliment to a
special function was the computation, by hand, of extended tables of its values. This was
a capital-intensive process, intended to make the function available by look-up, as for the
familiar logarithm tables. The aspects of the theory that then mattered might then be two:

for numerical analysis, discovery of infinite series or other analytical expression


allowing rapid calculation; and
reduction of as many functions as possible to the given function.

In contrast, one might say, there are approaches typical of the interests of pure
mathematics: asymptotic analysis, analytic continuation and monodromy in the complex
plane, and the discovery of symmetry principles and other structure behind the faade of
endless formulae in rows. There is not a real conflict between these approaches, in fact.

Twentieth century

The twentieth century saw several waves of interest in special function theory. The
classic Whittaker and Watson textbook sought to unify the theory by using complex
variables; the G. N. Watson tome A Treatise on the Theory of Bessel Functions pushed
the techniques as far as possible for one important type that particularly admitted
asymptotics to be studied.

The later Bateman manuscript project, under the editorship of Arthur Erdlyi, attempted
to be encyclopedic, and came around the time when electronic computation was coming
to the fore and tabulation ceased to be the main issue.

Contemporary theories

The modern theory of orthogonal polynomials is of a definite but limited scope.


Hypergeometric series became an intricate theory, in need of later conceptual
arrangement. Lie groups, and in particular their representation theory, explain what a
spherical function can be in general; from 1950 onwards substantial parts of classical
theory could be recast in terms of Lie groups. Further, work on algebraic combinatorics
also revived interest in older parts of the theory. Conjectures of Ian G. Macdonald helped
to open up large and active new fields with the typical special function flavour.
Difference equations have begun to take their place besides differential equations as a
source for special functions.

Special functions in number theory


In number theory, certain special functions have traditionally been studied, such as
particular Dirichlet series and modular forms. Almost all aspects of special function
theory are reflected there, as well as some new ones, such as came out of the monstrous
moonshine theory.

Implicit and explicit functions


In mathematics, an implicit function is a function in which the dependent variable has
not been given "explicitly" in terms of the independent variable. To give a function f
explicitly is to provide a prescription for determining the output value of the function y in
terms of the input value x:

y = f(x).

By contrast, the function is implicit if the value of y is obtained from x by solving an


equation of the form:

R(x,y) = 0.

That is, it is defined as the level set of a function in two variables: one variable or the
other may determine the other, but one is not given an explicit formula for one in terms of
the other.

Implicit functions can often be useful in situations where it is inconvenient to solve


explicitly an equation of the form R(x,y) = 0 for y in terms of x. Even if it is possible to
rearrange this equation to obtain y as an explicit function f(x), it may not be desirable to
do so since the expression of f may be much more complicated than the expression of R.
In other situations, the equation R(x,y) = 0 may fail to define a function at all, and rather
defines a kind of multiple-valued function. Nevertheless, in many situations, it is still
possible to work with implicit functions. Some techniques from calculus, such as
differentiation, can be performed with relative ease using implicit differentiation.

The implicit function theorem provides a link between implicit and explicit functions. It
states that if the equation R(x, y) = 0 satisfies some mild conditions on its partial
derivatives, then one can in principle solve this equation for y, at least over some small
interval. Geometrically, the graph defined by R(x,y) = 0 will overlap locally with the
graph of a function y = f(x).
Various numerical methods exist for solving the equation R(x,y)=0 to find an
approximation to the implicit function y. Many of these methods are iterative in that they
produce successively better approximations, so that a prescribed accuracy can be
achieved. Many of these iterative methods are based on some form of Newton's method.

Examples
Inverse functions

Implicit functions commonly arise as one way of describing the notion of an inverse
function. If f is a function, then the inverse function of f is a solution of the equation

for y in terms of x. Intuitively, an inverse function is obtained from f by interchanging the


roles of the dependent and independent variables. Stated another way, the inverse
function is the solution y of the equation

R(x,y) = x f(y) = 0.

Examples.

1. The natural logarithm y = ln(x) is the solution of the equation x ey = 0.


2. The product log is an implicit function given by x y ey = 0.

Algebraic functions

An algebraic function is a solution y for an equation R(x,y) = 0 where R is a polynomial


of two variables. Algebraic functions play an important role in mathematical analysis and
algebraic geometry. A simple example of an algebraic function is given by the unit circle:

x2 + y2 1 = 0.

Solving for y gives

Note that there are two "branches" to the implicit function: one where the sign is positive
and the other where it is negative.

Caveats
Not every equation R(x, y) = 0 has a graph that is the graph of a function, the circle
equation being one prominent example. Another example is an implicit function given by
x C(y) = 0 where C is a cubic polynomial having a "hump" in its graph. Thus, for an
implicit function to be a true function it might be necessary to use just part of the graph.
An implicit function can sometimes be successfully defined as a true function only after
"zooming in" on some part of the x-axis and "cutting away" some unwanted function
branches. A resulting formula may only then qualify as a legitimate explicit function.

The defining equation R = 0 can also have other pathologies. For example, the implicit
equation x = 0 does not define a function at all; it is a vertical line. In order to avoid a
problem like this, various constraints are frequently imposed on the allowable sorts of
equations or on the domain. The implicit function theorem provides a uniform way of
handling these sorts of pathologies.

Implicit differentiation
In calculus, a method called implicit differentiation makes use of the chain rule to
differentiate implicitly defined functions.

As explained in the introduction, y can be given as a function of x implicitly rather than


explicitly. When we have an equation R(x, y) = 0, we may be able to solve it for y and
then differentiate. However, sometimes it is simpler to differentiate R(x, y) with respect to
x and then solve for dy/dx.

Examples

1. Consider for example

This function normally can be manipulated by using algebra to change this equation to an
explicit function:

Differentiation then gives . Alternatively, one can differentiate the equation:

Solving for :
2. An example of an implicit function, for which implicit differentiation might be easier
than attempting to use explicit differentiation, is

In order to differentiate this explicitly with respect to x, one would have to obtain (via
algebra)

and then differentiate this function. This creates two derivatives: one for y > 0 and
another for y < 0.

One might find it substantially easier to implicitly differentiate the implicit function;

thus,

3. Sometimes standard explicit differentiation cannot be used and, in order to obtain the
derivative, another method such as implicit differentiation must be employed. An
example of such a case is the implicit function y5 y = x. It is impossible to express y
explicitly as a function of x and dy/dx therefore this cannot be found by explicit
differentiation. Using the implicit method, dy/dx can be expressed:

factoring out shows that

which yields the final answer

where
Formula for two variables

"The Implicit Function Theorem states that if F is defined on an open disk containing
(a,b), where F(a,b) = 0, , and Fx and Fy are continuous on the disk, then
the equation F(x,y) = 0 defines y as a function of x near the point (a,b) and the derivative
of this function is given by...": 11.5

Fx, Fy indicates the derivative of F with respect to x and y

The above formula comes from using the generalized chain rule to obtain the total
derivativewith respect to xof both sides of F(x, y) = 0:

Marginal rate of substitution

In economics, when the level set is an indifference curve, the implicit derivative (or
rather, 1 times the implicit derivative) is interpreted as the marginal rate of substitution
of the two variables: how much more of y one must receive in order to be indifferent to a
loss of 1 unit of x.

Implicit function theorem


It can be shown that if R(x,y) is given by a smooth submanifold M in , and (a,b) is a
point of this submanifold such that the tangent space there is not vertical (that is

), then M in some small enough neighbourhood of (a,b) is given by a


parametrization (x,f(x)) where f is a smooth function. In less technical language, implicit
functions exist and can be differentiated, unless the tangent to the supposed graph would
be vertical. In the standard case where we are given an equation

F(x,y) = 0

the condition on F can be checked by means of partial derivatives.


Chapter 4

Function Composition

g f, the composition of f and g. For example, (g f)(c) = #.

In mathematics, function composition is the application of one function to the results of


another. For instance, the functions f: X Y and g: Y Z can be composed by
computing the output of g when it has an argument of f(x) instead of x. Intuitively, if z is
a function g of y and y is a function f of x, then z is a function of x.

Thus one obtains a composite function g f: X Z defined by (g f)(x) = g(f(x)) for all x
in X. The notation g f is read as "g circle f", or "g composed with f", "g after f", "g
following f", or just "g of f".

The composition of functions is always associative. That is, if f, g, and h are three
functions with suitably chosen domains and codomains, then f (g h) = (f g) h,
where the parentheses serve to indicate that composition is to be performed first for the
parenthesized functions. Since there is no distinction between the choices of placement of
parentheses, they may be safely left off.

The functions g and f are said to commute with each other if g f = f g. In general,
composition of functions will not be commutative. Commutativity is a special property,
attained only by particular functions, and often in special circumstances. For example,
only when . But a function always commutes with its inverse
to produce the identity mapping.

Considering functions as special cases of relations (namely functional relations), one can
analogously define composition of relations, which gives the formula for
in terms of and .

Derivatives of compositions involving differentiable functions can be found using the


chain rule. Higher derivatives of such functions are given by Fa di Bruno's formula.

The structures given by composition are axiomatized and generalized in category theory.

Example
As an example, suppose that an airplane's elevation at time t is given by the function h(t)
and that the oxygen concentration at elevation x is given by the function c(x). Then (c
h)(t) describes the oxygen concentration around the plane at time t.

Functional powers
If then may compose with itself; this is sometimes denoted .
Thus:

Repeated composition of a function with itself is called function iteration.

The functional powers for natural follow immediately.

By convention, the identity map on the domain of .


If admits an inverse function, negative functional powers
are defined as the opposite power of the inverse function, .

Note: If f takes its values in a ring (in particular for real or complex-valued f), there is a
risk of confusion, as f n could also stand for the n-fold product of f, e.g. f 2(x) = f(x) f(x).

(For trigonometric functions, usually the latter is meant, at least for positive exponents.
For example, in trigonometry, this superscript notation represents standard
exponentiation when used with trigonometric functions: sin2(x) = sin(x) sin(x).
However, for negative exponents (especially 1), it nevertheless usually refers to the
inverse function, e.g., tan1 = arctan ( 1/tan).

In some cases, an expression for f in g(x) = f r(x) can be derived from the rule for g given
non-integer values of r. This is called fractional iteration. For instance, a half iterate of a
function f is a function g satisfying g(g(x)) = f(x). Another example would be that where f
is the successor function, f r(x) = x + r. This idea can be generalized so that the iteration
count becomes a continuous parameter; in this case, such a system is called a flow.

Iterated functions and flows occur naturally in the study of fractals and dynamical
systems.

Composition monoids
Suppose one has two (or more) functions f: X X, g: X X having the same domain
and codomain. Then one can form long, potentially complicated chains of these functions
composed together, such as f f g f. Such long chains have the algebraic structure of a
monoid, called transformation monoid or composition monoid. In general, composition
monoids can have remarkably complicated structure. One particular notable example is
the de Rham curve. The set of all functions f: X X is called the full transformation
semigroup on X.

If the functions are bijective, then the set of all possible combinations of these functions
forms a transformation group; and one says that the group is generated by these
functions.

The set of all bijective functions f: X X form a group with respect to the composition
operator. This is the symmetric group, also sometimes called the composition group.

Alternative notations
Many mathematicians omit the composition symbol, writing gf for g f.

In the mid-20th century, some mathematicians decided that writing "g f" to
mean "first apply f, then apply g" was too confusing and decided to change
notations. They write "xf" for "f(x)" and "(xf)g" for "g(f(x))". This can be more
natural and seem simpler than writing functions on the left in some areas in
linear algebra, for instance, where x is a row vector and f and g denote matrices
and the composition is by matrix multiplication. This alternative notation is called
postfix notation. The order is important because matrix multiplication is non-
commutative. Successive transformations applying and composing to the right
agrees with the left-to-right reading sequence.

Mathematicians who use postfix notation may write "fg", meaning first do f then
do g, in keeping with the order the symbols occur in postfix notation, thus making
the notation "fg" ambiguous. Computer scientists may write "f;g" for this, thereby
disambiguating the order of composition. To distinguish the left composition
operator from a text semicolon, in the Z notation a fat semicolon (U+2A1F) is
used for left relation composition. Since all functions are binary relations, it is
correct to use the fat semicolon for function composition as well.

Composition operator
Given a function g, the composition operator Cg is defined as that operator which maps
functions to functions as

Composition operators are studied in the field of operator theory.


Chapter 5

Continuous Function

In mathematics, a continuous function is a function for which, intuitively, small changes


in the input result in small changes in the output. Otherwise, a function is said to be
"discontinuous". A continuous function with a continuous inverse function is called
"bicontinuous". An intuitive (though imprecise) idea of continuity is given by the
common statement that a continuous function is a function whose graph can be drawn
without lifting the chalk from the blackboard.

Continuity of functions is one of the core concepts of topology, which is treated in full
generality below. The introductory portion here focuses on the special case where the
inputs and outputs of functions are real numbers. In addition, here we discusses the
definition for the more general case of functions between two metric spaces. In order
theory, especially in domain theory, one considers a notion of continuity known as Scott
continuity.

As an example, consider the function h(t) which describes the height of a growing flower
at time t. This function is continuous. In fact, there is a dictum of classical physics which
states that in nature everything is continuous. By contrast, if M(t) denotes the amount of
money in a bank account at time t, then the function jumps whenever money is deposited
or withdrawn, so the function M(t) is discontinuous. (However, if one assumes a discrete
set as the domain of function M, for instance the set of points of time at 4:00 PM on
business days, then M becomes continuous function, as every function whose domain is a
discrete subset of reals is.)

Real-valued continuous functions


Historical infinitesimal definition

Cauchy defined continuity of a function in the following intuitive terms: an infinitesimal


change in the independent variable corresponds to an infinitesimal change of the
dependent variable.
Definition in terms of limits

Suppose we have a function that maps real numbers to real numbers and whose domain is
some interval, like the functions h and M above. Such a function can be represented by a
graph in the Cartesian plane; the function is continuous if, roughly speaking, the graph is
a single unbroken curve with no "holes" or "jumps".

In general, we say that the function f is continuous at some point c of its domain if, and
only if, the following holds:

The limit of f(x) as x approaches c through domain of f does exist and is equal to
f(c); in mathematical notation, . If the point c in the domain of
f is not a limit point of the domain, then this condition is vacuously true, since x
cannot approach c through values not equal c. Thus, for example, every function
whose domain is the set of all integers is continuous.

We call a function continuous if and only if it is continuous at every point of its domain.
More generally, we say that a function is continuous on some subset of its domain if it is
continuous at every point of that subset.

The notation C() or C0() is sometimes used to denote the set of all continuous
functions with domain . Similarly, C1() is used to denote the set of differentiable
functions whose derivative is continuous, C() for the twice-differentiable functions
whose second derivative is continuous, and so on. In the field of computer graphics, these
three levels are sometimes called g0 (continuity of position), g1 (continuity of tangency),
and g2 (continuity of curvature). The notation C(n, )() occurs in the definition of a more
subtle concept, that of Hlder continuity.

Weierstrass definition (epsilon-delta) of continuous functions

Without resorting to limits, one can define continuity of real functions as follows.

Again consider a function that maps a set of real numbers to another set of real
numbers, and suppose c is an element of the domain of . The function is said to be
continuous at the point c if the following holds: For any number > 0, however small,
there exists some number > 0 such that for all x in the domain of with
c < x < c + , the value of (x) satisfies

Alternatively written: Given subsets I, D of R, continuity of : I D at c I means


that for every > 0 there exists a > 0 such that for all x I,:
A form of this epsilon-delta definition of continuity was first given by Bernard Bolzano
in 1817. Preliminary forms of a related definition of the limit were given by Cauchy,
though the formal definition and the distinction between pointwise continuity and
uniform continuity were first given by Karl Weierstrass.

More intuitively, we can say that if we want to get all the (x) values to stay in some
small neighborhood around (c), we simply need to choose a small enough neighborhood
for the x values around c, and we can do that no matter how small the (x) neighborhood
is; is then continuous at c.

In modern terms, this is generalized by the definition of continuity of a function with


respect to a basis for the topology, here the metric topology.

Heine definition of continuity

The following definition of continuity is due to Heine.

A real function is continuous if for any sequence (xn) such that

it holds that

(We assume that all the points xn as well as L belong to the domain of .)

One can say, briefly, that a function is continuous if, and only if, it preserves limits.

Weierstrass's and Heine's definitions of continuity are equivalent on the reals. The usual
(easier) proof makes use of the axiom of choice, but in the case of global continuity of
real functions it was proved by Wacaw Sierpiski that the axiom of choice is not actually
needed.

In more general setting of topological spaces, the concept analogous to Heine definition
of continuity is called sequential continuity. In general, the condition of sequential
continuity is weaker than the analogue of Cauchy continuity, which is just called
continuity. However, if instead of sequences one uses nets (sets indexed by a directed set,
not only the natural numbers), then the resulting concept is equivalent to the general
notion of continuity in topology. Sequences are sufficient on metric spaces because they
are first-countable spaces (every point has a countable neighborhood basis, hence
representative points in each neighborhood are enough to ensure continuity), but general
topological spaces are not first-countable, hence sequences do not suffice, and nets must
be used.
Definition using oscillation

The failure of a function to be continuous at a point is quantified by its oscillation.

Continuity can also be defined in terms of oscillation: a function is continuous at a


point x0 if and only if the oscillation is zero; in symbols, f(x0) = 0. A benefit of this
definition is that it quantifies discontinuity: the oscillation gives how much the function is
discontinuous at a point.

This definition is useful in descriptive set theory to study the set of discontinuities and
continuous points the continuous points are the intersection of the sets where the
oscillation is less than (hence a G set) and gives a very quick proof of one direction
of the Lebesgue integrability condition.
The oscillation is equivalence to the - definition by a simple re-arrangement, and by
using a limit (lim sup, lim inf) to define oscillation: if (at a given point) for a given 0
there is no that satisfies the - definition, then the oscillation is at least 0, and
conversely if for every there is a desired , the oscillation is 0. The oscillation definition
can be naturally generalized to maps from a topological space to a metric space.

Definition using the hyperreals

Non-standard analysis is a way of making Newton-Leibniz-style infinitesimals


mathematically rigorous. The real line is augmented by the addition of infinite and
infinitesimal numbers to form the hyperreal numbers. In nonstandard analysis, continuity
can be defined as follows.

A function from the reals to the reals is continuous if its natural extension to the
hyperreals has the property that for real x and infinitesimal dx, (x+dx) (x) is
infinitesimal.

In other words, an infinitesimal increment of the independent variable corresponds to an


infinitesimal change of the dependent variable, giving a modern expression to Augustin-
Louis Cauchy's definition of continuity.

Examples

All polynomial functions are continuous.


If a function has a domain which is not an interval, the notion of a continuous
function as one whose graph you can draw without taking your pencil off the
paper is not quite correct. Consider the functions f(x) = 1/x and g(x) = (sin x)/x.
Neither function is defined at x = 0, so each has domain R \ {0} of real numbers
except 0, and each function is continuous. The question of continuity at x = 0 does
not arise, since x = 0 is neither in the domain of f nor in the domain of g. The
function f cannot be extended to a continuous function whose domain is R, since
no matter what value is assigned at 0, the resulting function will not be
continuous. On the other hand, since the limit of g at 0 is 1, g can be extended
continuously to R by defining its value at 0 to be 1.
The exponential functions, logarithms, square root function, trigonometric
functions and absolute value function are continuous. Rational functions,
however, are not necessarily continuous on all of R.
An example of a rational continuous function is f(x)=1x-2. The question of
continuity at x= 2 does not arise, since x = 2 is not in the domain of f.
An example of a discontinuous function is the function f defined by f(x) = 1 if x >
0, f(x) = 0 if x 0. Pick for instance = 12. There is no -neighborhood around x =
0 that will force all the f(x) values to be within of f(0). Intuitively we can think
of this type of discontinuity as a sudden jump in function values.
Another example of a discontinuous function is the signum or sign function.
A more complicated example of a discontinuous function is Thomae's function.
Dirichlet's function
is nowhere continuous.

Facts about continuous functions

If two functions f and g are continuous, then f + g, fg, and f/g are continuous. (Note. The
only possible points x of discontinuity of f/g are the solutions of the equation g(x) = 0; but
then any such x does not belong to the domain of the function f/g. Hence f/g is continuous
on its entire domain, or - in other words - is continuous.)

The composition f o g of two continuous functions is continuous.

If a function is differentiable at some point c of its domain, then it is also continuous at c.


The converse is not true: a function that is continuous at c need not be differentiable
there. Consider for instance the absolute value function at c = 0.

Intermediate value theorem

The intermediate value theorem is an existence theorem, based on the real number
property of completeness, and states:

If the real-valued function f is continuous on the closed interval [a, b] and k is some
number between f(a) and f(b), then there is some number c in [a, b] such that f(c) = k.

For example, if a child grows from 1 m to 1.5 m between the ages of two and six years,
then, at some time between two and six years of age, the child's height must have been
1.25 m.

As a consequence, if f is continuous on [a, b] and f(a) and f(b) differ in sign, then, at
some point c in [a, b], f(c) must equal zero.

Extreme value theorem

The extreme value theorem states that if a function f is defined on a closed interval [a,b]
(or any closed and bounded set) and is continuous there, then the function attains its
maximum, i.e. there exists c [a,b] with f(c) f(x) for all x [a,b]. The same is true of
the minimum of f. These statements are not, in general, true if the function is defined on
an open interval (a,b) (or any set that is not both closed and bounded), as, for example,
the continuous function f(x) = 1/x, defined on the open interval (0,1), does not attain a
maximum, being unbounded above.
Directional continuity

A right continuous function A left continuous function

A function may happen to be continuous in only one direction, either from the "left" or
from the "right". A right-continuous function is a function which is continuous at all
points when approached from the right. Technically, the formal definition is similar to the
definition above for a continuous function but modified as follows:

The function is said to be right-continuous at the point c if the following holds: For any
number > 0 however small, there exists some number > 0 such that for all x in the
domain with c < x < c + , the value of (x) will satisfy

Notice that x must be larger than c, that is on the right of c. If x were also allowed to take
values less than c, this would be the definition of continuity. This restriction makes it
possible for the function to have a discontinuity at c, but still be right continuous at c, as
pictured.

Likewise a left-continuous function is a function which is continuous at all points when


approached from the left, that is, c < x < c.

A function is continuous if and only if it is both right-continuous and left-continuous.

Continuous functions between metric spaces


Now consider a function f from one metric space (X, dX) to another metric space (Y, dY).
Then f is continuous at the point c in X if for any positive real number , there exists a
positive real number such that all x in X satisfying dX(x, c) < will also satisfy dY(f(x),
f(c)) < .

This can also be formulated in terms of sequences and limits: the function f is continuous
at the point c if for every sequence (xn) in X with limit lim xn = c, we have lim f(xn) = f(c).
Continuous functions transform limits into limits.
This latter condition can be weakened as follows: f is continuous at the point c if and only
if for every convergent sequence (xn) in X with limit c, the sequence (f(xn)) is a Cauchy
sequence, and c is in the domain of f. Continuous functions transform convergent
sequences into Cauchy sequences.

The set of points at which a function between metric spaces is continuous is a G set
this follows from the - definition of continuity.

Continuous functions between topological spaces

Continuity of a function at a point

The above definitions of continuous functions can be generalized to functions from one
topological space to another in a natural way; a function f : X Y, where X and Y are
topological spaces, is continuous if and only if for every open set V Y, the inverse
image

is open.

However, this definition is often difficult to use directly. Instead, suppose we have a
function f from X to Y, where X, Y are topological spaces. We say f is continuous at x for
some x X if for any neighborhood V of f(x), there is a neighborhood U of x such that
f(U) V. Although this definition appears complex, the intuition is that no matter how
"small" V becomes, we can always find a U containing x that will map inside it. If f is
continuous at every x X, then we simply say f is continuous.

In a metric space, it is equivalent to consider the neighbourhood system of open balls


centered at x and f(x) instead of all neighborhoods. This leads to the standard -
definition of a continuous function from real analysis, which says roughly that a function
is continuous if all points close to x map to points close to f(x). This only really makes
sense in a metric space, however, which has a notion of distance.
Note, however, that if the target space is Hausdorff, it is still true that f is continuous at a
if and only if the limit of f as x approaches a is f(a). At an isolated point, every function is
continuous.

Definitions

Several equivalent definitions for a topological structure exist and thus there are several
equivalent ways to define a continuous function.

Open and closed set definition

The most common notion of continuity in topology defines continuous functions as those
functions for which the preimages(or inverse images) of open sets are open. Similar to
the open set formulation is the closed set formulation, which says that preimages (or
inverse images) of closed sets are closed.

Neighborhood definition

Definitions based on preimages are often difficult to use directly. Instead, suppose we
have a function f : X Y, where X and Y are topological spaces. We say f is continuous
at x for some x X if for any neighborhood V of f(x), there is a neighborhood U of x
such that f(U) V. Although this definition appears complicated, the intuition is that no
matter how "small" V becomes, we can always find a U containing x that will map inside
it. If f is continuous at every x X, then we simply say f is continuous.

In a metric space, it is equivalent to consider the neighbourhood system of open balls


centered at x and f(x) instead of all neighborhoods. This leads to the standard -
definition of a continuous function from real analysis, which says roughly that a function
is continuous if all points close to x map to points close to f(x). This only really makes
sense in a metric space, however, which has a notion of distance.
Note, however, that if the target space is Hausdorff, it is still true that f is continuous at a
if and only if the limit of f as x approaches a is f(a). At an isolated point, every function is
continuous.

Sequences and nets

In several contexts, the topology of a space is conveniently specified in terms of limit


points. In many instances, this is accomplished by specifying when a point is the limit of
a sequence, but for some spaces that are too large in some sense, one specifies also when
a point is the limit of more general sets of points indexed by a directed set, known as nets.
A function is continuous only if it takes limits of sequences to limits of sequences. In the
former case, preservation of limits is also sufficient; in the latter, a function may preserve
all limits of sequences yet still fail to be continuous, and preservation of nets is a
necessary and sufficient condition.

In detail, a function f : X Y is sequentially continuous if whenever a sequence (xn) in


X converges to a limit x, the sequence (f(xn)) converges to f(x). Thus sequentially
continuous functions "preserve sequential limits". Every continuous function is
sequentially continuous. If X is a first-countable space, then the converse also holds: any
function preserving sequential limits is continuous. In particular, if X is a metric space,
sequential continuity and continuity are equivalent. For non first-countable spaces,
sequential continuity might be strictly weaker than continuity. (The spaces for which the
two properties are equivalent are called sequential spaces.) This motivates the
consideration of nets instead of sequences in general topological spaces. Continuous
functions preserve limits of nets, and in fact this property characterizes continuous
functions.

Closure operator definition

Given two topological spaces (X,cl) and (X ', cl ') where cl and cl ' are two closure
operators then a function

is continuous if for all subsets A of X

One might therefore suspect that given two topological spaces (X,int) and (X ' ,int ')
where int and int ' are two interior operators then a function

is continuous if for all subsets A of X


or perhaps if

however, neither of these conditions is either necessary or sufficient for continuity.

Instead, we must resort to inverse images: given two topological spaces (X,int) and (X '
,int ') where int and int ' are two interior operators then a function

is continuous if for all subsets A of X '

We can also write that given two topological spaces (X,cl) and (X ' ,cl ') where cl and cl '
are two closure operators then a function

is continuous if for all subsets A of X '

Closeness relation definition

Given two topological spaces (X,) and (X' ,') where and ' are two closeness relations
then a function

is continuous if for all points x and of X and all subsets A of X,

This is another way of writing the closure operator definition.

Useful properties of continuous maps

Some facts about continuous maps between topological spaces:

If f : X Y and g : Y Z are continuous, then so is the composition g f : X


Z.
If f : X Y is continuous and
o X is compact, then f(X) is compact.
o X is connected, then f(X) is connected.
o X is path-connected, then f(X) is path-connected.
o X is Lindelf, then f(X) is Lindelf.
o X is separable, then f(X) is separable.
The identity map idX : (X, 2) (X, 1) is continuous if and only if 1 2

Other notes

If a set is given the discrete topology, all functions with that space as a domain are
continuous. If the domain set is given the indiscrete topology and the range set is at least
T0, then the only continuous functions are the constant functions. Conversely, any
function whose range is indiscrete is continuous.

Given a set X, a partial ordering can be defined on the possible topologies on X. A


continuous function between two topological spaces stays continuous if we strengthen the
topology of the domain space or weaken the topology of the codomain space. Thus we
can consider the continuity of a given function a topological property, depending only on
the topologies of its domain and codomain spaces.

For a function f from a topological space X to a set S, one defines the final topology on S
by letting the open sets of S be those subsets A of S for which f1(A) is open in X. If S has
an existing topology, f is continuous with respect to this topology if and only if the
existing topology is coarser than the final topology on S. Thus the final topology can be
characterized as the finest topology on S which makes f continuous. If f is surjective, this
topology is canonically identified with the quotient topology under the equivalence
relation defined by f. This construction can be generalized to an arbitrary family of
functions X S.

Dually, for a function f from a set S to a topological space, one defines the initial
topology on S by letting the open sets of S be those subsets A of S for which f(A) is open
in X. If S has an existing topology, f is continuous with respect to this topology if and
only if the existing topology is finer than the initial topology on S. Thus the initial
topology can be characterized as the coarsest topology on S which makes f continuous. If
f is injective, this topology is canonically identified with the subspace topology of S,
viewed as a subset of X. This construction can be generalized to an arbitrary family of
functions S X.

Symmetric to the concept of a continuous map is an open map, for which images of open
sets are open. In fact, if an open map f has an inverse, that inverse is continuous, and if a
continuous map g has an inverse, that inverse is open.

If a function is a bijection, then it has an inverse function. The inverse of a continuous


bijection is open, but need not be continuous. If it is, this special function is called a
homeomorphism. If a continuous bijection has as its domain a compact space and its
codomain is Hausdorff, then it is automatically a homeomorphism.
Continuous functions between partially ordered sets
In order theory, continuity of a function between posets is Scott continuity. Let X be a
complete lattice, then a function f : X X is continuous if, for each subset Y of X, we
have sup f(Y) = f(sup Y).

Continuous binary relation


A binary relation R on A is continuous if R(a, b) whenever there are sequences (ak)i and
(bk)i in A which converge to a and b respectively for which R(ak, bk) for all k. Clearly, if
one treats R as a characteristic function in two variables, this definition of continuous is
identical to that for continuous functions.

Continuity space
A continuity space is a generalization of metric spaces and posets, which uses the
concept of quantales, and that can be used to unify the notions of metric spaces and
domains.
Chapter 6

Additive Function

In mathematics the term additive function has two different definitions, depending on
the specific field of application.

In algebra an additive function (or additive map) is a function that preserves the
addition operation:

f(x + y) = f(x) + f(y)

for any two elements x and y in the domain. For example, any linear map is additive.
When the domain is the real numbers, this is Cauchy's functional equation.

In number theory, an additive function is an arithmetic function f(n) of the positive


integer n such that whenever a and b are coprime, the function of the product is the sum
of the functions:

f(ab) = f(a) + f(b).

Completely additive
An additive function f(n) is said to be completely additive if f(ab) = f(a) + f(b) holds for
all positive integers a and b, even when they are not co-prime. Totally additive is also
used in this sense by analogy with totally multiplicative functions. If f is a completely
additive function then f(1) = 0.

Every completely additive function is additive, but not vice versa.

Examples
Example of arithmetic functions which are completely additive are:

The restriction of the logarithmic function to N.

The multiplicity of a prime factor p in n, that is the largest exponent m for which
pm divides n.
a0(n) - the sum of primes dividing n counting multiplicity, sometimes called
sopfr(n), the potency of n or the integer logarithm of n (sequence A001414 in
OEIS). For example:

a0(4) = 2 + 2 = 4
a0(20) = a0(22 5) = 2 + 2+ 5 = 9
a0(27) = 3 + 3 + 3 = 9
a0(144) = a0(24 32) = a0(24) + a0(32) = 8 + 6 = 14
a0(2,000) = a0(24 53) = a0(24) + a0(53) = 8 + 15 = 23
a0(2,003) = 2003
a0(54,032,858,972,279) = 1240658
a0(54,032,858,972,302) = 1780417
a0(20,802,650,704,327,415) = 1240681

The function (n), defined as the total number of prime factors of n, counting
multiple factors multiple times, sometimes called the "Big Omega function"
(sequence A001222 in OEIS). For example;

(1) = 0, since 1 has no prime factors


(20) = (225) = 3
(4) = 2
(27) = 3
(144) = (24 32) = (24) + (32) = 4 + 2 = 6
(2,000) = (24 53) = (24) + (53) = 4 + 3 = 7
(2,001) = 3
(2,002) = 4
(2,003) = 1
(54,032,858,972,279) = 3
(54,032,858,972,302) = 6
(20,802,650,704,327,415) = 7

Example of arithmetic functions which are additive but not completely additive are:

(n), defined as the total number of different prime factors of n (sequence


A001221 in OEIS). For example:

(4) = 1
(20) = (225) = 2
(27) = 1
(144) = (24 32) = (24) + (32) = 1 + 1 = 2
(2,000) = (24 53) = (24) + (53) = 1 + 1 = 2
(2,001) = 3
(2,002) = 4
(2,003) = 1
(54,032,858,972,279) = 3
(54,032,858,972,302) = 5
(20,802,650,704,327,415) = 5

a1(n) - the sum of the distinct primes dividing n, sometimes called sopf(n)
(sequence A008472 in OEIS). For example:

a1(1) = 0
a1(4) = 2
a1(20) = 2 + 5 = 7
a1(27) = 3
a1(144) = a1(24 32) = a1(24) + a1(32) = 2 + 3 = 5
a1(2,000) = a1(24 53) = a1(24) + a1(53) = 2 + 5 = 7
a1(2,001) = 55
a1(2,002) = 33
a1(2,003) = 2003
a1(54,032,858,972,279) = 1238665
a1(54,032,858,972,302) = 1780410
a1(20,802,650,704,327,415) = 1238677

Multiplicative functions
From any additive function f(n) it is easy to create a related multiplicative function g(n)
i.e. with the property that whenever a and b are coprime we have:

g(ab) = g(a) g(b).

One such example is g(n) = 2f(n).


Chapter 7

Algebraic Function

In mathematics, an algebraic function is informally a function that satisfies a polynomial


equation whose coefficients are themselves polynomials. For example, an algebraic
function in one variable x is a solution y for an equation

where the coefficients ai(x) are polynomial functions of x. A function which is not
algebraic is called a transcendental function.

In more precise terms, an algebraic function may not be a function at all, at least not in
the conventional sense. Consider for example the equation of a circle:

This determines y, except only up to an overall sign:

However, both branches are thought of as belonging to the "function" determined by the
polynomial equation. Thus an algebraic function is most naturally considered as a
multiple valued function.

An algebraic function in n variables is similarly defined as a function y which solves a


polynomial equation in n + 1 variables:

It is normally assumed that p should be an irreducible polynomial. The existence of an


algebraic function is then guaranteed by the implicit function theorem.

Formally, an algebraic function in n variables over the field K is an element of the


algebraic closure of the field of rational functions K(x1,...,xn). In order to understand
algebraic functions as functions, it becomes necessary to introduce ideas relating to
Riemann surfaces or more generally algebraic varieties, and sheaf theory.
Algebraic functions in one variable
Introduction and overview

The informal definition of an algebraic function provides a number of clues about the
properties of algebraic functions. To gain an intuitive understanding, it may be helpful to
regard algebraic functions as functions which can be formed by the usual algebraic
operations: addition, multiplication, division, and taking an nth root. Of course, this is
something of an oversimplification; because of casus irreducibilis (and more generally
the fundamental theorem of Galois theory), algebraic functions need not be expressible
by radicals.

First, note that any polynomial is an algebraic function, since polynomials are simply the
solutions for y of the equation

More generally, any rational function is algebraic, being the solution of

Moreover, the nth root of any polynomial is an algebraic function, solving the equation

Surprisingly, the inverse function of an algebraic function is an algebraic function. For


supposing that y is a solution of

for each value of x, then x is also a solution of this equation for each value of y. Indeed,
interchanging the roles of x and y and gathering terms,

Writing x as a function of y gives the inverse function, also an algebraic function.

However, not every function has an inverse. For example, y = x2 fails the horizontal line
test: it fails to be one-to-one. The inverse is the algebraic "function" . In this
sense, algebraic functions are often not true functions at all, but instead are multiple
valued functions.
The role of complex numbers

From an algebraic perspective, complex numbers enter quite naturally into the study of
algebraic functions. First of all, by the fundamental theorem of algebra, the complex
numbers are an algebraically closed field. Hence any polynomial relation

p(y, x) = 0

is guaranteed to have at least one solution (and in general a number of solutions not
exceeding the degree of p in x) for y at each point x, provided we allow y to assume
complex as well as real values. Thus, problems to do with the domain of an algebraic
function can safely be minimized.

A graph of three branches of the algebraic function y, where y3 xy + 1 = 0, over the


domain 3/22/3 < x < 50.

Furthermore, even if one is ultimately interested in real algebraic functions, there may be
no adequate means to express the function in a simple manner without resorting to
complex numbers. For example, consider the algebraic function determined by the
equation

Using the cubic formula, one solution is (the red curve in the accompanying image)

There is no way to express this function in terms of real numbers only, even though the
resulting function is real-valued on the domain of the graph shown.

On a more significant theoretical level, using complex numbers allow one to use the
powerful techniques of complex analysis to discuss algebraic functions. In particular, the
argument principle can be used to show that any algebraic function is in fact an analytic
function, at least in the multiple-valued sense.
Formally, let p(x, y) be a complex polynomial in the complex variables x and y. Suppose
that x0 C is such that the polynomial p(x0,y) of y has n distinct zeros. We shall show
that the algebraic function is analytic in a neighborhood of x0. Choose a system of n non-
overlapping discs i containing each of these zeros. Then by the argument principle

By continuity, this also holds for all x in a neighborhood of x0. In particular, p(x,y) has
only one root in i, given by the residue theorem:

which is an analytic function.

Monodromy

Note that the foregoing proof of analyticity derived an expression for a system of n
different function elements fi(x), provided that x is not a critical point of p(x, y). A
critical point is a point where the number of distinct zeros is smaller than the degree of p,
and this occurs only where the highest degree term of p vanishes, and where the
discriminant vanishes. Hence there are only finitely many such points c1, ..., cm.

A close analysis of the properties of the function elements fi near the critical points can be
used to show that the monodromy cover is ramified over the critical points (and possibly
the point at infinity). Thus the entire function associated to the fi has at worst algebraic
poles and ordinary algebraic branchings over the critical points.

Note that, away from the critical points, we have

since the fi are by definition the distinct zeros of p. The monodromy group acts by
permuting the factors, and thus forms the monodromy representation of the Galois
group of p. (The monodromy action on the universal covering space is related but
different notion in the theory of Riemann surfaces.)

History
The ideas surrounding algebraic functions go back at least as far as Ren Descartes. The
first discussion of algebraic functions appears to have been in Edward Waring's 1794 An
Essay on the Principles of Human Knowledge in which he writes:
let a quantity denoting the ordinate, be an algebraic function of the abscissa x, by the
common methods of division and extraction of roots, reduce it into an infinite series
ascending or descending according to the dimensions of x, and then find the integral of
each of the resulting terms.
Chapter 8

Analytic Function

In mathematics, an analytic function is a function that is locally given by a convergent


power series. There exist both real analytic functions and complex analytic functions,
categories that are similar in some ways, but different in others. Functions of each type
are infinitely differentiable, but complex analytic functions exhibit properties that do not
hold generally for real analytic functions. A function is analytic if and only if it is equal
to its Taylor series in some neighborhood of every point.

Definitions
Formally, a function is real analytic on an open set D in the real line if for any x0 in D
one can write

in which the coefficients a0, a1, ... are real numbers and the series is convergent to (x)
for x in a neighborhood of x0.

Alternatively, an analytic function is an infinitely differentiable function such that the


Taylor series at any point x0 in its domain

converges to (x) for x in a neighborhood of x0. The set of all real analytic functions on a
given set D is often denoted by C(D).

A function defined on some subset of the real line is said to be real analytic at a point x
if there is a neighborhood D of x on which is real analytic.
The definition of a complex analytic function is obtained by replacing, in the definitions
above, "real" with "complex" and "real line" with "complex plane."

Examples
Most special functions are analytic (at least in some range of the complex plane). Typical
examples of analytic functions are:

Any polynomial (real or complex) is an analytic function. This is because if a


polynomial has degree n, any terms of degree larger than n in its Taylor series
expansion will vanish, and so this series will be trivially convergent. Furthermore,
every polynomial is its own Maclaurin series.

The exponential function is analytic. Any Taylor series for this function
converges not only for x close enough to x0 (as in the definition) but for all values
of x (real or complex).

The trigonometric functions, logarithm, and the power functions are analytic on
any open set of their domain.

Typical examples of functions that are not analytic are:

The absolute value function when defined on the set of real numbers or complex
numbers is not everywhere analytic because it is not differentiable at 0. Piecewise
defined functions (functions given by different formulas in different regions) are
typically not analytic where the pieces meet.

The complex conjugate function is not complex analytic, although its


restriction to the real line is the identity function and therefore real analytic.

Alternate characterizations
If is an infinitely differentiable function defined on an open set , then the
following conditions are equivalent.

1) is real analytic.
2) There is a complex analytic extension of to an open set which contains D.
3) For every compact set there exists a constant C such that for every
and every non-negative integer k the following estimate holds:

The real analyticity of a function at a given point x can be characterized using the FBI
transform.
Complex analytic functions are exactly equivalent to holomorphic functions, and are thus
much more easily characterized.

Properties of analytic functions


The sums, products, and compositions of analytic functions are analytic.
The reciprocal of an analytic function that is nowhere zero is analytic, as is the
inverse of an invertible analytic function whose derivative is nowhere zero.
Any analytic function is smooth, that is, infinitely differentiable. The converse is
not true; in fact, in a certain sense, the analytic functions are sparse compared to
all infinitely differentiable functions.
For any open set C, the set A() of all analytic functions u : C is a
Frchet space with respect to the uniform convergence on compact sets. The fact
that uniform limits on compact sets of analytic functions are analytic is an easy
consequence of Morera's theorem. The set of all bounded analytic
functions with the supremum norm is a Banach space.

A polynomial cannot be zero at too many points unless it is the zero polynomial (more
precisely, the number of zeros is at most the degree of the polynomial). A similar but
weaker statement holds for analytic functions. If the set of zeros of an analytic function
has an accumulation point inside its domain, then is zero everywhere on the connected
component containing the accumulation point. In other words, if (rn) is a sequence of
distinct numbers such that (rn) = 0 for all n and this sequence converges to a point r in
the domain of D, then is identically zero on the connected component of D containing r.

Also, if all the derivatives of an analytic function at a point are zero, the function is
constant on the corresponding connected component.

These statements imply that while analytic functions do have more degrees of freedom
than polynomials, they are still quite rigid.

Analyticity and differentiability


As noted above, any analytic function (real or complex) is infinitely differentiable (also
known as smooth, or C). (Note that this differentiability is in the sense of real variables;
compare complex derivatives below.) There exist smooth real functions which are not
analytic. In fact there are many such functions, and the space of real analytic functions is
a proper subspace of the space of smooth functions.

The situation is quite different when one considers complex analytic functions and
complex derivatives. It can be proved that any complex function differentiable (in the
complex sense) in an open set is analytic. Consequently, in complex analysis, the term
analytic function is synonymous with holomorphic function.
Real versus complex analytic functions
Real and complex analytic functions have important differences (one could notice that
even from their different relationship with differentiability). Analyticity of complex
functions is a more restrictive property, as it has more restrictive necessary conditions
and complex analytic functions have more structure than their real-line counterparts.

According to Liouville's theorem, any bounded complex analytic function defined on the
whole complex plane is constant. The corresponding statement for real analytic functions,
with the complex plane replaced by the real line, is clearly false; this is illustrated by

Also, if a complex analytic function is defined in an open ball around a point x0, its power
series expansion at x0 is convergent in the whole ball. This statement for real analytic
functions (with open ball meaning an open interval of the real line rather than an open
disk of the complex plane) is not true in general; the function of the example above gives
an example for x0 = 0 and a ball of radius exceeding 1, since the power series 1 x2 + x4
x6... diverges for |x| > 1.

Any real analytic function on some open set on the real line can be extended to a complex
analytic function on some open set of the complex plane. However, not every real
analytic function defined on the whole real line can be extended to a complex function
defined on the whole complex plane. The function (x) defined in the paragraph above is
a counterexample, as it is not defined for x = i.

Analytic functions of several variables


One can define analytic functions in several variables by means of power series in those
variables. Analytic functions of several variables have some of the same properties as
analytic functions of one variable. However, especially for complex analytic functions,
new and interesting phenomena show up when working in 2 or more dimensions. For
instance, zero sets of complex analytic functions in more than one variable are never
discrete.
Chapter 9

Completely Multiplicative Function and


Concave Function

Completely multiplicative function


In number theory, functions of positive integers which respect products are important and
are called completely multiplicative functions or totally multiplicative functions.
Especially in number theory, a weaker condition is also important, respecting only
products of coprime numbers, and such functions are called multiplicative functions.
Outside of number theory, the term "multiplicative function" is often taken to be
synonymous with "completely multiplicative function" as defined here.

Definition
A completely multiplicative function (or totally multiplicative function) is an
arithmetic function (that is, a function whose domain is the natural numbers), such that
f(1) = 1 and f(ab) = f(a) f(b) holds for all positive integers a and b.

Without the requirement that f(1) = 1, one could still have f(1) = 0, but then f(a) = 0 for
all positive integers a, so this is not a very strong restriction.

Examples
The easiest example of a multiplicative function is a monomial: For any particular
positive integer n, define f(a) = an.

Properties
A completely multiplicative function is completely determined by its values at the prime
numbers, a consequence of the fundamental theorem of arithmetic. Thus, if n is a product
of powers of distinct primes, say n = pa qb ..., then f(n) = f(p)a f(q)b ...
Concave function
In mathematics, a concave function is the negative of a convex function. A concave
function is also synonymously called concave downwards, concave down, convex cap
or upper convex.

Definition
A real-valued function f defined on an interval (or on any convex set C of some vector
space) is called concave if, for any two points x and y in its domain C and any t in [0,1],
we have

A function is called strictly concave if

for any t in (0,1) and x y.

For a function f:RR, this definition merely states that for every z between x and y, the
point (z, f(z) ) on the graph of f is above the straight line joining the points (x, f(x) ) and
(y, f(y) )

A function f(x) is a quasiconcave if the upper contour sets of the function


are convex sets.
Properties
A function f(x) is concave over a convex set if and only if the function f(x) is a convex
function over the set.

A differentiable function f is concave on an interval if its derivative function f is


monotonically decreasing on that interval: a concave function has a decreasing slope.
("Decreasing" here means "non-increasing", rather than "strictly decreasing", and thus
allows zero slopes.)

For a twice-differentiable function f, if the second derivative, f (x), is positive (or, if the
acceleration is positive), then the graph is convex; if f (x) is negative, then the graph is
concave. Points where concavity changes are inflection points.

If a convex (i.e., concave upward) function has a "bottom", any point at the bottom is a
minimal extremum. If a concave (i.e., concave downward) function has an "apex", any
point at the apex is a maximal extremum.

If f(x) is twice-differentiable, then f(x) is concave if and only if f (x) is non-positive. If its
second derivative is negative then it is strictly concave, but the opposite is not true, as
shown by f(x) = -x4.

If f is concave and differentiable then

A continuous function on C is concave if and only if for any x and y in C

If a function f is concave, and f(0) 0, then f is subadditive. Proof:

since f is concave, let y = 0,

Examples
The functions f(x) = x2 and are concave, as the second derivative
is always negative.
Any linear function f(x) = ax + b is both concave and convex.
The function f(x) = sin(x) is concave on the interval .
The function log | B | , where | B | is the determinant of matrix nonnegative-
definite matrix B, is concave.
Practical application: rays bending in Computation of radiowave attenuation in
the atmosphere.
Chapter 10

Convex Function

Convex function on an interval

A function (in black) is convex if and only if the region above its graph (in green) is a
convex set.
In mathematics, a real-valued function f(x) defined on an interval (or on any convex
subset of some vector space) is called convex, concave upwards, concave up or convex
cup, if for any two points x1 and x2 in its domain X and any ,

A function is called strictly convex if

for every , , and .

Note that the function must be defined over a convex set, otherwise the point
may not lie in the function domain.

A function f is said to be (strictly) concave if f is (strictly) convex.

Pictorially, a function is called 'convex' if the function lies below or on the straight line
segment connecting two points, for any two points in the interval.

Sometimes an alternative definition is used:

A function is convex if its epigraph (the set of points lying on or above the graph) is a
convex set.

These two definitions are equivalent, i.e., one holds if and only if the other one is true.

Properties
Suppose f is a function of one real variable defined on an interval, and let

(note that R(x,y) is the slope of the red line in the above drawing; note also that the
function R is symmetric in x,y). f is convex if and only if R(x,y) is monotonically non-
decreasing in x, for y fixed (or viceversa). This characterization of convexity is quite
useful to prove the following results.

A convex function f defined on some open interval C is continuous on C and Lipschitz


continuous on any closed subinterval. f admits left and right derivatives, and these are
monotonically non-decreasing. As a consequence, f is differentiable at all but at most
countably many points. If C is closed, then f may fail to be continuous at the endpoints
of C (an example is shown in the examples' section).
A function is midpoint convex on an interval C if

for all x and y in C. This condition is only slightly weaker than convexity. For example, a
real valued Lebesgue measurable function that is midpoint convex will be convex. In
particular, a continuous function that is midpoint convex will be convex.

A differentiable function of one variable is convex on an interval if and only if its


derivative is monotonically non-decreasing on that interval. If a function is differentiable
and convex then it is also continuously differentiable.

A continuously differentiable function of one variable is convex on an interval if and only


if the function lies above all of its tangents:

for all x and y in the interval. In particular, if f '(c) = 0, then c is a global minimum of f(x).

A twice differentiable function of one variable is convex on an interval if and only if its
second derivative is non-negative there; this gives a practical test for convexity. If its
second derivative is positive then it is strictly convex, but the converse does not hold. For
example, the second derivative of f(x) = x4 is f "(x) = 12 x2, which is zero for x = 0, but x4
is strictly convex.

More generally, a continuous, twice differentiable function of several variables is convex


on a convex set if and only if its Hessian matrix is positive semidefinite on the interior of
the convex set.

Any local minimum of a convex function is also a global minimum. A strictly convex
function will have at most one global minimum.

For a convex function f, the sublevel sets {x | f(x) < a} and {x | f(x) a} with a R are
convex sets. However, a function whose sublevel sets are convex sets may fail to be a
convex function. A function whose sublevel sets are convex is called a quasiconvex
function.

Jensen's inequality applies to every convex function f. If X is a random variable taking


values in the domain of f, then (Here denotes the
mathematical expectation.)

If a function f is convex, and f(0) 0, then f is superadditive on the positive half-axis.


Proof:
since f is convex, let y = 0,
for
every

Convex function calculus


If f and g are convex functions, then so are m(x) = max{f(x),g(x)} and h(x) = f(x) +
g(x).
If f and g are convex functions and g is non-decreasing, then h(x) = g(f(x)) is
convex.
If f is concave and g is convex and non-increasing, then h(x) = g(f(x)) is convex.
Convexity is invariant under affine maps: that is, if f(x) is convex with ,
then so is g(y) = f(Ay + b) with , where

If f(x,y) is convex in x then is convex in x, provided


for some x.
If f(x) is convex, then its perspective function g(x,t) = tf(x / t) (whose domain is
) is convex.

Strongly convex functions


The concept of strong convexity extends and parametrizes the notion of strict convexity.
A strongly convex function is also strictly convex, but not vice-versa.

A differentiable function f is called strongly convex with parameter m > 0 if the following
equation holds for all points x,y in its domain:

This is equivalent to the following

It is not necessary for a function to be differentiable in order to be strongly convex. A


third definition for a strongly convex function, with parameter m, is that, for all x,y in the
domain and ,
Notice that this definition approaches the definition for strict convexity as , and
is identical to the definition of a convex function when m = 0. Despite this, functions
exist that are strictly convex but are not strongly convex for any m > 0 (see example
below).

If the function f is twice continuously differentiable, then f is strongly convex with


parameter m if and only if for all x in the domain, where I is the identity
and is the Hessian matrix, and the inequality means that is
positive definite. This is equivalent to requiring that the minimum eigenvalue of
be at least m for all x. If the domain is just the real line, then is just the
second derivative , so the condition becomes . If m = 0, then this
means the Hessian is positive semidefinite (or if the domain is the real line, it means that
), which implies the function is convex, and perhaps strictly convex, but not
strongly convex.

Assuming still that the function is twice continuously differentiable, we show that the
lower bound of implies that it is strongly convex. Start by using Taylor's
Theorem:

for some (unknown) . Then


by the assumption about the
eigenvalues, and hence we recover the second strong convexity equation above.

The distinction between convex, strictly convex, and strongly convex can be subtle at
first glimpse. If f is twice continuously differentiable and the domain is the real line, then
we can characterize it as follows:

convex if and only if for all


strictly convex if for all (note: this is sufficient, but not necessary)
strongly convex if and only if for all

For example, consider a function f that is strictly convex, and suppose there is a sequence

of points (xn) such that . Even though , the function is not


strongly convex because will become arbitrarily small.

Strongly convex functions are in general easier to work with than convex or strictly
convex functions, since they are a smaller class. Like strictly convex functions, strongly
convex functions have unique minima.
Examples
The function f(x) = x2 has f''(x) = 2 > 0 at all points, so f is a convex function. It is
also strongly convex (and hence strictly convex too), with strong convexity
constant 2.
The function f(x) = x4 has , so f is a convex function. It is
strictly convex, even though the second derivative is not strictly positive at all
points. It is not strongly convex.
The absolute value function f(x) = | x | is convex, even though it does not have a
derivative at the point x = 0. It is not strictly convex.
The function f(x) = | x | p for 1 p is convex.
The exponential function f(x) = ex is convex. It is also strictly convex, since f''(x) =
ex > 0, but it is not strongly convex since the second derivative can be arbitrarily
close to zero. More generally, the function g(x) = ef(x) is logarithmically convex if
f is a convex function.
The function f with domain [0,1] defined by f(0) = f(1) = 1, f(x) = 0 for 0 < x < 1 is
convex; it is continuous on the open interval (0, 1), but not continuous at 0 and 1.
The function x3 has second derivative 6x; thus it is convex on the set where x 0
and concave on the set where x 0.
Every linear transformation taking values in is convex but not strictly convex,
since if f is linear, then f(a + b) = f(a) + f(b). This statement also holds if we
replace "convex" by "concave".
Every affine function taking values in , i.e., each function of the form f(x) = aTx
+ b, is simultaneously convex and concave.
Every norm is a convex function, by the triangle inequality and positive
homogeneity.
Examples of functions that are monotonically increasing but not convex include
and g(x) = log(x).
Examples of functions that are convex but not monotonically increasing include
h(x) = x2 and k(x) = x.

The function f(x) = 1/x has which is greater than 0 if x > 0, so f(x)
is convex on the interval (0, +). It is concave on the interval (-,0).
The function f(x) = 1/x2, with f(0) = +, is convex on the interval (0, +) and
convex on the interval (-,0), but not convex on the interval (-, +), because of
the singularity at x = 0.
Chapter 11

Differentiable Function

A differentiable function
The absolute value function is not differentiable at x = 0.

In calculus (a branch of mathematics), a differentiable function is a function whose


derivative exists at each point in its domain. The graph of a differentiable function must
have a non-vertical tangent line at each point in its domain. As a result, the graph of a
differentiable function must be relatively smooth, and cannot contain any breaks, bends,
or cusps, or any points with a vertical tangent.

More generally, if x0 is a point in the domain of a function , then is said to be


differentiable at x0 if the derivative (x0) is defined. This means that the graph of has
a non-vertical tangent line at the point (x0, (x0)), and therefore cannot have a break,
bend, or cusp at this point.
Differentiability and continuity

The Weierstrass function is continuous, but is not differentiable at any point.

If is differentiable at a point x0, then must also be continuous at x0. In particular, any
differentiable function must be continuous at every point in its domain. The converse
does not hold: a continuous function need not be differentiable. For example, a function
with a bend, cusp, or vertical tangent may be continuous, but fails to be differentiable at
the location of the anomaly.

Most functions which occur in practice have derivatives at all points or at almost every
point. However, a result of Stefan Banach states that the set of functions which have a
derivative at some point is a meager set in the space of all continuous functions.
Informally, this means that differentiable functions are very atypical among continuous
functions. The first known example of a function that is continuous everywhere but
differentiable nowhere is the Weierstrass function.

Differentiability classes
A function is said to be continuously differentiable if the derivative (x) exists, and is
itself a continuous function. Though the derivative of a differentiable function never has a
jump discontinuity, it is possible for the derivative to have an essential discontinuity. For
example, the function
is differentiable at 0 (with the derivative being 0), but the derivative is not continuous at
this point.

Sometimes continuously differentiable functions are said to be of class C1. A function is


of class C2 if the first and second derivative of the function both exist and are continuous.
More generally, a function is said to be of class Ck if the first k derivatives (x), (x), ...,
(k)(x) all exist and are continuous.

Differentiability in higher dimensions


A function f: Rm Rn is said to be differentiable at a point x0 if there exists a linear map
J: Rm Rn such that

If a function is differentiable at x0, then all of the partial derivatives must exist at x0, in
which case the linear map J is given by the Jacobian matrix.

Note that existence of the partial derivatives (or even all of the directional derivatives)
does not guarantee that a function is differentiable at a point. For example, the function :
R2 R defined by

is not differentiable at (0, 0), but all of the partial derivatives and directional derivatives
exist at this point. For a continuous example, the function

is not differentiable at (0, 0), but again all of the partial derivatives and directional
derivatives exist.

It is known that if the partial derivatives of a function all exist and are continuous in a
neighborhood of a point, then the function must be differentiable at that point, and is in
fact of class C1.
Differentiability in complex analysis
In complex analysis, any function that is complex-differentiable in a neighborhood of a
point is called holomorphic. Such a function is necessarily infinitely differentiable, and in
fact analytic.

Differentiable functions on manifolds


If M is a differentiable manifold, a real or complex-valued function on M is said to be
differentiable at a point p if it is differentiable with respect to some (or any) coordinate
chart defined around p. More generally, if M and N are differentiable manifolds, a
function : M N is said to be differentiable at a point p if it is differentiable with
respect to some (or any) coordinate charts defined around p and (p).
Chapter 12

Elementary Function and Entire Function

Elementary function
In mathematics, an elementary function is a function built from a finite number of
exponentials, logarithms, constants, one variable, and nth roots through composition and
combinations using the four elementary operations (+ ). By allowing these functions
(and constants) to be complex numbers, trigonometric functions and their inverses
become included in the elementary functions.

The roots of equations are the functions implicitly defined as solving a polynomial
equation with constant coefficients. For polynomials of degree four and smaller there are
explicit formulae for the roots (the formulae are elementary functions).

Elementary functions were introduced by Joseph Liouville in a series of papers from


1833 to 1841. An algebraic treatment of elementary functions was started by Joseph Fels
Ritt in the 1930s.

Examples
Examples of elementary functions include:

and

This last function is equal to the inverse cosine trigonometric function arccos(x) in the
entire complex domain. Hence, arccos(x) is an elementary function. An example of a
function that is not elementary is the error function
a fact that cannot be seen directly from the definition of elementary function but can be
proven using the Risch algorithm.

Differential algebra
The mathematical definition of an elementary function, or a function in elementary
form, is considered in the context of differential algebra. A differential algebra is an
algebra with the extra operation of derivation (algebraic version of differentiation). Using
the derivation operation new equations can be written and their solutions used in
extensions of the algebra. By starting with the field of rational functions, two special
types of transcendental extensions (the logarithm and the exponential) can be added to
the field building a tower containing elementary functions.

A differential field F is a field F0 (rational functions over the rationals Q for example)
together with a derivation map u u. (Here u is a new function. Sometimes the
notation u is used.) The derivation captures the properties of differentiation, so that for
any two elements of the base field, the derivation is linear

and satisfies the Leibniz product rule

An element h is a constant if h = 0. If the base field is over the rationals, care must be
taken when extending the field to add the needed transcendental constants.

A function u of a differential extension F[u] of a differential field F is an elementary


function over F if the function u

is algebraic over F, or
is an exponential, that is, u = u a for a F, or
is a logarithm, that is, u = a / u for a F.

(this is Liouville's theorem).


Entire function
In complex analysis, an entire function, also called an integral function, is a complex-
valued function that is holomorphic over the whole complex plane. Typical examples of
entire functions are the polynomials and the exponential function, and any sums, products
and compositions of these, including the error function and the trigonometric functions
sine and cosine and their hyperbolic counterparts the hyperbolic sine and hyperbolic
cosine functions. Neither the natural logarithm nor the square root functions can be
continued analytically to an entire function.

A transcendental entire function is an entire function that is not a polynomial.

Properties
Every entire function can be represented as a power series which converges uniformly on
compact sets. The Weierstrass factorization theorem asserts that any entire function can
be represented by a product involving its zeroes.

The entire functions on the complex plane form an integral domain (in fact a Prfer
domain).

Liouville's theorem states that any bounded entire function must be constant. Liouville's
theorem may be used to elegantly prove the fundamental theorem of algebra.

As a consequence of Liouville's theorem, any function which is entire on the whole


Riemann sphere (complex plane and the point at infinity) is constant. Thus any non-
constant entire function must have a singularity at the complex point at infinity, either a
pole for a polynomial or an essential singularity for a transcendental entire function.
Specifically, by the CasoratiWeierstrass theorem, for any transcendental entire function
f and any complex w there is a sequence (zm)mN with and
.

Picard's little theorem is a much stronger result: any non-constant entire function takes on
every complex number as value, except possibly one. The latter exception is illustrated
by the exponential function, which never takes on the value 0.

Liouville's theorem is a special case of the following statement: any entire function f
satisfying the inequality for all z with , with n a natural
number and M and R positive constants, is necessarily a polynomial, of degree at most n.
Conversely, any entire function f satisfying the inequality for all z
with , with n a natural number and M and R positive constants, is necessarily a
polynomial, of degree at least n.
Order and growth
The order (at infinity) of an entire function f(z) is defined using the limit superior as:

where Br is the disk of radius r and denotes the supremum norm of f(z) on Br. If
one can also define the type:

In other words, the order of f(z) is the infimum of all m such that
as . The order need not be finite.

Entire functions may grow as fast as any increasing function: for any increasing function
there exists an entire function f(z) such that f(x) > g( | x | ) for all real
x. Such a function f may be easily found of the form:

for a conveniently chosen strictly increasing sequence of positive integers nk. Any such
sequence defines an entire series f(z); and if it is conveniently chosen, the inequality f(x)
> g( | x | ) also holds, for all real x.

Other examples
J. E. Littlewood chose the Weierstrass sigma function as a 'typical' entire function in one
of his books. Other examples include the Fresnel integrals, the Jacobi theta function, and
the reciprocal Gamma function. The exponential function and the error function are
special cases of the Mittag-Leffler function.
Chapter 13

Even and Odd Functions

In mathematics, even functions and odd functions are functions which satisfy particular
symmetry relations, with respect to taking additive inverses. They are important in many
areas of mathematical analysis, especially the theory of power series and Fourier series.
They are named for the parity of the powers of the power functions which satisfy each
condition: the function f(x) = xn is an even function if n is an even integer, and it is an odd
function if n is an odd integer.

Even functions

(x) = x2 is an example of an even function.

Let f(x) be a real-valued function of a real variable. Then f is even if the following
equation holds for all x in the domain of f:
Geometrically, the graph of an even function is symmetric with respect to the y-axis,
meaning that its graph remains unchanged after reflection about the y-axis.

Examples of even functions are |x|, x2, x4, cos(x), and cosh(x).

Odd functions

(x) = x3 is an example of an odd function.

Again, let f(x) be a real-valued function of a real variable. Then f is odd if the following
equation holds for all x in the domain of f:
or

Geometrically, the graph of an odd function has rotational symmetry with respect to the
origin, meaning that its graph remains unchanged after rotation of 180 degrees about the
origin.

Examples of odd functions are x, x3, sin(x), sinh(x), and erf(x).

Some facts

(x) = x3 + 1 is neither even nor odd


A function's being odd or even does not imply differentiability, or even continuity. For
example, the Dirichlet function is even, but is nowhere continuous. Properties involving
Fourier series, Taylor series, derivatives and so on may only be used when they can be
assumed to exist.

Basic properties

The only function which is both even and odd is the constant function which is
identically zero (i.e., f(x) = 0 for all x).
The sum of an even and odd function is neither even nor odd, unless one of the
functions is identically zero.
The sum of two even functions is even, and any constant multiple of an even
function is even.
The sum of two odd functions is odd, and any constant multiple of an odd
function is odd.
The product of two even functions is an even function.
The product of two odd functions is an even function.
The product of an even function and an odd function is an odd function.
The quotient of two even functions is an even function.
The quotient of two odd functions is an even function.
The quotient of an even function and an odd function is an odd function.
The derivative of an even function is odd.
The derivative of an odd function is even.
The composition of two even functions is even, and the composition of two odd
functions is odd.
The composition of an even function and an odd function is even.
The composition of any function with an even function is even (but not vice
versa).
The integral of an odd function from A to +A is zero (where A is finite, and the
function has no vertical asymptotes between A and A).
The integral of an even function from A to +A is twice the integral from 0 to +A
(where A is finite, and the function has no vertical asymptotes between A and A).

Series

The Maclaurin series of an even function includes only even powers.


The Maclaurin series of an odd function includes only odd powers.
The Fourier series of a periodic even function includes only cosine terms.
The Fourier series of a periodic odd function includes only sine terms.

Algebraic structure

Any linear combination of even functions is even, and the even functions form a
vector space over the reals. Similarly, any linear combination of odd functions is
odd, and the odd functions also form a vector space over the reals. In fact, the
vector space of all real-valued functions is the direct sum of the subspaces of even
and odd functions. In other words, every function f(x) can be written uniquely as
the sum of an even function and an odd function:

where

is even and

is odd. For example, if f is exp, then fe is cosh and fo is sinh.

The even functions form a commutative algebra over the reals. However, the odd
functions do not form an algebra over the reals.

Harmonics

In signal processing, harmonic distortion occurs when a sine wave signal is sent through
a memoryless nonlinear system, that is, a system whose output at time t only depends on
the input at time t and does not depend on the input at any previous times. Such a system
is described by a response function Vout(t) = f(Vin(t)). The type of harmonics produced
depend on the response function f:

When the response function is even, the resulting signal will consist of only even
harmonics of the input sine wave;
o The fundamental is also an odd harmonic, so will not be present.
o A simple example is a full-wave rectifier.
When it is odd, the resulting signal will consist of only odd harmonics of the input
sine wave;
o The output signal will be half-wave symmetric.
o A simple example is clipping in a symmetric push-pull amplifier.
When it is asymmetric, the resulting signal may contain either even or odd
harmonics;
o Simple examples are a half-wave rectifier, and clipping in an
asymmetrical class A amplifier.
Chapter 14

Harmonic Function

A harmonic function defined on an annulus.

In mathematics, mathematical physics and the theory of stochastic processes, a harmonic


function is a twice continuously differentiable function f : U R (where U is an open
subset of Rn) which satisfies Laplace's equation, i.e.

everywhere on U. This is usually written as


Examples
Examples of harmonic functions of two variables are:

The real and imaginary part of any holomorphic function


The function

defined on (e.g. the electric potential due to a line charge, and the
gravity potential due to a long cylindrical mass)

The function

Examples of harmonic functions of n variables are:

The constant, linear and affine functions on all of (for example, the electric
potential between the plates of a capacitor, and the gravity potential of a slab)
The function on for n > 2.

Examples of harmonic functions of three variables are given in the table below with r2 =
x2 + y2 + z2. Harmonic functions are determined by their singularities. The singular points
of the harmonic functions below are expressed as "charges" and "charge densities" using
the terminology of electrostatics, and so the corresponding harmonic function will be
proportional to the electrostatic potential due to these charge distributions. Each function
below will yield another harmonic function when multiplied by a constant, rotated, and/or
has a constant added. The inversion of each function will yield another harmonic function
which has singularities which are the images of the original singularities in a spherical
"mirror". Also, the sum of any two harmonic functions will yield another harmonic
function.

Function Singularity

Unit point charge at origin

x-directed dipole at origin

Line of unit charge density on entire z-axis


Line of unit charge density on negative z-axis

Line of x-directed dipoles on entire z axis

Line of x-directed dipoles on negative z axis


Remarks
The set of harmonic functions on a given open set U can be seen as the kernel of the
Laplace operator and is therefore a vector space over R: sums, differences and scalar
multiples of harmonic functions are again harmonic.

If f is a harmonic function on U, then all partial derivatives of f are also harmonic


functions on U. The Laplace operator and the partial derivative operator will commute
on this class of functions.

In several ways, the harmonic functions are real analogues to holomorphic functions. All
harmonic functions are analytic, i.e. they can be locally expressed as power series. This is
a general fact about elliptic operators, of which the Laplacian is a major example.

The uniform limit of a convergent sequence of harmonic functions is still harmonic. This
is true because any continuous function satisfying the mean value property is harmonic.
Consider the sequence on ( , 0) R defined by . This
sequence is harmonic and converges uniformly to the zero function; however note that
the partial derivatives are not uniformly convergent to the zero function (the derivative of
the zero function). This example shows the importance of relying on the mean value
property and continuity to argue that the limit is harmonic.

Connections with complex function theory


The real and imaginary part of any holomorphic function yield harmonic functions on R2
(these are said to be a pair of harmonic conjugate functions). Conversely, any harmonic
function u on an open set is locally the real part of a holomorphic function. This
is immediately seen observing that, writing z = x + iy, the complex function g(z): = ux
iuy is holomorphic in , because it satisfies the Cauchy-Riemann equations. Therefore, g
has locally a primitive f, and u is the real part of f up to a constant, as ux is the real part of
.

Although the above correspondence with holomorphic functions only holds for functions
of two real variables, still harmonic functions in n variables enjoy a number of properties
typical of holomorphic functions. They are (real) analytic; they have a maximum
principle and a mean-value principle; a theorem of removal of singularities as well as a
Liouville theorem one holds for them in analogy to the corresponding theorems in
complex functions theory.

Properties of harmonic functions


Some important properties of harmonic functions can be deduced from Laplace's
equation.
Regularity theorem for harmonic functions

Harmonic functions are infinitely differentiable. In fact, harmonic functions are real
analytic.

Maximum principle

Harmonic functions satisfy the following maximum principle: if K is any compact subset
of U, then f, restricted to K, attains its maximum and minimum on the boundary of K. If
U is connected, this means that f cannot have local maxima or minima, other than the
exceptional case where f is constant. Similar properties can be shown for subharmonic
functions.

Mean value property

If B(x,r) is a ball with center x and radius r which is completely contained in the open set
, then the value u(x) of a harmonic function at the center of the
ball is given by the average value of u on the surface of the ball; this average value is also
equal to the average value of u in the interior of the ball. In other words

where n is the volume of the unit ball in n dimensions and is the n1 dimensional
surface measure. The mean value theorem follows by verifying that the spherical mean of
u is constant:

which in turn follows by making a change of variable and then applying Green's theorem.

As a consequence of the mean value theorem, u is preserved by the convolution of a


harmonic function u with any radial function with total integral one. More precisely, if
is an integrable radial function supported in B(0,) and = 1, then

provided that B(x,) . In particular, by taking to be a C function, the convolution


u is also smooth, and therefore harmonic functions are smooth throughout their
domains (in fact, real analytic, by the Poisson integral representation). Similar arguments
also show that harmonic distributions are, in fact, (smooth) harmonic functions (Weyl's
lemma).
The converse to the mean value theorem also holds: all locally integrable functions
satisfying the (volume) mean-value property are infinitely differentiable and harmonic
functions as well. This follows for C2 functions again by the method of spherical means.
For locally integrable functions, it follows since the mean value property implies that u is
unchanged when convolved with any radial mollifier of total integral one, but
convolutions with mollifiers are smooth and so the C2 result can still be applied.

Harnack's inequality

Let u be a non-negative harmonic function in a bounded domain . Then for every


connected set

Harnack's inequality

holds for some constant C that depends only on V and .

Removal of singularities

The following principle of removal of singularities holds for harmonic functions. If f is a


harmonic function defined on a dotted open subset of Rn, which is less singular at
x0 than the fundamental solution, that is

then f extends to a harmonic function on (compare Riemann's theorem for functions of


a complex variable).

Liouville's theorem

If f is a harmonic function defined on all of Rn which is bounded above or bounded


below, then f is constant (compare Liouville's theorem for functions of a complex
variable).

Generalizations
Weakly harmonic function

A function (or, more generally, a distribution) is weakly harmonic if it satisfies Laplace's


equation
in a weak sense (or, equivalently, in the sense of distributions). A weakly harmonic
function coincides almost everywhere with a strongly harmonic function, and is in
particular smooth. A weakly harmonic distribution is precisely the distribution associated
to a strongly harmonic function, and so also is smooth. This is Weyl's lemma.

There are other weak formulations of Laplace's equation that are often useful. One of
which is Dirichlet's principle, representing harmonic functions in the Sobolev space
H1() as the minimizers of the Dirichlet energy integral

with respect to local variations, that is, all functions such that
holds for all or equivalently, for all

Harmonic functions on manifolds

Harmonic functions can be defined on an arbitrary Riemannian manifold, using the


Laplace-Beltrami operator . In this context, a function is called harmonic if

Many of the properties of harmonic functions on domains in Euclidean space carry over
to this more general setting, including the mean value theorem (over geodesic balls), the
maximum principle, and the Harnack inequality. With the exception of the mean value
theorem, these are easy consequences of the corresponding results for general linear
elliptic partial differential equations of the second order.

Subharmonic functions

A C2 function that satisfies is called subharmonic. This condition guarantees


that the maximum principle will hold, although other properties of harmonic functions
may fail. More generally, a function is subharmonic if and only if, in the interior of any
ball in its domain, its graph lies below that of the harmonic function interpolating its
boundary values on the ball.

Harmonic forms

One generalization of the study of harmonic functions is the study of harmonic forms on
Riemannian manifolds, and it is related to the study of cohomology. Also, it is possible to
define harmonic vector-valued functions, or harmonic maps of two Riemannian
manifolds, which are critical points of a generalized Dirichlet energy functional (this
includes harmonic functions as a special case, a result known as Dirichlet principle).
These kind of harmonic maps appear in the theory of minimal surfaces. For example, a
curve, that is, a map from an interval in R to a Riemannian manifold, is a harmonic map
if and only if it is a geodesic.

Harmonic maps between manifolds

If M and N are two Riemannian manifolds, then a harmonic map u : M N is defined to


be a stationary point of the Dirichlet energy

in which du : TM TN is the differential of u, and the norm is that induced by the metric
on M and that on N on the tensor product bundle TMu1TN.

Important special cases of harmonic maps between manifolds include minimal surfaces,
which are precisely the harmonic immersions of a surface into three-dimensional
Euclidean space. More generally, minimal submanifolds are harmonic immersions of one
manifold in another. Harmonic coordinates are a harmonic diffeomorphism from a
manifold to an open subset of a Euclidean space of the same dimension.
Chapter 15

Holomorphic Function

A rectangular grid (top) and its image under a holomorphic function f (bottom).

In mathematics, holomorphic functions are the central objects of study in complex


analysis. A holomorphic function is a complex-valued function of one or more complex
variables that is complex-differentiable in a neighborhood of every point in its domain.
The existence of a complex derivative is a very strong condition, for it implies that any
holomorphic function is actually infinitely differentiable and equal to its own Taylor
series.

The term analytic function is often used interchangeably with holomorphic function,
although the word analytic is also used in a broader sense to describe any function
(real, complex, or of more general type) that is equal to its Taylor series in a
neighborhood of each point in its domain. The fact that the class of complex analytic
functions coincides with the class of holomorphic functions is a major theorem in
complex analysis.

Holomorphic functions are also sometimes referred to as regular functions or as


conformal maps. A holomorphic function whose domain is the whole complex plane is
called an entire function. The phrase "holomorphic at a point z0" means not just
differentiable at z0, but differentiable everywhere within some neighbourhood of z0 in the
complex plane.

Definition
Given a complex-valued function of a single complex variable, the derivative of at a
point z0 in its domain is defined by the limit

This is the same as the definition of the derivative for real functions, except that all of the
quantities are complex. In particular, the limit is taken as the complex number z
approaches z0, and must have the same value for any sequence of complex values for z
that approach z0 on the complex plane. If the limit exists, we say that is differentiable
at the point z0. This concept of complex differentiability shares several properties with
real differentiability: it is linear and obeys the product rule, quotient rule, and chain rule.

If is complex differentiable at every point z0 in U, we say that is holomorphic on U.


We say that is holomorphic at the point z0 if it is holomorphic on some neighborhood
of z0. We say that is holomorphic on some non-open set A if it is holomorphic in an
open set containing A.

The relationship between real differentiability and complex differentiability is the


following. If a complex function (x + iy) = u(x, y) + iv(x, y) is holomorphic, then u
and v have first partial derivatives with respect to x and y, and satisfy the Cauchy
Riemann equations:
If continuity is not a given, the converse is not necessarily true. A simple converse is that
if u and v have continuous first partial derivatives and satisfy the CauchyRiemann
equations, then is holomorphic. A more satisfying converse, which is much harder to
prove, is the LoomanMenchoff theorem: if is continuous, u and v have first partial
derivatives, and they satisfy the CauchyRiemann equations, then is holomorphic.

Terminology
The word "holomorphic" was introduced by two of Cauchy's students, Briot (18171882)
and Bouquet (18191895), and derives from the Greek (holos) meaning "entire",
and (morph) meaning "form" or "appearance".

Today, the term "holomorphic function" is sometimes preferred to "analytic function", as


the latter is a more general concept. This is also because an important result in complex
analysis is that every holomorphic function is complex analytic, a fact that does not
follow directly from the definitions. The term "analytic" is however also in wide use.

Properties
Because complex differentiation is linear and obeys the product, quotient, and chain
rules, the sums, products and compositions of holomorphic functions are holomorphic,
and the quotient of two holomorphic functions is holomorphic wherever the denominator
is not zero.

The derivative f'(a) can be written as a contour integral using Cauchy's differentiation
formula:

for any simple loop positively winding once around a, and

for infinitesimal positive loops around a.

If one identifies C with R2, then the holomorphic functions coincide with those functions
of two real variables with continuous first derivatives which solve the Cauchy-Riemann
equations, a set of two partial differential equations.

Every holomorphic function can be separated into its real and imaginary parts, and each
of these is a solution of Laplace's equation on R2. In other words, if we express a
holomorphic function f(z) as u(x, y) + iv(x, y) both u and v are harmonic functions.
In regions where the first derivative is not zero, holomorphic functions are conformal in
the sense that they preserve angles and the shape (but not size) of small figures.

Cauchy's integral formula states that every function holomorphic inside a disk is
completely determined by its values on the disk's boundary.

Every holomorphic function is analytic. That is, a holomorphic function f has derivatives
of every order at each point a in its domain, and it coincides with its own Taylor series at
a in a neighborhood of a. In fact, f coincides with its Taylor series at a in any disk
centered at that point and lying within the domain of the function.

From an algebraic point of view, the set of holomorphic functions on an open set is a
commutative ring and a complex vector space. In fact, it is a locally convex topological
vector space, with the seminorms being the suprema on compact subsets.

From a geometrical perspective, a function f is holomorphic at z0 if and only if its exterior


derivative df in a neighborhood U of z0 is equal to f(z) dz for some continuous function f.
It follows from

that df is also proportional to dz, implying that the derivative f is itself holomorphic and
thus that f is infinitely differentiable. Similarly, the fact that d(f dz) = f dz dz = 0
implies that any function f that is holomorphic on the simply connected region U is also
integrable on U. (For a path from z0 to z lying entirely in U, define

in light of the Jordan curve theorem and the generalized Stokes' theorem, F(z) is
independent of the particular choice of path , and thus F(z) is a well-defined function on
U having F(z0) = F0 and dF = f dz.)

Examples
All polynomial functions in z with complex coefficients are holomorphic on C, and so are
sine, cosine and the exponential function. (The trigonometric functions are in fact closely
related to and can be defined via the exponential function using Euler's formula). The
principal branch of the complex logarithm function is holomorphic on the set C \ {z
R : z 0}. The square root function can be defined as

and is therefore holomorphic wherever the logarithm log(z) is. The function 1/z is
holomorphic on {z : z 0}.
As a consequence of the CauchyRiemann equations, a real-valued holomorphic function
must be constant. Therefore, the absolute value of z, the argument of z, the real part of z
and the imaginary part of z are not holomorphic. Another typical example of a continuous
function which is not holomorphic is complex conjugation.

Several variables
A complex analytic function of several complex variables is defined to be analytic and
holomorphic at a point if it is locally expandable (within a polydisk, a Cartesian product
of disks, centered at that point) as a convergent power series in the variables. This
condition is stronger than the CauchyRiemann equations; in fact it can be stated as
follows:

A function of several complex variables is holomorphic if and only if it satisfies the


CauchyRiemann equations and is locally square-integrable.

Extension to functional analysis


The concept of a holomorphic function can be extended to the infinite-dimensional
spaces of functional analysis. For instance, the Frchet or Gteaux derivative can be used
to define a notion of a holomorphic function on a Banach space over the field of complex
numbers.
Chapter 16

Homogeneous Function

In mathematics, an homogeneous function is a function with multiplicative scaling


behaviour: if the argument is multiplied by a factor, then the result is multiplied by some
power of this factor. More precisely, if : V W is a function between two vector
spaces over a field F, then is said to be homogeneous of degree k F if

(1)

for all nonzero F and v V. When the vector spaces involved are over the real
numbers, a slightly more general form of homogeneity is often used, requiring only that
(1) hold for all > 0.

Homogeneous functions can also be defined for vector spaces with the origin deleted, a
fact that is used in the definition of sheaves on projective space in algebraic geometry.
More generally, if S V is any subset that is invariant under scalar multiplication by
elements of the field (a "cone"), then an homogeneous function from S to W can still be
defined by (1).
Examples

A homogeneous function is not necessarily continuous, as shown by this example. This is


the function f defined by f(x,y) = x if xy > 0 or f(x,y) = 0 if . This function is
homogeneous of order 1, i.e. f(x,y) = f(x,y) for any real numbers ,x,y. It is
discontinuous at y = 0.

Linear functions

Any linear function : V W is homogeneous of degree 1, since by the definition of


linearity

for all F and v V. Similarly, any multilinear function : V1 V2 ... Vn W is


homogeneous of degree n, since by the definition of multilinearity

for all F and v1 V1, v2 V2, ..., vn Vn. It follows that the n-th differential of a
function : X Y between two Banach spaces X and Y is homogeneous of degree n.

Homogeneous polynomials

Monomials in n variables define homogeneous functions : Fn F. For example,


is homogeneous of degree 10 since

The degree is the sum of the exponents on the variables; in this example, 10=5+2+3.

A homogeneous polynomial is a polynomial made up of a sum of monomials of the same


degree. For example,

is a homogeneous polynomial of degree 5. Homogeneous polynomials also define


homogeneous functions.

Polarization

A multilinear function g : V V ... V F from the n-th Cartesian product of V with


itself to the groundfield F gives rise to an homogeneous function : V F by evaluating
on the diagonal:

The resulting function is a polynomial on the vector space V.

Conversely, if F has characteristic zero, then given an homogeneous polynomial of


degree n on V, the polarization of is a multilinear function g : V V ... V F on the
n-th Cartesian product of V. The polarization is defined by

These two constructions, one of an homogeneous polynomial from a multilinear form and
the other of a multilinear form from an homogeneous polynomial, are mutually inverse to
one another. In finite dimensions, they establish an isomorphism of graded vector spaces
from the symmetric algebra of V to the algebra of homogeneous polynomials on V.

Rational functions

Rational functions formed as the ratio of two homogeneous polynomials are


homogeneous functions off of the affine cone cut out by the zero locus of the
denominator. Thus, if f is homogeneous of degree m and g is homogeneous of degree n,
then f/g is homogeneous of degree m n away from the zeros of g.
Non-Examples
Logarithms

The natural logarithm f(x) = lnx scales additively and so is not homogeneous.

This can be proved by noting that f(5x) = ln5x = ln5 + f(x), f(10x) = ln10 + f(x), and f(15x)
= ln15 + f(x). Therefore such that .

Affine functions

The function f(x) = x + 5 does not scale multiplicatively.

Positive homogeneity
In the special case of vector spaces over the real numbers, the notation of positive
homogeneity often plays a more important role than homogeneity in the above sense. A
function : V \ {0} R is positive homogeneous of degree k if

for all > 0. Here k can be any complex number. A (nonzero) continuous function
homogeneous of degree k on Rn \ {0} extends continuously to Rn if and only if Re{k} >
0.

Positive homogeneous functions are characterized by Euler's homogeneous function


theorem. Suppose that the function : Rn \ {0} R is continuously differentiable. Then
is positive homogeneous of degree k if and only if

This result follows at once by differentiating both sides of the equation (y) = k(y)
with respect to and applying the chain rule. The converse holds by integrating.

As a consequence, suppose that : Rn R is differentiable and homogeneous of degree


k. Then its first-order partial derivatives are homogeneous of degree k 1. The
result follows from Euler's theorem by commuting the operator with the partial
derivative.

Homogeneous distributions
A compactly supported continuous function on Rn is homogeneous of degree k if and
only if
for all compactly supported test functions and nonzero real t. Equivalently, making a
change of variable y = tx, is homogeneous of degree k if and only if

for all t and all test functions . The last display makes it possible to define homogeneity
of distributions. A distribution S is homogeneous of degree k if

for all nonzero real t and all test functions . Here the angle brackets denote the pairing
between distributions and test functions, and t : Rn Rn is the mapping of scalar
multiplication by the real number t.

Application to differential equations


The substitution v = y/x converts the ordinary differential equation

where I and J are homogeneous functions of the same degree, into the separable
differential equation
Chapter 17

Indicator Function

The graph of the indicator function of a two-dimensional subset of a square.

In mathematics, an indicator function or a characteristic function is a function defined


on a set X that indicates membership of an element in a subset A of X, having the value 1
for all elements of A and the value 0 for all elements of X not in A.

Definition
The indicator function of a subset A of a set X is a function

defined as
The Iverson bracket allows the equivalent notation, , to be used instead of

The indicator function of A is sometimes denoted

or or even .

(The Greek letter because it is the initial letter of the Greek etymon of the word
characteristic.)

Remark on notation and terminology


The notation may signify the identity function.
The notation may signify the characteristic function in convex analysis.

A related concept in statistics is that of a dummy variable (this must not be confused with
"dummy variables" as that term is usually used in mathematics, also called a bound
variable).

The term "characteristic function" has an unrelated meaning in probability theory. For
this reason, probabilists use the term indicator function for the function defined here
almost exclusively, while mathematicians in other fields are more likely to use the term
characteristic function to describe the function which indicates membership in a set.

Basic properties
The indicator or characteristic function of a subset A of some set X, maps elements of X
to the range {0,1}.

This mapping is surjective only when A is a proper subset of X. If , then


. By a similar argument, if then .

In the following, the dot represents multiplication, 11 = 1, 10 = 0 etc. "+" and ""
represent addition and subtraction. " " and " " is intersection and union, respectively.

If A and B are two subsets of X, then


and the "complement" of the indicator function of A i.e. AC is:

More generally, suppose is a collection of subsets of X. For any ,

is clearly a product of 0s and 1s. This product has the value 1 at precisely those
which belong to none of the sets Ak and is 0 otherwise. That is

Expanding the product on the left hand side,

where | F | is the cardinality of F. This is one form of the principle of inclusion-exclusion.

As suggested by the previous example, the indicator function is a useful notational device
in combinatorics. The notation is used in other places as well, for instance in probability
theory: if X is a probability space with probability measure and A is a measurable set,
then becomes a random variable whose expected value is equal to the probability of A:

This identity is used in a simple proof of Markov's inequality.

In many cases, such as order theory, the inverse of the indicator function may be defined.
This is commonly called the generalized Mbius function, as a generalization of the
inverse of the indicator function in elementary number theory, the Mbius function.

Mean, variance and covariance


Given a probability space with , the indicator random variable
is defined by if otherwise

(mean)
(variance)
(covariance)

Characteristic function in recursion theory, Gdel's and


Kleene's representing function
Kurt Gdel described the representing function in his 1934 paper "On Undecidable
Propositions of Formal Mathematical Systems". (The paper appears on pp. 41-74 in
Martin Davis ed. The Undecidable):

"There shall correspond to each class or relation R a representing function (x1, . .


., xn) = 0 if R(x1, . . ., xn) and (x1, . . ., xn)=1 if ~R(x1, . . ., xn)." (p. 42; the "~"
indicates logical inversion i.e. "NOT")

Stephen Kleene (1952) (p. 227) offers up the same definition in the context of the
primitive recursive functions as a function of a predicate P, takes on values 0 if the
predicate is true and 1 if the predicate is false.

For example, because the product of characteristic functions 1*2* . . . *n = 0


whenever any one of the functions equals 0, it plays the role of logical OR: IF 1=0 OR
2=0 OR . . . OR n=0 THEN their product is 0. What appears to the modern reader as
the representing function's logical-inversion, i.e. the representing function is 0 when the
function R is "true" or satisfied", plays a useful role in Kleene's definition of the logical
functions OR, AND, and IMPLY (p. 228), the bounded- (p. 228) and unbounded-
(p. 279ff) mu operators (Kleene (1952)) and the CASE function (p. 229).

Characteristic function in fuzzy set theory


In classical mathematics, characteristic functions of sets only take values 1 (members) or
0 (non-members). In fuzzy set theory, characteristic functions are generalized to take
value in the real unit interval [0, 1], or more generally, in some algebra or structure
(usually required to be at least a poset or lattice). Such generalized characteristic
functions are more usually called membership functions, and the corresponding "sets" are
called fuzzy sets. Fuzzy sets model the gradual change in the membership degree seen in
many real-world predicates like "tall", "warm", etc.
Chapter 18

Injective Function

An injective function (not a bijection)

Another injective function (is a bijection)


A non-injective function (this one happens to be a surjection)

In mathematics, an injective function is a function that preserves distinctness: it never


maps distinct elements of its domain to the same element of its codomain. In other words,
every element of the function's codomain is mapped to by at most one element of its
domain. If in addition all of the elements in the codomain are in fact mapped to by some
element of the domain, then the function is said to be bijective.

An injective function is called an injection, and is also said to be a one-to-one function


(not to be confused with one-to-one correspondence, i.e. a bijective function).
Occasionally, an injective function from X to Y is denoted f: X Y, using an arrow with a
barbed tail. Alternately, it may be denoted YX using a notation derived from that used for
falling factorial powers, since if X and Y are finite sets with respectively x and y elements,
the number of injections X Y is yx.

A function f that is not injective is sometimes called many-to-one. (However, this


terminology is also sometimes used to mean "single-valued", i.e., each argument is
mapped to at most one value; this is the case for any function, but is used to stress the
opposition with multi-valued functions, which are not true functions.)

A monomorphism is a generalization of an injective function in category theory.

Definition
Let f be a function whose domain is a set A. The function f is injective if for all a and b in
A, if f(a) = f(b), then a = b; that is, f(a) = f(b) implies a = b. Equivalently, if a b, then
f(a) f(b).

Examples
For any set X and any subset S of X the inclusion map S X (which sends any
element s of S to itself) is injective. In particular the identity function X X is
always injective (and in fact bijective).
The function f : R R defined by f(x) = 2x + 1 is injective.
The function g : R R defined by g(x) = x2 is not injective, because (for
example) g(1) = 1 = g(1). However, if g is redefined so that its domain is the
non-negative real numbers [0,+), then g is injective.
The exponential function exp : R R defined by exp(x) = ex is injective (but not
surjective as no value maps to a negative number).
The natural logarithm function ln : (0, ) R defined by x ln x is injective.
The function g : R R defined by g(x) = xn x is not injective, since, for
example, g(0) = g(1).

More generally, when X and Y are both the real line R, then an injective function
f : R R is one whose graph is never intersected by any horizontal line more than once.
This principle is referred to as the horizontal line test.

Injections can be undone


Functions with left inverses are always injections. That is, given f : X Y, if there is a
function g : Y X such that, for every x X

g(f(x)) = x (f can be undone by g)

then f is injective. In this case, f is called a section of g and g is called a retraction of f.

Conversely, every injection f with non-empty domain has a left inverse g (in conventional
mathematics). Note that g may not be a complete inverse of f because the composition in
the other order, f g, may not be the identity on Y. In other words, a function that can be
undone or "reversed", such as f, is not necessarily invertible (bijective). Injections are
"reversible" but not always invertible.

Although it is impossible to reverse a non-injective (and therefore information-losing)


function, one can at least obtain a "quasi-inverse" of it, that is a multiple-valued function.

Injections may be made invertible


In fact, to turn an injective function f : X Y into a bijective (hence invertible) function,
it suffices to replace its codomain Y by its actual range J = f(X). That is, let g : X J
such that g(x) = f(x) for all x in X; then g is bijective. Indeed, f can be factored as inclJ,Y
g, where inclJ,Y is the inclusion function from J into Y.

Other properties
If f and g are both injective, then f g is injective.
The composition of two injective functions is injective.

If g f is injective, then f is injective (but g need not be).


f : X Y is injective if and only if, given any functions g, h : W X, whenever
f g = f h, then g = h. In other words, injective functions are precisely the
monomorphisms in the category Set of sets.
If f : X Y is injective and A is a subset of X, then f 1(f(A)) = A. Thus, A can be
recovered from its image f(A).
If f : X Y is injective and A and B are both subsets of X, then f(A B) =
f(A) f(B).
Every function h : W Y can be decomposed as h = f g for a suitable injection f
and surjection g. This decomposition is unique up to isomorphism, and f may be
thought of as the inclusion function of the range h(W) of h as a subset of the
codomain Y of h.
If f : X Y is an injective function, then Y has at least as many elements as X, in
the sense of cardinal numbers. In particular, if, in addition, there is an injection
from Y to X, then X and Y has the same cardinal number. (This is known as the
CantorBernsteinSchroeder theorem.)
If both X and Y are finite with the same number of elements, then f : X Y is
injective if and only if f is surjective (in which case they are bijective).
An injective function which is a homomorphism between two algebraic structures
is an embedding.
Chapter 19

Measurable Function

In mathematics, particularly in measure theory, measurable functions are structure-


preserving functions between measurable spaces; as such, they form a natural context for
the theory of integration. Specifically, a function between measurable spaces is said to be
measurable if the preimage of each measurable set is measurable, analogous to the
situation of continuous functions between topological spaces.

This definition can be deceptively simple, however, as special care must be taken
regarding the -algebras involved. In particular, when a function is said to
be Lebesgue measurable what is actually meant is that is a
measurable functionthat is, the domain and range represent different -algebras on the
same underlying set (here is the sigma algebra of Lebesgue measurable sets, and is
the Borel algebra on ). As a result, the composition of Lebesgue-measurable functions
need not be Lebesgue-measurable.

By convention a topological space is assumed to be equipped with the Borel algebra


generated by its open subsets unless otherwise specified. Most commonly this space will
be the real or complex numbers. For instance, a real-valued measurable function is a
function for which the preimage of each Borel set is measurable. A complex-valued
measurable function is defined analogously. In practice, some authors use measurable
functions to refer only to real-valued measurable functions with respect to the Borel
algebra. If the values of the function lie in an infinite-dimensional vector space instead of
R or C, usually other definitions of measurability are used, such as weak measurability
and Bochner measurability.

In probability theory, the sigma algebra often represents the set of available information,
and a function (in this context a random variable) is measurable if and only if it
represents an outcome that is knowable based on the available information. In contrast,
functions that are not Lebesgue measurable are generally considered pathological, at least
in the field of analysis.

Formal definition
Let (X,) and (Y,) be measurable spaces, meaning that X and Y are sets equipped with
respective sigma algebras and . A function
is said to be measurable if for every . The notion of measurability
depends on the sigma algebras and . To emphasize this dependency, if
is a measurable function, we will write

Special measurable functions


If (X,) and (Y,) are Borel spaces, a measurable function
is also called a Borel function. Continuous functions are
Borel functions but not all Borel functions are continuous. However, a measurable
function is nearly a continuous function. If a Borel function happens to be a
section of some map , it is called a Borel section.

A Lebesgue measurable function is a measurable function


, where is the sigma algebra of Lebesgue measurable
sets, and is the Borel algebra on the complex numbers . Lebesgue
measurable functions are of interest in mathematical analysis because they can be
integrated.

Random variables are by definition measurable functions defined on sample


spaces.

Properties of measurable functions


The sum and product of two complex-valued measurable functions are
measurable. So is the quotient, so long as there is no division by zero.

The composition of measurable functions is measurable; i.e., if


and are measurable
functions, then so is . But see the caveat
regarding Lebesgue-measurable functions in the introduction.

The (pointwise) supremum, infimum, limit superior, and limit inferior of a


sequence (viz., countably many) of real-valued measurable functions are all
measurable as well.

The pointwise limit of a sequence of measurable functions is measurable; note


that the corresponding statement for continuous functions requires stronger
conditions than pointwise convergence, such as uniform convergence.
Non-measurable functions
Real-valued functions encountered in applications tend to be measurable; however, it is
not difficult to find non-measurable functions.

So long as there are non-measurable sets in a measure space, there are non-
measurable functions from that space. If (X,) is some measurable space and
is a non-measurable set, i.e. if , then the indicator function
is non-measurable (where is equipped with the Borel
algebra as usual), since the preimage of the measurable set {1} is the non-
measurable set A. Here 1A is given by

Any non-constant function can be made non-measurable by equipping the domain


and range with appropriate -algebras. If is an arbitrary non-
constant, real-valued function, then f is non-measurable if X is equipped with the
indiscrete algebra = {0,X}, since the preimage of any point in the range is some
proper, nonempty subset of X, and therefore does not lie in .

You might also like