Reasoning Under Uncertainty

Reasoning under Uncertainty
Uncertainty
Real world is uncertain and ambiguous
Can never be certain about the state of the world and its domain
True uncertainty: rules are probabilistic in nature

rolling dice, flipping a coin
Laziness: too hard to determine exception-less rules

takes too much work to determine all of the relevant factors too hard to use the enormous rules that result
Theoretical ignorance: don't know all the rules

problem domain has no complete, consistent theory (e.g., medical diagnosis)
Practical ignorance: do know all the rules BUT

haven't collected all relevant information for a particular case
2
Uncertainty (contd)
Plausible/probabilistic inference
Ive got this evidence; whats the chance that this
conclusion is true?
Suppose you wake up with headache

Do you have flu? Logical Inference: if headache then flu But not all patients have headache due to flu.
Better inference rule would be

if headache then problem is flu with 0.8 probability or P(flu | headache) = 0.8 the probability of flu is 0.8 given headache is observed
Uncertainty (contd)
Probability theory
Formal language for representing and reasoning with uncertain knowledge. Compute the probability of an event or decision given the evidence or observation
Rather than reasoning about the truth or falsity of a proposition, reason about the belief that a proposition or event is true or false
4
Source of Probabilities
Relative Frequency probability means the fraction that would be observed in the limit of large number of samples if 10 of 100 people tested have a cavity the P(cavity) = 0.1 Objective-based probabilities are real aspects of the world objects have a tendency to behave in certain ways coin has a tendency to come up heads with a probability 0.5 Subjective-based probabilities characterize an agent's belief i.e. experience and judgment of the person making estimates the probability that you'll pass the final exam can be based on your own subjective evaluation of your hardwork and understanding of the material
5
Sample Space
A space of events/outcomes to which we assign probabilities Events can be binary, multi-valued, or continuous Events are mutually exclusive Examples
Coin flip: {head, tail} Die roll: {1,2,3,4,5,6} English words: a dictionary Temperature tomorrow: R+ (Kelvin)
6
Random Variable
A variable, X, whose domain is a sample space, and whose value is (somewhat) uncertain
Assigns a number to every possible outcome of an experiment
Examples: X = coin flip outcome X = first word in tomorrows headline news X = tomorrows temperature For a given task, user defines a set of random variables for describing the world
Random Variable contd

Refers to attributes of the world whose "status" is
unknown Have one and only one value at a time Have a domain of values that are possible states of the world: Boolean: domain = <true, false> Discrete: domain is countable and values are mutually exclusive and exhaustive e.g. Sky domain = <clear, partly_cloudy, overcast> Continuous: domain is real numbers
8
Probability for Discrete Events

An agents uncertainty is represented by
P(A=a) or simply P(a) the agents degree of belief that variable A takes on value a given no other information relating to A a single probability called an unconditional or prior probability
Examples
P(head) = P(tail) = 0.5 fair coin P(A=head or tail) = 0.5 + 0.5 = 1 P(A=even number) = 1/6 + 1/6 + 1/6 = 0.5 fair 6-sided die P(A=two dice rolls sum to 2) = 1/6 * 1/6 = 1/36
9
Probability Distributions
Given A is a RV taking values in <a1,a2,,an> e.g. if A is Sky, then a is one of <clear, partly_cloudy, overcast> P(a) represents a single probability where A=a e.g. if A is Sky, then P(a) means any one of <P(clear), P(partly_cloudy), P(overcast) > P(A) represents a probability distribution The set of all possible values of a random variable and their associated probabilities <P(a1),P(a2),,P(an)> if A is Sky, then P(Sky) is the set of probabilities <P(clear), P(partly_cloudy), P(overcast)> sum over all values in the domain of variable A is 1
P(a ) P(a ) ... P(a ) 1

i 1 n
10
Useful Probability Distributions

Binomial Distribution : describes the number of successes in independent trials of a Bernoulli process (a process with two outcomes) Normal Distribution: Bell-shaped distribution that is a function of two parameters, the mean and standard deviation
Exponential Distribution: Used in dealing with queuing problems. Often used to describe the time required to service a customer Poisson Distribution : describes customers arrival times during a certain time interval
F Distribution: Helpful in testing hypotheses about variances
11
Useful Probability Distributions contd

Binomial Distribution
Normal Distribution
Exponential Distribution
Poisson Distribution
12
Axioms of Probability
0 <= P(A=a) <= 1 for all a in sample space of A P(True)=1, P(False)=0 P(A v B) = P(A) + P(B) - P(A ^ B)
Derived Properties: P(~A) = 1 - P(A) If A can take k different values a1, , ak: P(A=a1) + + P(A=ak) = 1 P(A) = P(A ^ B) + P(A ^ ~B) if B is binary event
13
Joint Probabilities
Joint probabilities specify the probabilities for a conjunction of events
Bird T T T T F F F F Flier T T F F T T F F Young T F T F T F T F Probability 0 0.2 0.04 0.01 0.01 0.01 0.23 0.5
14
Joint Probabilities
With n Boolean variables, the table will be of size 2n. And if n variables each had k possible values, then the table would be of size kn Example
P(Bird=T) = P(bird) = 0.0 + 0.2 + 0.04 + 0.01 = 0.25 P(bird, ~flier) = 0.04 + 0.01 = 0.05 P(bird flier) = 0.0 + 0.2 + 0.04 + 0.01 + 0.01 + 0.01 = 0.27
15
Conditional Probabilities
Formalize the process of accumulating evidence and updating probabilities based on new evidence. P(A|B) = P(A ^ B)/P(B) = P(A,B)/P(B) Example:
P(~B|F) = P(~B,F) / P(F) = (P(~B,F,Y) + P(~B,F,~Y)) / P(F) = (.01 + .01)/P(F) P(B|F) = P(B,F) / P(F) = (P(B,F,Y) + P(B,F,~Y)) / P(F) = (0.0 + 0.2)/P(F) P(~B|F) + P(B|F) = 1, so substituting and solving for P(F) we get P(F) = 0.22.
16
Conditional Probabilities contd

Product Rule P(A,B) = P(A|B)P(B) Chain Rule P(A,B,C,D) = P(A|B,C,D)P(B|C,D)P(C|D)P(D) Conditionalized version of the Chain Rule P(A,B|C) = P(A|B,C)P(B|C) Bayes Rule P(A|B) = P(A)P(B|A) / P(B) Conditionalized version of Bayes Rule: P(A|B,C) = P(B|A,C)P(A|C)/P(B|C)
17
Conditional Probabilities contd

Conditioning /Addition Rule: P(A) = Sum{P(A|B=b)P(B=b)} where the sum is over all possible values b in the sample space of B. P(~B|A) = 1 - P(B|A) Example
P(~Bird | Flier, ~Young) = P(~B,F,~Y) / P(F,~Y) =P(~B,F,~Y) /(P(~B,F,~Y) + P(B,F,~Y)) = .01 / (.01 + .2) = .048
18
Bayes Rule
Given a prior model of the world P(A) and a new evidence B, Bayes rule says how this piece of evidence decreases our ignorance about the world Initially know P(A) Update after knowing B (Posterior) P(A|B)=P(B|A).P(A) / P(B) Generalizing Bayes Rule for two pieces of evidence, B and C, we get: P(A|C,B) = P(C,B | A) P(A) / P(C,B) =P(C|B,A) P(B|A) P(A) / [P(C|B) P(B)] = P(A) * [P(B|A)/P(B)] * [P(C | B,A)/P(C|B)]
19
Independence
RVs A and B are independent if
P(A|B) = P(A) P(B | A) = P(B) P(A,B) = P(A)P(B)
RVs A and B are conditionally independent given C if P(A | B, C) = P(A | C) P(B | A, C) = P(B | C) P(A, B | C) = P(A | C) P(B | C)
20
Independence contd
Bayes Rule with Multiple, Independent Evidence Assuming conditional independence of B and C given A then Bayes Rule can be simplified as: P(A | B,C) = P(A) P(B,C | A) / P(B,C) = P(A) P(B|A)P(C|A) / [P(B) P(C|B)] = P(A) P(B|A)P(C|A) / [P(B) P(C)] = P(A) * [P(B|A)/P(B)] * [P(C|A)/P(C)]
21
Bayes Rule Example

RVs: P=PickledLiver (disease), J=Jaundice (symptom), B= Eyes Bloodshot (symptom) P(P) = 10-17 ,P(J) = 2-10 and P(J | P) = 2-3 ,P(B) = 2-6 and P(B| P) = 2-1 . J,B are independent. Determine the likelihood that the patient has a PickledLiver. P(PickledLiver | Jaundice) = P(J|P) P(P) / P(J) = (2-17 * 2-3)/2-10 = 2-10 P(PickledLiver | Jaundice, Bloodshot) = P(P) P(J|P) P(B|P) / [P(J)P(B)] = 2-10 * [2-1 / 2-6] = 2-5 .
22
Nave Bayes Classifier

Say we have a class/diagnosis/decision variable A. Goal is to find the value of A that is most likely given the evidence B,C,D i.e. Find a such that P(A=a | B,C,D) is maximized
argmaxa P( A a)P(B | A a)P(C | A a)P(D | A a) / P(B,C, D)
P(B,C,D) is constant for all a so can be ignored.

argm axv P(V v)
i 1
P( X i xi | V v)
23
Homework
Discrete mathematics and its Applications Study Chapter-1. Solve related examples/problems Quantitative Analysis for Management Study Chapter 1 & 2 Solve related examples/problems
24

Reasoning Under Uncertainty

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Reasoning Under Uncertainty

Uploaded by

Copyright:

Available Formats

Reasoning under Uncertainty

True uncertainty: rules are probabilistic in nature

Laziness: too hard to determine exception-less rules

Theoretical ignorance: don't know all the rules

Practical ignorance: do know all the rules BUT

Suppose you wake up with headache

Better inference rule would be

Random Variable contd

Probability for Discrete Events

P(a ) P(a ) ... P(a ) 1

Useful Probability Distributions

Useful Probability Distributions contd

Conditional Probabilities contd

Conditional Probabilities contd

Bayes Rule Example

Nave Bayes Classifier

P(B,C,D) is constant for all a so can be ignored.

You might also like