You are on page 1of 5

Okay.

So, we have given this, this dynamic,


this algorithm for the problem of RNA
secondary structure and as I said, this
fits within the category of algorithms
that we call, dynamic programming.
[SOUND]
Okay?
So that, that the, the term
programming here does not have to do
with programming in real
programming languages.
This is just the name of
this class of algorithms.
And dynamic, [COUGH] dynamic programming
as I said, it, it solves the problem.
Certain problems when we sort
certain optimization problems,
we observe that the,
the optimal solution to the big problem,
can be obtained from optimal
solutions to solve problems.
Like in this case here,
the solution to op-,
the solution to the entire
problem opt of 1N, can be
obtained from opt of 1N minus one and
opt of, of,
of NT and N1 to T minus one and
T plus one to T minus one.
So the solution to,
the bigger problem IJ, can be obtained
from a solution to sub-problems.
That involves smaller,
smaller sub-problems here.
Okay, so, when we talk, when we look at
dynamic programming, dynamic programming,
we have to have this, this property that
the problem, the solution to the problem,
the big problem, is obtained,
from solutions to sub-problems.
And when we have this,
this properties, when we compute
the solutions to sub-problems,.
We store them in a,
usually in a matrix and,
when we come to solve the big problem,
when we encounter this problems,
all we need to do,
because they have already been computed.
Just look them up.
So looking up this value, this value,
this value, it's all of one operation for
each one of them.
Of course, because we are doing this
technique the gain from, the gain from
dynamic programming comes, in cases where
the same sub problem is going to be, needs
to be computed over and over and over if
we do recursive algorithm, for example.

So, if I get to a situation where, the


problem is solved from sub-problems, but
these sub-problems do not overlap,
ever, in the sense that,
OPT of (7,19) is solved once.
OPT of (7,20) is solved only once,
or needed only once.
Then I'm not going to gain much.
The, the, the gain I would get from
this if the same sub-problem, is,
needs to be solved over and over and over.
With dynamic programming,
the way we do that.
By solving it only once, storing it in a
matrix and every time we need to solve it
again, all we do is just,
look up the value in the matrix, okay?
The other thing that we
have to keep in mind,
that the fact that I have written this,
this kind of formula and this algorithm.
Does not necessarily mean, or
if I have dynamic programming,
does not necessarily mean I
have an efficient algorithm, or
a polynom,
which we mean polynomial time algorithm.
For this to be polynomial,
the sub problems, that we divide and
solve, has to be polynomial as well, okay?
Because, if you, have to solve,
if the matrix has exponential size.
There's nothing much you can do about it,
because you are filing the entire matrix.
So usually the running time,
of a dynamic programming algorithm,
is dominated by the size of matrix
because, or the size of it,
the size of matrix because,
we have that entries in the matrix.
At least we have to fill out
these entries in the matrix.
Even if we do the simplest
amount of computation,
which is constant number of
operations per entry, the number of
entries in the matrix, is going to give
us a lower bound, on the running time.
Now, what is the running
time of this algorithm?
It's very simple to analyze, actually.
This loop here is on the order of n,
right.
n minus five, it is order of n.
This here is also in the order
of n because in the, in the,
in the worst case, n is five and this is
going to go from one to n minus five.
This is o of one.
Here, this is going to be O(1).

Of course this is O(1), O(1),


the addition is O(1) but
there is this [BLEEP] that is ranging.
If I am looking again in
the worse case at position N,
T could be, any of these positions here.
So in the worse case, it is O(n).
So this we see, in the worse case
we might be looking at O(n) values.
So, this can be O(n).
So O(n), this is nested within it,
this is nested within this.
So this algorithm, is in the order of nq.
Okay?
And this is much better than
trying a brute force approach,
that would have tried, for example, let's
look at every possible way of pairing.
Indexes in the [INAUDIBLE] secondary
structure and then check feasibility, and
then find the one with the maximum number,
okay?
And, if you,
if you want to think about, is this
faster than the recursive algorithm?
I en, I highly encourage you
to implement this, as a,
in dynamic programming; using a matrix.
Just implement this using
the recursive version,
where this opt is the name of
the function instead of a matrix.
So you call, that you are calling
the function here every time.
And run them and look at the running
time of these two algorithms and
see which one is going to be faster.
Of course, with dynamic programming,
when we're looking at
dynamic programming implementation, we,
there is an issue always with with memory,
because we have to be
building that matrix, okay?
Usually the memory is not a big deal when
we are looking at, at small strings,
because it is, in this case it's going
to be on the order of n squared.
But if n gets very, very large,
the dynamic programming approaches,
start suffering because
of the memory issue.
They are still efficient in terms of time,
but the amount of memory they need to
store, becomes very large, and
that becomes the bottleneck, okay.
One final word about this algorithm for
our secondary structure, and with respect
to dynamic programming, is, remember
that the biologist was not interested.
In the, in the, in the number.

The biologist didn't tell me,


it didn't ask me to give them the number
of pairs that are going to match.
The biologists wanted the actual pairs,
because they wanted the actual structure.
So how would I get, how would I get,
the actual pairs themselves,
rather than just the number?
Because when I say Return OPT of (1,n),
this can be like the number five.
But the biologist is
interested not in five,
the biologist is interested in
the five pairs, that I produced.
That's very easy to obtain if you
think about this algorithm carefully.
When you look at OPT (1,n), OPT (1,n),
it's either the max of this or this.
If it is, if, if this is the value
that's giving us the max.
Then we know that n is
not part of the solution.
We repeat the exercise
now on 1 to n minus 1.
If this is the part that
give us the solution,
then we find the t that
is maximizing this value.
If t is let's say 7 and n is 100,
then we tell the biologists 7 and
100 are going to be one of the pairs.
Then we repeat the exercise and
find what are the other pairs.
Okay?
So this is a technique
that's called backtracking.
When we compute the ends
are using dynamic programming.
Usually we are,
computing a number that is related to
the solution which is
usually the optimality score.
Here it's the maximum number of pairs.
Or, or something like that.
And, and within other algorithms.
So usually dynamic programming algorithms
are implemented in such a way that,
the matrix is containing, the interest
are containing the optimality score.
But using the trace back technique;
we can find the actual solution,
not just the score of it, okay?
And again, here if I just.
The traceback as you go from the,
remember that the matrix we built,
1,n was here, this is optive 1,n.
You will start from this entry here,
we start from this entry, and
start doing a traceback, to find
the actual pairs that, that form the,

the RNA secondary structure

You might also like