Professional Documents
Culture Documents
24.1
Since the twiddle factors are completely dictated by the data elements involved in
each buttery computation, the twiddle factors required by each processor are easily
identied by the data it owns, which are dierent for dierent mappings. Assuming
that naturally ordered input data are transformed by DIF FFT without inter-processor
permutations, an example using a consecutive block mapping is given in Figure 24.1,
and another example using a cyclic mapping is given in Figure 24.2.
Observe that in either case, one processor may need to use more twiddle factors
than the other. For comparison, the distribution of the twiddle factors are tabulated
in Table 24.1 for the consecutive block map in Figure 24.1, and in Table 24.2 for the
cyclic map in Figure 24.2. Apparently, in the former case, the twiddle factors are not
evenly distributed among the processors, whereas in the latter case, a more balanced
(but still not fully balanced) distribution results from using the cyclic data map in
parallelizing the DIF FFT algorithm.
Figure 24.1 DIFNR FFT twiddle factors required if a consecutive block map is used.
Table 24.1 DIFNR FFT twiddle factors required by each processor in Figure 24.1.
isizilio
Processors
(Zage
i*ilioO
1)
(z%ge
PO
il ioO0
2)
(zxge
ioOO0
3)
(SZge
0
N
0
N
4
N
5
N
0
WP,,
4)
(Stage
5)
0
WpJ =
0
WN
w;
w;
6
N
12
N
PI
0
WN
0
N
0
N
2
WN
4
WN
6
N
4
N
6
N
6
N
12
wN
6
N
10
IV
12
wN
14
N
p2
0
N
0
N
0
N
1
wN
4
wN
8
wN
2
wN
6
wN
3
wN
12
wN
4
wN
3
wN
6
N
7
wN
P3
a
wN
0
wN
0
N
0
N
9
wN
2
N
4
N
a
N
10
N
4
N
6
N
11
N
6
N
12
N
12
N
6
N
13
wN
*0
wN
14
wN
12
wN
15
WIV
14
W&l
Figure 24.2 DIFNR FFT twiddle factors required if a cyclic map is used.
Table 24.2 DIFNR FFT twiddle factors required by each processor in Figure 24.2.
isizilio
izi,ioD
il ioO0
ioOO0
O=
wi.J
Processors
(Z&e
PO
1)
(ii&e
0
N
0
PI
4
N
8
N
2)
(Zige
3)
(Siige
4)
(Stage
1
5)
0
N
8
wN
I*
N
PI
1
wN
1
N
5
wN
10
N
4
N
0
N
0
WN
9
wN
13
wN
P2
1
wN
4
N
6
wN
12
N
a
N
0
N
12
N
8
N
IO
wN
14
N
P3
3
wN
6
N
7
wN
14
N
1,
N
1.5
wN
24.2
Referring to Figures 20.1, 20.2, 20.3, 20.4, 20.5, and 20.6 in Chapter 20 on parallel FFTs with inter-processor permutations, one can tabulate the twiddle factors
i 3 i 2 i1 i0
i2 i1 i0 0
i1 i0 00
i0 000
0
N
, N
, N
, N
, and N
(inferred from global m = i4 i3 i2 i1 i0 ) required
by each processor as shown in Table 24.3 below. (Note that p = 4 and N = 32 in the
example.) Again, it is assumed that a DIFNR FFT is used. Observe that in this case,
each processor needs to compute almost all N/2 twiddle factors either in advance or
on the y (to save storage).
Table 24.3 DIFNR FFT twiddle factors required by each processor in Figure 24.1,
20.2, 20.3, 20.4, 20.5, and 20.6. (p = 4 and N = 32)
i
(r
P
1
O
U
w
w
w
w
w
w
0
I
2
3
a
9
I
W
P
w
w
w
w
w
w
W
w
T( o
5
6
r
1
1
1
1
)2
N
N
w
O
;
4
w
w
N
N
2
3
4
N
N
w
w
w
w
2
4
6
0
2
4
6
8
1
1
1
8
1
1
1
iz i
w
w
N
N
N
N
W
0
2
4
N
N
N
N
w
W
w
w
0
2
4
c )3
lI(
w
w
w
0
4
f
1
0
4
8
I
0
4
8
1
0
4
8
1
e )4
a! Z(
d
i0
i
w =
1
O
)s5
gi &
Z(
N W
00
) s
ie e
&
S
J
N
N
N W
N
N
N
N
w
w
0
8
N w O --
N
N
N
N
N
N
0
a
N w
N
;
g
et
oe a