FASTF: Fast Fourier transform with arbitrary factors

FASTF: Fast Fourier transform with arbitrary factors

Computer Physics Communications 30 (1983) 397—402 North-Holland Publishing Company 397 FASTF: FAST FOURIER TRANSFORM WITH ARBITRARY FACTORS Eduardo ...

373KB Sizes 3 Downloads 179 Views

Computer Physics Communications 30 (1983) 397—402 North-Holland Publishing Company

397

FASTF: FAST FOURIER TRANSFORM WITH ARBITRARY FACTORS Eduardo GARCIA-TORANO Metrologia Ed:ficio 3, Junta de Energia Nuclear, Avenida Complutense 22, Madrid -3, Spain Received in final form 15 June 1982

PROGRAM SUMMARY Title of program: FASTF Catalogue number: ACUH Program obtainablefrom: CPC Program Library, Queen’s University of Belfast, N. Ireland (see application form in this issue) Computer: UNIVAC 1100/81; Installation: Centro de Câlculo, Junta de Energia Nuclear Operating system: EXEC 8 Programming language used: FORTRAN IV High speed storage required: 8 kwords for the test run No. of bits in word: 36 Peripherals used: card reader, line printer No. of cards in combined program and test deck: 376

Nature of the problem FASTF performs a discrete Fourier transform (direct/inverse) of a complex vector of length N. N should be factorisable as N= r 1r2...r~(r, integers). Method of solution The fast Fourier transform algorithm as described in ref. [11. Restrictions on the complexity of the problem The lowest r, should be less than 181 for storage limitations. This condition may be modified by increasing the size of the IPRIM array which contains a list of the 40 first prime numbers. Typical running time The execution time depends on N. A set of 1024 points is transformed in 0.544 s. Reference [1] W.T. Cohram et al., IEEE Trans. on Audio and Electroacoustics AU-iS (1967) 45.

Keywords: fast Fourier transform, Fourier, time series analysis

OO1O-4655/83/0000—OOIJO/$03.OO © 1983 North-Holland

398

/ FA STF: fast

F. Garcia - Toraño

Fourier transform

LONG WRITE-UP 1. Introduction The Fourier transform is a very important tool for the analysis of many physical problems. The Fast Fourier Transform algorithm (FFT) allows us io compute the Discrete Fourier Transform (DFT) efficiently and accurately. The bibliography contains several programs to compute the FFT [2—4].Generally, the assumption is made that the number of points to be transformed is a power of 2. For an arbitrary number of points, it is usual to augment the set of points with sufficient zeros so that the condition N = can be satisfied. In many cases, as in nuclear spectroscopy, this method can be inadequate for different reasons: First, because of the Gibbs phenomenon [5] associated with the discontinuities that appear when the zeros are introduced, and second, for numbers such that 2’
Assuming that B are the coefficients obtained in the DFT of the basic sequences, we find the following expression for the final coefficients:

c

=

i

~

p

‘~‘“

=

o

,~

~



1

psi

=

o

p

1



‘‘‘

r= 0 N/p 1, (3) where W= exp(— 2’rri/N). If N has additional factors, the process can be continued until N/p = 1. —

2mnteger

3. Program structure 3.1. Introduction FASTF consists of a set of 6 subroutines. A simplified block diagram is shown in fig. 1 and a brief description of the routines is given. In order to optimize the computation of the ~s~lL

. .

takes a factor

2. Description of the method The DFT of a set defined [1] by Cr = ~

Xk

exp(



{ } ~.

TRANS

of N complex points is

2~irk/N) r

=

0, 1,.

. . ,

N



1

(1)

~

NP=2

no

yes

NPr 3

no

y~

NP~5

no

COMP 3

and its inverse COMPS

N—i

X~= ~ Ck/Nexp(2lrirk/N),

(2)

COMPO

k=O

where { C } are the complex coefficients obtained in the DFT of (X~}and ~= If N has a factor p, we form p basic sequences from the original points as follows: Z~= ~

i

=

0,.. .,p



1, k

=

0,... ,N/p



1.

no last factor yes users caLL Fig. i. Block diagram of the FASTF set of routines.

next factor

E. Garcia - Toraño

trigonometric functions, we defined ‘rr = double precision arc cosine(



/

FA STF: fast Fourier transform

as:

399

the transformation. They compute NPARO sets of NELEMO elements from sets of NELEMI elements.

1)

according to the internal value in the mathematical library of our computer. Users may redefine this number depending on their own systems. The real and imaginary parts of the input and output data are stored separately in double precision arrays because some FORTRAN versions do not have double precision complex arithmetic, Since computation is not “made in place”, the input data are lost during the process. 3.2. Subroutine FASTF This is the main routine of the set. It computes the factors of N, rearranges the input data and calls the auxiliary routine TRANS. The calling sequence is:

Since the computation of the trigonometrical functions is the slowest operation involved in the transformation, it is usual to generate these functions without the use of the mathematical library of the computer by approximate expressions. But, from the point of view of accuracy it is preferable to employ this library and we tried to reduce the number of angles which is necessary to calculate. To evaluate the transform according to the value of p in expression (3), we developed four specialized subroutines. Subroutine COMP2 works for p = 2. A considerable saving in time may be obtained if expression (3) is reduced to: Cr = B,°+ B~(cos2~ir/N i sin 2lTr/N), —

CALL FASTF (N, CR, CI, XR, XI, INDEX)

Cr+N/

inwhich N number of points to be transformed and dimension of the input and output arrays; CR real parts ofparts the complex coefficients; CI imaginary of the complex coefficients; XR real parts of the complex vector to be transformed; XI imaginary parts of the complex vector to be transformed; INDEX = 1 for direct transform, = 1 for inverse transform. The subroutine permits a minimum prime factor smaller than 181, but this condition may be modified by increasing the list of the 40 first prime factors which is contained in the IPRIM array.

r=0,1,...,~N—1. Subroutine COMP3 works for p = 3. In this case, we can express the transform as:



2

=

Br°+ B~(—cos2iir/N+

i sin 2iir/N),

2ix,’2r

~-~r— L~r+ — nO

cr+N/3



rliijr VV + Dr o

P’~’

B° BluI’~”/3

p2~2(r+N/3)

-‘-

Cr+

2~~/3 + B,~W2+2N/3),

B~+0 B,~WT+ ‘N 1 r , We can then reduce the computation of some Wk to others by trigonometric expressions. A different approach to the problem is employed in the case of p = 5 (subroutine COMP5). Eq. (3) may be rearranged as: 2N/3

=

— —

~.

,...,

~



C,+mN/t=B,0+ WT+m~T/5[B,1+ W1+mN/s 3.3.

Subroutine

TRANS

This subroutine calls the other subroutines according to the value of p in expression (3). 3.4. Subroutines COMF2, COMPG

COMP3,

COMP5,

These subroutines compute eq. (3) or its inverse to obtain the intermediate and final coefficients of

x[B,2+ ~ We verified that there was a very small loss of accuracy due to the accumulation of rounding errors in this expression. Subroutine COMPG works for p = 2, 3, 5. It follows from expression (3) that the angles to be considered in the transformation are: a= N

r + m1~ , p

m = 0, 1,... ‘p



1.

400

E. Garcia-Toraño

/

FASTF: fast Fourier transform

If i, r are fixed numbers, it may be seen that these angles are equally spaced according to the values of m. We note moreover that they are such that: + ampj = 4’rrir/N + 2’rr

general-purpose subroutine COMPG. Accuracy was evaluated in the following form: Suppose that {Ck}=FFT{Xj} and

=4iiir/N=/3. We compute only the trigonometric functions corresponding to the first angle (a1) and we obtain

{Ek}=FFT~{C}, we form the sums

those of the other angle by using the trigonometric expressions: sin(a~~)=sin($—a 1)

s;

sin /3 cos a~ sin a1 cos /3,

S,’

=

cos(a~



N

~

=

2/N,

(Re( x~) Re( E.)) —

11 =

2/N,

~ (Im( X,)



Im( E, ) )

1)=cos($—a1) =

cos /3 cos a~ sin a~sin /3. —

and compute ~

4. Testing the program

In order to estimate the time required by the program, it was tested for several values of the number of points N. We simulated a set { X, } of random numbers, normally distributed, with zero mean. Then, we calculated the FFT of this set. The measured cpu times for the transformation are given in table 1. From table 1 it may be seen that numbers which are divisible by 2, 3 or 5 are favoured in the transformation, because the special subroutines designed for these factors work faster than the

2. s1=(s;)~’ were very similar in real and imag-

The results mary components. Accuracy as defined before varies between 1017 and 10—19 for the number of points considered in table 1.

5. Input and output description The main program is to be written by the user. It should contain a calling sequence as described in section 3.2. The results are not printed by the FASTF routine, so the user must perform this in the main program.

Table 1 Values of the cpu time for the FASTF routine (UNIVAC 1100/81)

6. Test run

Number of points

Time (s)

5i2=2~ 1024=2b0

0.252 0.544 0.439 0.405 0.751 0.651 1.7 2.71

transform of the following data: X,(i)=float(i),

729=36

625=S~ 900 = 22.32.52 1000 ~ 529 = 232 897= 3’~13’~23’

A test program has been written to compute the

X,(i)

=

float(N— i + 1). .

.

The main program prints out the input arrays Xr, X, and the output arrays: FFT(X) and FFT1(FFT(X)). The number of points was N = 50.

E. Garcia -Toraño

/

FASTF: fast Fourier transform

References

[31J.P. Christiansen

[1] W.T. Cohran et al., IEEE Trans. on Audio and Electro. acoustics AU.15 (1967) 45. [2] J.A. Maruhn, Comput. Phys. Commun. 2 (1976) 147.

[41 R.C. Singleton, IEEE Trans. on

401

and R.W. Hockney, Comput. Phys. Commun. 2 (1971) 127.

Audio and Electroacoustics AU45 (1967) 91. [5] H.P. Hsu, Fourier Analysis (Simon—Schuster, New York, 1970).

E. Garcia - Toraño

402

/

FA STF: fast Fourier transform

TEST RUN OUTPUT

INPUT XR XR XR XR XR XR XR XI XI XI XI XI XI XI

• ‘ ‘ ‘

• •

• • •

• ‘ ‘ ‘



.i80009+081 .908900÷001 1’~0000+002 .250000÷002 .330090÷802 .410009+002 .490009÷002 .500000÷002 .420000+002 .340000÷002 .268000+082 • 180000+802 .100080+002 .200000+001

.200000+001 100000÷002 • 180089÷002 .260000+002 .340800+082 .420089÷002 .500800+002 .490090+802 .418000+002 .330000÷002 .250000+002 • 110000+002 .900000+001 .100000+001

.300000+091 • 110000+802 • 190000+002 .210000÷082 .350008+082 .430008+002

.400000+001 120000÷082 .208000+002 .280000+002 .360000÷002 .440000+002

.500000+001 • 130080+002 .210900÷002 .290000÷002 .310000÷002 .450000+002

.600000+001 • 140008+002 .220080+002 .308000+902 .389000÷092 .468000+092

.~00800+081 • 150008÷002 .238808+002 .310008+002 .398008+002 .410098÷002

.890090÷001 • 160000+002 .249000÷082 .320008+002 .480008÷082 .480088+002

.480000+802 .400000+002 .320000+002 .240000+002 • 168880+002 .800800+001

.410000÷002 .390000+002 .310000+082 .230900+002 • 150000÷802 .100098÷001

.468000÷002 .388800+002 .300008÷802 .228000+002 • 140000+002 .600000+081

.450800+002 .310000÷002 .290809+002 .210009÷082 • 130000+092 .500000+001

.440000÷002 .360000÷002 .288000+802 .298009÷002 • 120008+002 .400800÷001

.430009+082 .350000÷902 .210900÷002 • 199000÷002 • 110000+002 .300000÷001

OUTPUT CR CR CR CR CR CR CR CI CI CI CI CI CI CI

u • ‘ ‘

• • ‘



• •

u • • •

121500÷004 .312364+003 • 112895÷003 • 106055+003 .123686÷002 .519421+002 .381428+002 .281211+002 .204148÷002 • 143931÷002 .940955+001 .521981+801 • 162230+001 —o 152344+801 —.431820+001 —.683644+001 —.913452+001 —. 112561+002 —. 132359+002 —. 151018+002 —. 168710+002 —. 185811+002 —.202310+082 —.218418+082 —.234211+002 —.250000+002 —.265129÷002—.281582+802 -.291690+002 —.314189+002 —.331238+002 —.348982÷002 —.361641+002 —.381439÷002—.408655+002 —.431636+082 -.456818+002 —.484’166÷002—.516223+002 —.552180+002 —.594095+802 —.643931+002 —.104148÷002 -.181271+002 —.881428+002 —. 101942+003 —. 122369+083 —. 156055÷083 —.222895+003 -.422364+003 . 121500+084 .422364+003 .222895+003 • 156055+003 • 122369+003 • 101942+003 .881428+002 .181271÷002 .104148÷002 .643931÷092 .594095+002 .552198+002 .516223+802 .484166+002 .456818+002 .431636+082 .408655+002 .381439+982 .361641+002 .348982+002 .331230+002 .314189+002 .291690÷002 .281582+002 .265129÷092 .250000+002 .234211+802 .218418+002 .202310+002 • 185811+802 • 168730÷002 • 151018+082 . 132359+802 • 112561÷002 .913452+001 .883644÷001 .431820+001 • 152344+001 —. 162238+001 —.~i21901+001 -.940955+081 —. 143931+092 —.204148÷002—.281273+002 -.381428+082 —.519421+002 —.123686+002 —.186055+803 —. 112895÷003—.312364+003 •

OUTPUT XR XR XR XR XR XR XR XI XI XI XI XI XI XI

• • • ‘ ‘

u • • • • • • ‘



.100000+001 .900080+08 1 . 110000+002 .250000+002 .330000+082 .410000+002 .490000+002 .500000÷002 .420000+092 .340000+002 .260000+002 • 180000+002 . 189090÷002 .200000+001

.200000+001 • 108090+002 • 180000÷002 .260080+002 .340000÷002 .420008+002 .500000÷002 .490080÷002 .410980+002 .339000÷002 .250000÷002 • 110000+002 .900000÷091 .100000÷001

.300000+001 .110000÷082 • 190008+082 .210000+002 .350080÷082 .430080+002

.400800+081 • 120000+002 .200000+002 .288880+802 .360000÷002 .440000+002

.500080+001 • 130080+002 .210880+002 .290000+002 .318809÷882 .458090÷002

.608009+001 • 140990+802 .228080÷002 .300000+002 .380080+002 .460000+002

.180080÷001 • 150909+002 .230909+002 .310000+082 .399800+082 .410000+002

.800000+001 • 160800+002 .240800+802 .320000+002 .400900÷002 .488000÷082

.480800+002 .409000÷002 .320000÷082 .240000÷002 • 160000÷002 .800080÷001

.410000+002 .398090+002 .310000÷002 .230080÷002 • 150008÷002 .100080+001

.460000+002 .388000÷002 .308000+002 .220000+092 • 140000+092 .688000+801

.450000+002 .310000+002 .290002÷082 .210800÷002 • 130000+002 • 500009+801

.440900÷002 .360080+082 .289008+002 .209080+002 • 120800÷002 • 480009+001

.430008÷002 .350090+002 .210090+002 • 190090+002 • 110090+002 .390000÷801