Least square procedures to improve the results of the three ‐ parameter sine ‐ fitting algorithm

The paper presents two approaches to improve the three parameter sine fitting algorithm and attain accurate estimates of the parameters of a sinusoidal signal corrupted by noise. As the four parameter sine fitting algorithm, which is usually adopted to improve the estimates of the three parameter algorithm, in theory, the proposed ones can be inserted into iterative schemes and repeated until a target precision is gained. Anyway, the use of the proposed algorithms is particularly suggested for those applications in which the results must


INTRODUCTION
Many powerful methods aimed at measuring the parameters of digital signals corrupted by noise rely on model fitting approaches [1].Fitting consists in adjusting the parameters of the model in order to minimize a distance measurement between the model and the digital signal.When the distance measurement consists in the root mean square difference, the fitting is recognized as the best fit in a least square sense [2]- [3].
Sine fitting is aimed at accurately estimating the frequency, amplitude, phase and DC value of a sinusoidal signal corrupted by noise.Also, the estimation of the variance of the noise can be gained from the residue of the fit [4]- [7].For instance, displacement sensors [8] and digital impedance meters [9] exploit sine fitting to estimate the amplitude of the signal, whereas digital instruments devoted to system monitoring and control use sine fitting to obtain accurate frequency [10]- [12] or phase measurements [13]- [16].In metrological applications the chief purpose of the fitting is the accurate estimation of the variance of the noise [17].
The most common sine-fitting algorithm is the three parameter algorithm [17], which is based on a linear algebra framework, and it is utilized when the frequency parameter is known and stable.The accuracy characterizing the parameters estimates offered by the three parameter algorithm is mainly limited by the exact knowledge of the frequency of the signal.
If the fitting result is unsatisfactory a four parameter algorithm can be adopted to improve the estimates.The four parameter algorithm starts from the results of a pre-fit, obtained through the three parameter algorithm, and runs a least square procedure to (i) update the estimates of amplitude, phase, and dc value and (ii) identify a frequency correction term to improve the frequency estimate.In order to further improve the results, the four-parameter algorithm can be recursively invoked until the target accuracy is gained, using the estimates gained at the previous execution as inputs for the next execution.
In the paper, starting form an analysis of the effects produced by a frequency error in the use of the three parameter sine fitting algorithm, two least square methods capable of complementing the three parameter sine fitting algorithm and improve its results are proposed as alternative to the four parameter algorithm.The proposed approaches quantifies both the frequency error, to offer an accurate estimate of the frequency of the input signal, and the variance of the noise.
Their performance in terms of accuracy and running times are thoroughly analysed and compared to that offered by the four parameter algorithm.Although both the proposed algorithms and the four parameter algorithm can be inserted into iterative schemes and repeated until a target precision is gained, the single shot results are considered in the comparison.The proposed algorithms are in fact intended for those applications in which the estimates of the frequency and of the variance of the noise must be gained in very short times, which are in contrast to long iterative approaches.

Three parameter sine fitting
The standard three parameter sine fitting algorithm estimates the amplitude, phase and dc value of the sinusoidal signal that best fits in a least square sense a digital signal y k .Named the value of the digital frequency of the input data (i.e. the frequency in hertz normalized to the sample frequency), A, , and C, respectively, the amplitude, phase and DC value, n k the noise, and k the digital time ranging from 0 up to M-1, the following model can be adopted: The three-parameters sine-fitting algorithm requires as input parameter the value of the digital frequency or an estimate   of it.If the value   coincides with , the values of the parameters A, and C can be gained considering the equation: which is equivalent to equation (1) but linear in the parameters A 0 , B 0 and C 0 , respectively defined as: A 0 = Acos, B 0 = -Asin, and C 0 = C. Discarding the noise n k and ranging k from 0 up to M-1, equation (2) produces a linear system made up of M equations, which can be represented in matrix form by: D 0 x 0 = y (3) where x 0 is a column array with the unknown components A 0 , B 0 and C 0 , y is a column array containing the samples of the input digital signal, y k , and D 0 is the matrix of the coefficients: As it is well known the solution represented by equation ( 5) minimizes the mean square value of the residue of the fitting, which should coincide with the noise.The parameters of the model given in (1) are then estimated through: in which atan2( ., . ) represents a four quadrant inverse tangent function.

Four parameter sine fitting
The four parameter sine fitting algorithm works on the hypothesis that the estimated value of the digital frequency   approaches the exact one , but it is biased by a small amount .It linearizes the model given in (2), which is non-linear with respect to , by approximating it with a first order Taylor expansion centred in  i-1 =  0 .Thus it considers for the acquired digital signal the representation:   (7) in which there are four unknown parameters, namely A i , B i , C i , and  i .The term  i represents a correction term that added to  i-1 produces an improved estimate of the digital frequency, Equation (7) represents an iterative model that includes the values of the parameters estimated at step (i-1)-th and the unknown parameters to be estimated at step i-th.In order to start the iterations, the four parameter algorithm uses at step i = 1 the results of a three-parameter pre-fit to initialize for the next iterations, the parameters estimated at step i-th are reused at step (i+1)-th.
The method to solve ( 7) is formally identical to that discussed for the three-parameter algorithm.In fact, discarding the noise n k and ranging k from 0 up to M-1, equation ( 7) produces a linear system made up of M equations.The estimates of the parameters A i , B i , C i and  i that minimize the mean square value of the residue in (7) can thus be attained solving the system by means of the pseudo-inverse method.In matrix form, collecting the parameters A i , B i , C i and  i , in the column array where D i is defined by: The four-parameter algorithm is usually run until the distance between successive estimates is smaller than a target threshold.It typically converges in less than ten iterations, but can diverge if the initialization step is coarse.

ANALYSIS OF THE EFFECTS OF A FREQUENCY ERROR
In order to improve the results of the fitting, an appropriate model is proposed to analytically describe the effects produced by a frequency error.In particular, the digital signal can be represented according to: (10) in which I 0 satisfies the constraint 2 0 I 0 =  + n2 for an unknown integer n, and ∆ is much smaller than   .From equation (10) it follows that: (11) and, in the hypothesis that the frequency error  is small: According to (12), the residue (13) does not coincide with the noise but includes an additional contribution, which is described by linearly damped oscillations characterized by frequency  0 ; the parameters  and I 0 characterize the envelope of the damped oscillations.This contribution is often visible in the typical graph of the  k data resulting in real tests: an example of a possible scenario is given in Figure 1.

PROPOSED METHODS
The removal of the deterministic contribution present in (13) can be obtained by least square estimation procedures.In particular, two methods can be considered.

Method 'A'
A first method consists in finding the values of the parameters  and I 0 that produce the least square residue in (13).The values can be obtained by solving through the pseudo-inverse method the over-determined system: where . Specifically, named a the column array with components  and -I 0 : where  is the column array containing the values  k , and A is the matrix in (14).It is worth noting that ( 15) is formally the solution of a straight line fit problem.The estimate of the frequency error can be therefore straightforwardly gained by applying the formula: In order to measure the random noise n k , the deterministic contribution present in (13), identified through the estimation of  and I 0 , has to be subtracted from the residue  k .

Method 'B'
Alternatively the frequency correction term  and the variance 2 n  of the noise corrupting the signal can be estimated from the squared values 2 k  of the residue throughout a further least square procedure.In this case a simple and linear model can be utilized to fit the data.Specifically, taking the square values of equation ( 13), it follows: The squared values of the noise 2 k n include both a constant term equal to 2 n  , which is the variance of the original noise n k , and a zero mean random term m k .Grouping all the contributions that have zero mean value in (17), both if characterized by deterministic or random nature, in the term l k : it can be stated that: To gain estimates of , I 0 , and 2 n  , and get rid of l k , equation ( 19) is first rewritten in a more compact form: and the parameters a, b, and c are estimated by solving through the pseudo-inverse method the system: In order to estimate the term that compensates the frequency error, the sign of the correction has to be gained throughout further considerations.For instance, if the initial frequency estimate  0 has been obtained by applying a peak location algorithm to the results Y(m), m = 0, …, M-1, of a M points Discrete Fourier Transform (M-DFT), according to: the correction can be given without sign ambiguity as: which is valid for rectangular window at large values of j (typically more than 10 cycles in the measurement interval).The high signal to noise ratio in the neighbouring of the peak value should assure the minimum sensitivity to the effects of the noise.

PERFORMANCE EVALUATION
The performance of the proposed methods has been assessed by means of several tests.The main goal of the tests is evaluating the accuracy of the methods at estimating both the frequency of the input signal and the variance of the superimposed noise.In this case the accuracy can be decomposed into two terms, i.e. bias and repeatability, and can be quantified as square root of their quadratic summation.
Each test consists in repeating N times two fundamental steps: -running a three-parameter sine-fitting algorithm to attain initial estimates of the parameters of the sine model; -applying the proposed methods and the four-parameter algorithm to attain improved estimates of the parameters of interest.The difference between the mean value of the repeated estimates and the reference value represents the experimental bias, while the experimental standard deviation of the repeated estimates represents the repeatability.
The  are positive and equal to 10%, 20%, 30% 40% and 50% of RBW; it has been observed that positive and negative frequency errors produce the same statistics.For each method a set of 15 estimates has been considered to put in evidence bias and repeatability.The results highlight that methods A and B exhibit comparable performance, both in terms of bias and repeatability, with respect to the four-parameter sine-fitting algorithm in single run mode.Note that the results produced by the peak location applied to the M-DFT results are characterized by utmost repeatability, which is explained by the minimum sensitivity to the effects of the noise in the neighbouring of the peak value of the signal spectrum, as well as the high SNR equal to 37 dB.
Figure 4 shows the typical results related to the variance estimates obtained in the same test conditions.Specifically, the results are given by means of a scatterplot that makes visible both bias and repeatability.The peak-to-peak amplitude of the simulated signal is 1 V, the noise variance is 0.005 V 2 .In this case marker '*' is associated to the results offered by the three parameter sine fitting algorithm.Methods A and B exhibit almost comparable performance with respect to the four parameter sine fitting algorithm utilized in single run mode.
Similar tests have been repeated for signals characterized by reduced SNR = 17 dB; Figure 5 and Figure 6 show the results.Concerning the digital frequency estimations, methods A and B still exhibit comparable performance in terms of bias and repeatability with respect to the four parameter sine fitting algorithm.Further tests have been performed in order to verify any influence of the length of the processed data on the results.In particular, the estimates of the digital frequency and noise variance have been performed upon records made up of 100, 250, 500, and 1000 samples.The results show that the accuracy characterizing the digital frequency estimates improves upon the increasing of the record size M.The performance in terms of bias and repeatability related to the variance estimates produced by the proposed methods show the same sensitivity on the record size with respect to that offered by the four parameter algorithm.Figure 7 and Figure 8 show a typical case: they are related to a signal corrupted by noise characterized by SNR = 17 dB.It is worth noting that the resolution RBW needed to measure the exact digital frequency is 0.001, thus for the record characterized by size 100, 250 and 500, the M-DFT approach offers a digital frequency estimate which is shifted from the real one, because the available resolutions are respectively equal to 0.01, 0.004, 0.002.The results obtained for M = 1000 are very accurate because the digital frequency of the signal is exactly estimated by the M-DFT approach.

CONCLUSIONS
The paper has presented two methods to improve the three parameter sine fitting algorithm.The performance of the two methods, intended as an additional step of the three parameter     sine fitting algorithm, has been compared to that of a four parameter sine fitting algorithm used in single run mode.The proposed methods represent nice alternatives to the four parameter sine fitting algorithm for the identification of the parameters of the sinusoidal model and the estimation of the variance of the superimposed noise.In addition, they are suitable to function in a large variety of noise conditions.
It is worth noting that one of the simplest ways to improve the results of a three parameter sine fitting algorithm is preliminary computing an accurate initial estimate of the frequency of the input signal.To this end, high-accuracy spectrum analysis techniques, capable of providing analytical leakage compensation and improving frequency estimation, can be employed; one of these simply requires interpolating the DFT results related to the signal under test.In several practical applications the proposed methods are likely rejected in favor of interpolated DFT approaches, while in particular cases can be considered for their low computational burden.Anyhow, the proposed methods represent two different lines of attack to refine the three-parameter sine fitting results, to be valued from a conceptual and methodological point of view.

APPENDIX A
The computational burden of the different approaches that have been considered can be evaluated by counting the number of the most expensive calculus operations, i.e. the nonlinear operations and multiplications, which are required to obtain the results.
The overall computational burden of the proposed methods and the four parameter algorithm considered for estimating the digital frequency and the variance of the noise includes that of the three parameter algorithm.
The three parameter sine fitting algorithm requires the definition of matrix D 0 that involves the calculation of cos(2  k) and sin(2  k), with k ranging from 0 up to M-1, and has a cost equal to 2M.Then a least square minimization has to be performed by processing M equations to gain the values of P parameters.In detail, MP(P+1)/2 multiplications are needed to compute the symmetrical square matrix D 0 T D 0 and MP multiplications are required to condition the input data by computing D 0 T y.The other calculations have a negligible cost.In fact, the number of multiplications needed to perform the P-square matrix inversion, i.e. (D 0 T D 0 ) -1 , in the case that the inversion is attained by calculating the algebraic complements, which is the most expensive method and requires P! multiplications, is negligible since P is equal to 3; similarly, the final step that consists in calculating the inner product between the inverted matrix and the conditioned data, counts only P 2 multiplications, which is negligible.
The four parameter algorithm adds the computational burden of including in matrix D i a fourth column, i.e. 2A 0 kcos(2 0 k) -2B 0 ksin(2 0 k), which requires 4M multiplications, aside the three already present in D 0 .Moreover it demands least square minimization upon M equations to gain Q = 4 parameters.
Method A instead involves the computational burden of defining matrix A, which consists in calculating the values s k , which have a cost equal to M, and performing the M multiplications k•s k , and, furthermore, the computational burden of performing least square minimization upon M equations to gain R = 2 parameters.Nonetheless, to estimate the variance of the noise, the parameters  and I 0 have to be used to build up the fitted model of the residue  k , extract the noise n k , and calculate its variance.
Method B is the best one from a computational point of view because it does not add any computational burden for defining matrix B, the values of which are independent from unknown parameters: to define matrix B only the size M of the acquired data is needed.Since the square matrix (B T B) -1 can be calculated and stored previously, the remaining computational burden is just that of data conditioning, which consists in calculating B T y and requires 3M multiplications.Moreover, the method makes ready both an estimate of the frequency error and of the variance of the noise without requiring building up models.
The running times of the three and four parameter algorithms as well as of the methods A and B have also been measured.In particular, the algorithms have been run on a COMEX XP.520 core 2DUO T7200 machine, which is a windows based multitasking processor.Due to the multitasking functioning of the operative system, the running times have shown different values in repeated tests.To clear the effects of interruptions and holding-ups due to the operative system, the same test signals have been processed a thousand times and the average running time observed for each algorithm has been considered as a reasonable measurement.The test results highlight that the four parameter sine fitting algorithm lengthens the running time of the three parameter algorithm.In particular, Table 1 gives in the first column the results related to the average processing time of a record of 1000 points in milliseconds.It also summarizes in the second column the approximate estimations of the computational burden of the four different approaches, expressed in terms of number of most expensive calculus operations.
parameter algorithm determines x 0 by means of the pseudoinverse method, namely: 21) Specifically, named b the column array with components a, b, and c it results: b = (B T B) -1 B T    (22) where   is a column array containing the values 2 k  and B is the matrix in (21).The typical results obtained fitting the 2 k  data are given in Figure 2.

Figure 1 .
Figure 1.Residue characterized by linearly damped oscillations masked by noise.

Figure 1 .
Figure 1.Squared values of the residue and polynomial fitting results.

Finally, taking into 2 n
account that a, b, and c are related to , I 0 and  by: interest can be gained from: signals under test are made up of M samples that represent a sinusoid corrupted by noise.The digital frequency of the test signals  is obtained adding to a basic value  0 , expressed with finite resolution RBW = 1/M, an offset , i.e.  =  0 + .The basic value  0 is given in input to the initial three-parameter sine-fitting algorithm to obtain an initial estimate of the signals parameters.The offset  makes the methods under test work in the presence of a frequency error.

Figure 3
Figure 3 shows the typical results related to digital frequency estimates, obtained for a signal under test characterized by M = 100,  0 = 0.310, and signal to noise ratio (SNR) equal to 37 dB.To highlight both the bias and repeatability contributions to the overall accuracy the results are shown by means of scatterplots; different markers are associated to the different methods.In particular, the results offered by the peak location algorithm applied to the outcomes of the M-DFT are represented by marker '*', those offered by method A by marker '+', those offered by method B by marker 'o', and those offered by the four-parameter sine fitting algorithm in single run mode by marker 'x'.The considered frequency offsets 

Figure 3 .
Figure 3. Results related to the estimates of the digital frequency of sinusoidal signals corrupted by noise when the SNR is equal to 37 dB.

Figure 5 .
Figure 5. Results related to the estimates of the digital frequency of sinusoidal signals corrupted by noise when SNR is equal to 17 dB.

Figure 6 .
Figure 6.Results related to the estimates of the digital frequency of sinusoidal signals corrupted by noise when SNR is equal to 20 dB.

Figure 7 .
Figure 7. Estimates of the digital frequency of the signal corrupted by noise: the accuracy improves upon the increasing of the record size M.

Figure 8 .
Figure 8.Typical estimates of the variance of the noise: the performance of the proposed methods are comparable to those offered by the four parameter algorithm.

Figure 4 .
Figure 4. Results related to the estimates of the noise variance carried out when the digital frequency of the signal is affected by error.

Table 1 .
Average running times determined upon 1000 runs and theoretical computational burden.