Skip to the content.

This is the accompanying page for the article Regularized autoregressive modeling and its application to audio signal declipping authored by Ondřej Mokrý and Pavel Rajmic, submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processing.

Autoregressive (AR) modeling is invaluable in signal processing, in particular in speech and audio fields. Attempts in the literature can be found that regularize or constrain either the time-domain signal values or the AR coefficients, which is done for various reasons, including the incorporation of prior information or numerical stabilization. Although these attempts are appealing, an encompassing and generic modeling framework is still missing. We propose such a framework and the related optimization problem and algorithm. We discuss the computational demands of the algorithm and explore the effects of various improvements on its convergence speed. In the experimental part, we demonstrate the usefulness of our approach on the audio declipping problem. We compare its performance against the state-of-the-art methods and demonstrate the competitiveness of the proposed method, especially for mildly clipped signals. The evaluation is extended by considering a heuristic algorithm of generalized linear prediction (GLP), a strong competitor which has only been presented as a patent and is new in the scientific community.

The preprint is available at arXiv.

Comparison of the methods

Below, we present the comparison of the regularized AR model with the methods from the survey [1]. This is a reproduction of Fig. 4 from the article.

Results of the methods based on the AR model, treated in the article, are indicated with a background color. In the case of PEMO-Q and PEAQ, results using the “replace reliable” strategy from [2] are shown using stacked bars. For each algorithm which may produce signals inconsistent with the reliable samples, the cross-faded strategy is applied and the updated result is shown in lighter shade.

Legend for the AR-based methods (shared for all the plots):

Legend for the methods taken from the declipping survey [1]:

Comparison in terms of ∆SDR

Note that ∆SDR denotes the improvement of SDR over the clipped signal, i.e. over the input SDR.

Comparison in terms of PEMO-Q ODG

Comparison in terms of PEAQ ODG

Audio examples

This section only includes the signals reconstructed using the AR-based methods discussed in the article (and without the “replace reliable” step).

To listen to the references, please see the accompanying webpage to [1] and [2]. Since the experiment protocol was identical, the examples are comparable.

Please note that the differences between the individual reconstructions may be very subtle. There might be no audible clipping present in the reconstructed audio, only a subtle change in timbre. If you struggle to hear any differences, you can download the two WAV files that you want to compare and merge them in a single stereo file as separate (left/right) channels. This way, the differences become easier to recognize.

Choose file

Choose input SDR

Chosen audio file: a08_violin

Chosen input SDR: 5 dB

Clean
Clipped
inpainting GLP declipping
\(\lambda_S = 10\) \(\lambda_S = \infty\)
\(\lambda_C = 0\)
\(\lambda_C = 10^{-5}\)
\(\lambda_C = 10^{-3}\)

Remarks

References

  1. P. Záviška, P. Rajmic, A. Ozerov, and L. Rencker, “A survey and an extensive evaluation of popular audio declipping methods,” IEEE Journal of Selected Topics in Signal Processing, vol. 15, no. 1, pp. 5–24, 2021. DOI: 10.1109/JSTSP.2020.3042071
  2. P. Záviška, P. Rajmic, and O. Mokrý, “Audio declipping performance enhancement via crossfading,” Signal Processing, vol. 192, 2022. DOI: 10.1016/j.sigpro.2021.108365
  3. L. Atlas and P. Clark, “Clipped-waveform repair in acoustic signals using generalized linear prediction,” U.S. patentus US 8 126 578 B2, 2012. [Online]. Available: https://patents.google.com/patent/US8126578
  4. A. J. E. M. Janssen, R. N. J. Veldhuis, and L. B. Vries, “Adaptive interpolation of discrete-time signals that can be modeled as autoregressive processes,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 34, no. 2, pp. 317–330, 1986. DOI: 10.1109/TASSP.1986.1164824