New architecture of DDSP-Piano.
The main differences with the first version are:
Audio sampling rate increased from 16kHz to 24kHz.
Inharmonicity and detuning parameters are estimated using a signal-based extraction method, from isolated single piano notes.
A new differentiable reverberation module based on Feedback-Delay Networks.
Slightly deeper context and monophonic networks.
Synthesis examples
Other systems besides the presented DDSP-Piano v2 include:
Original recording: the real audio recording of the performance.
Piano-TTS v2: the improved neural-based synthesis model inspired by text-to-speech techniques.
DDSP-Piano v1: the first version of DDSP-Piano, retrained for synthesis with a 24kHz sampling rate.
DDSP-Piano v2 Regularized: a ablated variant of the v2 version, with a regularization on the early reflections of the reverberation module.
A. Scriabin - Etude, Op.42 No.4
Piano year: 2009
Model
Audio sample
Original recording
Piano-TTS v2
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
C. Debussy - Etude, No.7 “Study in Chromatic Steps”
Piano year: 2004
Model
Audio sample
Original recording
Piano-TTS v2
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
D. Scarlatti - Sonata in D Major, K.118
Piano year: 2014
Model
Audio sample
Original recording
Piano-TTS v2
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
F. Mendelssohn - Fantasy in F-sharp minor, Op.28
Piano year: 2017
Model
Audio sample
Original recording
Piano-TTS v2
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
F. Liszt - Hungarian Rhapsody No.9 in E-Flat Major, S.244
Piano year: 2015
Model
Audio sample
Original recording
Piano-TTS v2
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
F. Schubert - Impromptu Op.142 No.4, in F minor, D935
Piano year: 2011
Model
Audio sample
Original recording
Piano-TTS v2
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
F. Chopin - Nocturne in B Major, Op.9 No.3
Piano year: 2009
Model
Audio sample
Original recording
Piano-TTS v2
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
J.S. Bach - Prelude & Fugue in G-Sharp Minor, WTC I BWV.863
Piano year: 2013
Model
Audio sample
Original recording
Piano-TTS v2
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
L. van Beethoven - Rondo a Capriccioso “Rage over a Lost Penny”, Op.129
Piano year: 2018
Model
Audio sample
Original recording
Piano-TTS v2
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
S. Rachmaninoff, Etudes-Tableaux, Op.39 No.9
Piano year: 2006
Model
Audio sample
Original recording
Piano-TTS v2
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
W.A. Mozart - Sonata in B-Flat Major (1st movement), K333
Piano year: 2008
Model
Audio sample
Original recording
Piano-TTS v2
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
Disentanglement of sound components
Here, we can compare the output of individual DDSP synthesizer of the presented models.
Note that the audio samples are not set to a common loudness, mainly for highlighting the differences when the early relections of the FDN reverb are regularized.
Component
DDSP-Piano v1
DDSP-Piano v2 Regularized
DDSP-Piano v2
Additive
Residual Noise
Dry Output
Reverb
Bonus: Training on the MAPS dataset
MAPS is a piano dataset older than MAESTRO, that also provides aligned pairs of MIDI and audio performances.
The ENSTDkCl (ENST Disklavier Close) subset has 2 hours of MIDI/Audio recordings of an upright Disklavier piano.
We trained a DDSP-Piano v2 model on this dataset.
Below are some examples on MAESTRO performances, in comparison with the best sounding model trained on MAESTRO (year 2015):