Abstract: Musician and instrument make up an indispensible duo in the musical experience. Inseparable, they are the key actors of the musical performance, transforming a composition into an emotional auditory experience. To this end, the instrument is a sound device that leverages a physical vibratory phenomenon, which the musician controls to transcribe and share their understanding of a musical work. Access to the sound of such instruments, often the result of advanced craftsmanship, and to the mastery of playing them, can require extensive resources that limit the creative exploration of composers. This thesis explores the use of deep neural networks to reproduce the subtleties introduced by the musician’s playing and the instrument’s sound, making the music realistic and alive. Focusing on piano music, the conducted work has led to a sound synthesis model for the piano, as well an expressive performance rendering model. DDSP-Piano, the piano model, is built upon the hybrid approach of Differentiable Digital Signal Processing (DDSP), enabling the inclusion of traditional signal processing tools into a deep learning framework. The model takes symbolic performances as input, and explicitly includes instrument-specific knowledge, such as inharmonicity, tuning and polyphony. This modular, lightweight and interpretable approach synthesizes sounds of realistic quality, while separating the various components that make up the instrument’s sound. As for the performance rendering model, the developed approach enables the transformation of MIDI compositions to into symbolic expressive interpretations. In particular, thanks to an unsupervised adversarial training, it stands out from previous works by not relying on aligned score-performance training pairs to reproduce expressive qualities. The combination of the sound synthesis and performance rendering models would enable the synthesis of expressive audio interpretations of scores, while enabling modification of the generated interpretations in the symbolic domain.