improving musical timekeeping

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

improving musical timekeeping

HuBandiT@gmail.com
Here's a thought:

Currently, samples start playing from the very beginning of their attack
phase at the moment a NoteOn is due, right? This means instruments with
meaningfully long attack phases will drag (be late musically); and to
make things worse, drag varying amounts by note (pitch) as the sample is
resampled for various pitches. This unavoidaby makes musical timekeeping
poor, and needs manual, sample- and note-specific correction to arrive
at good timekeeping.

But it does not have to be this way when playing from a score. Real
musicians, when playing from a score, know ahead of time what notes they
will have to play and when, so they play their notes slightly ahead of
time, just enough to give time to the instrument to complete the attack
phase, to arrive at the meaningful musical onset of the note exactly on
time.

Why don't samplers/synthesizers do that when playing back a score? Why
don't we designate in our samples a musical onset point (much like we do
with loop points), and then play the sample early, just enough so that
this designated musical onset arrives on the "tick"?

Sample makers wouldn't have to artificially shorten their attack phases
just so that their soundfonts play "on time", and/or composers would no
longer have to be forced to compensate for - sample/soundfont-specific,
and note specific - attack phase lengths by tediously manually moving
their notes earlier in their scores. This would automatically be "on time".


_______________________________________________
fluid-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/fluid-dev
Reply | Threaded
Open this post in threaded view
|

Re: improving musical timekeeping

HuBandiT@gmail.com
2020.02.07. 12:58 keltezéssel, [hidden email] írta:
> This would automatically be "on time".

_Things_ would automatically be "on time".


_______________________________________________
fluid-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/fluid-dev
Reply | Threaded
Open this post in threaded view
|

Re: improving musical timekeeping

fluid-dev mailing list
Here are some thoughts of a software engineer. Not sure what a musician
would say.

First of all, you are right. A "meaningfully long" attack phase will
"delay" note on and thus shorten it. My question: what would be
the use-case of such a "meaningfully long" attack? The only use-case I
can think of is to play a crescendo. And for this, one should use the
Expression CC instead.

Why samplers/synthesizers don't compensate for that? I think because
it doesn't work for real-time performance. One would need to delay all
notes by the length of the attack phase. But what if the musician
suddenly switches to another instrument which has an even longer
attack? Even worse: what if the musician uses a custom CC that extends
the attack phase even more? The only way to correct this would be to
figure out the longest attack phase that can ever be produced and
delay all notes and all sound accordingly. But most musicians use
monitors. They want to hear what they play, when they play it. I think
they would be very confused if what they're playing is delayed by a
few seconds.

It is (a) well known (problem), that MIDI sheets only work well with
the soundbank they have been designed for. This is due to various
degrees of freedom a soundbank provides (volume, ADSR env, cut-off
filters, custom CC automation). So, if you really have instruments
with "meaningfully long" attacks, I'm afraid you're required to adjust
your MIDI sheet(s) manually.


Tom

_______________________________________________
fluid-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/fluid-dev
Reply | Threaded
Open this post in threaded view
|

Re: improving musical timekeeping

Marcus Weseloh
In reply to this post by HuBandiT@gmail.com
Hi,

I think the answer is hidden in your question. You talk about "the meaningful musical onset of the note exactly on time".

The thing is that what is "musically meaningful" depends very heavily on the context. In some musical contexts it might be correct to say that you want the end of the attack phase exactly on the beat. In other musical contexts you might want the beginning of the attack phase on the beat. Yet another context might want the middle of the attack phase on beat.

So the musician or composer is the only person that can decide what the correct meaning is for their musical performance. And the only way in which they can do that is if synthesizers don't add meaning to the musical input, but simply execute the commands and leave "meaning" to the musician or composer. As soon as a synth forces meaning onto the input, you take away control from the musician.

So what you describe as a flaw - that synths ignore the meaning of the note onset of samples - is actually a feature. It gives musicians a consistent and predictable system that they can use to add meaning themselves.

Cheers
Marcus



_______________________________________________
fluid-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/fluid-dev
Reply | Threaded
Open this post in threaded view
|

Re: improving musical timekeeping

Reinhold Hoffmann

Hi,

 

I totally agree to what Marcus says. It is up to the musician/composer and the style of music. Good and professional recorded MIDI files include the recommended behaviour using the existing MIDI protocol elements. Those MIDI files, of course, are complex. I think that the required feature can only be created by a musician or a composer (e.g. by recording and using the necessary protocol elements) rather than by a synthesizer “afterwards”.

 

Reinhold

 


Von: fluid-dev [mailto:fluid-dev-bounces+reinhold=[hidden email]] Im Auftrag von Marcus Weseloh
Gesendet: Samstag, 8
. Februar 2020 17:02
An: FluidSynth mailing list
Betreff: Re: [fluid-dev] improving musical timekeeping

 

Hi,

 

I think the answer is hidden in your question. You talk about "the meaningful musical onset of the note exactly on time".

 

The thing is that what is "musically meaningful" depends very heavily on the context. In some musical contexts it might be correct to say that you want the end of the attack phase exactly on the beat. In other musical contexts you might want the beginning of the attack phase on the beat. Yet another context might want the middle of the attack phase on beat.

 

So the musician or composer is the only person that can decide what the correct meaning is for their musical performance. And the only way in which they can do that is if synthesizers don't add meaning to the musical input, but simply execute the commands and leave "meaning" to the musician or composer. As soon as a synth forces meaning onto the input, you take away control from the musician.

 

So what you describe as a flaw - that synths ignore the meaning of the note onset of samples - is actually a feature. It gives musicians a consistent and predictable system that they can use to add meaning themselves.

 

Cheers

Marcus

 

 


_______________________________________________
fluid-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/fluid-dev
Reply | Threaded
Open this post in threaded view
|

Re: improving musical timekeeping

HuBandiT@gmail.com
In reply to this post by fluid-dev mailing list
Thank you for your time. I've written my comments inline.

2020.02.08. 15:41 keltezéssel, Tom M. írta:
> Here are some thoughts of a software engineer. Not sure what a musician
> would say.
>
> First of all, you are right. A "meaningfully long" attack phase will
> "delay" note on and thus shorten it. My question: what would be
> the use-case of such a "meaningfully long" attack? The only use-case I
> can think of is to play a crescendo. And for this, one should use the
> Expression CC instead.
I am referring to attacks on the order of 10 ms to a few hundred ms
(milliseconds) - be it the short attack of a piano note or a drum hit,
or longer attacks of woodwind instruments, strings, choirs, etc.
> Why samplers/synthesizers don't compensate for that? I think because
> it doesn't work for real-time performance. One would need to delay all
> notes by the length of the attack phase. But what if the musician
> suddenly switches to another instrument which has an even longer
> attack?
My suggested use case was for scored (non-interactive, known-in-advance)
music - MIDI playback if you will. Obviously in an interactive scenario
(someone playing a keyboard real-time) this function could not work. But
in the age of DAWs and music composition software (I'm coming from
MuseScore) playback of scored music is a common use-case, which makes me
think it might be worth adding this functionality to the underlying
synthesizer.
>   Even worse: what if the musician uses a custom CC that extends
> the attack phase even more? The only way to correct this would be to
> figure out the longest attack phase that can ever be produced and
> delay all notes and all sound accordingly. But most musicians use
> monitors. They want to hear what they play, when they play it. I think
> they would be very confused if what they're playing is delayed by a
> few seconds.
Let's keep the two use cases separate. For interactive playback, this
could not work well, I agree. But for rendering scored music (which is
feel is a quite common use case), it would.
> It is (a) well known (problem), that MIDI sheets only work well with
> the soundbank they have been designed for.
I acknowledge this is the status quo (the situation currently). But are
we also saying we are not interested in pushing the boundaries of
technology to improve the situation? Are we going to be frozen into
SoundFont 2.04 forever?
>   This is due to various
> degrees of freedom a soundbank provides (volume, ADSR env, cut-off
> filters, custom CC automation). So, if you really have instruments
> with "meaningfully long" attacks, I'm afraid you're required to adjust
> your MIDI sheet(s) manually.
So if I am scoring orchestral/choir music, and would like to play it
back with reasonable musical timekeeping, you are sentencing me to
eternal toiling with manually adjusting note times, just to arrive at a
baseline that would seem to me to be quite possible to get done
automatically, given there is infrastructure developed to support this.
I am not familiar with these customization possibilities you mention -
haven't read the SoundFont spefication yet -, but my use case would be
acoustic/orchestral instrument sounds, where little if any artificial
manipulation would be intended to be added by the composer. Everything
should sound as natural as possible instead.

- HuBandiT


_______________________________________________
fluid-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/fluid-dev
Reply | Threaded
Open this post in threaded view
|

Re: improving musical timekeeping

fluid-dev mailing list
> Let's keep the two use cases separate.

No, sry. We cannot keep them separate. Soundfont2 is a real-time synth
model based on MIDI, a real-time protocol. That's what fluidsynth has
been designed for. That's what it works for.

> Are we going to be frozen into SoundFont 2.04 forever?

"FluidSynth is a real-time software synthesizer based on the Soundfont
2 specification"

This sets our scope. If you need a more advanced synth model, have a
look at SFZ.

Thanks for your clarification. But given your very unique and personal
use-case, I do not see how it can be implemented in a synthesizer like
fluidsynth.

(Pls. note that you had two other replies, which you didn't quote, so
you might have missed them.)

Tom

_______________________________________________
fluid-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/fluid-dev
Reply | Threaded
Open this post in threaded view
|

Re: improving musical timekeeping

HuBandiT@gmail.com
2020.02.09. 9:26 keltezéssel, Tom M. írta:
"FluidSynth is a real-time software synthesizer based on the Soundfont
2 specification"

This sets our scope. If you need a more advanced synth model, have a
look at SFZ.

Thank you. I do not see the conceptual reasoning behind limiting FluidSynth to SF2 (I do understand there could be a resource/effort issue), but if that is cast in stone, then it looks like that's what I'll have to do then.

Which then makes everything I write below academic.

Thanks for your clarification. But given your very unique and personal
use-case, I do not see how it can be implemented in a synthesizer like
fluidsynth.

I am not sure my use case would be very unique and personal - perhaps I'm just not describing it properly.

I would like you to consider NotePerformer, which was created to achieve this result (and a lot more). Listen to his demos, all achieved automatically. It is even commented, that while most sample libraries require laborious manual adjustment of timing and switching samples, this software does most of it automatically, getting it very close in most of the cases.

Or consider the humanize feature of many scoring applications, intended to automatically get closer to how human musicians would render a piece from the mathematically perfect grid of a score.

Those are commercial products, so I think it is reasonable to assume that they fulfill an existing need.

SoundFont - as it stands today - is quite tone-deaf to these issues, it seems to me from all of your responses. But SoundFont is not an actively maintained (or shall I say, actively unmaintained) standard, so it's future is open, and could be enhanced. 20 or so years have passed, with the improvements in technology much more could be afforded in computer music today. I campaign for moving forward.

There is an insane amount of time and effort going into fighting with the limitations of the SoundFont2 standard, trying to shoehorn a reasonable "Klang" (sound) into it (there are no musically meaningful note phases, no natural decay of overtones unless full-length samples are used, stochastic elements are not modelled properly, timbre changes due to velocity are not modeled, note releases are not modeled, lack of proper legato, etc.), and the results are often still unsatisfactory. Yet the suffering is perpetuated in a vicious circle by SoundFont2 being the most widespread and supported standard, and hence the lowest common denominator people aim for. Commercial alternatives try to overcome these limits by sheer force (huge sample libraries) and/or advancing the synthesis models. But they are commercial, so they will never become universally adopted. In the meantime popular/commercial music continues to dive into a pit of mud, because the masses trying their hands at making music end up having to fight the tools ("It is (a) well known (problem), that MIDI sheets only work well with the soundbank they have been designed for." - quote from Tom), instead of being able to play with and write music to educate themselves. So unless they intend to base a career on it, they either give up, or go to EDM (electronic dance music) that is a genre invented to market computer music despite its musical flaws. And since it got popular and established an industry, now everyone is stuck in that tarpit, and non-EDM music made on a computer usually sounds bad. One could argue "oh, but you could always go and study classical, noone forces you to do computer music and suffer its consequences" - yes, but that is a lot of time and effort (and money for instruments, tuition, etc.), so it is not for the masses, hence it will not raise the bar of musical taste of the masses.

Hence we are back to no good, self-made, self-study acoustic music for the masses. Because we are still stuck with 90s (inadequate) sampling technology, and it is still everywhere, keeping people back from enjoying studying and writing (non-EDM) music.

This is why I think there is a need to raise the bar.

(Pls. note that you had two other replies, which you didn't quote, so
you might have missed them.)

Thank you, indeed I did miss them, as they don't seem to have arrived to my inbox.

Marcus Weseloh writes:

In some musical contexts it might be correct to say that you want the end of the attack phase exactly on the beat. In other musical contexts you might want the beginning of the attack phase on the beat. Yet another context might want the middle of the attack phase on beat.

I don't consider this as a conceptual obstacle. Indeed, e.g. playing pizzicato should be timed differently (hit on the beat) than playing long notes (more start on the beat). So then let's create separate samples for pizzicato and for long notes (this would need to be done for a realistic sound anyway), and mark their musical onset position accordingly (or invent and add multile such markers if needed), and then let some controller (program change or other) switch or interpolate between these timing points. In my mind these points are as tightly coupled to the sample, as loop points are; hence they, too, do belong into the realm of the soundfont creator artist to set them, and the sampler/synthesizer to reproduce them. But if you don't have an attack phase marked at all (because the technology is limited and does not allow it to be marked and used), then you as a composer are prevented from being able to express such meaningful musical intents (and they already are in scoring/notation systems, so the computer would just have to act on it) - instead, you are relegated to having to manually fiddle with "milliseconds" or "ticks".

So what you describe as a flaw - that synths ignore the meaning of the note onset of samples - is actually a feature. It gives musicians a consistent and predictable system that they can use to add meaning themselves.
Considering how these timings could change from soundfont to soundfont, I do not consider them to be consistent, nor predictable. To overexaggerate this: why don't we apply the same logic to loop points, and consider those to be the responsibility of the composer too? After all, the repeating loop length gives a rhythm/beat to the sound of the sample, which ideally should be coordinated with the tempo and events in the musical piece in order not to interfere/clash with them (let alone the fact that when a sample is used for multiple pitches, and gets transposed during playback, necessarily ends up changing the beat of the loop, further muddying up the sound).

Reinhold Hoffmann wirtes:

I totally agree to what Marcus says. It is up to the musician/composer and the style of music.
So sure, make it controllable by the composer. But the same way you do not expect the composer to specify sample rates in Hz for resampling samples to make them sound at the desired musical note/pitch, but you abstract that away into the SoundFont and allow the composer to specify the desired musical note diretly (and the synthesizer calculates the necessary sample rate conversions automatically to give the desired result), why do you expect them to have to manually worry about such a low-level detail coupled so tightly to individual samples?

I think that the required feature can only be created by a musician or a composer (e.g. by recording and using the necessary protocol elements) rather than by a synthesizer “afterwards”.
Just like notes (pitches) and rhythm - but you don't force them to do that in hertzs or milliseconds or samples. Why do you want to force them here? And then why don't you force them to specify loop points as well? Why do you consider one to be inherently coupled to the sample, but not the other? I fail to see the distinction. Yes, loop points the synthesizer is kinda conceptually forced to automatically act on (lest we want a note to end prematurely) and I guess not alterable by the composer, while the timing points I suggest could be controlled to be switched/interpolated between, but where those timing points are along the sample is as inherently coupled to that sample, as the position of loop points.

- HuBandiT



_______________________________________________
fluid-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/fluid-dev
Reply | Threaded
Open this post in threaded view
|

Re: improving musical timekeeping

Marcus Weseloh
Hi,

first of all: this is a very interesting topic, thank you for bringing it up!

I do sympathise with your idea and I don't think it is a very unique use-case at all. I also use MuseScore, but mostly for rehearsing. So I let MuseScore play some voices of a piece while I play another voice on top. And I usually choose sounds with a very short attack phase in MuseScore, so that the attack phases don't mess with the timing. I have used the OnTime offset shifting of MuseScore 2 from time to time, but as MuseScore 3 has crippled the user-interface for that feature (you can only affect a single note at a time), it is now way too much hassle for me.

So I agree: it would be great to have a good solution to this problem.

Let's assume that SoundFonts would be extended so that they contain information about the length of the attack phase and the position of the first "meaningful" sample point, i.e. the sample point that should be "on beat". Lets ignore the fact that there are MIDI messages that affect the sample offset. And lets also ignore that the choice of which sample is heard can also depend on oscillators. And that for a single note-on, many different samples (with different attack phases) could be started simultaneously.

Then the synth would have to start playing the sample *before* the beat, in order to play it on beat. In other words, the sampler would have to act on a note-on event before this note-on event is actually due. And in order to decide which sample to play and how much attack phase time needs to be compensated, it would have to examine all CC and other messages that could influence sample choice and sample offsets leading up to that note-on event. And assuming we don't want to put an artificial limit on how long an attack phase of a sample is allowed to be, that effectively means that the synth would need to know and analyse (or even simulate) the complete MIDI stream right until the last event before starting the playback.

That is obviously impossible for live playback, you say that yourself. But it also doesn't fit how synthesizers like FluidSynth are used by MuseScore, Ardour and similar programs. Because as far as I know, those programs are MIDI sequencers themselves. In other words: *they* control which MIDI messages to send and - most importantly - when to send them. They don't pass a complete MIDI file to Fluidsynth to be played, but rather send MIDI messages in bursts or as a continuous stream. And they have very good reasons to do it that way.

So in my opinion, if we wanted to implement a system like you propose, it would have to be implemented in the MIDI sequencer. In other words: in MuseScore, Ardour and all the other programs that use MIDI events to control synthesizers (which also includes FluidSynths internal sequencer used to play MIDI files).

So maybe all that is needed is better sequencer support for shifting OneTime offsets for notes, tracks and scores. MuseScore is definitely lacking in that regard, it needs better user-interfaces for selecting multiple notes and affecting their OnTime offset. Maybe even support for some database of popular soundfonts that lists the OnTime offset for each note of each sample. MuseScore could then read that database and adjust all notes in a track automatically, if the user decides that is would make musical sense.

Cheers
Marcus

_______________________________________________
fluid-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/fluid-dev
Reply | Threaded
Open this post in threaded view
|

Re: improving musical timekeeping

HuBandiT@gmail.com
2020.02.09. 20:35 keltezéssel, Marcus Weseloh írta:
> Hi,
>
> first of all: this is a very interesting topic, thank you for bringing
> it up!
My pleasure; and I thank you for the moral support.

>
> I do sympathise with your idea and I don't think it is a very unique
> use-case at all. I also use MuseScore, but mostly for rehearsing. So I
> let MuseScore play some voices of a piece while I play another voice
> on top. And I usually choose sounds with a very short attack phase in
> MuseScore, so that the attack phases don't mess with the timing. I
> have used the OnTime offset shifting of MuseScore 2 from time to time,
> but as MuseScore 3 has crippled the user-interface for that feature
> (you can only affect a single note at a time), it is now way too much
> hassle for me.
Digression: On this scenario (interactive playing on top of scored
music), my idea is that the playback can be performed "on time" via the
features under discussion, while with the interactive play - this is an
area we did not touch on before - a compromise could be achieved (again
using the musical onset markers) by delaying each interactively played
note/sample just enough to make any and all notes played to be delayed
by the same consistent amount, because I have a strong hunch with a
short but consistent delay between keypress and note onset, the
interactively playing musician's brain will be able to compensate for
that delay by pressing keys earlier so that the interactive notes sound
in time with the notes the computer performs from score. The system can
be made aware of the musical onset moments of the interactively played
notes, and notates those moments instead of the moment of the MIDI
keypress. Using this arrangement even interactively recording a new
track will be synchronized to the score without kludges - if the delay
can be kept short enough to fall within the brain's ability where it can
comfortably compensate for it, One option for this is letting the user
choose the maximum delay, and the system truncates attack phases of
interactively played notes down to that limit. It would be a tradeoff
between completeness of the interactively played notes (e.g. for a
performance - in effect reverting to the current behaviour of no attack
phase compensation) on the one end, and completely omitted attack phases
but best interactive playability (which would be a "best effort" full
attack phase compensation for the case when we only learn about the note
at the moment it should be on).
>
> So I agree: it would be great to have a good solution to this problem.
>
Let me add in advance - instead of pointing this out at each point -,
that in my opinion this compensation logic is essentially already in
place today, with the sample attack phase hardcoded at value zero:
because the synthesizer does already compute the destination moment -
the output audio stream frame (sample) number - where it has to start
playing the sample from the sample's origin (the first sample = sample
offset zero). What would change is merely aligning a new origin (with an
offset that is potentially non zero) point of the sample with the
destination moment. (Granted, my knowledge of the standard is limited
currently.)
> Let's assume that SoundFonts would be extended so that they contain
> information about the length of the attack phase and the position of
> the first "meaningful" sample point, i.e. the sample point that should
> be "on beat". Lets ignore the fact that there are MIDI messages that
> affect the sample offset. And lets also ignore that the choice of
> which sample is heard can also depend on oscillators.
This however can - and indeed, already is being - precalculated, so is
known (assuming by "sample" in the above sentence you mean a time-series
of measured audio values - and not just one of such value of the series;
afaik the English "sample" can mean both in this context.)
> And that for a single note-on, many different samples (with different
> attack phases) could be started simultaneously.

It would be a meaningful musical discussion to hash out why this needs
to be done at all, and whether in those cases the sound resulting from
the ensemble of those samples played together ends up having a musical
note onset moment instead of the individual samples: if one tries to
reproduce the sound of a preexisting musical instrument, the sound of
that natural instrument when played has a musically meaningful onset -
so by extension, reproducing the sound of that instrument should have
conceptually the same onset, even if for whatever reason the
reproduction of the sound of the instrument ends up getting actually
constructed from multiple "samples" played in parallel - the sound still
represents the same instrument musically, and therefore ought to be
considered the same musically.

(Whether all instrument types have such musically meaningful moments in
their sounds, or not - for the above assume an instrument that does -,
or which ones there are altogether, is a meaningful musical discussion;
and maybe some instruments will not have any such phases; or some
instruments will have some, while other instruments will have others. As
a side note, this is kind of where I think ADSR also originally came
from, trying to capture some musically meaningful and perceptually
pleasant/necessary phases of an instrument's sound.)

>
> Then the synth would have to start playing the sample *before* the
> beat, in order to play it on beat. In other words, the sampler would
> have to act on a note-on event before this note-on event is actually
> due. And in order to decide which sample to play and how much attack
> phase time needs to be compensated, it would have to examine all CC
> and other messages that could influence sample choice and sample
> offsets leading up to that note-on event. And assuming we don't want
> to put an artificial limit on how long an attack phase of a sample is
> allowed to be, that effectively means that the synth would need to
> know and analyse (or even simulate) the complete MIDI stream right
> until the last event before starting the playback.
Correct. For rendering
preexisting/non-interactive/non-realtime/score-based music. The notes
would be known ahead of time, so everything could be
precalculated/simulated. Although an artificial limit on how long an
attack phase is allowed to be would not be good, it could be reasonable
to lessen the burden by allowing the software driving the synthesizer to
speficy an upper bound, beyond which attack phases would get truncated.
The synthesizer could maybe provide a suggestion for this to its client
by precalculating some reasonable value for maximum attack lengths, say
over the entire set of samples the music will be played over. The ideal
solution would be to hide it all, and ask for the entire score in
advance and calculate everything perfectly internally. But for cases
where this is inconvenient, a lookahead window could be used, where the
synthesizer is continuously fed events far ahead of current audio output
time so that there is a reasonable assumption that (almost all) of those
events will still be known early enough to the synthesizer to synthesize
them on time (if not, then it's best effort, so some attacks will end up
getting truncated somewhat: if you whip up an instrument with a two
minute attack phase, you better not send a note-on event before two
minutes into the piece; think polyphony and note stealing - another area
where users can expect degradations in output if they ask for
unreasonable things).
> That is obviously impossible for live playback, you say that yourself.
Well, on second thought, yes and no - see digression above.
> But it also doesn't fit how synthesizers like FluidSynth are used by
> MuseScore, Ardour and similar programs. Because as far as I know,
> those programs are MIDI sequencers themselves. In other words: *they*
> control which MIDI messages to send and - most importantly - when to
> send them. They don't pass a complete MIDI file to Fluidsynth to be
> played, but rather send MIDI messages in bursts or as a continuous
> stream.
I agree, but I don't consider this a conceptual issue, merely an
API/implementation issue. I don't see any conceptual reason why
MuseScore couldn't send the entire score to the synthesizer in advance,
or at least ahead of time with a reasonable look-ahead window. (And we
can open up discussion about making this more efficient by introducing a
caching mechanism so that when an edit is made, not all
instruments/tracks/MIDI channels have to be transferred to - and parsed 
and precalculated by - the synthesizer anew.) Maybe it is possible to
edit the score in MuseScore during playback (but then does anyone really
does that)? If it is possible, that could still be made possible by a
sequencer API that allows adding events with timestamps and returns an
identifier to each event, so that the client can later change or remove
events.
> And they have very good reasons to do it that way.
So you do consider this a conceptual issue? What would be those good
reasons? I can't think of any for MuseScore - but I am not an expert
user of MuseScore.
>
> So in my opinion, if we wanted to implement a system like you propose,
> it would have to be implemented in the MIDI sequencer. In other words:
> in MuseScore, Ardour and all the other programs that use MIDI events
> to control synthesizers (which also includes FluidSynths internal
> sequencer used to play MIDI files).
In light of the above, do you still think this is the case? And if yes,
which parts would end up in the client (MuseScore, Ardour, etc.), and
which parts (if any) would end up in the synthesizer? Would the client
query the synthesizer about note attack lenghths of each and every note,
just to then turn around and advance (bring earlier) the corresponding
note on event before sending it to the synthesizer? Wouldn't that logic
be better done by the synthesizer itself then?
>
> So maybe all that is needed is better sequencer support for shifting
> OneTime offsets for notes, tracks and scores.
I don't know what OneTime offsets are, but if you don't have information
on the per-sample (or per-note/per-pitch) attack lengths of each note,
how much do you shift by? Furthermore, if different pitches end up
having different attack lengths (think resampled samples) there is no
single correct value to shift by... so I'm probably missing some part of
your logic?
> MuseScore is definitely lacking in that regard, it needs better
> user-interfaces for selecting multiple notes and affecting their
> OnTime offset. Maybe even support for some database of popular
> soundfonts that lists the OnTime offset for each note of each sample.
> MuseScore could then read that database and adjust all notes in a
> track automatically, if the user decides that is would make musical sense.
This sidecar file to soundfonts I feel to be a quite workaround-like
approach. It could achieve the timing in the short term, yes (but maybe
still not perfectly? - see above). But I could not imagine it as a
long-term solution: it would be specific to MuseScore, the sidecar files
could get separated from the SoundFonts (misplaced, renamed, mixed up,
not found "oh, I have downloaded this soundfont, but cannot find the
sidecar file and now my music is all badly timed, does anyone know where
to download it from?"), how do you know what value to write into the
sidecar file unless you have a soundfont editor to open up the font (or
you are the soundfont author) - at which moment it becomes much more
easier to mark the moment in the soundfont editor and save that with the
soundfont, than having to manually write it into an external file,
taking care not to make a mistake (different note, diferent velocity
layer, etc.), etc.
>
> Cheers
> Marcus
- HuBandiT

_______________________________________________
fluid-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/fluid-dev
Reply | Threaded
Open this post in threaded view
|

Re: improving musical timekeeping

Marcus Weseloh
Hi,

Am Mo., 10. Feb. 2020 um 01:50 Uhr schrieb [hidden email] <[hidden email]>:
[...] with the interactive play - this is an
area we did not touch on before - a compromise could be achieved (again
using the musical onset markers) by delaying each interactively played
note/sample just enough to make any and all notes played to be delayed
by the same consistent amount [...]

I doubt that this idea would be very popular real-time control... :-) People have put enormous effort into bringing sound output latency down to low and acceptable levels.
 
> And that for a single note-on, many different samples (with different
> attack phases) could be started simultaneously.
It would be a meaningful musical discussion to hash out why this needs
to be done at all, and whether in those cases the sound resulting from
the ensemble of those samples played together ends up having a musical
note onset moment instead of the individual samples:

In SoundFonts (and many other sample bank formats), the sound you hear through the speakers after a note-on event can (and often is) the combination of many different samples. Those samples are mixed together depending on values passed through MIDI messages. How those samples are mixed together can depend on the note value, the note-on velocity, the current time (oscillators) and other many other MIDI messages. All of these parameters can affect many different aspects of this mixing process: static volume, pitch, note-on delay, length of the attack phase, sample offset, speed of oscillators, to name just a few. And before and/or after they are mixed together, the resulting sound can then pass through different filters that shape the sound even more. And the reason for this complicated and expressive system is exactly the point you raise below:
 
if one tries to
reproduce the sound of a preexisting musical instrument, the sound of
that natural instrument when played has a musically meaningful onset -
so by extension, reproducing the sound of that instrument should have
conceptually the same onset, even if for whatever reason the
reproduction of the sound of the instrument ends up getting actually
constructed from multiple "samples" played in parallel [...]

Yes, this is exactly it. But one important point you don't mention here is: most musical instruments do not have fixed sound characteristics. Their sound and - most importantly for this discussion - their attack phase and shape and length of onset transients depend on how you play the instrument. And your playing style and many other aspects also affect if and how well defined the border between transient phase and "musically meaningful sound" is. I'm sure I could have long debates with fellow musicians about when that "musically meaningful sound" of a particular instrument actually starts.

So in my opinion, your initial premise for this discussion - that there is one or a limited number of musically meaningful note-onset durations measured in sample offsets that could easily be compensated - is flawed. For very simple cases and a very narrow musical style it might be ok. But I can't imagine a general system we could implement that achieves this "just-do-the-musically-meaningful-correct-thing" effect you are after when it is applied to the wide range of sounds and music styles that SoundFonts, MuseScore and similar software is used to create today.

Yes, there are tools like NotePerformer that attempt to solve this problem, and they seem to do quite a good job at it. But NotePerformer is not a general sample-bank format like SF2. It's not even a synthesizer like FluidSynth. It is a synthesizer fused with a sequencer fused to a specific(!) set of samples and a predetermined set of performance rules. It has its own opinion on how the MIDI notes that you write in your notation editor should be articulated and performed. It takes your input, adds its own musical playing style and performs your music in a certain way, similar to a human musician. They talk about having analysed lots of classical pieces to train their model. I like the idea and I imagine it being a really useful tool if you compose classical music, orchestral film scores or similar music.

But how does NotePerformer sound if you use it to play rock music, reggae, an irish jig, a scottish hornpipe, a french two-time bourree or a swedish polska? Or something not rooted in the western musical world? I haven't tried it, but my guess is that the output would not be "musically meaningful" at all in those contexts. I imagine it being like listening to a classically trained violin player performing an irish jig - it just doesn't sound right. The phrasing, the rhythm, the groove (i.e. tiny shifts in note onset timings) you need are totally different to classical music.

Don't get me wrong: I would love to have an open-source NotePerformer. Ideally one you could train yourself so that it can learn many different performance styles. I would even be interested in participating in creating one, it sounds like a really interesting project! But you are barking up the wrong tree here. FluidSynth is not the right software to implement your idea, and I don't think you will get close to your goal by creating an extension to the SoundFont format. If I have understood your goal correctly, then you need to start from a blank slate.

Cheers
Marcus


_______________________________________________
fluid-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/fluid-dev