Hey @Kingpin,
@Celona, thanks for the ping. Happy to weigh in on this one because there is a real engineering question buried in it that is worth separating from the marketing wrapper. Furthermore, being a nerd obsessed with physics, math, what not - in my usual OCD meaner.
A bit on terminology first
“Power of 2” is not quite the property you are after. 705.6 kHz is 16 x 44.1, and 16 happens to also be 2^4, so it is both. But what actually matters for the math is “integer multiple of the source rate”, not “power of two”. You could upsample 44.1 by 3 to 132.3, or by 5 to 220.5, and those would be just as clean mathematically as 705.6. The reason the “power of 2” framing comes up so often is historical: old converter chips implemented oversampling as cascaded 2x stages because half-band FIR filters are cheap to build that way, and DSP textbooks lean on power-of-two FFT block sizes. It is an implementation convenience, not a quality property.
So the request is really “always upsample to the nearest integer multiple of the source rate” - 44.1 source goes to 705.6, 48 source goes to 768, and so on. That is a legitimate and implementable feature, and worth discussing on its own merits.
The “introduces errors in the digital domain” claim
This is the bit that needs evidence rather than assertion, and the SRC4392 datasheet that was linked earlier in the thread is itself a useful piece of evidence. It lets us compare integer and non-integer conversion in measurable numbers from one device:
|
|
|
| fSIN:fSOUT 44.1 → 44.1 |
THD+N -140 dB, DR 141 dB |
|
| fSIN:fSOUT 44.1 → 48 |
THD+N -140 dB, DR 141 dB |
(non-integer, 147:160) |
| fSIN:fSOUT 44.1 → 96 |
THD+N -140 dB, DR 141 dB |
(non-integer) |
| fSIN:fSOUT 44.1 → 192 |
THD+N -137 dB, DR 138 dB |
(non-integer) |
| fSIN:fSOUT 96 → 192 |
THD+N -137 dB, DR 138 dB |
(integer 1:2) |
[TI SBFS029D, pages 4-5]
The non-integer 44.1 → 48 conversion measures identically to the synchronous 44.1 → 44.1 pass-through at -140 dB THD+N and 141 dB dynamic range. The integer 96 → 192 conversion measures slightly worse (-137 dB) than the non-integer 44.1 → 96. So on this hardware ASRC chip the integer vs non-integer distinction does not produce a measurable advantage in either THD+N or dynamic range.
These artefacts sit roughly 140 dB below full scale. For context, that is about 40 to 50 dB below the noise floor of even very good DAC analog stages, and 60+ dB below the threshold of human hearing in a quiet room. Modern software resamplers (SoX VHQ, r8brain, the better Linux ones) measure even cleaner than this 2007 chip. The “rounding errors and phase distortion associated with fractional conversions” exist mathematically but are not the audible artefacts in any modern chain.
The Musical Fidelity DAC linked as evidence
Worth flagging this because it cuts against the original argument rather than supporting it. The M3x DAC manual states explicitly on page 7:
“Upsampling is always on for PCM data up to 192kHz on any input.
Incoming sample rates up to 192kHz are resampled to 192kHz.”
And page 10 confirms the SRC converter is the SRC4392.
So whatever you send into this DAC at 44.1, it is non-integer resampled to 192 kHz internally using the exact chip whose measurements appear above. The DAC sounding different at 96 vs 192 input is real, but it is the result of the DAC selecting a different input filter profile (see page 8, where the front-panel filter button is documented), not the math of the conversion ratio. If your chain ends in this DAC or anything architecturally similar, the integer-ratio Volumio output is being undone milliseconds later by the DAC itself.
What you might actually be hearing
To be clear, I am not dismissing the impression. Reiss 2016 (JAES 64:6, open access) is a meta-analysis of 18 studies covering 400+ participants and 12,500+ trials, and it found a small but statistically significant ability of trained listeners to discriminate hi-res from CD-quality. So “audible difference exists at all” is on the table, in trained ears, on the right material. Meyer & Moran 2007 (JAES 55:9) did 554 ABX trials on hi-end systems and found chance-level performance, which is also data.
If you are reliably hearing a difference between 705.6 and 768 output from a 44.1 source, the candidate explanations in order of likelihood are:
-
The DAC selects a different reconstruction filter at different rates. Many DACs do this without exposing it to the user. The SRC4392 itself has selectable steep vs slow roll-off filters with different ringing behaviour.
-
Level mismatch. Even 0.2 dB is reliably perceived as “better”.
-
Expectation bias. Sighted listening is famously susceptible.
-
A genuine resampler implementation difference in the Rivo+ pipeline (worth investigating if measurable, but distinct from the ratio question).
If you want to know which of these is doing the work, the protocol is well established: capture both outputs, level-match to within 0.05 dB, ABX with foobar2000 or Lacinato, minimum 16 trials, look for 13/16 or better to claim p < 0.05. It is the same protocol the published studies used. Below that bar, the difference is plausibly noise.
On the feature itself
I am open to “upsample to nearest integer multiple of source rate” as a user-selectable preference. It is a sensible option to offer alongside the existing modes, and the implementation is not unreasonable. What I would push back on is framing the current behaviour as a fault or as introducing audible errors, because the measurement evidence (including the very datasheet linked above) does not support that framing. Let us discuss it as a preference rather than a defect, and we can look at where it fits the pipeline.
References for anyone who wants to dig in
-
TI SRC4392 datasheet, SBFS029D, pages 4-5 (THD+N and dynamic range across all ratios), page 8 (filter characteristics), page 35 (group delay options).
-
Musical Fidelity M3x DAC manual, pages 7, 8, 10 (always-on upsampling to 192, filter button, SRC4392 specified).
-
Meyer, E. B. and Moran, D. R. (2007). “Audibility of a CD-Standard A/DA/A Loop Inserted into High-Resolution Audio Playback”. JAES 55:9, 775-779. https://www.aes.org/e-lib/browse.cfm?elib=14195
-
Reiss, J. D. (2016). “A Meta-Analysis of High Resolution Audio Perceptual Evaluation”. JAES 64:6, 364-379. Open access: https://aes2.org/publications/elibrary-page/?id=18296
-
Infinite Wave resampler comparisons (sweep, impulse, stopband measurements for SoX, SRC, r8brain and others): https://src.infinitewave.ca/
-
Smith, J. O., “Digital Audio Resampling Home Page”, CCRMA Stanford: Digital Audio Resampling Home Page
@Darmur, Any further thoughts?
Kind Regards,