RFC: Proposed new penalty/distance: tracktotal

This penalty shows up when the tracktotal doesn’t match up. Either
missing_tracks or unmatched_tracks is likely to also trigger in
such a case, but it’s useful to distinguish the case where we’re
matching an album where I don’t have the whole thing (tracktotal
matches) from ones where it’s matching what’s probably the wrong album
release (tracktotal doesn’t match).

This would only have an effect when both the items have a tracktotal and
we have medium_totals for the info, and would check against both the
current disk medium_total and the total across all disks, since we don’t
know whether the imported tracks use per disc numbering.

This would let me bump the distance weight to move albums with
mismatched track totals down the candidates list for cases where it’s likely
not the same album release, while letting incomplete albums with the
same total stay farther up the list.

Thoughts? Too much complexity? Could it be optional, or allow plugins to
extend distance calculations? I’m playing around with a prototype now.

It’s a good idea! However, the bad news is that I think it is surprisingly hard to implement. The track matching mechanism uses the “Hungarian algorithm” for the weighted bipartite matching problem (implemented in the Munkres Python module, if you’re looking).

This setup sort of bakes in the assumption that the match will use all of the tracks, if possible. To set up the ability to leave tracks unmatched even when matches exist, we would need to modify the problem formulation, possibly with additional “dummy” graph vertices that signify the lack of a match.

This is all to say: it’s definitely possible, but I don’t know how to do it easily. If you’re motivated, I recommend you jump into the matching code to see if you can sort out how it all works currently.

Hmm, I’m okay with it using all the available tracks, I just don’t want to apply any candidate that’ll change my tracktotal fields, since it’s probably not the right release (i.e. bonus track editions for itunes or amazon are often missing from musicbrainz). I think what you’re discussing is worthwhile, but independent of and future enhancement of what I’m proposing.

Oh wow, I’m sorry about that! I read your original post too quickly and thought this was about something else entirely.

Yeah, this makes sense to me. To clarify, the tracktotal for the penalty would come from the metadata, right? Not from the actual number of files we have?

This sounds perfectly reasonable and sounds like a pretty straightforward thing to do. Sorry again for not getting it the first time around!

Exactly, yes. tracktotal in the item metadata compared against medium_total from the track info, with a bit of finagling to deal with per_disc_numbering, basically.

1 Like