Hmm. That is extremely unfortunate, and quite confusing for a user (she wouldn’t know that a field shown in info or a display format is not in the DB and thus not queryable).
So if I understand correctly, there is an abstraction layer for tag data but not for database content? The docs show a lot of nice example queries for ‘added’ and ‘mtime’, I think these should absolutely also work for release and original dates.
You say you store a ‘date’ as separate y/m/d columns in the DB. I wonder if that originally stems from the idea to be able to represent incomplete dates (which SQLite apparently can’t handle in date columns). The real tags in the files are always one tag only, of the form YYYY[-MM[-DD]]
, so MediaFile must actually deconstruct them into separate fields for the DB.
For one of the files in question, I can see beet info
shows only original_year
and no original_month
or original_day
, resulting in original_date: 2019-01-01
.
It’s nice that one can perform date queries against ‘added’ and ‘mtime’, but from a user’s perspective, it makes much more sense to be able to search for release dates and/or original dates.
I wonder how complicated it would be to add that, without too much performance penalty.
Or—possibly—let MediaFile construct one ‘date’ field that allowed zeroes for non-existant date parts and store it in the DB as a 10-char text field. The only problem then would be date calculations, because these couldn’t be done without either the DB handling it (like MySQL) or a Python class wrapper.
Maybe something like Django’s PartialDate
might help. It is of course also a kludge, insofar as it stores into a DateTime field and uses the seconds as a precision indicator. Oh well, but it works and doesn’t break the DB or datetime calculations.
ID3v2 frames like TDRC, TDRL and TDOR allow exactly the same content (YYYY, YYYY-MM, YYYY-MM-DD, and even some more), so this might be a viable solution to read these (possibly incomplete) tags and even write them back correctly. Same applies of course for the date fields in other file formats.
If stored in the DB as either DateTime or even VARCHAR(10), it would also be easy doing date queries and comparisons. The gut feeling says VARCHAR(10) (sortable, human readable) but from a performance standpoint I’d probably go for DateTime fields (less conversions to/from Python, and DateTime as a text field in SQLite could still be used).
As opposed to storing 3 different columns per date in the DB, I think this would make easy queries possible and performant, and still allow easy resolving to year/month/day variables, if needed.
Just thinking out loud …