Like it or not over the years we end up copying and moving things around, make backups of backups and before you know it you have two or more copies of the identical album (i.e. regardless of file size and metadata, the audio streams are identical) lying around in your library.
For my own use I coded some SQLite queries to identify folders within my library containing FLAC files which together have the identical audio stream to the FLAC files in another folder. In doing the analysis the code generates 3 SQLite tables
- a list of distinct folders
- a list of folders with the same FLAC content
- a list of folders which can be deleted (leaving behind only one copy of the FLAC files)
The code will not modify or delete your music, it simply does analysis based on table contents. It’s been used and tested extensively by myself and a few friends and has no known issues.
I think it’d be a neat addition to beets and require only a few additional tables to be created whenever a user wants to check their library for duplicate albums and perhaps some code to act on the outcomes.
Beets’ import code would need to import and store the md5sum from the underlying FLAC files
I’d be happy to share the code if there’s interest in incorporating into beets.