Modify not working for paths that contain accents?

I made a new custom field to help me keep track of the source of my files, for example if they came from a cd rip or if I purchased them at a particular online store. I’m having trouble setting this field on files whose path contains an accent (at least I think I’ve narrowed down the problem to that).

This works fine:
$ beet modify mysource=mycd path:/Volumes/Music/Music/Gustav\ Holst;\ Boston\ Symphony\ Orchestra,\ New\ England\ Conservatory\ Chorus,\ 小澤征爾/

This doesn’t work:
$ beet modify mysource=mycd path:/Volumes/Music/Music/Ludwig\ van\ Beethoven;\ Jenő\ Jandó/
error: No matching items found.

I’m running beets from git (b63b66a3912c000ee961b3da7efa4bb1ddfaecb1), MacOS 10.15.7 and my library is on an nfs mount. I’ve set ‘nfs.client.mount.options = nfc’.

Encoding problems like this are extraordinarily hard to diagnose. The command-line argument is being passed in one encoding to beets, and it will attempt to use exactly those bytes to look for files in its database—but the encoding may differ subtly, such as in the Unicode normalization. It may be possible to diagnose by looking directly at the database, but in the mean time, maybe try a non-path query? Would something like artist:beethoven album:jand match what you want, for example?

This works some of the time, but not all of the time because sometimes I’m not able to figure out the right tags to use? For example, above it seems like Jenő Jandó should be in the artist field, but this returns nothing?

$ beet ls artist:jand
$

Is there a way to print the values of all the fields? I could use that to see why the above doesn’t work and that would help me write better searches perhaps.

Sure! The info plugin (in library mode) can list out all the fields on a given item/album.

One workaround is to use the regex matching support, replacing the problematic characters with period (.) characters:

beet modify mysource=mycd 'path::/Volumes/Music/Music/Ludwig van Beethoven; Jen. Jand./'

Note the two colon prefix path::<pattern>.

You can also use the asciify_paths setting to avoid filenames with encoding problems like this.

Thanks! This is a good tip!

Thanks, this is a very reasonable workaround!

1 Like