Why do i keep getting apostrophe's converted to ? when i import


#1

(new to beets)

I’m on windows running the last version that pip can install. when i try to import albums with tracks with a ’ in the title, beets wants to turn them into ?. I’ve been all over the config options and faq and can’t figure this out. I only see documentation about filenames, this seems to be converting metadata itself. Please help me stop this madness.

e.g. (UNC)

\\foo\music\Album Rips\A Tribe Called Quest\[1999] The Anthology (19 items)
Tagging:
A Tribe Called Quest - The Anthology
URL:
https://musicbrainz.org/release/73ebe2f8-40dd-45fb-8f28-0f1c0885aae4
(Similarity: 99.1%) (tracks) (CD, 1999, US, Jive)

  • Check The Rhime -> Check the Rhime
  • Buggin’ Out -> Buggin? Out
  • If The Papes Come -> If the Papes Come
  • Jazz (We’ve Got) -> Jazz (We?ve Got)
  • I Left My Wallet In El Segundo -> I Left My Wallet in El Segundo
  • Luck Of Lucien -> Luck of Lucien
  • Description Of A Fool -> Description of a Fool
  • Find A Way -> Find a Way
  • Vivrant Thing - Violator -> Vivrant Thing (title)
    [A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
    Enter search, enter Id, aBort, eDit, edit Candidates? M

#2

Hello—what you’re seeing here actually isn’t literal ? characters appearing; instead, your terminal doesn’t support Unicode (or at least beets thinks it doesn’t). It’s trying to print a ’ but needs to substitute the character to accommodate your console.

If you look around on the issue tracker and such, you can see that other Windows users sometimes have this trouble. Sometimes the right solution is switching to PowerShell; other times, it seems to be forcing beets to use UTF-8. (There is a config option for forcing the terminal encoding when automatic detection doesn’t work.)


#3

wait. so we have a ’ and a unicode ’ ???, why is beets asking me to approve the change? either do this silently or don’t prompt me at all. I’m fine with the windows weirdness, but i want to import a large library, asking me about trivial conversions is a complete waste of my time and likely to send me to another tool. (because unfortunately i have a long windows background and am not really willing to dive into what i see as yet another stupid python problem with unicode)


#4

The autotagger wants to change a straight ASCII “prime” symbol to a curly right single quote/apostrophe. The similarity is actually quite high—greater than 99%. In its standard configuration, beets will indeed accept such a high-similarity change automatically. Is there any chance you’re running in “timid” mode, for example?

As an aside, can I please ask you to reconsider your tone in the previous message in future posts to this board? No one has to use beets if they don’t like it, of course, but the language you used would be more appropriate if this were a commercial enterprise where you’re a customer. In reality, of course, this is an open-source project, so it’s a community—not a business. If there’s something that you don’t like about beets, please try to make your first reaction “here’s how I might be able to help beets improve!” instead of “if it doesn’t do what I want, I’m taking my business elsewhere.” I know it sounds like a small thing, but because we’re all volunteers just doing this as a passion project, it makes a huge difference to hear positive, solution-seeking comments when there’s work to be done.


#5

Hehe, you snatched these words from my mouth, Adrian. The guy’s sense of entitlement struck me as inappropriate, more so toward such a benevolently managed project. I was preparing to type a rebuke when I saw your post. You’ve said it all.


#6

Sorry guys. You should see me when I get really mad and frustrated. It just hit me on the very first album i tried to import and so was blocked trying to figure out if/how beets works for me until I realized what was happening. And as a tool advertised at the very top as being for obsessive-compulsive music geeks I thought what was happening would be more clear.

I’ll think through it a little more and maybe dig into the auto-tagger code. My python is weak, but if i can find a way to add a config option to either opt-in to smart quotes, or opt-out of smart quotes I’ll send it along.


#7

So i’m only casually skimming the code, but in autotag/hooks.py there is a mechanism for replacing & to and while computing the string match, right?

Replacements to use before testing distance.
SD_REPLACE = [
(r’&’, ‘and’),
]

Shouldn’t mappings from curly quotes to straight quotes just be added here to prevent ever prompting on this trivial match?

Hypothetically this should be an option in case the user prefers to stick to a certain kind. But i’ll get to that later…


#8

Is there a reason you don’t want beets to correct the names of your tracks? It’s trying to replace the normal apostrophes with the unicode ones, as those are the ones in the “official” track title from MusicBrainz. The reason they appear as ? in your command prompt is because they are not supported and beets replaces them when outputting them on the command line. The actual tracks themselves will be tagged with the unicode apostrophe as they should be.

Edit: I’ve realised I’ve pretty much just repeated what @adrian said, oops! I don’t think Beets should be prompting you to for what to do there, as you’ve got a high similarity value. Could you post the output of beet config here?


#9

In fact, the autotagger already discounts Unicode-related differences by normalizing everything to ASCII before doing the comparison. (Take a look at the references to unidecode in the code there if you’re curious.) In fact, the only penalty being applied here is this change:

You can tell because of the parenthesized “(title)” after the track listing. We added those markers to explain where the similarity penalties come from. We still display the penalty-free differences so you can see what’s going to change.


#10

I see on both counts - i was also pretty concerned about the (title) thing because the output doesn’t really distinguish between what is an actual change and what is the description of the change detected. ( i still might keep a mapping there for ‘…’) but now i understand what the colorized output meant.

(and yes i have timid set to yes because i’m still evaluating whether i really want to do something like this across my entire collection – that one time i let picard run wild took me probably a year or so of manual re-editing to get back to how i wanted to curate my metadata)

I still like the idea of saying please don’t touch my quotes because i’ve never really been a fan of the curly quotes, but i’m warming up to the idea. i just really want to look at what will happen without seeing a lot of noise in the output. I’m experimenting with real albums but ultimately don’t want all the bootlegs i have to get trampled on either.

(just wait until i start ranting about how track numbers vary from “1” to “1/x” for no apparent reason and about all the duplicate artist/album artist/albumartist fields when they aren’t necessary)


#11

I’m sure i continue to follow on deaf ears, but my keyboards all produce ’ by default not ’ so simple things become harder if i need to work with curly quotes:

C:\Users\CaptFrankSolo\Music\beets>beet ls Buggin’

C:\Users\CaptFrankSolo\Music\beets>beet ls Buggin’
[1999] The Anthology\06 Buggin? Out

Note also that my windows terminal (conemu and cmd both!) is completely capable of displaying a curly quote. Either python or beets doesn’t know how to talk to it properly. :frowning:


#12

You’ve got two options in regards to the filename; ASCII-fying the entire filename or just replacing the unicode quote with a normal one.

You can ASCII-fy the entire filename using the asciify-paths configuration option, or if you’d like to replace just the quotes in the filename then you can use the replace configuration option instead;

replace:
    # beets defaults
    '[\\/]': _
    '^\.': _
    '[\x00-\x1f]': _
    '[<>:"\?\*\|]': _
    '\.$': _
    '\s+$': ''
    '^\s+': ''
    # no special quotes thank you
    '[\u2018\u2019]': ''''
    '[\u201c\u201d]': '"'

#13

I’m completely fine with the file name, it’s the metadata itself that is my problem/obsession


#14

ASCIIfying tags has definitely come up before, and it would be good to have the ability to do that. A good entry point would be the proposed “formatted modify” feature (which can be found by searching for that phrase in the bug tracker). That would let you apply arbitrary functions to metadata like you can to path formats.


#15

Hello, I’m having some troubles with quote replacement.
I am using an expanded version of @jackwilsdon snippet to replace whatever needs replacing.
Here it is:

replace:
  # beets defaults
  '[\\/]': _
  '^\.': _
  '[\x00-\x1f]': _
  '[<>:"\?\*\|]': _
  '\.$': _
  '\s+$': ''
  '^\s+': ''
  # custom
  '[\u275C\u02BC]': ''''
  '[\u2018\u2019\u201a\u201b\u2039\u203a]': ''''
  '[\u201c\u201d\u201e\u201f\u00ab\u00bb]': '"'

And this is the path section of the configuration:

paths:
    default: $albumartist/$album%ifdef{albumdisambig, - %title{$albumdisambig}} ($original_year)%if{$tags, - $tags} [$format%ifdef{profile, $profile}]]/%if{$multidisc,Disc $disc/}$track - $title
    singleton: Non-Album/$artist/$title
    comp: Compilations/$album%aunique{}/$track $title
    albumtype_soundtrack: Soundtracks/$album/$track $title
    ext:log: $albumpath/$artist - $album
    ext:cue: $albumpath/$artist - $album
item_fields:
    multidisc: 1 if disctotal > 1 else 0
album_fields:
    profile: "total = 0\nfor item in items:\n    total += item.bitrate\nabr = total / len(items) / 1000  \nif abr > 480:\n    return None\nelif abr < 480 and abr >= 320:\n    return '320'\nelif abr < 320 and abr >= 220:\n    return 'V0'  \nelif abr < 215 and abr >= 170 and abr != 192:\n    return 'V2'\nelif abr == 192:\n    return '192'\nelif abr < 170:\n    return round(abr, 0)\n"
    tags: "import re\nmatch = re.search(r'(Live|EP|Remix)', albumtype)\nif match:\n    return match.expand('\\\\1')\n"

However this doesn’t really work for me, for example, this release: The Beatles - “The White Album” (1968) [MP3 V0] gets moved in a folder named either as The Beatles - _The White Album_ (1968) [MP3 V0] or, with slight variations to the above config, exactly as shown.
What should I try to make this work?


#16

It might be a good idea to reexamine this line in the default replacements:

That outlaws the " character, replacing it with _. That replacement is necessary on some filesystems where quotes are illegal (looking at you, FAT) but probably not on your filesystem, so you might want to remove " from the character class.


#17

I definitely didn’t notice that. Thanks for pointing that out.

On a side not, is it possible to use unicode on the replacement side? For example:

replace:
  # beets defaults
  ':': "\u2236" # ∶ (ratio symbol)
  '\?': "\uff1f" # ?(fullwidth)