Importing From Python

Hello,

I’d like manage beet imports from another python script. I’ve got the overall workflow working with subprocess and capturing the output, but I’d rather import the beets code and call the import function directly, and capture the results in a structured way. I found this old thread on Google Groups with a potentially useful breadcrumb, but I’m curious if anyone has documented the process in a bit more depth or has an example they can share.

Thanks

Hello! We don’t currently have great documentation about how to do that, although it should certainly be possible. This recent thread got things started—maybe this would be a good starting point?

That is super helpful and gives me a great place to start. Unfortunately, I’ve spent a bit of time poking at the problem since and I’m struggling with two problems: a runtime error, and difficulty manipulating the config object. Here’s my super simple autobeet.py:

from beets import config
from beets import importer
from beets.ui import _open_library
from beets.autotag import Recommendation

class Autobeets(object):

    class AutoImportSession(importer.ImportSession):

        def should_resume(self, path):
            return True

        def choose_match(self, task):
            if task.rec == Recommendation.strong:
                return importer.action.APPLY
            else:
                return importer.action.SKIP

        def resolve_duplicate(self, task, found_duplicates):
            return importer.action.SKIP

        def choose_item(self, task):
            if task.rec == Recommendation.strong:
                return importer.action.APPLY
            else:
                return importer.action.SKIP

    def __init__(self, config_file, music_db):
        config.set_file(config_file)
        config.resolve()
        self.lib = _open_library(config)

    def import_directory(self, import_dir):
        query = None
        logger = None
        self.session = Autobeets.AutoImportSession(self.lib, logger, [import_dir], query)
        self.session.run()

And here’s what is calling it:

#!/usr/bin/python3

import os
import sys
import autobeet

def main():
    config_file = 'autobeet.yaml'
    library_path = 'autobeet.blb'
    beets = autobeet.Autobeets(config_file, library_path)
    beets.import_directory('inbox/Artist - Album (2011) [FLAC]')

if __name__ == "__main__":
    main()

First up, here’s the error:

Traceback (most recent call last):
  File "./import.py", line 14, in <module>
    main()
  File "./import.py", line 11, in main
    beets.import_directory('inbox/Gang Gang Dance - Eye Contact (2011) [FLAC]')
  File "/home/michaelmacleod/autobeet/autobeet.py", line 40, in import_directory
    self.session.run()
  File "/usr/local/lib/python3.4/dist-packages/beets/importer.py", line 325, in run
    pl.run_parallel(QUEUE_SIZE)
  File "/usr/local/lib/python3.4/dist-packages/beets/util/pipeline.py", line 445, in run_parallel
    six.reraise(exc_info[0], exc_info[1], exc_info[2])
  File "/usr/local/lib/python3.4/dist-packages/six.py", line 693, in reraise
    raise value
  File "/usr/local/lib/python3.4/dist-packages/beets/util/pipeline.py", line 312, in run
    out = self.coro.send(msg)
  File "/usr/local/lib/python3.4/dist-packages/beets/util/pipeline.py", line 171, in coro
    task = func(*(args + (task,)))
  File "/usr/local/lib/python3.4/dist-packages/beets/importer.py", line 1317, in user_query
    task.choose_match(session)
  File "/usr/local/lib/python3.4/dist-packages/beets/importer.py", line 800, in choose_match
    self.set_choice(choice)
  File "/usr/local/lib/python3.4/dist-packages/beets/importer.py", line 454, in set_choice
    assert choice != action.APPLY  # Only used internally.
AssertionError

And the second problem is I can’t figure out how to manipulate the config object without overwriting the whole thing. I’ve been looking through the beets ui code, and I see many examples where the config object is modified like a regular dict, but if I try and overwite any settings as if it were a dict it gets replaced entirely with only my override.

I’d appreciate any ideas on either subject.

Thanks

Good question. The problem is that choose_match (and choose_item) can’t just return APPLY, because that doesn’t tell the system which match to apply. Instead, return one of the matches from the task—probably the highest-ranked one.

About the configuration: it should be possible to do things like config['foo']['bar'] = baz. That should create a new overlay, which may make things look confusing, but it should not remove other settings beyond what was changed. Do you have examples of what you tried?

Wherever possible, however, settings for the importer can be controlled by the ImportSession, which just uses a plain dictionary.

Huzzah! I have some success! I’ve now got a wrapper class that can load the config.yaml file, loads plugins, and can successfully import a directory. In my use case I’m going to have to metadata about the import, such as the original media format, so I’m passing that to the import function which is using it to dynamically adjust the match requirements. Here’s the code:

from beets import config
from beets import plugins
from beets import importer
from beets.ui import _open_library
from beets.ui import _load_plugins
from beets.autotag import Recommendation

class Autobeets(object):

    class AutoImportSession(importer.ImportSession):

        def should_resume(self, path):
            return True

        def choose_match(self, task):
            if task.rec == Recommendation.strong:
                return task.candidates[0]
            else:
                return importer.action.SKIP

        def resolve_duplicate(self, task, found_duplicates):
            return importer.action.SKIP

        def choose_item(self, task):
            if task.rec == Recommendation.strong:
                return task.candidates[0]
            else:
                return importer.action.SKIP

    def __init__(self, config_file, music_db):
        config.set_file(config_file)
        config.resolve()
        config['library'] = music_db
        config['import']['flat'] = True
        self.lib = _open_library(config)
        self.plugins = _load_plugins(config)

    def import_directory(self, import_dir, media):
        query = None
        logger = None
        config['import']['set_fields'] = {'import_dir': import_dir}
        config['import']['match']['required']['media'] = [media]
        self.session = Autobeets.AutoImportSession(self.lib, logger, [import_dir], query)
        results = self.session.run()

The last thing I need is to grab the results. There’s that logger option that I’m bypassing with None right now which should work, but I’m curious if there’s a way to get an object or dict that represents the import? I can peer into the candidate during the choose_match, but is there a way to grab it after the import is done?

Cool! The ordinary way to observe when an import task is finished would be to hook into one of the relevant plugin events. That seems a little roundabout for your purposes, but perhaps listening for the appropriate event would work regardless?

Thanks for the tip Adrian. I’ve made some progress, and figure I’ll share what I have so far in case anyone else wants to do something similar. There’s still some room for improvement (nothing gets written to logs yet, for example), but it’s functional. You can use the code below to hook into beets from other python apps and automatically import albums (it should work for singletons as well, though I haven’t tested that). It works with your existing library and config file (though you need to supply the paths), and will do the full plugin pipeline.

Here’s the wrapper class, which is what you’ll work with:

#!/usr/bin/python3

import logging
from beets import config
from beets import plugins
from beets import importer
from beets.ui import _open_library
from beets.ui import _load_plugins
from beets.autotag import Recommendation
from beetsplug import importdata

class Autobeets(object):

    class AutoImportSession(importer.ImportSession):

        def should_resume(self, path):
            return False

        def choose_match(self, task):
            if task.rec == Recommendation.strong:
                return task.candidates[0]
            else:
                return importer.action.SKIP

        def resolve_duplicate(self, task, found_duplicates):
            return importer.action.SKIP

        def choose_item(self, task):
            if task.rec == Recommendation.strong:
                return task.candidates[0]
            else:
                return importer.action.SKIP

    def __init__(self, config_file, music_db):
        config.set_file(config_file)
        config.resolve()
        config['library'] = music_db
        autoplugins = []
        autoplugins.append('importdata')
        for p in config['plugins']:
            autoplugins.append(str(p))
        config['plugins'] = autoplugins
        config['import']['flat'] = True
        config['import']['resume'] = False
        config['import']['quiet'] = True
        self.lib = _open_library(config)
        self.plugins = _load_plugins(config)

    def import_directory(self, import_dir, media=None, fields=None):
        query = None
        if fields:
            config['import']['set_fields'] = fields
        if media:
            config['import']['match']['required']['media'] = [media]
        self.session = Autobeets.AutoImportSession(self.lib, logger, [import_dir], query)
        self.session.run()
        if importdata.imported:
            return importdata.imported[0]
        else:
            return None

As per Adrian’s suggestion, you’ll need to include a custom plugin. Follow the advice about setting up the paths and __init__.py from the docs for creating plugins (http://beets.readthedocs.io/en/v1.4.6/dev/plugins.html). For the plugin itself, here’s what you need:

from beets.plugins import BeetsPlugin

imported = []

class ImportData(BeetsPlugin):

    def __init__(self):
        super(ImportData, self).__init__()
        self.register_listener('album_imported', self.album_imported)
        self.register_listener('item_imported', self.item_imported)

    def album_imported(self, lib, album):
        imported.append(album)

    def item_imported(self, lib, item):
        imported.append(item)

Example script for using the Autobeets importer:

#!/usr/bin/python3

import os
import sys
import autobeet

def main():
    config_file = 'autobeet.yaml'
    library_path = 'autobeet.blb'
    import_dir = 'Artist - Album (2018) [FLAC]'
    beets = autobeet.Autobeets(config_file, library_path)
    imported = beets.import_directory('inbox/%s' % import_dir, 'CD', {'import_dir': import_dir})
    print('imported: %s/[%s] %s' % (imported['albumartist_sort'], imported['original_year'], imported['album']))

if __name__ == "__main__":
    main()

That should do it. Some things to note:

  • I append the importdata plugin to the plugin list in the Autobeet class, there’s no need to include it in your config.yaml.
  • I set resume, flat, and quiet import settings to reasonable settings for a single unattended directory import.
  • Speaking of, this whole thing assumes you’re only importing a single directory at a time. There’s a couple of places where that assumption shows through. If you’re importing multiple albums per session, you’ll need to tweak a few things.
  • You can pass a media type to the import_directory() call, and it’ll be required for the match. If you’re importing a CD rip, this’ll prevent it matching a vinyl release.
  • You can pass a directory of fields to the import_directory() call, and they’ll be used to set custom fields, as if you’d passed them to the --set command line option. In the importer example above, I’m setting a field to track the original import directory.

Room for improvement:

  • Logging
  • Test the returned object (imported) for Album or Item.

Hopefully someone else finds this useful.
Cheers,
Mike

That’s very cool! As you can see in the thread above, we really need better documentation for how to do this kind of thing. So if you end up drawing any useful conclusions, perhaps we can get your help integrating them into the official docs.

I’d be happy to help out with some docs, though a lot of my solution was arrived at by trial and error rather than a deep understanding of the beets code base. What would you like to see in terms of documentation?

Cool! I think even just a few examples of working code would be useful—for instance, just a minimal skeleton of a program that “boots up” a beets import session could be really informative.

hi Mike, following your lead ~2 years later! did you ever wrap this up into a beets plugin?

Sorta, but only good enough for my specific use case. And it stopped working properly in the spring…

I’ve been meaning to take another look at it and resolve my issue and improve it enough to share it. I was going to make it my winter hobby project.

If you’re looking to do something similar though, I can take a look and share what I have. I’ll need a few nights to re-acquaint myself with my work though.

@rik, I’ve spent some time this last week refreshing my tool and dockerizing it (my local python environment was declared a superfund site). For the most part, it’s the same as I documented above. There’s three separate parts to my tool:

  • beet-importd: Daemon process that polls for messages on an AWS SQS queue that detail new directories to import, and goes about importing them.
  • Autobeets: The custom import session used by beet-importd. Functionally similar to just calling ‘beet import -q --flat’ on the directory supplied to beet-importd, but without dropping to a shell to call it.
  • Importdata: A custom plugin that serves just to record information about the import for beet-importd to access. It records the import choice (action.SKIP vs. action.APPLY, etc), as well as the album and item objects imported.

The only part necessary for integrating with other python tools is autobeets. One thing I’ve noticed in my testing this week is my duplicate checking is busted, as duplicate imports aren’t returning with action.SKIP. Here’s my current autobeets.py with the few lines related to the importdata plugin removed:

import logging
import itertools
from beets import config
from beets import plugins
from beets import importer
from beets.ui import _open_library
from beets.ui import _load_plugins
from beets.autotag import Recommendation

class Autobeets(object):
    class AutoImportSession(importer.ImportSession):
        def should_resume(self, path):
            return False

        def choose_match(self, task):
            if task.rec == Recommendation.strong:
                self.logger.debug('Found match: %s' % str(task.candidates[0]))
                return task.candidates[0]
            else:
                self.logger.info('No strong candidate, skipping')
                return importer.action.SKIP

        def resolve_duplicate(self, task, found_duplicates):
            self.logger.debug('Skipping duplicate')
            return importer.action.SKIP

        def choose_item(self, task):
            if task.rec == Recommendation.strong:
                self.logger.debug('Found match: %s' % str(task.candidates[0]))
                return task.candidates[0]
            else:
                self.logger.info('No strong candidate, skipping')
                return importer.action.SKIP

    def _merge_config(self, a, b):
        if isinstance(a, dict) and isinstance(b, dict):
            d = dict(a)
            d.update({k: merge(a.get(k, None), b[k]) for k in b})
            return d
        if isinstance(a, list) and isinstance(b, list):
            return [merge(x, y) for x, y in itertools.izip_longest(a, b)]
        return a if b is None else b

    def __init__(self, music_db, config_file, config_opts=None):
        self.logger = logging.getLogger(__name__)
        self.logger.debug('Initializing an instance of Autobeets')
        config.set_file(config_file)
        config['library'] = music_db
        plugins = config_opts.pop('plugins', None)
        if plugins:
            autoplugins = []
            for p in plugins:
                autoplugins.append(str(p))
            for p in config['plugins']:
                autoplugins.append(str(p))
            config['plugins'] = autoplugins
        import_opts = config_opts.pop('import', None)
        if import_opts:
            for k, v in import_opts.items():
                config['import'][k] = v
        for k, v in config_opts.items():
            config[k] = v
        config['import']['flat'] = True
        config['import']['resume'] = False
        config['import']['quiet'] = True
        config['import']['timid'] = False
        config['import']['incremental'] = False
        config['import']['duplicate_action'] = 'skip'
        self.lib = _open_library(config)
        self.plugins = _load_plugins(config)

    def import_directory(self, import_dir, media=None, fields=None):
        query = None
        self.logger.debug('Importing %s' % import_dir)
        logger = self.logger
        if fields:
            config['import']['set_fields'] = fields
            self.logger.debug('Setting fields %s' % fields)
        if media:
            config['import']['match']['preferred']['media'] = [media]
            self.logger.debug('Preffering %s matches' % media)
        self.session = Autobeets.AutoImportSession(self.lib, logger, [import_dir], query)
        self.session.run()

That should get anyone started.

Good as your word, thanks @macleodmike ! i find your new version of autobeet.py in your post; are you sticking all the bits in some repo somewhere? i’ll try to get back to poking at this task asap.

@rik, I do, but it’s also bound up with the parts of the code that are specific to my use case and not generalized.