Resuming interrupted import starts all over

In beets 1.4.5 on Centos 7, after an interrupted import (abort or crash), the subsequent import always starts from the beginning, even though I have set the -p command line option and the message “Resuming interrupted import of …” shows up. Any idea how to fix this?

beets version 1.4.5
Python version 2.7.5
plugins: fetchart, fromfilename, lyrics

Strange! Is there a way we can reproduce the problem? (Aborting and import and starting it again works as expected here.)

Or, perhaps a verbose log would reveal something enlightening?

Has to be something strange, here are the first lines of the log:

[media@gurnemanz ~]$ beet -vv import -p /data/media/audio/flac/
user configuration: /home/media/.config/beets/config.yaml
data directory: /home/media/.config/beets
plugin paths:
Sending event: pluginload
lyrics: Disabling google source: no API key configured.
library database: /data/media/audio/library/beets.blb
library directory: /data/media/audio/library
Sending event: library_opened
Sending event: import_begin
Resuming interrupted import of /data/media/audio/flac
Sending event: import_task_created
Sending event: import_task_start
Looking up: /data/media/audio/flac/!!!- Strange Weather, Isn’t It (Japanese Edition) 2010
Tagging !!! - Strange Weather, Isn’t It? (Japanese Edition)
No album ID found.
Search terms: !!! - Strange Weather, Isn’t It? (Japanese Edition)
Album might be VA: False
Searching for MusicBrainz releases with: {‘release’: u"strange weather, isn’t it? (japanese edition)", ‘tracks’: u’10’, ‘artist’: u’!!!’}
Requesting MusicBrainz release 885c73f6-b7ea-49aa-92d8-4a5e5535b46f
Sending event: import_task_created
Sending event: import_task_created
primary MB release type: album
Sending event: albuminfo_received
Candidate: !!! - Strange Weather, Isn’t It? (885c73f6-b7ea-49aa-92d8-4a5e5535b46f)
Computing track assignment…
…done.
Success. Distance: 0.07
Requesting MusicBrainz release eb29cef7-cc52-4a49-ad75-10b9a61d45d6
Sending event: import_task_created
Sending event: import_task_created
primary MB release type: album
Sending event: albuminfo_received
Candidate: !!! - Strange Weather, Isn’t It? (eb29cef7-cc52-4a49-ad75-10b9a61d45d6)
Computing track assignment…
…done.
Success. Distance: 0.01
Requesting MusicBrainz release eb2e6f79-8b97-4565-8ea8-d8d0f2814cb2
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
primary MB release type: album
Sending event: albuminfo_received
Candidate: !!! - Strange Weather, Isn’t It? (eb2e6f79-8b97-4565-8ea8-d8d0f2814cb2)
Computing track assignment…

Strange! I don’t see any obvious evidence there. I’m not sure what else to try, really—maybe there’s some sort of small test that we can run here to reproduce the problem “from scratch”?

Could be because I fiddled around with some of the files beet uses, how is
the starting point of an interrupted import determined? Maybe this gives me
a clue where to start looking.

The short answer is that the importer stores all the directories it imports in a file called state.pickle under ~/.config/beets. It also records an indication when an import finishes.

Then, when you start an import, it checks whether an import of that directory has previously started but not finished. If so, it skips over all the subdirectories recorded in the file for that directory.

If you’re interested in more mechanical details, I recommend reading importer.py in the source. The right place to start are the _open_state and _save_state functions for reading and writing this file.