What does beets do with already existing .lrc files?

Hi, I’ve got a bunch of albums (over 1000) that all have their own individual synced “songname.lrc” (exact name match to the .FLAC files except for extension) in the folders.

In the past with albums that don’t have their own files, I have noticed that sometimes slightly out-of-sync lyrics are imported.

In my config I have:
lyrics:
force: no
synced: yes
source: lrclib genius

I have 1 problem that I really want to clarify before I potentially mess up my library.
The .FLAC files have unsynced lyrics embedded, but the .lrc files contain the synced lyrics for music players to use.

Since the .lrc files are separate, does beets embed the contents of the .lrc files to the .FLAC file, or does it simply copy it to the new folder, or does it disregard the .lrc files and sync new lyrics from the web?

Once that is cleared up, would I need to change my lyrics configuration? I just want my synced lyrics kept, I really don’t want to lose them or have new ones from the internet imported unless the .lrc files are missing

beets won’t import files alongside your tracks (except maybe cover art?). There’s a couple of solutions in this thread for copying additional files into your library during import: Extrafiles vs. copyartifacts plugin or something else for lyrics?

Thanks for the quick response.

I ended up using a workaround that I came up with to recursively embed the lyrics into all .FLAC files’ metadata with ffmpeg, then delete all the .lrc files.

Once that’s done, I finally do “beet import”, now I can enjoy my properly organized music library with properly synced lyrics.

It’s pretty easy to do in Linux:

#!/bin/bash
shopt -s globstar

for file in **/*.flac; do
    base="${file%.*}"
    lrc_file="$base.lrc"

    if [[ -f "$lrc_file" ]]; then
        ffmpeg -i "$file" -map 0 -c copy -metadata LYRICS="$(cat "$lrc_file")" "${base}_with_lyrics.flac"
        rm "$lrc_file"
        mv "${base}_with_lyrics.flac" "$file"
        echo "Lyrics embedded into $file"
    else
        echo "No .lrc file found for $file"
    fi
done

Hope this helps someone out one day. Cheers!

Since @jackwilsdon linked to my old thread I might as well add how I solved a good portion of my lyrics troubles.

While this approach might work if you only have flac files and if all your .lrc files match them, it lacks other formats and cases where the .lrc file does not match the song.

Initially I only wrote a script that checks for unmatched .lrc files recursively (so I could manually fix them). After helping another user (who had trouble with my initial, rough script), I started expanding it a bit. Then I kept adding stuff, finally resulting in lyrict which can check for unmatched lyrics files, import and export lyrics (synced and unsynced) from and to multiple audioformats (and tags) and standardize the timestamp format of the lyrics. From [0:00.000]TEXT to [00:00.000]TEXT for example.

Feel free to check it out and report any errors you find.

Hey! Thanks for your input. I’ll definitely give your work a look because it certainly seems to align with what I’m looking for.

Your right about that, my approach really only applies to someone with a somewhat “managed” library. I actually did adapt this to all of the other formats which I’ve used in my library and I’ll share it here:

#!/bin/bash
shopt -s globstar #nocaseglob (todo: ignore capitalization)

# Commonly used Extensions
extensions=("flac" "alac" "wav" "mp3" "m4a" "ogg" "opus")

for ext in ${extensions[@]}; do
    for file in **/*."$ext"; do
        base="${file%.*}"
        lrc_file="$base.lrc"
    
        if [[ -f "$lrc_file" ]]; then
            # Keep basename and match extension of original file
            temp_file="${base}_temp.${file##*.}"
            # Embed lyrics
            ffmpeg -i "$file" -map 0 -c copy -metadata LYRICS="$(cat "$lrc_file")" "$temp_file"
            # Cleanup
            rm "$lrc_file"
            mv "$temp_file" "$file"
            echo "Lyrics embedded into $file"
        else
            echo "No .lrc file found for $file"
        fi
    done
done

As you can see at the top of the code I was actually considering potentially mismatched (in terms of capitalization) .lrc files.
Although I know my current library and personally put it together, hence no need for nocaseglob I wanted to make it easier for future additions.

Basically it would evaluate filenames as common letters and then embed if there is match in the basename for the .lrc and .format files.

I have experience with Python, but much more with bash, so that’s my approach for the while. As I said at the beginning, I’ll definitely look into your work as well.

A smarter approach on my end as well would be to find the .lrc file and then look for a matching file regardless of extension in the same directory.

That was the sole purpose of my script in the beginning. I specified the formats it should check, it then found all .lrc files and looked for matching audio files of these formats. If it did not find any, it added the .lrc file path to a list of unmatched files to be fixed.

Later on that became creating multiple lists based on found/not found and which format the lyrics are associated with to create meaningful lists which are then further used for importing .lrc files efficiently (why try to embed a .lrc file for hundreds of thousands of songs if you already have a list of those that do have matching .lrc files?). I put quite a lot of thought into my script and tried to keep it as flexible and efficient as I could.

I’m always looking for feedback.

Just thought I’d give an update on my solution, this does exactly what I need for synced lyrics specifically. It utilizes bash on Linux (my library is stored on a machine running Ubuntu Server).

I’ve optimized it to match lyrics to media instead of media to lyrics (find .lrc files first, then looks for media that matches it.)

#!/bin/bash
# Allow matching entire directory tree, case insensitive matching.
shopt -s globstar nocaseglob

# Loop through all directories
for lyric in **/*.lrc; do
    # Filepath without extension.
    base="${lyric%.*}"
    # Find all files with same basename
    for media in "$base".*; do
        # Ensure identified file is not the synced lyrics file.
        if [[ "$media" != "$lyric" ]]; then
            # Add "_temp" to filename for output. 
            temp_file="${media%.*}_temp.${media##*.}"
            # Embed lyrics to media file.
            ffmpeg -i "$media" -map 0 -c copy -metadata LYRICS="$(cat "$lyric")" "$temp_file"
            # Replace original file with updated file
            mv "$temp_file" "$media"
            # Remove .lrc file (it's embedded in the media, no longer needed)
            rm "$lyric"
            # Lets you know which file has synced lyrics now.
            echo "Lyrics embedded into $media"
        fi
    done
done
1 Like

I’ll keep updating here as I get closer to what I’m looking for. (Script needs ffmpeg and ffprobe, can easily be adapted for non-bash shells, will do later). I have beets setup in a virtual environment, so that can be removed if you have it install on your system and added to path.

I also have Plex Media Server set up (which unfortunately doesn’t support ID3 tags very well)… So for lyrics I’m dependant on the .lrc files.

Quick overview of what this script does:

  1. Scans media files based on the list of extensions
  2. Reads lyrics in the file.
  3. If, Then, Else
  • If it’s synced, leave the file alone, and delete .lrc if it exists (my import and library folders are separate, which is why the 2nd part is done)
  • Otherwise if it’s unsynced, look for synced lyrics (.lrc file) and embed
  • If it’s unsynced and no .lrc file exists, strip the lyrics from the file so beets can autotag it with synced lyrics. Or leave it alone if it’s blank.
  1. Beet Update/Import/Convert
  2. Extracts the lyrics (.lrc and .txt) from all media files in the library (and exports a list of songs with unsynced lyrics to the library’s parent folder) for Plex Media Server to read

ROOT_DIR is just where the script is (the /home/$USER directory in my case)

#!/bin/bash
shopt -s globstar nocaseglob

# Get the directory of the script
ROOT_DIR="$(dirname "$(realpath "$0")")"
# State Import/Library folders
IMPORT_DIR="/home/${USER}/Downloads/Music"
LIBRARY_DIR="/home/${USER}/Music/Library"

# List of extensions of media files.
extensions=("flac" "m4a" "mp3")

### Embed Synced Lyrics in music to be imported
# Go to Import Directory
cd "$IMPORT_DIR"
find . -type f -name *.txt -delete

for ext in ${extensions[@]}; do
	if [[ "$ext" == "flac" ]]; then
		tags="LYRICS"
	else
		tags="lyrics"
	fi

	for file in **/*."$ext"; do
		# Get lyrics from metadata
		lyrics=$(ffprobe -loglevel error -show_entries format_tags="$tags" -of default=noprint_wrappers=1:nokey=1 "$file")
		# If not empty, check if it's synced and try to embed if not.
		if ! [ -z "$lyrics" ]; then
			# If synced, leave untouched
			if echo "$lyrics" | grep -qE '^\[[0-9]{2}:[0-9]{2}.[0-9]{2}\]'; then
				echo "Synced lyrics already embedded in '$file'."
				if [ -e "${file%.*}.lrc" ]; then rm "${file%.*}.lrc"; fi
			# If not synced, look for .lrc file and embed
			elif [ -e "${file%.*}.lrc" ]; then
				lyrics="${file%.*}.lrc"
				temp_file="${file%.*}_temp.${file##*.}"
				ffmpeg -i "$file" -map 0 -c copy -metadata "$tags"="$(cat "$lyrics")" "$temp_file"
				mv "$temp_file" "$file"
				rm "$lyrics"
			# Otherwise, strip for autotagging in beets
			else
				echo "Stripping lyrics from '$file' for autotagging."
				temp_file="${file%.*}_temp.${file##*.}"
				ffmpeg -i "$file" -map 0 -c copy -metadata "$tags"="" "$temp_file"
				mv "$temp_file" "$file"
			fi
		# Embed synced lyrics if available
		elif [ -e "${file%.*}.lrc" ]; then
			lyrics="${file%.*}.lrc"
			temp_file="${file%.*}_temp.${file##*.}"
			ffmpeg -i "$file" -map 0 -c copy -metadata "$tags"="$(cat "$lyrics")" "$temp_file"
			mv "$temp_file" "$file"
			rm "$lyrics"
		# Otherwise, just state it's empty
		else
			echo "'$file' has no lyrics"
		fi
	done
done

### Import all Music to Library
source "$ROOT_DIR/beets/bin/activate"
beet update
beet import "$IMPORT_DIR"
beet convert -y
deactivate

### Export (hopefully now Synced) Lyrics to .lrc file for Plex Media Server
cd "$LIBRARY_DIR"

for ext in ${extensions[@]}; do
	if [[ "$ext" == "flac" ]]; then
		tags="LYRICS"
	else
		tags="lyrics"
	fi
	
	for file in **/*."$ext"; do
		# Check if any exported lyrics exists, delete unsynced if both exists
		if [ -e "${file%.*}.lrc" ] && [ -e "${file%.*}.txt" ]; then
			rm "${file%.*}.txt"
		# If only unsynced exists
		elif [ -e "${file%.*}.txt" ]; then
			# Read Lyrics from file
			lyrics=$(ffprobe -loglevel error -show_entries format_tags="$tags" -of default=noprint_wrappers=1:nokey=1 "$file")		
			# Ensure lyrics aren't empty
			if ! [ -z "$lyrics" ]; then
				# If synced, delete unsynced and save synced
				if echo "$lyrics" | grep -qE '^\[[0-9]{2}:[0-9]{2}.[0-9]{2}\]'; then
					echo "Synced lyrices for '$file' found, removing unsynced."
					rm "${file%.*}.txt"
					echo "$lyrics" > "${file%.*}.lrc"
				else
					echo "No synced lyrics found, leaving files intact."
                    echo "$file" >> "unsynced.txt"
				fi
			fi
		# If no lyrics exists.
		elif [ ! -e "${file%.*}.txt" ] && [ ! -e "${file%.*}.lrc" ]; then
			# Read Lyrics from file
			lyrics=$(ffprobe -loglevel error -show_entries format_tags="$tags" -of default=noprint_wrappers=1:nokey=1 "$file")		
			# Ensure lyrics aren't empty
			if ! [ -z "$lyrics" ]; then
				# If synced, delete unsynced and save synced
				if echo "$lyrics" | grep -qE '^\[[0-9]{2}:[0-9]{2}.[0-9]{2}\]'; then
					echo "Exporting synced lyrics for '$file'."
					echo "$lyrics" > "${file%.*}.lrc"
				else
					echo "Exporting unsynced lyrics for '$file'."
					echo "$lyrics" > "${file%.*}.txt"
					echo "$file" >> "unsynced.txt"
				fi
			fi
		else
			echo "Lyrics already exists for '$file'"
		fi
	done
done

Back again with another update, transitioning to Python (still learning, eventually [hopefully] I’ll get to the point where this can be a plugin).

I’m leaving the old one up just in case anyone wants something that’ll run without python (though you’d still need ffmpeg and ffprobe).

Now I’m at a point where I have an import.sh file which calls a import.py script.

import.sh:

#!/bin/sh
# Get the directory of the script
IMPORT="~/Downloads/Music"
LIBRARY="~/Music"

# Embed synced lyrics into files before importing with beats
# Strip unsynced lyrics from files
python3 ./import.py -em -dir "$IMPORT"

# Import all Music to Library
source "$ROOT_DIR/beets/bin/activate"
beet update
beet import "$IMPORT"
beet convert -y
deactivate

# Export lyrics from media files now that they're tagged
python3 ./import.py -ex -dir "$LIBRARY"

import.py:

import os, glob, re, argparse
from mutagen import File, MutagenError

extensions = ['flac', 'mp3', 'm4a']
lyrics_tags = ['LYRICS', 'lyrics', '\xa9lyr']
extension_to_tag = {
    'flac': 'LYRICS',
    'mp3': 'lyrics',
    'm4a': '\xa9lyr'
}

def get_tag_for_extension(extension):
    return extension_to_tag.get(extension, None)

def find_tag(audio, lyrics_tags):
    for tag in lyrics_tags:
        try:
            if tag in audio:
                return tag
        except ValueError:
            continue  # Ignore the error and continue checking other tags
    return None

def find_audio(folder):
  print(f"Scanning '{folder}'")
  audio_files = []
  for ext in extensions:
    audio_files.extend(glob.glob(f'{folder}/**/*.{ext}', recursive=True))
  return audio_files
  

    
def embed_lyrics_from_file(filepath, audio, lrc_filename, tag):
    if os.path.isfile(lrc_filename):
        with open(lrc_filename, 'r', encoding='utf-8') as lrc_file:
            lyrics = lrc_file.read()
        if lyrics:
            audio[tag] = lyrics
            audio.save()
            print(f"Embedded synced lyrics in '{filepath}'")
        os.remove(lrc_filename)
    print(f"Deleted '{lrc_filename}' after embedding lyrics.")

def embed_lyrics(importdir):
    audio_files = find_audio(importdir)
    for filepath in audio_files:
        try:
            audio = File(filepath)
            if audio:
                lrc_filename = f"{os.path.splitext(filepath)[0]}.lrc"
                tag = find_tag(audio, lyrics_tags);
                if tag:
                    lyrics = audio.get(tag)
                    if lyrics:
                        lyrics_str = lyrics[0] if isinstance(lyrics, list) else lyrics
                        if bool(re.search(r'\[\d{2}:\d{2}\.\d{2}\]|\[\d{1,2}\.\d{2}\]', lyrics_str)):
                            print(f"Synced lyrics already in '{filepath}'")
                        else:
                            print(f"Stripping lyrics from '{filepath}'")
                            del audio[tag]
                            audio.save()
                            print(f"Stripped lyrics from '{filepath}'")
                    # No harm in embedding it
                    embed_lyrics_from_file(filepath, audio, lrc_filename, tag)
                else:
                    # Get the extension without the dot
                    ext = os.path.splitext(filepath)[1][1:]
                    tag = get_tag_for_extension(ext)
                    if tag:
                        embed_lyrics_from_file(filepath, audio, lrc_filename, tag)
                    else:
                        print(f"No tag found for '{filepath}' and no extension map available")
            else:
                print(f"Audio file not readable: '{filepath}'")
        except MutagenError as e:
            print(f"Error processing '{filepath}': {e}")

def extract_lyrics(musicdir):
    audio_files = find_audio(musicdir)
    unsynced = []
    nolyrics = []
  
    for filepath in audio_files:
        try:
            audio = File(filepath)
            if audio:
                tag = find_tag(audio, lyrics_tags);
                if tag:
                    lyrics = audio.get(tag)
                    if lyrics:
                        lyrics_str = lyrics[0] if isinstance(lyrics, list) else lyrics
                        if bool(re.search(r'\[\d{2}:\d{2}\.\d{2}\]|\[\d{1,2}\.\d{2}\]', lyrics_str)):
                            lyrics_type = 'lrc'
                        else:
                            lyrics_type = 'txt'
                            unsynced.append(filepath)
                            output_filename = f"{os.path.splitext(filepath)[0]}.{lyrics_type}"
                            if not os.path.isfile(output_filename):
                                with open(output_filename, 'w', encoding='utf-8') as output:
                                    output.write(f"{lyrics_str}\n")
                    else:
                        print(f"No lyrics found for '{filepath}'")
                        nolyrics.append(filepath)
                else:
                    print(f"No lyrics tag found for '{filepath}'")
                    nolyrics.append(filepath)
            else:
                print(f"Could not read metadata for '{filepath}'")
        except MutagenError as e:
            print(f"Error processing '{filepath}': {e}")
    if unsynced:
        print("There are some files with unsynced lyrics \nCheck 'unsynced.txt' for more.")
        
        with open('unsynced.txt', 'w') as output:
            for item in unsynced:
                output.write(f"{item}\n")
    if nolyrics:
        print("There are some files with no lyrics \nCheck 'nolyrics.txt' for more.")
        with open('nolyrics.txt', 'w') as output:
            for item in nolyrics:
                output.write(f"{item}\n")

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('-em', action='store_true')
    parser.add_argument('-ex', action='store_true')
    parser.add_argument('-dir', type=str, required=True)
    args = parser.parse_args()
    
    directory = os.path.expanduser(args.dir)
  
    if args.em:
        print(f"Embedding lyrics in '{directory}'")
        embed_lyrics(directory)
    elif args.ex:
        print(f"Extracting lyrics in '{directory}'")
        extract_lyrics(directory)
    else:
        print("No valid option provided.")
    
if __name__ == "__main__":
    main()

Script is set up to either embed or extract lyrics from audio files using the mutagen library, with the option to process multiple audio file types (flac, m4a, mp3). It’s driven by command-line arguments, where -em triggers embedding lyrics, and -ex triggers extracting lyrics -dir "string" specifies the directory to use.

It also puts a list of files without lyrics and unsynced lyrics next to the script itself.