tags:

views:

1229

answers:

4

I'm looking for a resource in python or bash that will make it easy to take, for example, mp3 file X and m4a file Y and say "copy X's tags to Y".

Python's "mutagen" module is great for manupulating tags in general, but there's no abstract concept of "artist field" that spans different types of tag; I want a library that handles all the fiddly bits and knows fieldname equivalences. For things not all tag systems can express, I'm okay with information being lost or best-guessed.

(Use case: I encode lossless files to mp3, then go use the mp3s for listening. Every month or so, I want to be able to update the 'master' lossless files with whatever tag changes I've made to the mp3s. I'm tired of stubbing my toes on implementation differences among formats.)

A: 

You can just write a simple app with a mapping of each tag name in each format to an "abstract tag" type, and then its easy to convert from one to the other. You don't even have to know all available types - just those that you are interested in.

Seems to me like a weekend-project type of time investment, possibly less. Have fun, and I won't mind taking a peek at your implementation and even using it - if you won't mind releasing it of course :-) .

Guss
+3  A: 

I needed this exact thing, and I, too, realized quickly that mutagen is not a distant enough abstraction to do this kind of thing. Fortunately, the authors of mutagen needed it for their media player QuodLibet.

I had to dig through the QuodLibet source to find out how to use it, but once I understood it, I wrote a utility called sequitur which is intended to be a command line equivalent to ExFalso (QuodLibet's tagging component). It uses this abstraction mechanism and provides some added abstraction and functionality.

If you want to check out the source, here's a link to the latest tarball. The package is actually a set of three command line scripts and a module for interfacing with QL. If you want to install the whole thing, you can use:

easy_install QLCLI

One thing to keep in mind about exfalso/quodlibet (and consequently sequitur) is that they actually implement audio metadata properly, which means that all tags support multiple values (unless the file type prohibits it, which there aren't many that do). So, doing something like:

print qllib.AudioFile('foo.mp3')['artist']

Will not output a single string, but will output a list of strings like:

[u'The First Artist', u'The Second Artist']

The way you might use it to copy tags would be something like:

import os.path
import qllib  # this is the module that comes with QLCLI

def update_tags(mp3_fn, flac_fn):
    mp3 = qllib.AudioFile(mp3_fn)
    flac = qllib.AudioFile(flac_fn)
    # you can iterate over the tag names
    # they will be the same for all file types
    for tag_name in mp3:
        flac[tag_name] = mp3[tag_name]
    flac.write()

mp3_filenames = ['foo.mp3', 'bar.mp3', 'baz.mp3']

for mp3_fn in mp3_filenames:
    flac_fn = os.path.splitext(mp3_fn)[0] + '.flac'
    if os.path.getmtime(mp3_fn) != os.path.getmtime(flac_fn):
        update_tags(mp3_fn, flac_fn)
Jeremy Cantrell
A: 

There's also tagpy, which seems to work well.

gatoatigrado
+1  A: 

Here's some example code, a script that I wrote to copy tags between files using Quod Libet's music format classes (not mutagen's!). To run it, just do copytags.py src1 dest1 src2 dest2 src3 dest3, and it will copy the tags in sec1 to dest1 (after deleting any existing tags on dest1!), and so on. Note the blacklist, which you should tweak to your own preference. The blacklist will not only prevent certain tags from being copied, it will also prevent them from being clobbered in the destination file.

To be clear, Quod Libet's format-agnostic tagging is not a feature of mutagen; it is implemented on top of mutagen. So if you want format-agnostic tagging, you need to use quodlibet.formats.MusicFile to open your files instead of mutagen.File.

One critical detail for me was that Quod Libet's music format classes expect QL's configuration to be loaded, hence the config.init line in my script. Without that, I get all sorts of errors when loading or saving files.

#!/usr/bin/python

import os, sys, re, UserDict
from warnings import warn

from quodlibet import config
from quodlibet.formats import MusicFile

config.init(os.path.join(os.getenv("HOME") + ".quodlibet" + "config"))

class AudioFile(UserDict.DictMixin):
    """A simple class just for tag editing.

    No internal mutagen tags are exposed, or filenames or anything. So
    calling clear() won't destroy the filename field or things like
    that. Use it like a dict, then .write() it to commit the changes.

    Optional argument blacklist is a list of regexps matching
    non-transferrable tags. They will effectively be hidden, nether
    settable nor gettable.

    Or grab the actual underlying quodlibet format object from the
    .data field and get your hands dirty."""
    def __init__(self, filename, blacklist=()):
        self.data = MusicFile(filename)
        # Also exclude mutagen's internal tags
        self.blacklist = [ re.compile("^~") ] + blacklist
    def __getitem__(self, item):
        if self.blacklisted(item):
            warn("%s is a blacklisted key." % item)
        else:
            return self.data.__getitem__(item)
    def __setitem__(self, item, value):
        if self.blacklisted(item):
            warn("%s is a blacklisted key." % item)
        else:
            return self.data.__setitem__(item, value)
    def __delitem__(self, item):
        if self.blacklisted(item):
            warn("%s is a blacklisted key." % item)
        else:
            return self.data.__delitem__(item)
    def blacklisted(self, item):
        """Return True if tag is blacklisted.

        Blacklist automatically includes internal mutagen tags (those
        beginning with a tilde)."""
        for regex in self.blacklist:
            if re.search(regex, item):
                return True
        else:
            return False
    def keys(self):
        return [ key for key in self.data.keys() if not self.blacklisted(key) ]
    def write(self):
        return self.data.write()

# A list of regexps matching non-transferrable tags, like file format
# info and replaygain info. This will not be transferred from source,
# nor deleted from destination.
blacklist_regexes = [ re.compile(s) for s in (
        'encoded',
        'replaygain',
        ) ]

def copy_tags (src, dest):
    m_src = AudioFile(src, blacklist = blacklist_regexes)
    m_dest = AudioFile(dest, blacklist = m_src.blacklist)
    m_dest.clear()
    m_dest.update(m_src)
    m_dest.write()

if __name__ == '__main__':
    if len(sys.argv[1:]) == 0:
        print "No files specified."
    if len(sys.argv[1:]) % 2 != 0:
        print "Need an even number of files."
    file_pairs = dict(zip(sys.argv[1::2],sys.argv[2::2]))
    for pair in file_pairs.iteritems():
        print """Copying tags from "%s" to "%s" """ % pair
        copy_tags(pair[0],pair[1])

I have tested this script for copying between flac, ogg, and mp3, with "standard" tags, as well as arbitrary tags. It has worked perfectly so far.

Edit: I should mention that my class is called AudioFile, the same as QLLib's, because I started out using QLLib and then switched to my own implementation, and I was too lazy to change the class name. The point is that I haven't actually used any of QLLib's code.

As for the reason that I abandoned QLLib, it didn't work for me. I suspect it was getting the same config-related errors as I was, but was silently ignoring them and simply failing to write tags.

Ryan Thompson
And, as predicted by Murphy's law, QLLib started working perfectly for me as soon as I finished writing this script.
Ryan Thompson