EsoErik

Sunday, July 26, 2009

 

retrieving original iPod music filenames from an iPodDB

I noticed that when iTunes writes a track to an iPod, it hashes the track filename. For example, an MP3 on an iPod might be called HQXA.mp3 and located in one of many directories with two letter numeric names. Incidentally, this style of naming files is similar to that used by CCache, squid, and other applications that store a large number of files.

I desired to import numerous such files from an iPod into iTunes. iTunes supports automatic organization of imported tracks by the information stored in each track (such as artist name, album name, track number, disc number). Thus, for tracks imported into iTunes, filenames are irrelevant - provided that those tracks include tag information. Some of the tracks that I wanted to import included no useful information in their tags. The original tack filenames was descriptive and obviously stored somewhere on the iPod - when the iPod played these tracks, it showed their original filenames (minus extension). I browsed through the data files on the iPod and quickly found that iPod_Control/iTunes/iTunesDB appeared to contain track names in little endian UCS-16 or UTF-16 format in addition to a significant amount of other data.

On a flight from Boston to Reykjavik, I wore down my laptop battery exploring the iTunesDB file from an iPod containing the music I wished to recover. Looking at the file in a hex editor, I quickly recognized that the iTunesDB is definitely a database composed of a hierarchical record structure. Records are delimited by plain ASCII strings beginning with "mh"; offsets and sizes are specified as 32 bit little endian integers that are probably unsigned (I would need a 2GiB+ iTunesDB to verify this - mine was just 500KiB, and the DB does not use virtual addressing).

To keep myself occupied as I overcame jetlag upon reaching Norway, I wrote a ruby module and front end application to retrieve original filenames from an iTunes DB and to give hashed filenames their original names. It worked well with my data; that said, I haven't tested it with any other, so be careful if you use my application.


fix_iPod_filenames.rb (the frontend application)

#! /usr/bin/env ruby19
#
# (c) Erik Hvatum, 2009
#

require 'pathname'
require 'FileUtils'
require 'iTunesDB'

if ARGV.size != 2
abort "Usage: fix_iPod_filenames.rb <directory containing iPod_Control> <directory in which to place renamed music files>"
end

iPodRoot = Pathname.new(File.expand_path(ARGV[0]))
dbPath = iPodRoot + "iPod_Control/iTunes/iTunesDB"
if !dbPath.exist?
abort "iPod database \"#{dbPath}\" does not exist or is inaccessible."
end
destPath = Pathname.new(File.expand_path(ARGV[1]))
if !destPath.exist?
abort "Destination path \"#{destPath}\" does not exist or is inaccessible."
end

db = MiTunesDB::CiTunesDB.new
db.open(dbPath)
tracks = db.tracks
db.close
db = nil

tracks.sort! {|l, r| l.location <=> r.location}
tracks.each do |track|
needSep = false
dstFn = ""
doField = Proc.new do |field|
v = eval "track.#{field}"
if v != nil
if needSep
dstFn << "-"
else
needSep = true
end
dstFn << v.to_s
end
end
doField.call("album")
doField.call("discNumber")
doField.call("trackNumber")
doField.call("artist")
doField.call("title")
dstFn.gsub!(/[*!?\\\/:%]/, "_")
src = iPodRoot.to_s + track.location.gsub(/:/, "/")
dst = destPath.to_s + "/" + dstFn[0, 250] + "." + track.format.downcase.gsub(/ $/, "")
FileUtils.mv(src, dst)
end

iTunesDB.rb (the module used by the frontend application)

# (c) Erik Hvatum, 2009

module MiTunesDB

# Reads the binary DB written by iTunes Windows to an iPod circa late 2008
class CiTunesDB
Track = Struct.new(:title,
:location,
:album,
:artist,
:genre,
:fileType,
:comment,
:composer,
:grouping,
:description,
:albumArtist,
:format,
:trackNumber,
:discNumber)
attr :mhbd
attr :file

def initialize
@mhbd = nil
@file = nil
end

# Opens the iTunesDB specified by filename, loading its contents into memory
def open(fileName)
@file = File.open(fileName, "rb")
@mhbd = Cmhbd.new(self)
end

# Closes the iTunesDB
def close()
@mhbd = nil
@file = nil
end

# Returns all known tracks as an array of Track structures (defined above)
def tracks()
ts = Array.new
if @mhbd
@mhbd.children.each do |mhsd|
if mhsd.children
mhsd.children.each do |sdChild|
if sdChild.is_a? Cmhlt
sdChild.children.each do |mhit|
t = Track.new
ts << t
mhit.children.each do |mhod|
if Cmhod::Types.has_key? mhod.type
sym = (Cmhod::Types[mhod.type].to_s + '=').to_sym
t.method(sym).call(mhod.str)
end
end
t.format = mhit.format
t.trackNumber = mhit.trackNumber
t.discNumber = mhit.discNumber
end
end
end
end
end
end
return ts
end

# Base class for objects in the iTunes DB
class Cmh_base
attr :addr
attr :depth
attr :len
attr :recordName
attr :children

def initialize(addr, db, depth)
@addr = addr
@depth = depth
@len = nil
@db = db
@children = nil
load()
end

def to_s()
indent = "\t" * @depth
return "#{indent}#{@recordName}:\n#{indent} len: #@len\n"
end

protected

def load(loadLen = true)
if @db.file.tell != @addr
@db.file.seek(@addr)
end
recordName = @db.file.read(@recordName.length())
if recordName != @recordName
raise "Invalid record identifier in DB: expected \"#@recordName\" but read \"#{recordName}\"."
end
@len = readUInt32() if loadLen
end

def loadChildren()
@children = []
end

def readUInt32()
return @db.file.read(4).unpack("V")[0]
end

def children_to_s()
ret = String.new
if @children
@children.each { |child| ret << child.to_s() }
end
return ret
end
end

# Record describing the database. This is the first record in the database and is located at the beginning
# of the database file.
class Cmhbd < Cmh_base
attr :dbVersion
attr :dbLen

def initialize(file)
@dbVersion = nil
@dbLen = nil
@recordName = "mhbd"
super(0, file, 0)
end

def to_s()
indent = "\t" * @depth
return super() << "#{indent} dbVersion: #@dbVersion\n#{indent} dbLen: #@dbLen\n" << children_to_s()
end

protected

def load()
super
seekTo = @addr + @recordName.length() + 4
if @db.file.tell != seekTo
@db.file.seek(seekTo)
end
@dbLen = readUInt32()
@dbVersion = Array.new(3) { readUInt32() }
loadChildren()
end

def loadChildren()
super
childAddr = @addr + @len
while childAddr < @dbLen
@children << Cmhsd.new(childAddr, @db, @depth + 1)
childAddr += @children[-1].childSize
end
end
end

# Record describing a dataset.
class Cmhsd < Cmh_base
# Adding childSize to @addr gives the addr of the next mhsd record
attr :childSize
attr :childType

def initialize(addr, db, depth)
@recordName = "mhsd"
@childSize = nil
@childType = nil
super
end

def to_s()
indent = "\t" * @depth
return super() << "#{indent} childSize: #@childSize\n#{indent} childType: #@childType\n" << children_to_s()
end

protected

def load()
super
seekTo = @addr + @recordName.length() + 4
if @db.file.tell != seekTo
@db.file.seek(seekTo)
end
@childSize = readUInt32()
@childType = readUInt32()
loadChildren()
end

def loadChildren()
super
if @childType == 1
@children << Cmhlt.new(@addr + @len, @db, @depth + 1)
end
end
end

# Record describing a track list
class Cmhlt < Cmh_base
attr :numChildren

def initialize(addr, db, depth)
@numChildren = nil
@recordName = "mhlt"
super
end

def to_s()
indent = "\t" * @depth
return super() << "#{indent} numChildren: #@numChildren\n" << children_to_s()
end

protected

def load()
super
seekTo = @addr + @recordName.length() + 4
@db.file.seek(seekTo)
@numChildren = readUInt32()
loadChildren()
end

def loadChildren()
super
addr = @addr + @len
@numChildren.times do
mhit = Cmhit.new(addr, @db, depth + 1)
@children << mhit
addr += mhit.totalLen
end
end
end

# Record describing a track item
class Cmhit < Cmh_base
attr :totalLen
attr :numStrMhods
attr :id
attr :format
attr :trackNumber
attr :discNumber

def initialize(addr, db, depth)
@totalLen = nil
@numStrMhods = nil
@id = nil
@format = nil
@recordName = "mhit"
@trackNumber = nil
@discNumber = nil
super
end

def to_s()
indent = "\t" * @depth
str = super
str << "#{indent} totalLen: #@totalLen\n"
str << "#{indent} numStrMhods: #@numStrMhods\n"
str << "#{indent} id: #@id\n"
str << "#{indent} format: #@format\n"
str << "#{indent} trackNumber: #@trackNumber\n"
str << "#{indent} discNumber: #@discNumber\n"
str << children_to_s()
return str
end

protected

def load()
super
seekTo = @addr + @recordName.length() + 4
@db.file.seek(seekTo)
@totalLen = readUInt32()
@numStrMhods = readUInt32()
@id = readUInt32()
@db.file.seek(4, IO::SEEK_CUR)
@format = @db.file.read(4).reverse()
@db.file.seek(@addr + 44)
trackNumber = readUInt32()
@trackNumber = trackNumber if trackNumber > 0
discNumber = readUInt32()
@discNumber = discNumber if discNumber > 0
loadChildren()
end

def loadChildren()
super
addr = @addr + @len
numStrMhods.times do
mhod = Cmhod.new(addr, @db, depth + 1)
children << mhod
addr += mhod.len
end
end
end

# Record describing a data object
class Cmhod < Cmh_base
Types = {1 => :title,
2 => :location,
3 => :album,
4 => :artist,
5 => :genre,
6 => :fileType,
8 => :comment,
12 => :composer,
13 => :grouping,
14 => :description,
22 => :albumArtist}.freeze
attr :headerLen
attr :type
attr :strLen
attr :str

def initialize(addr, db, depth)
@headerLen = nil
@type = nil
@str = nil
@strLen = nil
@recordName = "mhod"
super
end

def to_s()
indent = "\t" * @depth
str = super
str << "#{indent} headerLen: #@headerLen\n"
str << "#{indent} type: #@type\n"
str << "#{indent} strLen: #@strLen\n"
str << "#{indent} str: #@str\n" if @str
return str
end

protected

def load()
super(false)
@db.file.seek(@addr + @recordName.length())
@headerLen = readUInt32()
@len = readUInt32()
@type = readUInt32()
@db.file.seek(12, IO::SEEK_CUR)
@strLen = readUInt32()
if @strLen > 0 && @strLen % 2 == 0
@db.file.seek(8, IO::SEEK_CUR)
begin
str = @db.file.read(@strLen).force_encoding("UTF-16LE").encode("US-ASCII")
rescue
else
@str = str
end
end
end
end
end

end

Labels: , , ,


Comments:

Post a Comment

Subscribe to Post Comments [Atom]





<< Home

Archives

July 2009   August 2009   September 2009   October 2009   November 2009   December 2009   January 2010   September 2010   December 2010   January 2011   February 2011   April 2011   June 2011   August 2011   February 2012   June 2012   July 2012   August 2012   October 2012   November 2012   January 2014   April 2014   June 2014   August 2014   September 2014   October 2014   January 2015   March 2015   April 2015   June 2015   November 2015   December 2015   January 2016   June 2016   August 2016   January 2017   March 2017   April 2018   April 2019   June 2019   January 2020  

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]