Extending MusicBrainz to hold audio checksums

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Extending MusicBrainz to hold audio checksums

Eric Shattow
How can we extend MusicBrainz so that people ripping digital media may verify their audio content is correctly media-shifted from audio CD sources?

There is a higher challenge here. Reading CD audio is an inexact science:

- Labels may release multiple pressings of the same content, leading to the same audio at slightly differing sample offsets
- Consumer CD drives are not consistent at which sample offset they start reading from
- No drive on the market can reliably detect all errors when reading audio from CD media

If we can store file audio data checksums in a meaningful way, should we?

Possible approach:

1) No "zero" reference. Cover all possible read offsets given drives available on the market today. Make a lot of SHA-1 checksums (3072 count) per track, and per album. This amounts to be 33KiB * (num tracks + 1) of checksum data per release. This has the benefit of working without drive calibration, as most CD pressings contain no useful data in the missing samples (which is like maximum 5 sectors i.e. 5/72 of a second audio). Drawbacks are the required time to compute checksums and the nearly 400KiB of checksum data per release.

2) Above but with a "zero" reference. Maintain a list of approved "zero offset" drives (I own such a drive, the Plextor PX-712SA). This differs from the AccurateRip(TM) method by 30 samples. Checksums stored in the database will be moderated and voted on by persons submitting from approved hardware only. This reduces the data storage requirement to less than 5KiB per release. Client verification software is still tasked with heavy computational load to generate all possible checksums as described in method #1

3) Guess what the actual audio content is for the release and cover all possible read offsets for the release's audio content as a whole. It would also have some kind of "inner checksum" calculated at offset from the start and finish of useful audio. Required storage is less than 70KiB per release.

This concept of verifying CD audio rips does kind of walk the line between what does and does not apply to the purpose of MusicBrainz database.

Thoughts? Comments welcomed.
_______________________________________________
MusicBrainz-users mailing list
[hidden email]
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-users
Reply | Threaded
Open this post in threaded view
|

Re: Extending MusicBrainz to hold audio checksums

Per Øyvind Øygard
On Fri, Jun 5, 2009 at 12:32 PM, Eric Shattow <[hidden email]> wrote:
How can we extend MusicBrainz so that people ripping digital media may verify their audio content is correctly media-shifted from audio CD sources?

It's an interesting technical question, but I'm not sure I see the value of this in an MBz context. For one AccurateRip already does this reasonably well, and will likely do it even better in the near future. See http://forum.dbpoweramp.com/showpost.php?p=87227&postcount=5

It's also a very niche thing, and would likely only be interesting to a small group of MBz users (hardcore flac pirates). If picard had ripping capabilites it would be somewhat different, but I don't see that happening anytime soon, if ever.

-- 
Per / Wizzcat 

_______________________________________________
MusicBrainz-users mailing list
[hidden email]
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-users
Reply | Threaded
Open this post in threaded view
|

Re: Extending MusicBrainz to hold audio checksums

Nikolai Prokoschenko
Administrator


Per Øyvind Øygard wrote:
>
> It's also a very niche thing, and would likely only be interesting to a
> small group of MBz users (hardcore flac pirates). If picard had ripping
> capabilites it would be somewhat different, but I don't see that happening
> anytime soon, if ever.
>

I'd be pretty interested in some checksum on the audio to check whether some
small bits of my music collection have been corrupted by a datasystem error
or something like that. Not sure though if that's the scope...

Nikolai
--
View this message in context: http://n2.nabble.com/Extending-MusicBrainz-to-hold-audio-checksums-tp3029531p3040629.html
Sent from the User help mailing list archive at Nabble.com.


_______________________________________________
MusicBrainz-users mailing list
[hidden email]
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-users
Reply | Threaded
Open this post in threaded view
|

Re: Extending MusicBrainz to hold audio checksums

Nikolai Prokoschenko
Administrator



Nikolai Prokoschenko wrote:
>
> ... corrupted by a datasystem error ...
>

That's "filesystem" of course, I need sleep :(

Nikolai.
--
View this message in context: http://n2.nabble.com/Extending-MusicBrainz-to-hold-audio-checksums-tp3029531p3040638.html
Sent from the User help mailing list archive at Nabble.com.


_______________________________________________
MusicBrainz-users mailing list
[hidden email]
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-users
Reply | Threaded
Open this post in threaded view
|

Re: Extending MusicBrainz to hold audio checksums

Per Øyvind Øygard
In reply to this post by Nikolai Prokoschenko


On Mon, Jun 8, 2009 at 1:09 AM, Nikolai Prokoschenko <[hidden email]> wrote:

Per Øyvind Øygard wrote:
>
> It's also a very niche thing, and would likely only be interesting to a
> small group of MBz users (hardcore flac pirates). If picard had ripping
> capabilites it would be somewhat different, but I don't see that happening
> anytime soon, if ever.
>

I'd be pretty interested in some checksum on the audio to check whether some
small bits of my music collection have been corrupted by a datasystem error
 or something like that. Not sure though if that's the scope...

Sure, though OP was mentioning offset issues specifically, which is a non-issue if you do the rips yourself.

Checksumming your music is indeed quite useful, but this can be done to great effect with AccurateRip (try CUE Tools, it can verify+tag your collection in one pass). Duplicating the efforts AR has done will be incredibly difficult since MBz has no ripper backing, which you pretty much need to get a critical mass of checksums.

-- 
Per / Wizzcat 

_______________________________________________
MusicBrainz-users mailing list
[hidden email]
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-users