Initial working version

2022-12-22 20:22:22 +11:00
parent ce9675a1cc
commit ced7fa5092
902 changed files with 150252 additions and 0 deletions
@@ -0,0 +1,177 @@
+# audio-metadata
+
+[![Build Status](https://travis-ci.org/tmont/audio-metadata.png)](https://travis-ci.org/tmont/audio-metadata)
+[![NPM version](https://badge.fury.io/js/audio-metadata.png)](http://badge.fury.io/js/audio-metadata)
+
+This is a tinyish (2.1K gzipped) library to extract metadata from audio files.
+Specifically, it can extract [ID3v1](http://en.wikipedia.org/wiki/ID3#ID3v1),
+[ID3v2](http://en.wikipedia.org/wiki/ID3#ID3v2) and
+[Vorbis comments](http://www.xiph.org/vorbis/doc/v-comment.html)
+(i.e. metadata in [OGG containers](http://en.wikipedia.org/wiki/Ogg)).
+
+Licensed under the [WTFPL](http://www.wtfpl.net/).
+
+## What is this good for?
+The purpose of this library is to be very fast and small. It's suitable
+for server-side or client-side. Really any platform that supports
+`ArrayBuffer` and its ilk (`Uint8Array`, etc.).
+
+I wrote it because the other libraries were large and very robust; I just
+needed something that could extract the metadata out without requiring
+30KB of JavaScript. `audio-metadata.min.js` comes in at 6.1K/2.1K
+minified/gzipped.
+
+To accomplish the small size and speed, it sacrifices several things.
+
+1. It's very naive. For example, the OGG format stipulates that the comment
+   header must come second, after the identification header. This library
+   assumes that's always true and ignores the header type byte.
+2. Text encoding is for losers. ID3v2 in particular has a lot of flexibility in
+   terms of the encoding of text for ID3 frames. This library will handle UTF8
+   properly, but everything else is just spit out as ASCII.
+3. It assumes that ID3v2 tags are always the very first thing in the file (as they
+   should be). The spec is mum on whether that's ''required'', but this library
+   assumes it is.
+4. ID3v1.1 (extended tags with "TAG+") are not supported; Wikipedia suggests they
+   aren't really well-supported in media players anyway.
+
+As such, the code is a bit abstruse, in that you'll see some magic numbers, like
+`offset += 94` where it's ignoring a bunch of header data to get to the good stuff.
+Don't judge me based on this code. It works and it's tested; it's just hard to
+read.
+
+Of course, since this isn't an actual parser, invalid files will also work. This
+means, for example, you could only read the first couple hundred bytes of an MP3
+file and still extract the metadata from it, rather than requiring actual valid
+MP3 data.
+
+## Usage
+The library operates solely on `ArrayBuffer`s, or `Buffer`s for Node's convenience.
+So you'll need to preload your audio data before using this library.
+
+The library defines three methods:
+
+```javascript
+// extract comments from OGG container
+AudioMetaData.ogg(buffer)
+
+// extract ID3v2 tags
+AudioMetaData.id3v2(buffer);
+
+// extract ID3v1 tags
+AudioMetaData.id3v1(buffer);
+```
+
+The result is an object with the metadata. It attempts to normalize common keys:
+
+* ''title'': (`TIT1` and `TIT2` in id3v2)
+* ''artist'' (`TSE1` in id3v2)
+* ''composer'' (`TCOM` in id3v2)
+* ''album'' (`TALB` in id3v2)
+* ''track'' (`TRCK` in id3v2, commonly `TRACKNUMBER` in vorbis comments)
+* ''year'' (`TDRC` (date recorded) is used in id3v2)
+* ''encoder'' (`TSSE` in id3v2)
+* ''genre'' (`TCON` in id3v2)
+
+Everything else will be keyed by its original name. For id3v2,
+anything that is not a text identifier (i.e. a frame that starts with a
+"T") is ignored. This includes comments (`COMM`).
+
+### Node
+Install it using NPM: `npm install audio-metadata` or `npm install -g audio-metadata`
+if you want to use it from the shell.
+
+```javascript
+var audioMetaData = require('audio-metadata'),
+	fs = require('fs');
+
+var oggData = fs.readFileSync('/path/to/my.ogg');
+var metadata = audioMetaData.ogg(oggData);
+/*
+{
+  "title": "Contra Base Snippet",
+  "artist": "Konami",
+  "album": "Bill and Lance's Excellent Adventure",
+  "year": "1988",
+  "tracknumber": "1",
+  "track": "1",
+  "encoder": "Lavf53.21.1"
+}
+*/
+```
+
+#### From the Shell
+```
+Extract metadata from audio files
+
+USAGE
+audio-metadata --type <type> [options] file1 [file2...]
+
+OPTIONS
+--help,-h
+  This help
+--type,-t <type>
+  One of "id3v1", "id3v2" or "ogg"
+--chunk-size,-c <size>
+  Read the file in chunks of <size>; default is 512
+--quit-after,-q <length>
+  Stop searching for metadata if nothing is found after
+  <length> bytes; default is 512
+--no-colors,-z
+  Don't colorize the output
+
+EXAMPLE
+Search for metadata in the first 300 bytes in 100 byte increments
+ audio-metadata -t id3v2 -c 100 -q 300 keepitoffmy.wav
+ ```
+
+### Browser
+This library has been tested on current versions of Firefox and Chrome. IE
+might work, since it apparently supports `ArrayBuffer`. Safari/Opera are
+probably okayish since they're webkit. Your mileage may vary.
+
+Loading `audio-metadata.min.js` will define the `AudioMetadata` global variable.
+
+```html
+<script type="text/javascript" src="audio-metadata.min.js"></script>
+<script type="text/javascript">
+	var req = new XMLHttpRequest();
+	req.open('GET', 'http://example.com/sofine.mp3', true);
+	req.responseType = 'arraybuffer';
+
+	req.onload = function() {
+		var metadata = AudioMetaData.id3v2(req.response);
+		/*
+			{
+				"TIT2": "Foobar",
+				"title": "Foobar",
+				"TPE1": "The Foobars",
+				"artist": "The Foobars",
+				"TALB": "FUBAR",
+				"album": "FUBAR",
+				"year": "2014",
+				"TRCK": "9",
+				"track": "9",
+				"TSSE": "Lavf53.21.1",
+				"encoder": "Lavf53.21.1"
+			}
+		*/
+	};
+
+	req.send(null);
+</script>
+```
+
+## Development
+```bash
+git clone git@github.com:tmont/audio-metadata.js
+cd audio-metadata
+npm install
+npm test
+```
+
+There's a "test" (yeah, yeah) for browsers, which you can view
+by running `npm start` and then pointing your browser at
+[http://localhost:24578/tests/browser/](http://localhost:24578/tests/browser/).
+
+To build the minified browserified file, run `npm run minify`.
@@ -0,0 +1,140 @@
+#!/usr/bin/env node
+
+var fs = require('fs'),
+	audioMetadata = require('../'),
+	util = require('util'),
+	args = process.argv.slice(2),
+	type = 'id3v2',
+	chunkSize = 512,
+	quitAfter = chunkSize,
+	colorize = true,
+	files = [],
+	i;
+
+function usage() {
+	console.log('Extract metadata from audio files');
+	console.log();
+	console.log('USAGE');
+	console.log('audio-metadata --type <type> [options] file1 [file2...]');
+	console.log();
+	console.log('OPTIONS');
+	console.log('--help,-h');
+	console.log('  This help');
+	console.log('--type,-t <type>');
+	console.log('  One of "id3v1", "id3v2" or "ogg"');
+	console.log('--chunk-size,-c <size>');
+	console.log('  Read the file in chunks of <size>; default is 512');
+	console.log('--quit-after,-q <length>');
+	console.log('  Stop searching for metadata if nothing is found after ');
+	console.log('  <length> bytes; default is 512');
+	console.log('--no-colors,-z');
+	console.log('  Don\'t colorize the output');
+	console.log();
+	console.log('EXAMPLE');
+	console.log('Search for metadata in the first 300 bytes in 100 byte increments');
+	console.log(' audio-metadata -t id3v2 -c 100 -q 300 keepitoffmy.wav');
+}
+
+for (i = 0; i < args.length; i++) {
+	switch (args[i]) {
+		case '-t':
+		case '--type':
+			type = args[++i];
+			break;
+		case '-h':
+		case '--help':
+			usage();
+			process.exit(0);
+			break;
+		case '-c':
+		case '--chunk-size':
+			chunkSize = parseInt(args[++i]);
+			break;
+		case '-q':
+		case '--quit-after':
+			quitAfter = parseInt(args[++i]);
+			break;
+		case '-z':
+		case '--no-colors':
+			colorize = false;
+			break;
+		default:
+			files.push(args[i]);
+			break;
+	}
+}
+
+if (!type) {
+	console.error('--type is required');
+	process.exit(1);
+}
+if (!(type in { ogg: 1, id3v1: 1, id3v2: 1 })) {
+	console.error('Unrecognized type: ' + type);
+	process.exit(1);
+}
+
+if (!files.length) {
+	console.error('At least one file must be specified');
+	process.exit(1);
+}
+if (isNaN(chunkSize) || chunkSize < 64) {
+	console.error('Invalid chunk size');
+	process.exit(1);
+}
+if (isNaN(quitAfter)) {
+	console.error('Invalid --quit-after value');
+	process.exit(1);
+}
+if (chunkSize > quitAfter) {
+	console.error('chunk size cannot be greater than quit after value');
+	process.exit(1);
+}
+
+try {
+	for (i = 0; i < files.length; i++) {
+		//everything's done synchronously so things are printed in the expected order
+		var fd = fs.openSync(files[i], 'r'),
+			buffer = new Buffer(quitAfter),
+			metadata = null,
+			offset = 0;
+
+		while (!metadata) {
+			var toRead = offset + chunkSize > quitAfter ? quitAfter - offset : chunkSize;
+			if (!toRead) {
+				break;
+			}
+
+			var bytesRead = fs.readSync(fd, buffer, offset, toRead, offset);
+			if (bytesRead === 0) {
+				//EOF
+				break;
+			}
+
+			offset += bytesRead;
+			metadata = audioMetadata[type](buffer);
+		}
+
+		fs.closeSync(fd);
+
+		if (files.length > 1) {
+			console.log(files[i] + ':');
+		}
+		if (metadata) {
+			if (colorize) {
+				console.log(util.inspect(metadata, false, null, true));
+			} else {
+				console.log(JSON.stringify(metadata, null, '  '));
+			}
+		} else {
+			console.log('no metadata found');
+		}
+
+		console.log();
+	}
+
+	process.exit(0);
+} catch (e) {
+	console.error('An error occurred trying to read from a file');
+	console.error('  ' + e.message);
+	process.exit(1);
+}
@@ -0,0 +1,5 @@
+module.exports = {
+	ogg: require('./src/ogg'),
+	id3v1: require('./src/id3v1'),
+	id3v2: require('./src/id3v2')
+};
@@ -0,0 +1,76 @@
+{
+  "_from": "audio-metadata@^0.3.0",
+  "_id": "audio-metadata@0.3.0",
+  "_inBundle": false,
+  "_integrity": "sha1-fVVAMfDCRO4pYjGhpV4A/3iNbOs=",
+  "_location": "/audio-metadata",
+  "_phantomChildren": {},
+  "_requested": {
+    "type": "range",
+    "registry": true,
+    "raw": "audio-metadata@^0.3.0",
+    "name": "audio-metadata",
+    "escapedName": "audio-metadata",
+    "rawSpec": "^0.3.0",
+    "saveSpec": null,
+    "fetchSpec": "^0.3.0"
+  },
+  "_requiredBy": [
+    "#USER",
+    "/"
+  ],
+  "_resolved": "https://registry.npmjs.org/audio-metadata/-/audio-metadata-0.3.0.tgz",
+  "_shasum": "7d554031f0c244ee296231a1a55e00ff788d6ceb",
+  "_spec": "audio-metadata@^0.3.0",
+  "_where": "C:\\Users\\Maspenguin\\Documents\\Programming\\MasSite",
+  "author": {
+    "name": "Tommy Montgomery",
+    "email": "tmont@tmont.com",
+    "url": "http://tmont.com/"
+  },
+  "bin": {
+    "audio-metadata": "bin/audio-metadata.js"
+  },
+  "bugs": {
+    "url": "https://github.com/tmont/audio-metadata/issues"
+  },
+  "bundleDependencies": false,
+  "deprecated": false,
+  "description": "Extract metadata from audio files",
+  "devDependencies": {
+    "browserify": "3.19.1",
+    "mocha": "1.16.2",
+    "serve": "1.3.0",
+    "should": "2.1.1",
+    "uglify-js": "2.4.8"
+  },
+  "files": [
+    "index.js",
+    "audio-metadata.min.js",
+    "src",
+    "bin",
+    "README.md"
+  ],
+  "homepage": "https://github.com/tmont/audio-metadata#readme",
+  "keywords": [
+    "id3",
+    "metadata",
+    "mp3",
+    "ogg",
+    "wav",
+    "audio"
+  ],
+  "license": "WTFPL",
+  "name": "audio-metadata",
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/tmont/audio-metadata.git"
+  },
+  "scripts": {
+    "build": "browserify -s AudioMetadata -e index.js --bare > audio-metadata.js",
+    "minify": "npm run build && ./node_modules/.bin/uglifyjs audio-metadata.js > audio-metadata.min.js && rm audio-metadata.js",
+    "start": "serve -p 24578 .",
+    "test": "mocha -R spec tests"
+  },
+  "version": "0.3.0"
+}
@@ -0,0 +1,54 @@
+var utils = require('./utils');
+
+function checkMagicId3v1(view) {
+	var id3Magic = utils.readBytes(view, view.byteLength - 128, 3);
+	//"TAG"
+	return id3Magic[0] === 84 && id3Magic[1] === 65 && id3Magic[2] === 71;
+}
+
+module.exports = function(buffer) {
+	//read last 128 bytes
+	var view = utils.createView(buffer);
+	if (!checkMagicId3v1(view)) {
+		return null;
+	}
+
+	function trim(value) {
+		return value.replace(/[\s\u0000]+$/, '');
+	}
+
+	try {
+		var offset = view.byteLength - 128 + 3,
+			readAscii = utils.readAscii;
+		var title = readAscii(view, offset, 30),
+			artist = readAscii(view, offset + 30, 30),
+			album = readAscii(view, offset + 60, 30),
+			year = readAscii(view, offset + 90, 4);
+
+		offset += 94;
+
+		var comment = readAscii(view, offset, 28),
+			track = null;
+		offset += 28;
+		if (view.getUint8(offset) === 0) {
+			//next byte is the track
+			track = view.getUint8(offset + 1);
+		} else {
+			comment += readAscii(view, offset, 2);
+		}
+
+		offset += 2;
+		var genre = view.getUint8(offset);
+		return {
+			title: trim(title),
+			artist: trim(artist),
+			album: trim(album),
+			year: trim(year),
+			comment: trim(comment),
+			track: track,
+			genre: genre
+		};
+	} catch (e) {
+		return null;
+	}
+};
@@ -0,0 +1,124 @@
+var utils = require('./utils');
+
+function checkMagicId3(view, offset) {
+	var id3Magic = utils.readBytes(view, offset, 3);
+	//"ID3"
+	return id3Magic[0] === 73 && id3Magic[1] === 68 && id3Magic[2] === 51;
+}
+
+function getUint28(view, offset) {
+	var sizeBytes = utils.readBytes(view, offset, 4);
+	var mask = 0xfffffff;
+	return ((sizeBytes[0] & mask) << 21) |
+		((sizeBytes[1] & mask) << 14) |
+		((sizeBytes[2] & mask) << 7) |
+		(sizeBytes[3] & mask);
+}
+
+//http://id3.org/id3v2.3.0
+//http://id3.org/id3v2.4.0-structure
+//http://id3.org/id3v2.4.0-frames
+module.exports = function(buffer) {
+	var view = utils.createView(buffer);
+	if (!checkMagicId3(view, 0)) {
+		return null;
+	}
+
+	var offset = 3;
+	//var majorVersion = view.getUint8(offset);
+	offset += 2;
+	var flags = view.getUint8(offset);
+	offset++;
+	var size = getUint28(view, offset);
+	offset += 4;
+
+	var extendedHeader = (flags & 128) > 0;
+
+	if (extendedHeader) {
+		offset += getUint28(view, offset);
+	}
+
+	function readFrame(offset) {
+		try {
+			var id = utils.readAscii(view, offset, 4);
+			var size = getUint28(view, offset + 4);
+			offset += 10; //+2 more for flags we don't care about
+
+			if (id[0] !== 'T') {
+				return {
+					id: id,
+					size: size + 10
+				};
+			}
+
+			var encoding = view.getUint8(offset),
+				data = '';
+
+			if (encoding <= 3) {
+				offset++;
+				if (encoding === 3) {
+					//UTF8 - null terminated
+					data = utils.readUtf8(view, offset, size - 1);
+				} else {
+					//ISO-8859-1, UTF-16, UTF-16BE
+					//UTF-16 and UTF-16BE are $FF $00 terminated
+					//ISO is null terminated
+
+					//screw these encodings, read it as ascii
+					data = utils.readAscii(view, offset, size - 1);
+				}
+			} else {
+				//no encoding info, read it as ascii
+				data = utils.readAscii(view, offset, size);
+			}
+
+			//id3v2.4 is supposed to have encoding terminations, but sometimes
+			//they don't? meh.
+			data = utils.trimNull(data);
+
+			return {
+				id: id,
+				size: size + 10,
+				content: data
+			};
+		} catch (e) {
+			return null;
+		}
+	}
+
+	var idMap = {
+		TALB: 'album',
+		TCOM: 'composer',
+		TIT1: 'title',
+		TIT2: 'title',
+		TPE1: 'artist',
+		TRCK: 'track',
+		TSSE: 'encoder',
+		TDRC: 'year',
+		TCON: 'genre'
+	};
+
+	var endOfTags = offset + size,
+		frames = {};
+	while (offset < endOfTags) {
+		var frame = readFrame(offset);
+		if (!frame) {
+			break;
+		}
+
+		offset += frame.size;
+		if (!frame.content) {
+			continue;
+		}
+		var id = idMap[frame.id] || frame.id;
+		if (id === 'TXXX') {
+			var nullByte = frame.content.indexOf('\u0000');
+			id = frame.content.substring(0, nullByte);
+			frames[id] = frame.content.substring(nullByte + 1);
+		} else {
+			frames[id] = frames[frame.id] = frame.content;
+		}
+	}
+
+	return frames;
+};
@@ -0,0 +1,79 @@
+var utils = require('./utils');
+
+/**
+ * See http://www.ietf.org/rfc/rfc3533.txt
+ * @param {Buffer|ArrayBuffer} buffer
+ */
+module.exports = function(buffer) {
+	var view = utils.createView(buffer);
+
+	function parsePage(offset, withPacket) {
+		if (view.byteLength < offset + 27) {
+			return null;
+		}
+
+		var numPageSegments = view.getUint8(offset + 26),
+			segmentTable = utils.readBytes(view, offset + 27, numPageSegments),
+			headerSize = 27 + numPageSegments;
+
+		if (!segmentTable.length) {
+			return null;
+		}
+
+		var
+			pageSize = headerSize + segmentTable.reduce(function(cur, next) {
+				return cur + next;
+			}),
+			length = headerSize + 1 + 'vorbis'.length,
+			packetView = null;
+
+		if (withPacket) {
+			packetView = utils.createView(new ArrayBuffer(pageSize - length));
+			utils.readBytes(view, offset + length, pageSize - length, packetView);
+		}
+
+		return {
+			pageSize: pageSize,
+			packet: packetView
+		};
+	}
+
+	function parseComments(packet) {
+		try {
+			var vendorLength = packet.getUint32(0, true),
+				commentListLength = packet.getUint32(4 + vendorLength, true),
+				comments = {},
+				offset = 8 + vendorLength,
+				map = {
+					tracknumber: 'track'
+				};
+
+			for (var i = 0; i < commentListLength; i++) {
+				var commentLength = packet.getUint32(offset, true),
+					comment = utils.readUtf8(packet, offset + 4, commentLength),
+					equals = comment.indexOf('='),
+					key = comment.substring(0, equals).toLowerCase();
+
+				comments[map[key] || key] = comments[key] = utils.trimNull(comment.substring(equals + 1));
+				offset += 4 + commentLength;
+			}
+
+			return comments;
+		} catch (e) {
+			//all exceptions are just malformed/truncated data, so we just ignore them
+			return null;
+		}
+	}
+
+	var id = parsePage(0);
+	if (!id) {
+		return null;
+	}
+
+	var commentHeader = parsePage(id.pageSize, true);
+	if (!commentHeader) {
+		return null;
+	}
+
+	return parseComments(commentHeader.packet);
+};
@@ -0,0 +1,69 @@
+function toArrayBuffer(buffer) {
+	var arrayBuffer = new ArrayBuffer(buffer.length);
+	var view = new Uint8Array(arrayBuffer);
+	for (var i = 0; i < buffer.length; ++i) {
+		view[i] = buffer[i];
+	}
+	return arrayBuffer;
+}
+
+module.exports = {
+	trimNull: function(s) {
+		return s.replace(/\u0000+$/, '');
+	},
+
+	createView: function(buffer) {
+		if (typeof(Buffer) !== 'undefined' && buffer instanceof Buffer) {
+			//convert nodejs buffers to ArrayBuffer
+			buffer = toArrayBuffer(buffer);
+		}
+
+		if (!(buffer instanceof ArrayBuffer)) {
+			throw new Error('Expected instance of Buffer or ArrayBuffer');
+		}
+
+		return new DataView(buffer);
+	},
+
+	readBytes: function(view, offset, length, target) {
+		if (offset + length < 0) {
+			return [];
+		}
+
+		var bytes = [];
+		var max = Math.min(offset + length, view.byteLength);
+		for (var i = offset; i < max; i++) {
+			var value = view.getUint8(i);
+			bytes.push(value);
+			if (target) {
+				target.setUint8(i - offset, value);
+			}
+		}
+
+		return bytes;
+	},
+
+	readAscii: function(view, offset, length) {
+		if (view.byteLength < offset + length) {
+			return '';
+		}
+		var s = '';
+		for (var i = 0; i < length; i++) {
+			s += String.fromCharCode(view.getUint8(offset + i));
+		}
+
+		return s;
+	},
+
+	readUtf8: function(view, offset, length) {
+		if (view.byteLength < offset + length) {
+			return '';
+		}
+
+		var buffer = view.buffer.slice(offset, offset + length);
+
+		//http://stackoverflow.com/a/17192845 - convert byte array to UTF8 string
+		var encodedString = String.fromCharCode.apply(null, new Uint8Array(buffer));
+		return decodeURIComponent(escape(encodedString));
+	}
+};