Saturday, May 03, 2008

Transcoding to h.264 Using FFmpeg

(Geek alert: this article is not for the technically faint of heart!)

One of the basic tasks to be done in transforming a TiVo recording into a video that plays on an Apple TV is the transcoding of the MPEG-2 of the original TiVo recording (after it has been duly decrypted) into a file that uses the MPEG-4 codec. Specifically, the h.264 variety of that codec is what Apple TV must have. Apple TV can't use MPEG-2.

If you, like me, use VisualHub for this transcoding purpose, you may (again, like me) run into situations where you need more control over what VisualHub is doing ... more control, that is, than the "ordinary" Advanced Settings in VisualHub seem to give you.

You can gain extra control by checkmarking "Force: FFmpeg Decoding" and entering appropriate FFmpeg options and arguments into the "Extra FFmpeg flags" field.

So, what is FFmpeg? FFmpeg is a standalone software tool that VisualHub "contains" in the form of a Unix executable file named vh131ffmpeg, located in the user folder ~/Library/Application Support/Techspansion/. The exact name of the executable could change in future releases of VisualHub; that's the name as of version 1.3.1 of VisualHub.

That Unix executable can be directly invoked in Terminal, if you want to play around with it without using VisualHub as an intermediary. Here's how I get to it, under username dalekhound:

$ cd /Users/dalekhound/Library/Application\ Support/Techspansion
$ ./vh131ffmpeg

The $ is the prompt character in Terminal, which is always, in my case, expanded to:

eric-stewarts-computer:Techspansion dalekhound$

I'll leave that boilerplate out in what follows, and just use $.

Notice that I have to precede the name of the executable with ./ so as to get $ ./vh131ffmpeg. If I don't do that, the Mac tries to find vh131ffmpeg somewhere other than the current directory, the one which I set using cd.

Just entering ./vh131ffmpeg with no options or arguments causes it to produce a voluminous list of the options you can choose ... as if you had entered ./vh131ffmpeg -help.

Another way to learn about FFmpeg's options is to visit http://ffmpeg.mplayerhq.hu/ffmpeg-doc.html.

You can also find a well-written overview of FFmpeg's usage at http://howto-pages.org/ffmpeg/.

The basic function of VisualHub's "Extra FFmpeg flags" field is to allow you to input various options and arguments to FFmpeg, just as you would if you were using the command-line interface in Terminal. Here's a very brief primer on the options and arguments.

If I enter, in Terminal:

$ ./vh131ffmpeg -i ...

and fill in the placeholder meaningfully with the path to an input file, FFmpeg will transcode the input file and produce an output file in according to the options that I replace ... with.

For purposes of this tutorial I'm using an input file whose full path can be given as:

/Volumes/My\ Book-3/TiVo/TiVo\ Shows/Needing\ Export/From\ VideoReDo/Hello,\ Dolly\!.mpg

Notice two things. First, the levels of the folder hierarchy are separated from each other by a forward slash /. Second, every character in the pathname (including space characters) that is not a letter, a number, or a period is preceded by an "escape" character: a backward slash \.

Alternatively, the whole pathname, or any part thereof that would otherwise have to contain internal escape characters, can be enclosed in single quotes ('). For example:

'/Volumes/My Book-3/TiVo/TiVo Shows/Needing Export/From VideoReDo/Hello, Dolly!'.mpg

The output file's full path name can be exactly the same as that for the input file, except that .mp4 is substituted for .mpg. This is appropriate for two reasons. One, you don't want to overwrite the input file. Two, the output file (in this case) will be in a format appropriate to the MPEG-4/h.264 codec, and as such should have the qualifier .mpg.

Rather than laboriously type in the whole path name, try typing just ./vh131ffmpeg -i , with a trailing space character, after which you can simply drag the Finder icon for the input file into the Terminal window. The properly specified path name will magically appear, with all the required escape characters. Then type in the options you want to replace the ellipsis ( ... ) with, in the above template. Finally, again drag the Finder icon for the input file into Terminal, but this time go on to manually edit the output file name (if desired) and definitely change the extension to .mp4. Once you have done all that, hit the return key, and the command will execute.

Another trick you can use is to build the command line in a TextEdit document and copy it in its entirety to Terminal. You can use the drag-the-Finder-option shortcut too. But now, the pathname you get in TextEdit won't have the necessary escape characters, so you have to manually enclose the pathname in single quotes.

If I use TextEdit to build:

/vh131ffmpeg -i '/Volumes/My Book-3/TiVo/TiVo Shows/Needing Export/From VideoReDo/Hello, Dolly!'.mpg

and copy that into Terminal (hitting return), I see:

FFmpeg version SVN-r9226, Copyright (c) 2000-2007 Fabrice Bellard, et al. libavutil: 49.4.0 libavcodec: 51.40.4 libavformat: 51.12.1 built: Feb 12 2008 19:58:15, gcc: 4.0.1 (Apple Computer, Inc. build 5367), i386

Input #0, mpeg, from '/Volumes/My Book-3/TiVo/TiVo Shows/Needing Export/From VideoReDo/Hello, Dolly!.mpg':

Duration-8906 start-0.200000 bitrate-2312

0.0,,dvd1e0,,,Video,mpeg2video,yuv420p,704,480,29.97

0.1,,dvd80,,,Audio,ac3,48000,2,192

The first part is information about this particular version of FFmpeg, followed by important information about the input file. (Notice that as yet there are no further FFmpeg options in the command line, and no output file is specified.)

The duration of the input file is 8,906 seconds, or about 148.43 minutes ... or about 2.47 hours. It's actual start time is for some reason offset from the nominal beginning by 0.2 seconds. The bitrate of the file, including both its audio and video streams, is 2,312 kilobits per second.

There are two streams, a video stream, designated 0.0, and an audio stream, 0.1. These designations use "0" before the "." to represent the first (and in this case only) input file to FFmpeg, and the "0" and "1" after the "." to represent, respectively, the first and second streams in that single input file. The first (or "0.0") stream is video; the second (or "0.1") stream is audio.

I'm not sure what some of the other stuff means, but:

  • mpeg2video is the video codec that will be used by FFmpeg to decode the video stream
  • yuv420p tells how the video's pixels are "put together": how the color is specified as two components (uv) and how the luminance (black and white component) is specified (y); I think "420p" has to do with the way in which the color information is "subsampled" (reduced in quantity) with respect to the luminance
  • The video frame has 704 pixels horizontally and 480 pixels vertically
  • There are 29.97 video frames per second
  • The audio is encoded using the ac3 (i.e., Dolby Digital) codec, with 48.000 audio samples per second for each channel; there are 2 channels (i.e., it is stereo); the audio bitrate is 192 Kb/sec
Notice that what is not mentioned, but is actually the case, is that this input file plays back in (say) the VLC media player with an aspect ratio of 4:3. That "display aspect ratio" is usual with standard definition NTSC fare such as this recording.

Also notice that the "nominal aspect ratio" is 704:480 — that is, there are 704 pixels horizontally in the frame and 480 pixels vertically. For a 4:3 frame with 480 "lines," the number of pixels in each line "ought to be" 640, not 704. 640:480 is the same as 4:3. (To see this, divide the X in X:Y by the Y. For both 640:480 and 4:3, the result is 1.33333 .... .)

But 704:480 is 1.4666666 ... , not 1.33333 ... .

So the pixels in my input file are not the usual "square" pixels whose height and width are equal. Instead, the width is only 640/704 of the height. 640/704 equals 0.909090909 ... . These pixels are roughly 90% as wide as they "ought to be."

That's not a bad thing: cramming 704 pixels into a line that "ought to" contain only 640 pixels allows greater horizontal resolution.

Actually, though, I believe the non-square pixels are but an artifact of how the TiVo captures standard-definition analog TV, which has a display aspect ratio of 4:3. Instead of encoding it as 480 lines of 640 square pixels per line, it uses non-square pixels in a 704x480 grid.

This is akin to how 4:3 video is recorded in MPEG-2 on a non-anamorphic DVD. By emulating a DVD format, a TiVo makes it relatively easy to use TiVoToGo to transfer a recording to a computer and then burn it (after decrypting) to a DVD.

But FFmpeg, at least when its output file is h.264, uses only square pixels. If you transcode a 704x480 file in FFmpeg (or in VisualHub with FFmpeg as its decoder), the result will have a 640x480 grid. Whatever extra horizontal resolution (above 640 pixels per line) that may have been in the input file will be lost. Fortunately, I don't believe actual standard-def 4:3 channels have even 640 pixels per line of discernible detail, so nothing is really lost here.

[More to come ...]

3 comments:

Unknown said...

Dear Eric:
I just read over your piece on Blu-ray audio codecs and issues involved in obtaining high-def sound. I'm doing a story on the PC side of this equation for Tom's Hardware and would very much like to interview you on this subject. You can look up my e-mail at my Web page at www dot edtittel dot com and contact me through there. I would really appreciate it if you did so, and am even open to co-authoring and shared payment credit.
Thanks!
--Ed--

eric said...

Ed,

Thanks for your comment. I'll contact you through your website.

Eric

jtjdt said...

Very Very helpful. Thanks for writing this.