Thursday, August 14, 2014

How to Convert DVD to mp4: Part 2- Handling Audio

The last conversion script works for video but creates terrible audio. The reason is easy to see. Lets run ffprobe on the first VOB.

ffprobe VTS_01_1.VOB

Here are the results:

Input #0, mpeg, from 'VTS_01_1.VOB':
Duration: 00:26:01.26, start: 0.280633, bitrate: 5501 kb/s
Stream #0:0[0x1bf]: Data: dvd_nav_packet
Stream #0:1[0x1e0]: Video: mpeg2video (Main), yuv420p(tv), 720x480 [SAR 32:27 DAR 16:9], max. 7000 kb/s, 33.33 fps, 59.94 tbr, 90k tbn, 59.94 tbc
Stream #0:2[0x20]: Subtitle: dvd_subtitle
Stream #0:3[0x21]: Subtitle: dvd_subtitle
Stream #0:4[0x22]: Subtitle: dvd_subtitle
Stream #0:5[0x23]: Subtitle: dvd_subtitle
Stream #0:6[0x24]: Subtitle: dvd_subtitle
Stream #0:7[0x80]: Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s
Stream #0:8[0x89]: Audio: dts (DTS), 48000 Hz, 5.1(side), fltp, 768 kb/s
Stream #0:9[0x82]: Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
Unsupported codec with id 1145979222 for input stream 0

Clearly, there are multiple audio streams. Not specifying a particular stream will cause all audio inputs to get inserted into the audio conversion chain, and this produces muck.

To specify a particular stream, let's use a filter to map it. This is the most reliable way to ensure that only the required stream is mapped. We will use the "anull" filter, which does nothing to the audio, but just passes it on untouched. We will in this example use the stereo stream, that is stream 0:7. So we will provide the input pad to anull as 0:7.

To check the quality of the audio I did tests both with wav output as well as aac. These are short quality checks:

ffmpeg -i VTS_01_1.VOB -y -ss 10 -t 50 -filter_complex '[0:7]anull[ao]' -map '[ao]' -acodec pcm_s16le -b:a 900k -f wav test.wav


ffmpeg -i VTS_01_1.VOB -y -ss 10 -t 50 -filter_complex '[0:7]anull[ao]' -map '[ao]' -acodec aac -strict -2 -b:a 256k test.m4a

Results of both these tests were fine. So the final audio conversion portion was modified and inserted into the older mp4 conversion script. This now becomes:

#!/bin/sh

cat VTS_01_[1234567].VOB | ffmpeg -i - -y -strict -2 -acodec aac -ab 256k -pass 1 -vcodec libx264 -filter_complex "yadif,hqdn3d=5:3:8:8,unsharp=5:5:0.5:3:3:0.3,curves=master='0.2/0.28';[0:7]anull[ao]" -map '[ao]' -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -subq 1 -trellis 0 -refs 1 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -f mp4 /dev/null

cat VTS_01_[1234567].VOB | ffmpeg -i - -y -strict -2 -acodec aac -ab 256k -pass 2 -vcodec libx264 -filter_complex "yadif,hqdn3d=5:3:8:8,unsharp=5:5:0.5:3:3:0.3,curves=master='0.2/0.28';[0:7]anull[ao]" -map '[ao]' -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -mixed-refs 1 -subq 6 -trellis 1 -refs 5 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -movflags +faststart outfile.mp4 

Monday, August 11, 2014

How to Convert DVD to mp4: De-grain, De-noise, Unsharp Mask Filters, x264 Segfaults and Solution

DVDs contain Video Objects (VOBs) which are essentially mpeg streams and other data in a container. Stream type mpegs can be directly concatenated just like regular text files. You may have realized that there is no discernible gap or glitch when playing back a series of VOBs.

Typically, the VOBs are Numbered in this way:

VTS_01_1.VOB
VTS_01_2.VOB
...
VTS_01_n.VOB

VTS_01_0 is typically the DVD menu. So it does not concern us at this point.

To concatenate the VOBs you can say:

cat VTS_01_1.VOB VTS_01_2.VOB.....VTS_01_n.VOB

Lets say you have four VOBs. You can then also use a regular expression, "VTS_01_[1234].VOB" to concatenate VTS_01_1.VOB thru VTS_01_4.VOB.

cat VTS_01_[1234].VOB 

And if we want to pass this straight to ffmpeg, we can pipe the concatenated VOBs to ffmpeg in this way:

cat VTS_01_[1234].VOB | ffmpeg -i - -y -pass 1 -an -vcodec libx264 -vf "yadif,crop=720:412:0:85" -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -subq 1 -trellis 0 -refs 1 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -f mp4 /dev/null

cat VTS_01_[1234].VOB | ffmpeg -i - -y -strict -2 -acodec aac -ab 128k -pass 2 -vcodec libx264 -vf "yadif,crop=720:412:0:85" -threads 4 -b 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -mixed-refs 1 -subq 6 -trellis 1 -refs 5 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 <outputfile>.mp4


The above is a 2 pass conversion with264 codec and also de-interlacing (yadif) and cropping functions. Of course, it is not suitable for iDevices (iPad, iPhone, iPod).

Note the input parameters of ffmpeg, that is, the empty hyphen after the input argument:

-i -

This hyphen stands for the standard input: so ffmpeg reads from the standard input, which in this case is the piped output.

Next, as a reference I took a DVD of a film- "Bring on the Night" (Yep that's right- Sting's first concert after The Police). The film has grain and artifacts, especially noticeable in the sky areas. So I added a de-noising filter, specifically, the hqdn3d filter, and did some preliminary experiments on a small portion of the video.

#!/bin/sh

ffmpeg -i VTS_01_1.VOB -y -ss 10 -t 30 -pass 1 -an -vcodec libx264 -vf "yadif,hqdn3d=3:3:6:6" -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -subq 1 -trellis 0 -refs 1 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -f mp4 /dev/null

ffmpeg -i VTS_01_1.VOB -y -ss 10 -t 30 -strict -2 -acodec aac -ab 128k -pass 2 -vcodec libx264 -vf "yadif,hqdn3d=3:3:6:6" -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -mixed-refs 1 -subq 6 -trellis 1 -refs 5 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 outputfile.mp4

This did indeed clean up much of the artifacts- the sky especially looked much cleaner.

The parameters used :

hqdn3d=3:3:6:6

The meanings of the parameters are:

hqdn3d=luma_spatial:chroma_spatial:luma_tmp:chroma_tmp

Next I thought I would add an unsharp mask filter to the same filter chain, after the de-noising.

unsharp=5:5:1.0:5:5:0.0

The parameters for the filter are:

unsharp= luma_msize_x:luma_msize_y:luma_amount:chroma_msize_x:chroma_msize_y:chroma_amount

Here's the script:

#!/bin/sh

ffmpeg -i VTS_01_1.VOB -y -ss 10 -t 30 -pass 1 -an -vcodec libx264 -vf "yadif,hqdn3d=3:3:6:6,unsharp=5:5:1.0:5:5:0.0" -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -subq 1 -trellis 0 -refs 1 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -f mp4 /dev/null


ffmpeg -i VTS_01_1.VOB -y -ss 10 -t 30 -strict -2 -acodec aac -ab 128k -pass 2 -vcodec libx264 -vf "yadif,hqdn3d=3:3:6:6,unsharp=5:5:1.0:5:5:0.0" -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -mixed-refs 1 -subq 6 -trellis 1 -refs 5 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 outputfile.mp4

But the default values produced ghosting and fringing on high-contrast edges, although it added more pizzazz to the image on the whole.
I stepped down the values to reduce fringing. Stepped down the matrix size, as well as the amount:

unsharp=3:3:0.5:3:3:0.0

Note that till now I was not trying anything in the chroma values at all. Thereafter I added 0.3 unsharp in chroma. Some artifacts were back, but results look much better at default sizes (that is, not blown up to full screen display). After trying out many combinations I arrived at the best trade-off.

Final values used:

hqdn3d=5:3:8:8
unsharp=5:5:0.5:3:3:0.3

Next, I wanted to boost the luma in the low range. "Bring on the Night" is a fairly moody film, often dark and "noirish". But nevertheless I wanted to try bringing up the lows a little, using a curves filter. The curves filter can import a photoshop curves setting via an .acv file too. But in this particular instance i simply boosted position "0.2" to 0.28. This increased the lows beautifully. So my final settings in the test script was:

#!/bin/sh

ffmpeg -i VTS_01_1.VOB -y -ss 10 -t 30 -pass 1 -an -vcodec libx264 -vf "yadif,hqdn3d=5:3:8:8,unsharp=5:5:0.5:3:3:0.3,curves=master='0.2/0.28'" -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -subq 1 -trellis 0 -refs 1 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -f mp4 /dev/null


ffmpeg -i VTS_01_1.VOB -y -ss 10 -t 30 -strict -2 -acodec aac -ab 128k -pass 2 -vcodec libx264 -vf "yadif,hqdn3d=5:3:8:8,unsharp=5:5:0.5:3:3:0.3,curves=master='0.2/0.28'" -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -mixed-refs 1 -subq 6 -trellis 1 -refs 5 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 outputfile.mp4

Finally, for streaming purposes, I thought of shifting the MOOV atom to the beginning of the file. This is achieved by using the "-movflags +faststart" option.

Then I went back to the concatenation script and added the filters. So the script then became:

#!/bin/sh

cat VTS_01_[1234567].VOB | ffmpeg -i - -y -pass 1 -an -vcodec libx264 -vf "yadif,hqdn3d=5:3:8:8,unsharp=5:5:0.5:3:3:0.3,curves=master='0.2/0.28'" -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -subq 1 -trellis 0 -refs 1 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -f mp4 /dev/null


cat VTS_01_[1234567].VOB | ffmpeg -i - -y -strict -2 -acodec aac -ab 128k -pass 2 -vcodec libx264 -vf "yadif,hqdn3d=5:3:8:8,unsharp=5:5:0.5:3:3:0.3,curves=master='0.2/0.28'" -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -mixed-refs 1 -subq 6 -trellis 1 -refs 5 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -movflags +faststart outfile.mp4
Unfortunately, this results in a segfault at the end of the process:
[libx264 @ 0x9a9d0e0] 2nd pass has more frames than 1st pass (225787)
[libx264 @ 0x9a9d0e0] continuing anyway, at constant QP=25
[libx264 @ 0x9a9d0e0] disabling adaptive B-frames
[libx264 @ 0x9a9d0e0] specified frame type is not compatible with max B-frames
The solution is to add the audio encoding on pass 1 as well. This seems to be a bug on x264. On concatenation situations x264 will throw this error without audio in pass 1. Audio is not useful in pass 1 because it is only a motion estimation pass. But we have to live with it- and the tradeoff is a slightly slower pass 1. The final script:
#!/bin/sh

cat VTS_01_[1234567].VOB | ffmpeg -i - -y -strict -2 -acodec aac -ab 128k -pass 1 -vcodec libx264 -vf "yadif,hqdn3d=5:3:8:8,unsharp=5:5:0.5:3:3:0.3,curves=master='0.2/0.28'" -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -subq 1 -trellis 0 -refs 1 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -f mp4 /dev/null

cat VTS_01_[1234567].VOB | ffmpeg -i - -y -strict -2 -acodec aac -ab 128k -pass 2 -vcodec libx264 -vf "yadif,hqdn3d=5:3:8:8,unsharp=5:5:0.5:3:3:0.3,curves=master='0.2/0.28'" -threads 4 -b:v 1200k -pix_fmt yuv420p -flags +loop -cmp chroma -partitions +parti4x4+partp8x8+partb8x8 -mixed-refs 1 -subq 6 -trellis 1 -refs 5 -bf 3 -b_strategy 2 -coder 1 -me_range 16 -g 250 -keyint_min 75 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -movflags +faststart outfile.mp4

That's it for this segment. But Audio is not handled well in this current form. More on that later.