Capturing Video: Technical Considerations

Jeffrey La Favre
6/5/99

Video capture can quickly exhaust storage space on your computer. Therefore, it is helpful to know how to estimate the file size of a video prior to capture. Armed with this knowledge, you will be able to make informed choices in video format in order to control file size.

There are six factors that determine the file size of a digital video:

1. Screen size

2. Color depth (e.g. 8 bit, 16 bit or 24 bit)

3. Number of frames per second (fps)

4. Sound sampling (e.g. 44,100 Hz, 16 bit, stereo)

5. Play length

6. CODEC (type of compression employed)

The first five factors can be used to calculate the exact file size of an uncompressed video. For compressed video, only an estimate of file size is possible prior to capture.

At this point we should review the units of digital data:

• Bit - the smallest unit of digital data

• Byte - the unit usually used for digital storage, 1 byte = 8 bits

• Kilobyte (KB) - 1 KB = 1024 bytes

• Megabyte (MB) - 1 MB = 1024 KB = 1, 048,576 bytes

• Gigabyte (GB) - 1 GB = 1024 MB = 1,048,576 KB = 1,073,741,824 bytes

Calculating the file size of an uncompressed video

1. Calculate the number of pixels on the video screen. For example, suppose we plan to format the video with a screen that is 320 pixels wide and 240 pixels high. The number of pixels on this screen is calculated by multiplying the width and height.

320 x 240 = 76,800 pixels (the number of pixels in the screen)

2. Calculate the number of bytes required to store one frame. In order to do this we must know the number of bytes required to store one pixel, which is determined by the color depth. Suppose we plan to use 24 bit color. Remember that 1 byte = 8 bits, so 24 bits = 3 bytes. Therefore, each pixel requires 3 bytes of storage for 24 bit color.

3 bytes/pixel x 76,800 pixels = 230,400 bytes
(the number of bytes required to store one frame of video)

3. Calculate the number of bytes required to store one second of video. In order to do this we must know the number of frames per second. Suppose we plan to use 15 fps, a common value for computer video.

230,400 bytes/frame x 15 frames/sec = 3,456,000 bytes/sec
(number of bytes for one second of video)

4. Calculate the number of bytes required to store one second of sound (skip steps 4 and 5 if the video is to be captured without sound). In order to make this calculation you must know three things: the sampling rate (Hz), sample depth (bits), number of channels (1 for mono, 2 for stereo). Suppose we want to capture the sound at CD quality (44,100 Hz, 16 bit, stereo). This means the sound will be sampled 44,100 times in one second, each sample will be 16 bits (2 bytes), and there will be two channels.

44,100 samples/sec x 2 bytes/sample x 2 channels = 176,400 bytes/sec
(number of bytes for one second of sound)

5. Calculate the number of bytes required to store one second of video with sound.

3,456,000 bytes/sec for video + 176,400 bytes/sec for sound = 3,632,400 bytes/sec
(number of bytes for one second, video + sound)

6. Calculate the number of bytes required to store complete video. Suppose our video is exactly one minute long.

60 sec x 3,632,400 bytes/sec = 217,944,000 bytes or 208 MB

Compressed Video

The calculations above reveal the problem of video file size. Nearly all digital video for computers is compressed in final form to control the file size. During the editing stage, it is desirable to work with an uncompressed video if you have enough disk space. After all editing is complete, the video can be compressed to yield the final form for distribution.

Video compression should not be confused with screen size. When a video is compressed, the screen size is not reduced. Video compression is technology that is applied to reduce the number of bytes required to store each frame of video. The subject is too complex to warrant complete discussion in this paper. For our purposes it is sufficient to know that various CODECs (compression decompression) can be utilized to compress video. These CODECs are also required for playback of a compressed video. If a video is compressed with a CODEC, the computer used to play the video must have that CODEC installed.

CODECs come in two basic types: hardware and software. Some are available in both types. Hardware CODECs are commonly incorporated into video capture cards. These CODECs allow you to compress video on the fly (as you capture the video from video tape, etc.). Unless you have a very fast computer, it is usually not possible to compress video during capture with a software CODEC. However, a software CODEC may function just fine for playback. This is due to the fact that many CODECs are asymmetrical (more processing is required to compress the video during capture than to uncompress during playback).

Computers running with Windows 95 or 98 have a media player for video that contains a selection of CODECs. These include for example, several versions of the Intel Indeo, Cinepak, MPEG-1and Microsoft Video 1. Therefore, videos compressed with one of these CODECs will play on most computers running Windows (if they have a current version of Windows Media Player installed, a free download from Microsoft).

There are a number of CODECs that are more proprietary in nature. A good example is the CODEC we will be using in this workshop, WNV1. WNV1 is a proprietary CODEC used with our Winnov video capture cards. Video compressed with WNV1 will not play on the average computer due to the absence of this CODEC. In order to prepare for this workshop, it was necessary to obtain a software version of the WNV1 CODEC from Winnov Corp. This CODEC was installed to all workshop computers to facilitate video editing on machines that do not have the video capture card installed.

Why did we decide to have you capture video in compressed form? While it would be desirable to capture in uncompressed form, your 10 minute videos would be approximately 2 GB in size (assuming a 320 x 240 screen size, 15 fps, 24 bit color). We only have 1 GB of available disk space to work with on workshop computers. And there must be enough room to accommodate the unedited version of the video and the edited version. Therefore, the video file size must not exceed 500 MB for our purposes. This can be achieved if the WNV1 CODEC is employed during video capture. In this compromise we have sacrificed a small amount of video quality (when a video is compressed, there is usually some loss in image quality -- with the better CODECs the reduction in quality may only be noticed by the trained eye).

Calculating the file size of a compressed video

As mentioned previously, it is not possible to calculate the exact file size of a compressed video prior to capture. The exact amount of compression applied by a CODEC depends on many factors. Nevertheless, with experience you can estimate the amount of compression that will be applied. For example, in a test run prior to the workshop, a sample video captured uncompressed was 200 MB. The same video was captured in compressed form with WNV1, which yielded a 40 MB file. Therefore, we can estimate the compression performance of WNV1 at 5:1 (we can make adjustment settings to the CODEC prior to capture which will vary the amount of compression).

After we have an estimate of the CODEC compression ratio, we can apply this to the file size calculated for an uncompressed video.

208 MB/minute of uncompressed video ÷ 5 = 42 MB/minute of WNV1 compressed video

Controlling video file size

The most significant factor in controlling file size is the choice of CODEC (very little video is distributed in uncompressed form). Some CODECs can achieve compression ratios of 100:1 or more. The CODEC employed with RealSystem can achieve compression ratios higher than 200:1. This extreme level of compression provides video that can be delivered in real time over networks.

While the CODEC is important, you should also consider the other factors that determine file size. The screen size of the video can have a strong influence on file size. Suppose you want to create video at full screen size (640 x 480 pixels, the equivalent of TV standard NTSC). Video formatted for this screen size requires four times more storage space than the 320 x 240 screen size (e.g.. 832 MB for one minute of video vs. 208 MB). This is why full screen video is usually rejected when making format decisions for computer video. Try formatting your video at different screen sizes and choose the smallest size you are willing to accept.

Color depth also influences file size but I prefer to use 24 bit color. Reducing color depth to 16 bits will reduce the file size to 2/3rds that of 24 bit color. If this level of color is acceptable to you, capture in 16 bit color to reduce file size. I would not recommend capturing video in 256 colors (8 bit).

The number of frames per second is a factor that can be manipulated to a small degree. Lowering the fps has an adverse effect on the quality of movement in a video. The standard for TV (NTSC) is just under 30 fps and for film the standard is 24 fps. These frame rates are used because they deliver smooth movement. For computer video, a rate of 15 fps is often used to control file size. Movement at 15 fps can be slightly "jerky" but is considered a good trade-off to control file size (file size is half that of video at 30 fps). In general it is desirable to capture video at a frame rate that is evenly divisible into the frame rate of the original format (e.g. capture NTSC video at 15 or 7.5 fps)

The portion of a video file devoted to sound is usually small compared to the portion containing the images. Nevertheless, attention to this detail can trim the size of a video file. Capture your video with mono sound if you don't need stereo. If your video does not contain music, do not capture the sound at a high sampling rate. A sampling rate of 11,025 Hz is more than adequate for a sound track that contains only voice. In general I prefer to capture sound at 16 bit rather than 8 bit if the capture is done in uncompressed format. The quality of uncompressed 8 bit sound is clearly less than 16 bit sound. If you are using a CODEC during capture, the sound will probably be captured in compressed format, which may limit your choices in setting parameters.