Configurações de Renderização para YouTube
Obs by Tch3 - 130906
Formatos de arquivos compatíveis com o YouTubehttps://support.google.com/youtube/troubleshooter/2888402?hl=pt-BR&ref_topic=2888648#ts=2888341
Codec de vídeo: H.264
Por exemplo:
O conteúdo gravado a 24 fps deve ser codificado a 24 fps, e o envio deve ser feito a 24 fps.
O envio do conteúdo gravado a 30 fps deve ser feito a 30 fps.
O envio do conteúdo gravado a 720p60 deve ser feito a 720p60.
Por exemplo, um conteúdo com 1080i60 deve ter o entrelaçamento desfeito antes do envio, passando de 60 campos entrelaçados por segundo para 30 quadros progressivos por segundo.
Resoluções
Como enviar vídeos com mais de 15 minutos
Encoding options for H.264 video
As a producer of video on the web, you know that you're judged by the quality of your video. In this regard, many producers are considering converting from the venerable On2 VP6 codec to H.264. H.264 offers better visual quality than VP6, and the AAC audio codec offers much better quality than the MP3 codec paired with VP6. Starting withAdobe Flash Player 9 Update 3, you could play back files encoded in H.264/AAC formats. As of September 2008, the penetration of H.264/AAC-compatible players exceeded 89% in all Internet-connected PCs. No wonder they're switching over.
Formatos de arquivos compatíveis com o YouTubehttps://support.google.com/youtube/troubleshooter/2888402?hl=pt-BR&ref_topic=2888648#ts=2888341
Certifique-se de que você está usando um dos seguintes formatos:
- .MOV
- .MPEG4
- .AVI
- .WMV
- .MPEGPS
- .FLV
- 3GPP
- WebM
- .veg (projeto de arquivo Sony Vegas): converta esses arquivos para .MP4.
- .vep (arquivo de projeto AVS): converta esses arquivos para .AVI.
- .vpj (arquivo de projeto Pixbend): converta esses arquivos para .pxv.
- arquivos de DVD (como .VOB, .IFO e .BUP): converta esses arquivos para qualquer um dos formatos de arquivo listados no início deste artigo
Configurações avançadas de codificação
Taxas de bits, codecs e resoluções recomendados e maishttps://support.google.com/youtube/answer/1722171?hl=pt-BR&ref_topic=2888648
Recipiente: .mp4
- Sem listas editadas (ou você pode perder a sincronia AV)
- moov atom na frente do arquivo (Fast Start)
Codec de áudio: AAC-LC
- Canais: estéreo ou estéreo + 5.1
- Taxa de amostragem 96 khz ou 48 khz
Codec de vídeo: H.264
Obs: Encoding options for H.264 video (colado depois desse texto)
http://www.adobe.com/devnet/adobe-media-server/articles/h264_encoding.html
- Verificação progressiva (sem entrelaçamento)
- Alto perfil (permite usar B frames)
- 2 quadros B consecutivos
Obs: B-frames are the most efficient frames because they can search both ways for redundancies. Though controls and control nomenclature varies from encoder to encoder, the most common B-frame related control is simply the number of B-frames, or "B-Pictures" as shown in Figure 6. Note that the number in Figure 6 actually refers to the number of B-frames between consecutive I-frames or P-frames. - GOP fechado. GOP de metade da frame rate.
Obs:Closed GOP stands for Closed Group Of Pictures. It applies to MPEG-2 encoding. MPEG-2 streams are separated into GOPs (group of pictures) no longer than 18 frames for NTSC or 15 for PAL. A Closed GOP setting means that frames from the current GOP cannot reference I frames (full frames) from the previous GOP. - CABAC (Entropy coding)
Obs:When you select the Main or High Profiles, some encoding tools will give you two options for entropy coding mode:CAVLC: Context-based adaptive variable-length codingCABAC: Context-based adaptive binary arithmetic codingOf the two, CAVLC is the lower-quality, easier-to-decode option, while CABAC is the higher-quality, harder-to-decode option.
- Taxa de bits variável.
Não é necessário nenhum limite de taxa de bits, embora abaixo ofereçamos taxa de bits recomendadas, para referência. - Espaço de cores: 4.2.0
Obs: O Sistema de cor do vídeo é o sistema YCC. (Canal LUMA que codifica a luz sem cor, canal de cor, canal cor) que pode assumir os seguintes valores. Há um grande problema com a cor RGB - é difícil de trabalhar. Se eu preciso diminuir o brilho uniforme numa imagem, preciso fazer isso para todas as três cores. Há também muita redundância nos dados. Para combater esta redundância, há uma forma diferente de armazenar a informação chamado de YCbCr, que quebra o sinal para baixo em um Y, ou canal de luminância e dois canais que informações de cor loja sem informação de brilho - um canal de azul e um canal vermelho que não contém nenhum brilho.
http://antoniofreitasvideodigital.blogspot.com.br/2012/01/espaco-de-cor-no-video-444-422-420-411.html
- Obs: Se no programa de edição tiver a opção de renderizar usando a memoria da placa de video (CUDA), show! =)
http://www.nvidia.com.br/object/cuda_home_new_br.html
Frame rates
As frame rates devem corresponder ao material de origem.Por exemplo:
O conteúdo gravado a 24 fps deve ser codificado a 24 fps, e o envio deve ser feito a 24 fps.
O envio do conteúdo gravado a 30 fps deve ser feito a 30 fps.
O envio do conteúdo gravado a 720p60 deve ser feito a 720p60.
Por exemplo, um conteúdo com 1080i60 deve ter o entrelaçamento desfeito antes do envio, passando de 60 campos entrelaçados por segundo para 30 quadros progressivos por segundo.
Taxas de bits
Envios de qualidade padrãoTipo | Taxa de bits do vídeo | Taxa de bits de áudio mono | Taxa de bits de áudio estéreo | Taxa de bits de áudio 5.1 |
---|---|---|---|---|
1080p | 8.000 kbps | 128 kbps | 384 kbps | 512 kbps |
720p | 5.000 kbps | 128 kbps | 384 kbps | 512 kbps |
480p | 2.500 kbps | 64 kbps | 128 kbps | 196 kbps |
360p | 1.000 kbps | 64 kbps | 128 kbps | 196 kbps |
Resoluções
O YouTube utiliza players com proporção de tela de 16:9. Se enviar um arquivo que não seja 16:9, ele será processado e exibido corretamente, com caixas de pilar (barras pretas à esquerda e à direita) ou caixas de correio (barras pretas nas partes superior e inferior) fornecidas pelo player. Se deseja que o vídeo caiba perfeitamente na tela do player, codifique com estas resoluções:
- 1080p: 1920x1080
- 720p: 1280x720
- 480p: 854x480
- 360p: 640x360
- 240p: 426 x 240
O player do YouTube adiciona faixas pretas automaticamente para que os vídeos sejam exibidos corretamente sem cortes ou expansões, não importa o tamanho do vídeo ou do player.
Por exemplo, o player adicionará barras verticais automaticamente aos vídeos em 4:3 no novo player de tela ampla de 16:9. Se o player for redimensionado (quando for incorporado a outro website), ocorrerá o mesmo processo, sendo adicionadas barras pretas na parte superior e inferior em 16:9, quando o player for redimensionado para 4:3. De maneira semelhante, vídeos anamórficos terão faixas pretas na parte superior e inferior adicionadas automaticamente quando exibidos em players de tamanho 16:9 ou 4:3. O player só pode fazer isso se a proporção original do vídeo for mantida.
Você pode ajustar o enquadramento de seu vídeo em nosso player após seu envio usando tags de formatação.
https://support.google.com/youtube/answer/146402?hl=pt-BR
https://support.google.com/youtube/answer/146402?hl=pt-BR
ags são palavras-chave descritivas que você pode adicionar a seu vídeo para ajudar as pessoas a encontrarem seu conteúdo. No entanto, você também pode adicionar tags para mudar a aparência e formato de seu vídeo no YouTube e em players incorporados.
Exemplos de tags de formatação
- yt:quality=high: padrões para um fluxo de alta qualidade (disponível com base no tamanho do player do espectador e no tamanho da janela do navegador)
- yt:crop=16:9: aplica zoom na área 16:9, remove windowboxing
- yt:stretch=16:9: corrige o conteúdo anamórfico ampliando para 16:9
- yt:stretch=4:3: corrige o conteúdo de 720 x 480, que é a proporção errada, ampliando para 4:3
Como adicionar tags:
Durante o envio, você verá uma seção "Tags" abaixo da barra de progresso de envio, onde você pode adicionar suas tags.
Para adicionar tags a um vídeo existente, visite seu Gerenciador de vídeo e clique no botão Editar abaixo do vídeo para o qual você gostaria de adicionar tags de formatação.
Se forem adicionadas faixas horizontais superiores e inferiores no vídeo antes do envio (por exemplo, para criar um vídeo em 4:3 a partir de um original em 16:9), o player em tela ampla também adicionará faixas horizontais, gerando faixas pretas em torno do vídeo todo e em uma visualização ruim
Como enviar vídeos com mais de 15 minutos
https://support.google.com/youtube/answer/71673?hl=pt-BR&ref_topic=2888648
Por padrão, você pode enviar vídeos com 15 minutos de duração. Para enviar vídeos mais longos, siga estes passos:
- Acesse a página de envio em www.youtube.com/my_videos_upload
- Clique em Aumentar seu limite na parte inferior da página ou acesse https://www.youtube.com/verify
- Siga os passos para verificar sua conta com um telefone celular. Atualmente, não podemos oferecer outras formas de verificação de conta.
Quando aumentar seu limite, certifique-se de que esteja usando uma versão atualizada de seu navegador, a fim de que possa enviar arquivos superiores a 2 GB.
Não consigo encontrar o link "Aumentar seu limite"
Caso não consiga encontrar o link descrito acima, é possível que você já consiga enviar vídeos longos. Acesse a seção "Envios ilimitados" em sua página Elementos da conta para verificar se o elementos já está habilitado.
Estou tendo problemas para enviar um vídeo longo
Caso tenha esquecido de verificar sua conta antes de enviar o vídeo, você será solicitado a fazê-lo após a conclusão do processamento de seu vídeo. O seguinte erro: "Recusado (vídeo muito longo)" também pode ser exibido em seu Gerenciador de vídeos. (Aqui, clique no botão Verificar conta ao lado do vídeo longo.)
Depois de verificar sua conta, clique em Ativar este vídeo em seu Gerenciador de vídeos para publicá-lo. Os vídeos ativados são automaticamente definidos como vídeos privados, portanto, certifique-se de verificar as configurações de privacidade de seu vídeo e de editá-las, se quiser.
Caso contrário, se você conseguia enviar vídeos longos, mas não consegue mais, verifique se há reivindicações de direitos autorais e greves em sua conta.
Você somente poderá enviar vídeos grandes se sua conta estiver em situação regular com base nas Diretrizes da comunidade do YouTube e não houver bloqueios mundiais do ID do conteúdo no seu conteúdo.
- - -
Encoding options for H.264 video
http://www.adobe.com/devnet/adobe-media-server/articles/h264_encoding.html
As a producer of video on the web, you know that you're judged by the quality of your video. In this regard, many producers are considering converting from the venerable On2 VP6 codec to H.264. H.264 offers better visual quality than VP6, and the AAC audio codec offers much better quality than the MP3 codec paired with VP6. Starting withAdobe Flash Player 9 Update 3, you could play back files encoded in H.264/AAC formats. As of September 2008, the penetration of H.264/AAC-compatible players exceeded 89% in all Internet-connected PCs. No wonder they're switching over.
This article first discusses the issues involved in such a changeover, including the potential requirement for royalties. I then describe the H.264-specific encoding parameters offered by most encoding programs. Finally, I cover how you can produce H.264 video with Adobe Media Encoder CS4 and Adobe Flash Media Encoding Server 3.5.
To begin, I should explain some introductory concepts related to H.264 video.
What is H.264?
H.264 is a video compression standard known as MPEG-4 Part 10, or MPEG-4 AVC (for "advanced video coding"). It's a joint standard promulgated by the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG).
H.264's audio sidekick is AAC (advanced audio coding), which is designated MPEG-4 Part 3. Both H.264 and AAC are technically MPEG-4 codecs—though it's more accurate to call them by their specific names—and compatible bitstreams should conform to the requirements of Part 14 of the MPEG-4 spec.
According to Part 14, MPEG-4 files containing both audio and video, including those with H.264/AAC, should use the .mp4 extension, while audio-only files should use .m4a and video-only files should use .m4v. Different vendors have adopted a range of extensions that are recognized by their proprietary players, such as Apple with .m4p for files using FairPlay Digital Rights Management and .m4r for iPhone ringtones. (Mobile phones use the .3gp and .3g2 extensions, though I don't discuss producing for mobile phones in this article.)
Like MPEG-2, H.264 uses three types of frames, meaning that each group of pictures (GOP) is comprised of I-, B-, and P-frames, with I-frames like the DCT-based compression used in DV and B- and P-frames referencing redundancies in other frames to increase compression. I'll cover much more on this later in this article.
Like most video coding standards, H.264 actually standardizes only the "central decoder...such that every decoder conforming to the standard will produce similar output when given an encoded bitstream that conforms to the constraints of the standard," according to Overview of the H.264/AVC Video Coding Standard published in IEEE Transactions on Circuits and Systems for Video Technology (ITCSVT). Basically, this means that there's no standardized H.264 encoder. In fact, H.264 encoding vendors can utilize a range of different techniques to optimize video quality, so long as the bitstream plays on the target player. This is one of the key reasons that H.264 encoding interfaces vary so significantly among the various tools.
Will there be royalties?
If you stream H.264 encoded video after December 31, 2010, there may be an associated royalty obligation. As yet, however, it's undefined and uncertain. Here's an overview of what's known about royalties to date.
Briefly, H.264 was developed by a group of patent holders now represented by the MPEG Licensing Suthoring, or MPEG-LA for short. According to the Summary of AVC/H.264 License Terms (PDF, 34K) you can download from the MPEG-LA site, there are three classes of video producers subject to a potential royalty obligation.
If you're in the first two classes, and are either distributing via pay-per-view or subscription, you may already owe MPEG-LA royalties. The third group, which is clearly the largest, is for free Internet broadcast. Here, there will be no royalties until December 31, 2010 (source: AVC/H.264 License Agreement). After that, "the royalty shall be no more than the economic equivalent of royalties payable during the same time for free television."
According to their website, MPEG-LA must disclose licensing terms at least one year before they become due, or no later than December 31, 2009. Until then, we're unfortunately in the dark as to which uses of H.264 video will incur royalties, and the extent of these charges. For more information on H.264-related royalties, check out my article,The Future's So Bright: H.264 Year in Review, at StreamingMedia.com.
H.264 and Flash Player
As I mentioned, Adobe added H.264 playback support to Adobe Flash Player 9 Update 3 back in 2007. The apparent goal was to support the widest possible variation of files containing H.264 encoded video, and Flash Player should play.mp4, .m4v, .m4a, .mov, and .3gp files, H.264 files using the .flv extension, as well as files using the newer extensions that were released along with Flash Player 9 (see Table 1).
Table 1. File extensions for H.264 files produced for Flash Player playback
File Extension | FTYP | MIME Type | Description |
---|---|---|---|
.f4v | 'F4V ' | video/mp4 | Video for Flash Player |
.f4p | 'F4P ' | video/mp4 | Protected media for Flash Player |
.f4a | 'F4A ' | audio/mp4 | Audio for Flash Player |
.f4b | 'F4B ' | audio/mp4 | Audio book for Flash Player |
I'll describe profiles and levels in the next section. For now, understand that Flash Player supports the Baseline, Main, High, and High 10 H.264 profiles with no levels excluded. Accordingly, when you're producing H.264 video for Flash Player, you're free to choose the most advanced profile supported by the encoding tool, which is typically the High profile. On the audio side, Flash Player can play AAC Main, AAC Low Complexity, and AAC SBR (spectral band replication), which is otherwise known as High-Efficiency-AAC, or HE-AAC.
Producing H.264 video
You have seen that you have nearly complete flexibility regarding profiles and extensions; what else do you need to know before you dig into the details? A couple of things.
First, unlike VP6, which is available only from On2, there are multiple suppliers of H.264 codecs, including MainConcept, whose codec Adobe uses in Adobe Media Encoder and Adobe Flash Media Encoding Server. I've compared the quality of H.264 files produced with H.264 codecs from other vendors, and MainConcept has proven to be the best.
In general, while the overall quality of other codecs has improved, there are some tools to avoid out there. If you're producing with a different tool and not achieving the quality you were hoping for, try encoding with one of the Adobe tools.
Second, some older encoding tools do not offer output directly into F4V format. If F4V format is not offered in your encoding tool, the best alternative is to produce an MPEG-4 compatible streaming media file using the .mp4 extension.
With this as background, I'll describe the most common H.264 encoding parameters.
Though H.264 codecs come from different vendors, they use the same general encoding techniques and typically present similar encoding options. Here I review the most common H.264 encoding options.
Understanding profiles and levels
According to the aforementioned article, Overview of the H.264/AVC Video Coding Standard, a profile "defines a set of coding tools or algorithms that can be used in generating a conforming bitstream, whereas a level "places constraints on certain key parameters of the bitstream." In other words, a profile defines specific encoding techniques that you can or can't utilize when encoding the files (such as B-frames), while the level defines details such as the maximum resolutions and data rates.
Take a look at Figure 1, which is a filtered screen capture of a features table from Wikipedia's description of H.264. On top are H.264 profiles, including the Baseline, Main, High, and High 10 profiles that Flash Player supports. On the left are the different encoding techniques available, with the table detailing those supported by the respective profiles.
Figure 1. Encoding techniques enabled by profile (source: Wikipedia)
As you would guess, the higher-level profiles use more advanced encoding algorithms and produce better quality (see Figure 2). To produce this comparison, I encoded the same source file to the same encoding parameters. The file on the left uses the Main Profile; the files on the right uses the Baseline. A quick check of the chart in Figure 1 reveals that the Main Profile enables B slices (also called B-frames) and the higher-quality CABAC encoding, which I define later in this article. As you can see, these do help the Main Profile deliver higher-quality video than the Baseline.
Figure 2. File encoded using the Main profile (left) retaining much more quality than a file encoded using the Baseline profile (right)
So, the Main and High profiles deliver better quality than the Baseline Profile; what's the catch? The catch is, as you use more advanced encoding techniques, the file becomes more difficult to decompress, and may not play smoothly on older, slower computers.
This observation illustrates one of the two trade-offs typically presented by H.264 encoding parameters. One trade-off is better quality for a file that is harder to decompress. The other trade-off is a parameter that delivers better quality at the expense of encoding time. In some rare instances, as with the decision to include B-frames in the stream, you trigger both trade-offs, increasing both decoding complexity and encoding time.
To return to profiles: At a high level, think about profiles as a convenient point of agreement for device manufacturers and video producers. Mobile phone vendor A wants to build a phone that can play H.264 video but needs to keep the cost, heat, and size requirements down. So the crafty chief of engineering searches and finds the optimal processor that's powerful enough to play H.264 files produced to the Baseline Profile. If you're a video producer seeking to create video for that device, you know that if you encode using the Baseline profile, the video will play.
Accordingly, when producing H.264 video, the general rule is to use the maximum profile supported by the target playback platform, since that delivers the best quality at any given data rate. If producing for mobile devices, this typically means the Baseline Profile, but check the documentation for that device to be sure. If producing for Flash Player consumption on Windows or Macintosh computers, this means the High Profile.
This sounds nice and tidy, but understand this: While encoding using the Baseline Profile ensures smooth playback on your target mobile device, using the High Profile for files bound for computer playback doesn't provide the same assurance. That's because the High Profile supports H.264 video produced at a maximum resolution of 4096 × 2048 pixels and a data rate of 720 Mbps. Few desktop computers could display a complete frame, much less play back that stream at 30 frames per second.
Accordingly, while producing for devices is all about profile, producing for computers is all about your video configuration. Here, the general rule is that decoding H.264 video is about as computationally intense as VP6—or Windows Media, for that matter. So long as you produce your H.264 video at a similar resolution and data rate as the other two codecs, it should play fine on the same class of computer. (For comparative playback statistics for H.264, VP6 and VC-1, check out my StreamingMedia.com article, Decoding the Truth About Hi-Def Video Production.)
In general, this means that as long as you're producing SD video at 640 × 480 resolution and lower, it should play fine on most post–2003 computers. If you're producing at 720p or higher, these streams won't play smoothly on many of these computers. You should consider offering an alternative SD stream for these viewers.
What about H.264 levels? If producing for mobile devices with limited screen resolution and bandwidth, you also have to choose the correct level, which again should be specified by the device manufacturer. However, since Flash Player can handle any level supported by any of the supported profiles, you don't have to worry about levels when producing for Flash Player playback on a personal computer.
Entropy coding
When you select the Main or High Profiles, some encoding tools will give you two options for entropy coding mode (see Figure 3):
- CAVLC: Context-based adaptive variable-length coding
- CABAC: Context-based adaptive binary arithmetic coding
Of the two, CAVLC is the lower-quality, easier-to-decode option, while CABAC is the higher-quality, harder-to-decode option.
Figure 3. Your entropy coding choices: CABAC and CAVLC
Though results are source-dependent, CABAC is generally regarded as being between 5–15% more efficient than CAVLC. This means that CABAC should deliver equivalent quality at a 5–15% lower data rate, or better quality at the same data rate. In my own tests, CABAC produced noticeably better quality, though only in HD test clips encoding to very low data rates. This is shown in Figure 4, from a 720p file produced with CABAC on the left and CAVLC on the right, both to the same 800 kbps video data rate. Figure 4 shows a portion of a frame cut from a 16:9 720p video. Now 800 kbps is very low for 720p footage; by way of comparison, YouTube encodes H.264 720p footage at 2 Mbps, over 2.5 times the data rate.
Figure 4. 720p file produced using CABAC on the left, CAVLC on the right
Though neither image would win an award for clarity, the ballerina's face and other details are clearly more visible on the left. The bottom line is that CABAC should deliver better quality, however modest the difference. Now the question becomes, How much harder is the file to decompress and play?
Not that much, it turns out. I tested this on two of the less-powerful multiple-core computers in my office, one a Hewlett-Packard notebook with a Core 2 Duo processor, and the other a Power PC-based Apple PowerMac. As you can see in Table 2, the CABAC file increased the CPU load by less than 1% on the HP notebook, and less than 2% on the Mac. Based on the improved quality and minimal difference in the required playback CPU, I recommend choosing CABAC whenever the option is available.
Table 2. CPU consumed when playing back H.264 files encoded using CABAC and CAVLC
Computer | CABAC | CAVLC | Difference |
---|---|---|---|
HP Compaq 8710w Mobile Workstation – Core 2 duo | 31.1% | 30.5% | 0.6% |
Apple PowerMac – Dual 2.7 GHz PPC G5 | 35.5 | 33.7 | 1.8% |
I, P, and B-frames
It's common knowledge that talking-head footage, where very little changes from frame to frame, encodes at higher quality than dynamic, motion-filled video. That's because H.264, like all high-quality motion codecs, is designed to take advantage of redundancies between video frames. The more redundancy, the higher the quality at any given bit rate.
To leverage this redundancy, H.264 streams include three types of frames (see Figure 5):
- I-frames: Also known as key frames, I-frames are completely self-referential and don't use information from any other frames. These are the largest frames of the three, and the highest-quality, but the least efficient from a compression perspective.
- P-frames: P-frames are "predicted" frames. When producing a P-frame, the encoder can look backwards to previous I or P-frames for redundant picture information. P-frames are more efficient than I-frames, but less efficient than B-frames.
- B-frames: B-frames are bi-directional predicted frames. As you can see in Figure 5, this means that when producing B-frames, the encoder can look both forwards and backwards for redundant picture information. This makes B-frames the most efficient frame of the three. Note that B-frames are not available when producing using H.264's Baseline Profile.
Figure 5. I, P, and B-frames in an H.264-encoded stream
Now that you know the function of each frame type, I'll show you how to optimize their usage.
Working with I-frames
Though I-frames are the least efficient from a compression perspective, they do perform two invaluable functions. First, all playback of an H.264 video file has to start at an I-frame because it's the only frame type that doesn't refer to any other frames during encoding.
Since almost all streaming video may be played interactively, with the viewer dragging a slider around to different sections, you should include regular I-frames to ensure responsive playback. This is true when playing a video streamed from Flash Media Server, or one distributed via progressive download. While there is no magic number, I typically use an I-frame interval of 10 seconds, which means one I-frame every 300 frames when producing at 30 frames per second (and 240 and 150 for 24 fps and 15 fps video, respectively).
The other function of an I-frame is to help reset quality at a scene change. Imagine a sharp cut from one scene to another. If the first frame of the new scene is an I-frame, it's the best possible frame, which is a better starting point for all subsequent P and B-frames looking for redundant information. For this reason, most encoding tools offer a feature called "scene change detection," or "natural key frames," which you should always enable.
Figure 6 shows the I-frame related controls from Flash Media Encoding Server. You can see that Enable Scene Change detection is enabled, and that the size of the Coded Video Sequence is 300, as in 300 frames. This would be simpler to understand if it simply said "I-frame interval," but it's easy enough to figure out.
Figure 6. I-frame related controls from Flash Media Encoding Server
Specifically, the Coded Video Sequence refers to a "Group of Pictures" or GOP, which is the building block of the H.264 stream—that is, each H.264 stream is composed of multiple GOPs. Each GOP starts with an I-frame and includes all frames up to, but not including, the next I-frame. By choosing a Coded Video Sequence size of 300, you're telling Flash Media Encoding Server to create a GOP of 300 frames, or basically the same as an I-frame interval of 300.
IDR frames
I'll describe the Number of B-Pictures setting further on, and I've addressed Entropy Coding Mode already; but I wanted to explain the Minimum IDR interval and IDR frequency. I'll start by defining an IDR frame.
Briefly, the H.264 specification enables two types of I-frames: normal I-frames and IDR frames. With IDR frames, no frame after the IDR frame can refer back to any frame before the IDR frame. In contrast, with regular I-frames, B and P-frames located after the I-frame can refer back to reference frames located before the I-frame.
In terms of random access within the video stream, playback can always start on an IDR frame because no frame refers to any frames behind it. However, playback cannot always start on a non-IDR I-frame because subsequent frames may reference previous frames.
Since one of the key reasons to insert I-frames into your video is to enable interactivity, I use the default setting of 1, which makes every I-frame an IDR frame. If you use a setting of 0, only the first I-frame in the video file will be an IDR frame, which could make the file sluggish during random access. A setting of 2 makes every second I-frame an IDR frame, while a setting of 3 makes every third I-frame an IDR frame, and so on. Again, I just use the default setting of 1.
Minimum IDR interval defines the minimum number of frames in a group of pictures. Though you've set the Size of Codec Video Sequence at 300, you also enabled Scene Change Detection, which allows the encoder to insert an I-frame at scene changes. In a very dynamic MTV-like sequence, this could result in very frequent I-frames, which could degrade overall video quality. For these types of videos, you could experiment with extending the minimum IDR interval to 30–60 frames, to see if this improved quality. For most videos, however, the default interval of 1 provides the encoder with the necessary flexibility to insert frequent I-Frames in short, highly dynamic periods, like an opening or closing logo. For this reason, I also use the default option of 1 for this control.
Working with B-frames
B-frames are the most efficient frames because they can search both ways for redundancies. Though controls and control nomenclature varies from encoder to encoder, the most common B-frame related control is simply the number of B-frames, or "B-Pictures" as shown in Figure 6. Note that the number in Figure 6 actually refers to the number of B-frames between consecutive I-frames or P-frames.
Using the value of 2 found in Figure 6, you would create a GOP that looks like this:
IBBPBBPBBPBB...
...all the way to frame 300. If the number of B-Pictures was 3, the encoder would insert three B-frames between each I-frame and/or P-frame. While there is no magic number, I typically use two sequential B-frames.
How much can B-frames improve the quality of your video? Figure 7 tells the tale. By way of background, this is a frame at the end of a very-high-motion skateboard sequence, and also has significant detail, particularly in the fencing behind the skater. This combination of high motion and high detail is unusual, and makes this frame very hard to encode. As you can see in the figure, the video file encoded using B-frames retains noticeably more detail than the file produced without B-frames. In short, B-frames do improve quality.
Figure 7. File encoded with B-frames (left) and no B-frames (right)
What's the performance penalty on the decode side? I ran a battery of cross-platform tests, primarily on older, lower-power computers, measuring the CPU load required to play back a file produced with the Baseline Profile (no B-frames), and a file produced using the High Profile with B-frames. The maximum differential that I saw was 10 percent, which isn't enough to affect my recommendation to always use the High Profile except when producing for devices that support only the Baseline Profile.
Advanced B-frame options
Adobe Flash Media Encoding Server also includes the B and P-frame related controls shown in Figure 8. Adaptive B-frame placement allows the encoder to override the Number of B-Pictures value when it will enhance the quality of the encoded stream; for instance, when it detects a scene change and substitutes an I-frame for the B. I always enable this setting.
Figure 8. Other B-frame related options
Reference B-Pictures lets the encoder to use B-frames as a reference frame for P frames, while Allow pyramid B-frame coding lets the encoder use B-frames as references for other B-frames. I typically don't enable these options because the quality difference is negligible, and I've noticed that these options can cause playback to become unstable in some environments.
Reference frames is the number of frames that the encoder can search for redundancies while encoding, which can impact both encoding time and decoding complexity; that is, when producing a B-frame or P-frame, if you used a setting of 10, the encoder would search until it found up to 10 frames with redundant information, increasing the search time. Moreover, if the encoder found redundancies in 10 frames, each of those frames would have to be decoded and in memory during playback, which increases decode complexity.
Intuitively, for most videos, the vast majority of redundancies are located in the frames most proximate to the frame being encoded. This means that values in excess of 4 or 5 increase encoding time while providing little value. I typically use a value of 4.
Finally, though it's not technically related to B-frames, consider the number of Slices per picture, which can be 1, 2, or 4. At a value of 4, the encoder divides each frame into four regions and searches for redundancies in other frames only within the respective region. This can accelerate encoding on multicore computers because the encoder can assign the regions to different cores. However, since redundant information may have moved to a different region between frames—say in a panning or tilting motion—encoding with multiple slices may miss some redundancies, decreasing the overall quality of the video.
In contrast, at the default value of 1, the encoder treats each frame as a whole, and searches for redundancies in the entire frame of potential reference frames. Since it's harder to split this task among multiple cores, this setting is slower, but also maximizes quality. Unless you're in a real hurry, I recommend the default value of 1.
Other encoding parameters
Once you get beyond I and B-frame related controls, H.264 enables a range of additional encoding parameters, as I will soon describe. To put these in perspective, I would estimate that all the options described up to this point account for 90-95% of the quality available in H.264. The settings discussed in this section can deliver only the remaining 5%, which means that most users can accept the defaults without noticing the difference. Still, if you want to try to eke out the ultimate in H.264 quality,you can use the functions that the settings shown in Figure 9 control.
Figure 9. Other H.264 encoding parameters available in Flash Media Encoding Server
First is search shape, which can be either 16 × 16 or 8 × 8. The latter (8 × 8) is the higher-quality option, with the trade-off being longer encoding time. The next three "fast" options allow you to speed encoding time at the possible cost of quality. I typically disable these options.
Adaptive Quantization Mode and Quantization Strength are advanced settings that reallocate bits of data within a frame using one of the three selected criteria: brightness, contrast, or complexity. I would only experiment with these settings when areas in the video frame are noticeably blocky. Unfortunately, operation is extremely content-specific, which makes it impossible to offer general advice regarding which techniques and values to use.
Both the rate distortion optimization and Hadamard transformation settings can improve quality but lengthen encoding time; I usually enable both. Finally, the Motion estimation subpixel mode defines the granularity of the search for redundancies: Quarter pixel represents the highest-quality option, though the slowest to encode, and Full pixel represents the fastest but lowest quality. In my low-volume environment, I always use the Quarter pixel option.
Now that you've seen how H.264 works, I'll give you a quick look at how to produce H.264 video with the tools from Adobe.
Adobe Media Encoder
Adobe significantly enhanced the Flash Video Encoder in Creative Suite 4. There is now both stand-alone operation and batch encoding capabilities. As before, you can access H.264 encoding by choosing different formats in the Format pop-up menu. When producing for Flash Player, you should always use the FLV|F4V option, which lets you produce both VP6- and H.264-encoded files for Flash Player distribution.
Typically, you'll choose your codec by choosing a preset that uses one format or the other. Alternatively, you can change your codec in the Format tab by choosing FLV for VP6 or F4V for H.264 (see Figure 10).
Figure 10. Choosing to encode via VP6 (FLV option) or H.264 (F4V option)
The easiest way to work with Adobe Media Encoder is to choose a preset that's either the same size or slightly larger than your target resolution. This will ensure that the appropriate profile and level are selected.
All presets accessible through Adobe Premiere Pro default to the Main Profile, rather than High. While any quality difference is likely to be very minor, I generally change this to the High profile before encoding (see Figure 11). Other than this, the only setting that I modify is the Set Key Frame Distance in the Advanced Setting section, which I always enable, and insert 300 for Key Frame Distance. I always use the default values for Audio, changing only the data rate and channels to match my content and encoding targets.
Figure 11. Choosing H.264 encoding parameters in Adobe Media Encoder
Flash Media Encoding Server
Controls available in Flash Media Encoding Server are much more extensive than Adobe Media Encoder, but you start pretty much the same way: choosing your container format and preset (see Figure 12).
Figure 12. Choosing a container format and preset in Flash Media Encoding Server
Flash Media Server and Flash Player can both stream or play back any H.264 file in virtually any format, so either the F4V or MP4 container would work. If you want a file that can be played by both QuickTime Player and Flash Player, I would choose MP4; otherwise, use F4V. Choose a preset that uses a resolution equal to or higher than your target to ensure that you use the proper Profile and Level.
Figure 13 shows the H.264-related parameters in Flash Media Encoding Server. On the left are the default values for the preset selected in Figure 12. On the right are the values I would use. Big red asterisks identify recommended changes from the preset values, none of which are very dramatic.
Figure 13. Modifying the H.264 encoding parameters: default values (left), recommended values (right)
As I discussed earlier, I would extend the GOP size to 300 and use adaptive B-frame placement to provide the encoder with maximum flexibility. Extending the number of reference frames from 2 to 4 potentially increases quality at a slight cost in encoding time and decoding complexity, while disabling fast inter and intra decisions again potentially increases quality, with some increase in encoding time.
Overall, my recommended values should produce optimum quality, though at the outer edge of encoding time. If throughput is critical, I would do the following:
- Use the default value of 2 for reference frames
- Enable all "fast" encoding options
- Use a 16 × 16 Search shape
- Use a Full pixel for Motion estimation subpixel mode
- Enable two or four slices, assuming that you were encoding on a multiple-core system
If you take this route, however, you should compare the output from these parameters with the output using the recommended settings shown in Figure 13 to see if the faster encoding parameters make a noticeable quality difference.
On the audio front, I would use the default values and change only the target bitrate and channels to match my targets. I also use the default values for other H.264 encoding parameters, like Timestamps and Sequence End codes, that Flash Media Encoding Server makes available.
That's it! Go encode some video.
Nenhum comentário:
Postar um comentário