Usando a GPU para renderizar vídeo

Categoria: Linux Publicado: Domingo, 31 Outubro 2021 Escrito por Helio Loureiro Imprimir

Durante a útima gravação do Unix Load On, Ingo comentou sobre aceleração de hardware via VAAPI.  Eu imaginava que isso apenas era possível com GPUs da NVIDIA, mas depois que ele comentou sobre VAPI rodando em Intel, fui atrás de como isso funcionava.

Então fiz o teste de criar um vídeo a partir das imagens que fiz da troca de pneus da bicicleta pros de inverno.

O primeiro foi usando a CPU.

  $ rm -f out.mp4; time ffmpeg -r 60 -i image-%03d.jpg -c:v libx264 -vf fps=60 -pix_fmt yuv420p out.mp4                                                                                                          
  ffmpeg version 3.4.8-0ubuntu0.2 Copyright (c) 2000-2020 the FFmpeg developers
    built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
    configuration: --prefix=/usr --extra-version=0ubuntu0.2
    --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu
    --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping
    --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa
    --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca
    --enable-libcdio --enable-libflite --enable-libfontconfig
    --enable-libfreetype --enable-libfribidi --enable-libgme
    --enable-libgsm --enable-libmp3lame --enable-libmysofa
    --enable-libopenjpeg --enable-libopenmpt --enable-libopus
    --enable-libpulse --enable-librubberband --enable-librsvg
    --enable-libshine --enable-libsnappy --enable-libsoxr
    --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame
    --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp
    --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq
    --enable-libzvbi --enable-omx --enable-openal --enable-opengl
    --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883
    --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264
    --enable-shared
    libavutil      55. 78.100 / 55. 78.100
    libavcodec     57.107.100 / 57.107.100
    libavformat    57. 83.100 / 57. 83.100
    libavdevice    57. 10.100 / 57. 10.100
    libavfilter     6.107.100 /  6.107.100
    libavresample   3.  7.  0 /  3.  7.  0
    libswscale      4.  8.100 /  4.  8.100
    libswresample   2.  9.100 /  2.  9.100
    libpostproc    54.  7.100 / 54.  7.100
  Input #0, image2, from 'image-%03d.jpg':
    Duration: 00:00:25.24, start: 0.000000, bitrate: N/A
      Stream #0:0: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown),
      4000x3000 [SAR 1:1 DAR 4:3], 25 fps, 25 tbr, 25 tbn, 25 tbc
  Stream mapping:
    Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
  Press [q] to stop, [?] for help
  [swscaler @ 0x56448c3106c0] deprecated pixel format used, make sure you did set range correctly
  [libx264 @ 0x56448c268ee0] using SAR=1/1
  [libx264 @ 0x56448c268ee0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
  [libx264 @ 0x56448c268ee0] profile High, level 6.0
  [libx264 @ 0x56448c268ee0] 264 - core 152 r2854 e9a5903 - H.264/MPEG-4
  AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html -
  options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7
  psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1
  8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6
  lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0
  bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1
  b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25
  scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0
  qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
  Output #0, mp4, to 'out.mp4':
    Metadata:
      encoder         : Lavf57.83.100
      Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p,
      4000x3000 [SAR 1:1 DAR 4:3], q=-1--1, 60 fps, 15360 tbn, 60 tbc
      Metadata:
        encoder         : Lavc57.107.100 libx264
      Side data:
        cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
  frame=  631 fps=1.1 q=-1.0 Lsize=   64381kB time=00:00:10.46 bitrate=50388.8kbits/s speed=0.0188x
  video:64372kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.013232%
  [libx264 @ 0x56448c268ee0] frame I:11    Avg QP:23.13  size:216485
  [libx264 @ 0x56448c268ee0] frame P:236   Avg QP:25.43  size:133712
  [libx264 @ 0x56448c268ee0] frame B:384   Avg QP:25.38  size: 83278
  [libx264 @ 0x56448c268ee0] consecutive B-frames: 14.1% 11.7%  7.6% 66.6%
  [libx264 @ 0x56448c268ee0] mb I  I16..4: 16.8% 81.8%  1.4%
  [libx264 @ 0x56448c268ee0] mb P  I16..4: 24.8% 44.9%  0.4%  P16..4:
  16.1%  1.8%  0.9%  0.0%  0.0%    skip:11.1%
  [libx264 @ 0x56448c268ee0] mb B  I16..4: 12.5% 18.4%  0.1%  B16..8:
  21.0%  2.2%  0.3%  direct: 2.5%  skip:43.1%  L0:50.8% L1:46.8% BI: 2.4%
  [libx264 @ 0x56448c268ee0] 8x8 transform intra:62.8% inter:92.5%
  [libx264 @ 0x56448c268ee0] coded y,uvDC,uvAC intra: 29.1% 18.4% 0.7% inter: 11.4% 16.9% 0.1%
  [libx264 @ 0x56448c268ee0] i16 v,h,dc,p: 41% 27% 23%  9%
  [libx264 @ 0x56448c268ee0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 29% 22% 42%  2%  1%  1%  1%  1%  1%
  [libx264 @ 0x56448c268ee0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 31% 17% 19%  6%  6%  6%  6%  5%  3%
  [libx264 @ 0x56448c268ee0] i8c dc,h,v,p: 66% 16% 18%  1%
  [libx264 @ 0x56448c268ee0] Weighted P-Frames: Y:33.5% UV:16.5%
  [libx264 @ 0x56448c268ee0] ref P L0: 40.1% 13.1% 23.2% 17.1%  6.5%
  [libx264 @ 0x56448c268ee0] ref B L0: 59.5% 30.9%  9.6%
  [libx264 @ 0x56448c268ee0] ref B L1: 83.0% 17.0%
  [libx264 @ 0x56448c268ee0] kb/s:50142.36
  1190.99user 25.61system 9:17.29elapsed 218%CPU (0avgtext+0avgdata 2714684maxresident)k
  117280inputs+128944outputs (45major+630483minor)pagefaults 0swaps

9 minutos e 17 segundos.  Usando 218% de CPU, de um laptop com 4 CPUs.  Esse é o funcionamento normal.

E usando a GPU pra isso?  Eu demorei um pouco pra acertar os parâmetros pra usar a VAAPI, mas no fim deu certo.

  $ rm -f out.mp4; time ffmpeg -vaapi_device /dev/dri/renderD128 -r 60 -i image-%03d.jpg -vf 'format=nv12,hwupload,fps=60' -c:v h264_vaapi  -pix_fmt vaapi_vld out.mp4                                           
  ffmpeg version 3.4.8-0ubuntu0.2 Copyright (c) 2000-2020 the FFmpeg developers
    built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
    configuration: --prefix=/usr --extra-version=0ubuntu0.2
    --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu
    --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping
    --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa
    --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca
    --enable-libcdio --enable-libflite --enable-libfontconfig
    --enable-libfreetype --enable-libfribidi --enable-libgme
    --enable-libgsm --enable-libmp3lame --enable-libmysofa
    --enable-libopenjpeg --enable-libopenmpt --enable-libopus
    --enable-libpulse --enable-librubberband --enable-librsvg
    --enable-libshine --enable-libsnappy --enable-libsoxr
    --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame
    --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp
    --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq
    --enable-libzvbi --enable-omx --enable-openal --enable-opengl
    --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883
    --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264
    --enable-shared
    libavutil      55. 78.100 / 55. 78.100
    libavcodec     57.107.100 / 57.107.100
    libavformat    57. 83.100 / 57. 83.100
    libavdevice    57. 10.100 / 57. 10.100
    libavfilter     6.107.100 /  6.107.100
    libavresample   3.  7.  0 /  3.  7.  0
    libswscale      4.  8.100 /  4.  8.100
    libswresample   2.  9.100 /  2.  9.100
    libpostproc    54.  7.100 / 54.  7.100
  Input #0, image2, from 'image-%03d.jpg':
    Duration: 00:00:25.24, start: 0.000000, bitrate: N/A
      Stream #0:0: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown),
      4000x3000 [SAR 1:1 DAR 4:3], 25 fps, 25 tbr, 25 tbn, 25 tbc
  Stream mapping:
    Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (h264_vaapi))
  Press [q] to stop, [?] for help
  [swscaler @ 0x5577c4cfb3c0] deprecated pixel format used, make sure you did set range correctly
  Output #0, mp4, to 'out.mp4':
    Metadata:
      encoder         : Lavf57.83.100
      Stream #0:0: Video: h264 (h264_vaapi) (High) (avc1 / 0x31637661),
      vaapi_vld, 4000x3000 [SAR 1:1 DAR 4:3], q=0-31, 60 fps, 15360 tbn,
      60 tbc
      Metadata:
        encoder         : Lavc57.107.100 h264_vaapi
  frame=  631 fps=9.3 q=-0.0 Lsize=  133604kB time=00:00:10.48
  bitrate=104401.7kbits/s speed=0.154x
  video:133596kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB
  muxing overhead: 0.006032%
  59.94user 1.03system 1:08.18elapsed 89%CPU (0avgtext+0avgdata 120480maxresident)k
  3504inputs+267264outputs (0major+44766minor)pagefaults 0swaps

Resultado?  1 minuto e 08 segundos.  Incrível.  Usou 89% de CPU, o que significa que das 4 CPUs, uma somente ficou ocupada.  E não 100%, apenas 90%.

Realmente incrível.

Não são todos os formatos de vídeo que são suportados, mas pra ter esse desempenho acho que vale sacrificar os formatos pra gerar um simples mp4.

 UPDATE: 2021-11-01.  O vídeo que renderizei pros testes.

 

Acessos: 81