No title
In multi-view sports broadcasting, extracting high-quality highlights and selecting the optimal camera angle from concurrent video streams is traditionally a labor-intensive process with significant computational overhead. This thesis explores the feasibility of automating the production of multi-view sports highlights using Multimodal Large Language Models (MLLMs). To manage the high computatio
