標籤: Vision-language models (VLMs)