图像提示词反推 Joycaption 最强反推模型

图像提示词反推 Joycaption 最强反推模型

效果分享

wechat_2025-06-15_090052_548-tuya

This photograph captures the Palace of Heavenly Purity in the Forbidden City, Beijing, China, under a bright blue sky with a large,fluffy white cloud. The foreground features a wide, empty courtyard with a curved white stone wall. The midground showcases a large, red-roofed building with intricate gold-tiled roofs and white trim, adorned with archways. The building’s architectural style is traditional Chinese, with layered eaves and ornate details. The background includes more red-roofed structures and a distant white building. The image includes a watermark in the bottom center with Chinese characters and a small house icon. The overall scene is serene and majestic.
image

JoyCaption图像提示词反推模型使用指南

一、介绍

  • 模型概述
    JoyCaption是基于SigLip视觉编码器与Meta Llama3.1语言模型的融合架构,专为高精度图像描述生成设计。支持生成Stable Diffusion/MidJourney等平台的提示词,并适配多种标签风格(如Danbooru、e621)。

  • 核心优势

    • 反推精准度:在测试中超越Florence2,部分场景效果优于MiniCPM,尤其在人物特征还原(发型、年龄)上表现突出
    • 多模态支持:除反推提示词外,还支持文本扩写、视频理解及批量处理23
    • 量化版本:提供NF4量化模型,8G显存即可运行,适配50系显卡1419
  • 性能对比

    模型处理速度人像还原度场景细节
    JoyCaption50-55秒★★★★☆★★★★★
    Florence224秒★★☆☆☆★★★☆☆
    MiniCPM-V2.652秒★★★☆☆★★★★☆
    数据来源1614   

多网盘整合包下载

© 版权声明
THE END
点赞10 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容