• blank_sl8@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    Google especially is working on multimodal models that do both language and image, audio, etc understanding in the same model. Their latest work, PaLM-E, demonstrates that learning in one domain (eg images) can indirectly benefit the model’s performance in other domains (eg text) without additional training in the other domain.