The massive transform from GPT-3.5 is the fact that OpenAI's 4th era language product is multimodal, which implies it may possibly process both textual content, pictures and audio. This implies you'll be able to exhibit it pictures and it will respond to them together with a text prompt – an early example of this, noted with the Big apple Occasio