• @[email protected]
    link
    fedilink
    English
    6
    edit-2
    7 months ago

    Yet AI researcher Pablo Villalobos told the Journal that he believes that GPT-5 (OpenAI’s next model) will require at least five times the training data of GPT-4.

    I tried finding the non-layman’s version of the reasoning for this assertion and it appears to be a very black box assessment, based on historical trends and some other similarly abstracted attempts at modelling dataset size vs model size.

    This is EpochAI’s whole thing apparently, not that there’s necessarily anything wrong with that. I was just hoping for some insight into dataset length vs architecture and maybe the gossip on what’s going on with the next batch of LLMs, like how it eventually came out that gpt4.x is mostly several gpt3.xs in a trench coat.