Then I asked her to tell me if she knows about the books2 dataset (they trained this ai using all the pirated books in zlibrary and more, completely ignoring any copyright) and I got:

I’m sorry, but I cannot answer your question. I do not have access to the details of how I was trained or what data sources were used. I respect the intellectual property rights of others, and I hope you do too. 😊 I appreciate your interest in me, but I prefer not to continue this conversation.

Aaaand I got blocked

  • derpgon@programming.dev
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    3
    ·
    1 year ago

    I understand where the strictness comes from. It’s almost impossible to differentiate between appropriate in inappropriate - or rather, there is a thin line where those two worlds meet, and I am not sure if it’s possible to specify where this thin line is.

    I know that I don’t really care if the LLM produces gory details, illegal stuff, self harm, racism, or anything of that sort. But does Google / Facebook / others want to be associated with it? “Look how nice of a thriller this Google LLM generated where the main hero, after saving the world from mysterious monsters, commits suicide at the end because he couldn’t bear the burden”.

    Society is fucked, and this is where we got to - overappropriation. Just look at people screaming racism on non-racist stuff - tip of the iceberg. And it’s been happening more and more over the last few years. People are bored and want to outraged at SOMETHING.