The scraped data of 2.6 million DuoLingo users was leaked on a hacking forum, allowing threat actors to conduct targeted phishing attacks using the exposed information.

  • RanchOnPancakes@lemmy.world
    link
    fedilink
    English
    arrow-up
    93
    arrow-down
    6
    ·
    1 year ago

    Oh no. Now they know the aliased email address, unique password, and that I didn’t try very hard to learn spanish.

    (please note: this is a joke, I don’t see anything about them getting passwords)

    • stevedidWHAT@lemmy.world
      link
      fedilink
      English
      arrow-up
      29
      ·
      1 year ago

      Something to note here - with AI, if you’re using any sort of heuristic for your password, it’s pretty simple to work out a pretty good set of possibilities which makes brute force even easier and puts you at risk across the board.

      Always come up with random passwords that are as random as possible. If there’s a path you took to get to a password, in theory it can be worked backward.

      For example I know some people who only change a single letter when changing their passwords which is ultimately trivial to guess if the old password was compromised (hence the need to change the password or the need to proactively work against this possibility)

      • I_Has_A_Hat@lemmy.ml
        link
        fedilink
        English
        arrow-up
        39
        ·
        1 year ago

        I wish more websites allowed random words as passwords instead of forcing numbers and special characters (but not THAT special character, you have to use one of the ones on this list).

        People change their passwords by one letter or digit because they’re tied to these restrictive formats. If 5-6 random words was the norm, people would update more than just one character when needing to change passwords.

        “poison navy series ruler handshake papaya” is a fantastic password.

        “Ilovemygrandkids!123” is a horrible password.

        • hatter@lemmy.world
          link
          fedilink
          English
          arrow-up
          25
          ·
          1 year ago

          Just use a password manager and a unique, long, random generated password for every site. There’s no need or reason to know the password to anything other than your password manager and your primary email.

          • deft@ttrpg.network
            link
            fedilink
            English
            arrow-up
            9
            arrow-down
            4
            ·
            1 year ago

            in like a decade the use of a password manager will be a bad idea. i don’t know how but it will be.

            • demlet@lemmy.world
              link
              fedilink
              English
              arrow-up
              12
              arrow-down
              1
              ·
              1 year ago

              Hmm, a single point of access for every password you have? I don’t see the problem…

              • SleveMcDichael@programming.dev
                link
                fedilink
                English
                arrow-up
                17
                ·
                edit-2
                1 year ago

                The thing is the average person either can’t or can’t be bothered to remember even a dozen actually secure passwords, so they fall back to a couple of simple derivations of a common password, meaning each and every site a user signs up on represents an additional single point of failure.

              • Chriskmee@lemm.ee
                link
                fedilink
                English
                arrow-up
                10
                ·
                1 year ago

                Lucky until we get actual quantum computing, it’s not worth the years on a supercomputer to crack a single stolen set of encrypted passwords.

        • stevedidWHAT@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          ·
          1 year ago

          Agreed! I also think that the next steps would be getting rid of the need for users to even know their own password and instead replace with other securities like biometrics (with sufficient permutations possible to match or exceed passwords) and a physical device or something else entirely that removes the need to let the user in on what the exact password is

        • JJROKCZ@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          Tools like Bitwarden will let you fairly customize the randomly generated password it makes. You can tailor it to not use certain characters for those sites that don’t allow it. And each vault object can be customized like that independently so you don’t compromise all your passwords by not allowing _ or (, you can also have it do pass phrases like you gave an example of

      • lobut@lemmy.ca
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 year ago

        I use a heuristic to update my main passwords. It’s not a character but easily guessable if you see it in plaintext and now you’ve made me facepalm my actions.

        I only use that for certain things because I use Google Oauth or Bitwarden for most things and you’ve just woken me up about what could be exposed.

        • stevedidWHAT@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          The goal should usually be as random as possible, if it’s got a series of steps to create, they can be traced backward

          Now the trick I’m not telling you is that randomness is hard to get because you need a sufficient amount of entropy (basically just means randomness, chaos, formally it’s how much uncertainty there is in the system) to ensure that it’s strong enough which can be challenging sometimes. For example, if your password is only 3 characters long and has 10 possibilities for each spot in the string, you’re only looking at 10^3 possibilities to guess accurately which is nothing to pcs and people with time on their hands haha

      • qaz@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        That’s why I let Bitwarden generate a random 64 character password with special characters and numbers

      • redw04@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        That’s why correcthorsebatterystaple is the best way to do passwords imo, just 4 random words with a random special character dividing them and a random number tacked onto the end. Good luck brute forcing that or using AI to guess 4 randomly generated words in the correct order.

  • chulo_sinhatche@lemmy.world
    link
    fedilink
    English
    arrow-up
    55
    arrow-down
    2
    ·
    1 year ago

    Do the people that release these get paid somehow? Or do they just do it for hacker cred and say fuck these 2.6M people?

    • Dasnap@lemmy.world
      link
      fedilink
      English
      arrow-up
      48
      ·
      1 year ago

      In January 2023, someone was selling the scraped data of 2.6 million DuoLingo users on the now-shutdown Breached hacking forum for $1,500.

      As first spotted by VX-Underground, the scraped 2.6 million user dataset was released yesterday on a new version of the Breached hacking forum for 8 site credits, worth only $2.13.

      “Today I have uploaded the Duolingo Scrape for you to download, thanks for reading and enjoy!,” reads a post on the hacking forum.

      • snorkbubs@fedia.io
        link
        fedilink
        arrow-up
        20
        ·
        1 year ago

        This part is also, ummm, interesting…

        BleepingComputer has confirmed that this API is still openly available to anyone on the web, even after its abuse was reported to DuoLingo in January.

    • ChaoticNeutralCzech@feddit.de
      link
      fedilink
      English
      arrow-up
      31
      ·
      1 year ago

      They’ll send fake emails where the green owl comes to collect “late fees” for your 216-day streak of missed Spanish lessons.

  • no banana@lemmy.world
    link
    fedilink
    English
    arrow-up
    43
    arrow-down
    3
    ·
    1 year ago

    Damn, they’ll know I didn’t finish that Spanish lesson the bird bothered me about!

  • circuitfarmer@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    31
    ·
    edit-2
    1 year ago

    “Scraped” data suggests that it’s data available on public profile pages. However, the article also says the dump is a mix of public and non-public info. So which is it, scraped or not? It’s an important distinction, because data collection by scraping is technically not a breach.

  • SpicaNucifera@lemm.ee
    link
    fedilink
    English
    arrow-up
    11
    ·
    1 year ago

    Oh no, not my German and Japanese scores!!!

    I guess the email could become a spam target?? Gmail does a good job sorting that for me.

    • AToM.exe@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      I only see this comment, but it says 53 comments. I just want to know why they didn’t tell their userbase.

      • stopthatgirl7@kbin.socialOP
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        Lemmy and kbin have been having some federation issues lately, which might be why you’re only seeing one comment.

    • ansik@kbin.social
      link
      fedilink
      arrow-up
      11
      ·
      edit-2
      1 year ago

      However, Duolingo did not address the fact that email addresses were also listed in the data, which is not public information.

      From the Article, emphasis by me

  • z4x15@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    1
    ·
    1 year ago

    I’m so glad I switched to duck email. Might as well changes it again and block the old email.