SourceHut continues to face disruptions due to aggressive LLM crawlers. We are continuously working to deploy mitigations. We have deployed a number of mitigations which are keeping the problem contained for now. However, some of our mitigations may impact end-users.

  • Roguelazer@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 days ago

    The companies that run these residential proxy networks are sketchy as shit and in a better world would be criminally prosecuted. They’re tricking random low-information users into installing VPNs and other software with backdoors that turn them into a veritable botnet.

  • Treczoks@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 days ago

    I wonder how much of the load problems I observe with lemmy.world are due to AI crawlers.

    • mesa@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      9 days ago

      I had the same issue. OpenAI was just slamming my tiny little server, ignoring the robots.txt. I had to install a LLM black hole and put a very basic password protection around my git server frontend, since it kept getting slammed by the crawler.

      As much as I dont like google, I did see them come in, look at the robot.txt and no other calls for a week. Thats how it should work.