• OwOarchist@pawb.social
      link
      fedilink
      English
      arrow-up
      22
      ·
      2 days ago

      Why wouldn’t they? You don’t even have to be logged in to view them.

      You should never assume anything you post publicly online is at all private or hidden from any search engine/AI.

      • Rhoeri@piefed.social
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        3
        ·
        2 days ago

        Could you imagine someone legitimately looking some shit up and having trash from lemmy.ml be the result?

        The world isn’t really for that level of misinformation.

      • Chamomile 🐑@furry.engineer
        link
        fedilink
        arrow-up
        7
        ·
        2 days ago

        @OwOarchist @Rhoeri Unlike AI crawlers, search engines generally respect robots.txt and noindex tags, which will tell them not to index or surface those pages in search results. This is how fediverse profiles which have chosen to opt out of internet search indexes do so.

        You should still assume things you post in public with no auth required are public of course.

        • cron@feddit.org
          link
          fedilink
          arrow-up
          3
          ·
          2 days ago

          Does robots.txt really work in the fediverse? At least on lemmy, the content can be retrieved on different hosts, all of which have different robots.txt files. Unless it is somehow “baked” into the protocol.

          • pkjqpg1h@lemmy.zip
            link
            fedilink
            English
            arrow-up
            2
            ·
            7 hours ago

            Major search engines respect robots.txt, but as you said some instances allow them but this is not a scalable way