• OwOarchist@pawb.social
    link
    fedilink
    English
    arrow-up
    22
    ·
    2 days ago

    Why wouldn’t they? You don’t even have to be logged in to view them.

    You should never assume anything you post publicly online is at all private or hidden from any search engine/AI.

    • Rhoeri@piefed.social
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      3
      ·
      2 days ago

      Could you imagine someone legitimately looking some shit up and having trash from lemmy.ml be the result?

      The world isn’t really for that level of misinformation.

    • Chamomile 🐑@furry.engineer
      link
      fedilink
      arrow-up
      7
      ·
      2 days ago

      @OwOarchist @Rhoeri Unlike AI crawlers, search engines generally respect robots.txt and noindex tags, which will tell them not to index or surface those pages in search results. This is how fediverse profiles which have chosen to opt out of internet search indexes do so.

      You should still assume things you post in public with no auth required are public of course.

      • cron@feddit.org
        link
        fedilink
        arrow-up
        3
        ·
        2 days ago

        Does robots.txt really work in the fediverse? At least on lemmy, the content can be retrieved on different hosts, all of which have different robots.txt files. Unless it is somehow “baked” into the protocol.

        • pkjqpg1h@lemmy.zip
          link
          fedilink
          English
          arrow-up
          2
          ·
          13 hours ago

          Major search engines respect robots.txt, but as you said some instances allow them but this is not a scalable way