what bugs me the most is thinking that if lemmy became as relevant as reddit, servers to spam all over the threadiverse would be created nonstop

does anyone have some resources into solutions for spam in stuff like email? i’d like to check if having some giants gatekeepers like microsoft and google is inevitable

  • poVoqMA
    link
    fedilink
    arrow-up
    14
    ·
    edit-2
    2 months ago

    There is a project working on shared & real-time updated block-lists (and a chain of trust model) that originated from within the Lemmy community: https://fediseer.com/

    It was set up pre-emptively last year when automated bot accounts were exploding, but somehow that never turned into much spam. But the infrastructure is there now and basically waits for a real-world test case.

    There are also several auto-moderation bot frameworks, such as Threativore which we have running here on SLRPNK and which is planned to be integrated with Fediseer as well.

    XMPP also has a similar concept with https://xmppbl.org/ and IRC had a similar system for spam IPs for a long time.

    • ex_06OP
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 months ago

      Thank you! I knew about fediseer but wanted to know more battle tested and technical solutions (if there were any) like, idk, having some kind of fail2ban but not so low level

      Didn’t know about threativore, nice

      Still weird we don’t have yet an automod directly into lemmy lol

      (And I still get negative number of posts because of the deleted ones :D)(and they still have some cache like user should be able to restore their stuff, they shouldn’t)

  • Admiral Patrick@dubvee.org
    link
    fedilink
    English
    arrow-up
    3
    ·
    2 months ago

    (Just thinking out loud here) I wonder if it’s possible to pull posts via the API, format them into a text file in email format, run them through SpamAssassin, and then use the results to trigger an automod action?

    You’d need to curate a good list of ham/spam posts to train it on, but the built-in heuristics may catch some low-hanging fruit out of the box.

    If anyone has any thoughts or has tried this, lemme know. I’ve been kicking around the idea for a few months but haven’t gone any further than that.

    • ex_06OP
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      2 months ago

      honestly i don’t like automated systems based on content of the post rather than number

      too easy to become biased against non native talkers and occasional promotion is healthy in a community

      p.s. i’m also thinkign out loud, the web is too empty of actual discussions like the one we having right now, not everything has to be published only when finished :D (even tho on commercial social media it’s actually the opposite of having too much noise lol)

      • Admiral Patrick@dubvee.org
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        2 months ago

        If you haven’t seen walls of pill spam from, typically, Kbin, then thank your instance admins. Picking those out in email is not quite a solved problem, but is routine and accurate enough that I only see that kind of spam once or twice a year. That’s the content I’m targeting.

        Plus, I estimate about 10-15% of the people I work with (and/or interact with professionally) are non-native English speakers. I never have to dig their emails out of spam.