

Yes, ~4million CHF against (mostly big donations and organisations) vs 400’000 for (mostly small private donations)


Yes, ~4million CHF against (mostly big donations and organisations) vs 400’000 for (mostly small private donations)


This is the way. I also have rules for hits to url, without a referer, that should never be hit without a referer, with some threshold to account for a user hitting F5. Plus a whitelist of real users (ones that got a 200 on a login endpoint). Mostly the Huawei and Tencent crawlers have fake user agents and no referer. Another thing crawlers don’t do is caching. A user would never download that same .js file 100s of times in a hour, all their devices’ browsers would have cached it. There’s quite a lot of these kinds of patterns that can be used to block bots. Just takes watching the logs a bit to spot them.
Then there’s ratelimiting and banning ip’s that hit the ratelimit regularly. Use nginx as a reverse proxy, set rate limits for URLs where it makes sense, with some burst set, ban IPs that got rate-limited more than x times in the past y hours based on the rate limit message in the nginx error.log. Might need some fine tuning/tweaking to get the thresholds right but can catch some very spammy bots. Doesn’t help with those that just crawl from 100s of ips but only use each ip once every hour, though.
Ban based on the bot user agents, for those that set it. Sure, theoretically robots.txt should be the way to deal with that, for well behaved crawlers, but if it’s your homelab and you just don’t want any crawlers, might as well just block those in the firewall the first time you see them.
Downloading abuse ip lists nightly and banning those, that’s around 60k abusive ip’s gone. At that point you probably need to use nftables directly though instead of iptables or going through ufw, for the sets, as having 60k rules would be a bad idea.
there’s lists of all datacenter ip ranges out there, so you could block as well, though that’s a pretty nuclear option, so better make sure traffic you want is whitelisted. E.g. for lemmy, you can get a list of the ips of all other instances nightly, so you don’t accidentally block them. Lemmy traffic is very spammy…
there’s so much that can be done with f2b and a bit of scripting/writing filters


It’s pretty normal in taiwan, everything is presented in a cute way. E.g. the signs against sexual harassment on public transport https://www-ws.gov.taipei/001/Upload/405/relpic/18288/136717/69e3b002-46fe-4880-a801-3a88b9d5c5a2.jpg
In a perfect world, yes.
In reality, i knew what i did and why i did it, two years ago, after which i never had to touch it again until now, and it takes me 2 hours of searching/fiddling until i remember that weird thing i did 2 years ago…
and it’s still totally worth it
Oh or e.g. random env vars in .profile that I’m sure where needed for nvidia on wayland at some point, no clue if they’re still necessary but i won’t touch them unless something breaks. and half of them were probably not neccessary to begin with, but trying all differen’t combinations is tedious…
The funniest one was the example of a rich guy leaving germany because of inheritance tax being used to prove this.
Only that he left germany for switzerland and there’s not really any other countries around with no inheritance tax