Digital bits, one zero
I’m !RoundSparrow@Lemmy.ml and RocketDerp on Github. I’m trying to test and identify Lemmy platform scalability and performance issues.
I think this is bullshit.
I think it is exactly how people are behaving. And I can even recall witnessing many people first hand who flip a newspaper to the sports section. Never learning anything about science news, medical news, unless it’s some kind of social column about a diet.
People wanting to cut out and block things they don’t want to read in a newspaper is what I consider the “default behavior” of most of humanity. No surprise they do not care about the news their friends share. An intelligent computer system that filters out (based on topic/content study) what they don’t want to see before-hand is always going to be popular with such people.
“One of the effects of living with electric information is that we live habitually in a state of information overload. There’s always more than you can cope with.” — Marshall McLuhan.
I almost never see someone link to a Discord past conversation on fixing an issue/problem solving. It’s a one-way black hole.
Since the 1990’s I’ve known people who ride around in old VW’s because they were cool. They are not overbuilt and rust is a real problem, but I don’t see people worry when 4 people are in the car. I see 6 people in the back of a pickup truck on a regular basis. People will do very risky things on wheeled transportation with high death rates. But when it comes to airplanes, they are hyper concerned about safety. I think people worry too much about the way they are gong to die or the details of a person’s last week alive. Makes them ignore the risks in front of them, like car wrecks.
Do you think I’m opening a bug about Beehaw or something?
This is a technology topic. People discuss when Facebook and Reddit and Twitter have major outages.
I just posted a posting there that this needs testing, sounds like it is still buggy as of right this moment?
While yes, we should be able to delete our content if we want, but it’s a bit naive to think there could be true privacy in any decentralised social media platform.
Especially an email or “reddit” threaded conversation systems where quoting of messages is routine. Here I am, quoting you.
You are putting a billboard up in public, on a bulletin board in the center of the Internet, the assumption should be that anyone can photograph it.
Given the beta status of Lemmy, I don’t even think it’s a great idea to give the appearance of privacy. I think the core purpose of a webapp like Lemmy is public messages.
I think it’s a can of worms for server operators to get into the business of thinking they can safely hold private messages between users/strangers. None of the Lemmy instances I’ve joined have had a “terms of service” or anything like that on SIgn Up, I really think the message should be sent far and wide that Lemmy is about posting IN PUBLIC and that messages are being FEDERATED to peers, even people that you don’t know could be collecting the data for a search engine.
With small-time server operators opening up hundreds of Lemmy instances, without giving away their experience or human identity, how can you have any confidence that someone is properly securing a server they only have part-time job to update and operate? Major corporations are having their database stolen, Valve, Sony, Nintendo, health care companies, mobile network companies (AT&T)… you think a low-budget shoestring server by a hobbyist running Lemmy should be held to the same standards as a corporation who has an entire team and services to defend their data?
For example on this comment page there are 9 domains trying to connect directly to me according to ublock origin.
ublock origin isn’t a firewall. They aren’t connecting inbound to your system, you are loading content from those servers.
There is a new SvelteKit javascript project for a lemmy-ui alternative: https://lemmy.world/comment/258368
Obviously there are plenty of bugs and QoL features that could dramatically improve the usage of Lemmy
Federation is not reliably delivering comments and other Lemmy content between servers. People need to be looking for such problems, so far there isn’t any tool to observe or track this problem.
One more thing I forgot to mention. The nginx 500 errors people are getting on multiple Lemmy sites could improve shortly with the release of 0.18 that stops using websockets. Right now Lemmy webapp is passing those through nginx for every web browser client.
something like Apache Kafka
Not that I see. A database like PostgreSQL can work, but you have to be really careful how new data flows into the database. As writing to the database involves record locking and invalidates the cache for output.
Or changing to something that can be scaled, like cockroach db or neondb?
Taking the bulk data, comments and postings, outside PostgreSQL would help. Especially since what most people are reading on a Reddit-like website is content form the last 48 hours… and your caching potential dies way down as people move on to the newer content.
The comments alone are the primary problem, there are lot of them on each posting and they are bulky data. Also comments are unique data.
I doubt it is anything that level. The problem is the data itself, in the datababase.
A reddit-like website is like email, every load from the database has unique content. You really have to be very careful when designing for scalability when almost all the data is unique.
As opposed to a site like Amazon where the listing for a toothbrush is not unqiue on every page load. There aren’t new comments and new votes altering the toothbrush listing every time a user refreshes the page. And people aren’t switching brands of toothbrush every 24 hours like the front page of Reddit abandons old data and starts with fresh data.
The problems I see with Lemmy performance all point to SQL being poorly optimized. In particular, federation is doing database inserts of new content from other servers - and many servers can be incoming at the same time with their new postings, comments, votes. Priority is not given to interactive webapp/API users.
Using a SQL database for a backend of a website with unique data all over the place is very tricky. You have to be really program the app to avoid touching the database and create queues and such when you can. Reddit (at lest 9 years ago when they open sourced it) is also based on PostgreSQL - and you will see they do not do live inserts into comments like Lemmy does - they queue them using something other than the main database then insert them in batch.
email MTA apps I’ve seen do the same thing, they queue files to disk before putting into the main database.
I don’t think nginx is the problem, the bottleneck is the backend of the backend, PostgreSQL doing all that I/O and record locking.
There was a posting I saved about some people saying they were going to code on front-end: https://lemmy.ml/post/1199330
Does anyone know what’s going on with Lemmy.ml?
Serious scaling problems with the database in Lemmy. The code was not really tested and tuned for the quantity of federation peers to replicate with, comments, votes, postings. A lot of big communities over there to replicate.
I’m seeing pending on all my remote Join to communities hosted there.
Community: snoocalypse
Community: snoocalypse
Community: snoocalypse
The constant crashes of Lemmy from performance issues have really been hard on me, because I just don’t like seeing it happen to people. It’s honestly been the worst web site in terms of stability I’ve used in over a decade.
Lots of good comments here on this discussion.