cross-posted from: https://feddit.de/post/5294605
Youtube, for so many years, was just too good. Yes, they changed the 5 star rating system to likes and dislikes and a few years later disabled dislikes altogether, but their algorithm mostly digs up interesting content and it just works for creators and viewers.
This might change soon. Their new strategy to disallow ad-blockers will frustrate a certain kind of viewer. Those who dislike surveillance and like open-source tech, those who use uBlock Origin and know why.
Just like a few years ago mastodon suddenly reached a certain kind of popularity, because twitter had their first big fuckup, maybe Peertube is next. It certainly is the most polished decentralized solution that doesn’t use a blockchain. Creators or fans could easily host their own videos, fans can watch it, without ads.
PeerTube will not replace youtube. it cannot compete in either scale or creator compensation.
i don’t think people realize just how insane your infrastructure has to be to handle 30,000 hours of video being uploaded every hour.
Taking some simple napkin math, I have a 1min 1080p video downloaded from YT. It clocks in at 15MB.
So, Gamer’s Nexus has 2.6k videos. (That’s insane, btw, but fairly large channel, not even LTT size though).
Assuming just 1080p, and let’s say about 10min average per video. (Some are less, some are 40+), that’s 150MB per 10min video, and that means it’s 390,000Mb (or 380.86GB) for their collection. Assuming I’m wrong and the average is even half of that, and the average GN video is only 5 minutes that’s still 190GB. And that isn’t counting 4k, or the multiple other formats to optimize streaming (720, 480, 360, misc bitrates, etc)
And that’s just storage, not even taking into account compute! (Or egress, or transcriptions, or scaling, or…)
Really for something like Peertube to take off it will require each channel to spin up their own instance, which honestly is just another expense for them, one that Youtube does for them for free, plus Youtube offers to pay them. Which, would cut down on some of the chaff (only people who want to do it would do it), but yeah, I don’t think it’s going to replace YT at any point. Smaller channels can combine for sure, but there is definitely a threshold where it becomes extremely costly.
I’m all for the fediverse, but video streaming is freaking costly and expensive. There’s definitely a reason youtube has a monopoly on it. Now this isn’t to discourage, but more for anyone who may be thinking "yeah why doesn’t peertube just replace it?)
Your overall point is fair, but your math here is off by a factor of 1000 - it would be around 380 GB.
Oh damn, forgot GB. So stupid, good catch. Fixing it
It could be done if peertube used a scheme like BitTorrent. We are approaching a time where enough users have sufficient upstream bandwidth for video.
But then, even without hosting costs, creating videos takes much more time and effort than writing a short text.
Peertube does allow downloading from peers like bittorrent. But you still need to host the whole video, it only would alleviate data transfer. And I don’t think you’d want to not host the video and rely entirely on people sharing your video and continuing to seed it for it to be available. So for running a channel or sharing videos that you have produced you will still need to host the files somewhere.
This is something PeerTube already does. Viewers of a video will be a peer and so can other PeerTube servers also be for each others videos.
Bandwidth isn’t the biggest issue. Storage is. The video need to be stored somewhere and storage is expensive.
We need something like Siacoin, that’s easy to use and easy to donate or sell cheap storage.
Nice idea, but then everytime a video that contains anything licensed by the content mafia is uploaded (even partly), the user in question breaks that license opening themselves up to lawsuits.
In a perfect world where only properly free content is shared that model would work. But that is not how most content shared on YouTube looks like.
not a word.
A long time ago I read a paper how to mitigate this. Without remembering the details, the idea was: 1. One peer never holds a complete file, only parts of it. 2. You need a key to find all parts of the file and get them in the right order. So Disney can only accuse you of having an incomplete and unusable part of their movie.
But copyrighted material is only one issue. Do you want your hardware to be used for distributing depictions of sexual abuse, or inciting hatred and violence? Any YouTube replacement will need strong moderation tools.
The concept is that will only happen if you have watched that video depicting sexual abuse, because your peertube client (the website) won’t download videos you didn’t watch.
I was thinking of a hypothetical system were peers provide storage for creators independently of what they are watching (in response to ‘videos take too much storage for individuals to host’ comment. For peertube, you are right.
That is essentially how bittorrent works anyway. In Germany people lost in court over this. Also portions of a copyrighted file are a problem. If they can “proof” that they got a relevant portion (more than the typical fair use seconds) you are still on the hook.
‘Landgericht Hamburg’ proofing will be hard, admittedly. But doesn’t BT just split up a file in x parts, so each part is watchable? What if you sliced differently, like every 100th byte of a file? Or even bitwise slicing? Not one 600 s snippet but 60000 10 ms snippets from throughout the movie.
That could help, but if a file is not shared that much (yet) or not many people are online at the moment, a single peer will still share many more parts, likely ending up with having shared significant amounts.
You are vastly overestimating the amount of storage you need since you are looking at some download which itself has to choose the encoding (which is independent of whatever youtube does: youtube absolutely crushes the quality).
Most estimates assume that youtube has 1 exabyte of storage, let’s say we buy this in bulk from retail (which we wouldn’t do: you wait as long as possible since storage prices are going down and retail stores would give you the finger if you ordered and exabyte worth).
Let’s take that number and run with it:
Buying retail, you can get Seagate Exos X20 20TB drives for 280€, 1 exabyte is 1Mio terabyte, meaning we have 1_000_000/20 * 280 = 14 Mio € (you’d need machines to put those into but you also wouldn’t buy the entire thing upfront, and using retail prices either).
Compute also isn’t that big of a deal if you do it correctly: the expensive part in video hosting is usually video encoding since to get small video sizes you need to spend compute beforehand to compress it.
However, you can shift this in significant parts to the user by implementing the transcoding in WASM and running this clientside (see e.g. https://www.w3.org/2021/03/media-production-workshop/talks/qiang-fu-video-transcoding.html) in that case users would compress locally in the browser before uploading (this presumably wouldn’t even take longer than normal uploads for most people since you trade off transcoding time against upload time).
There are still other compute expenses but those are much more limited.
These mechanisms don’t (at least to my knowledge) exist in peertube yet, but would be possible.
The actually expensive part is always the actual networking: Networking is one of the few things that actually get more expensive at scale due to the complexity explosion, rather than cheaper (e.g. having dedicated transcoding hardware drops in price per user since you have higher utilization).
Networking quickly runs into bottlenecks where you have to account for all the covariances between datasets in the network.
Basically to increase the amount of e.g. storage available everything in the network needs to be increased (from the local machines connections, over the cables and switches up to routers and outgoing connections) due to you increasing the density at one point, you have to increase the network everywhere.
That’s why networking dwarfs everything: you just get crushed by networking being the bottleneck between your increasingly dense devices.
The clue behind peertube is that this is not as extreme of an effect due to
The latter is the important part: instead of having network cost rising (super) linearly to the amount of users you have it rise linearly to the amount of simultaneous unique videos.
This is a much smaller number which means you do not need to compete in that space, which is the dominant cost factor. (if you have a method where one user can retain the video and share it without actively watching that same video, you can probably get real-world sublinear scaling)
Mind you, the costs involved here are still large, but not insurmountably large, especially considering there is not one unique organisation that would have to pay for the entire thing and its not an upfront expense. Fundamentally though the system is built such that it won’t be crushed as users flood into the network.
That’s why it needs to be an international project. Paid by every country together. Sure some will initially have to pay more but sooner or later everyone wants to be part of it and pay their part.
Yeah, storage and bandwidth are massive considerations and there’s no way Peertube can handle it. And each channel running their own instance actually makes it worse, since you’re going to have smaller entities who can’t take advantage of deals that larger companies can make for hardware, data centers, bandwidth, etc. Plus, if you’re having to run your own instance to have a channel, then you’re not just focusing on creating videos for the channel, now you’re also a system architect, sysadmin, etc. It makes it a massive barrier to entry, and one that only tech enthusiasts will even consider tackling.
But even say that happens: a bunch of people running their own instances for their channel. Where are they hosting it? Are they purchasing their own hardware? Running their own data centers? They’re most certainly not running it out of their home. The overhead for that kind of operation is massive. What you’ll end up with is a bunch of people running their instances on AWS or some other PaaS provider. And then you’re right back to the problem you’re trying to solve with a distributed service: that the service is consolidated on one platform (even if it doesn’t appear that way to the end user). Sure, AWS et al aren’t dictating the terms of service for your Peertube instance, but the instance is dependent on that platform.
On top of all that, you have the issue of monetization. How are you going to make money from your channel? Peertube doesn’t have the kind of infrastructure of advertising etc. that YT has.
You also have another massive issue: legal. YT spent over a decade going through the courts with the MPAA, RIAA, et al fighting about copyright issues. Google has massive amounts of money and was able to weather that fight. But it’s competitors didn’t. Which is why you don’t have Vimeo stars, for example.
Running a YT channel is a massive time, energy, and money sink. Add all of these other considerations to it, and it’s an impossible task. It’s hard to think someone would could see PT as a viable alternative. Google destroyed all of the competition (or let attrition do it for them), and pulled the ladder up behind them.
I didn’t even think about the personal risk, which I do know because I run a lemmy instance. You hit the nail on the head, I either see it as:
I love the fediverse, but I was a professional in the video world too, and video is heavy. Everything about it is crazy, take all the scaling problems and quadruple them. I hope peertube can find something that works
380 GB in storage for multiple years of contents is really not much. I archive that amount every 2 months.
The real problem is serving all that content to the viewers, and the first bottleneck is usually the upload bandwidth.
I think the more interesting number would be to know how much data would it be to upload an average sized video to every viewer of it.
Using your example of a 15 MB video, serving that to 300.000 viewers means uploading roughly 4,5 TB data, plus some for technical data (TCP/IP and HTTP headers and such). For every (average) video! Now that’s a lot!
Fortunately PeerTube helps with that: viewers will automatically upload their downloaded chunks of the video to the others currently viewing it, so in the end the server needs somewhat less bandwidth usage.
Other than that, it would be the perfect place where channels could team up to host shared instances for themselves, or every channel their own one but with redundancy set up, so that their friend channels could also chip in with the bandwidth when needed.
Who said it needs to compete in scale as a single entity? PeerTube was never planned to be run by a single large provider
to be run
Oh yes, sorry, and also not complete but compete
For the scale that is needed it will inevitably be a handful of hosts at best.
deleted by creator
I didn’t say it would. Mastodon looked vastly different when it had its first wave of users. Peertube will look very different in the future as well.