Archiving RoosterTeeth.com to my own local Plex server

This is a repost of what I wrote on the r/morningsomewhere subreddit on Noveember 15th 2024.


In episode 2024.11.11: Package Stimulation Burnie talked about archiving user manuals. Listening to that I was thinking of how I have a shared iCloud Drive with my housemate of all the user manuals of any equipment we buy.

But it also made me think of something I archived earlier in the year.

When the news broke that Rooster Teeth was closing down I wanted to save some of the shows that were special to me. I don't know what flicked in my head but I soon changed to wanting to get everything, so I went out and purchased a NAS and a bunch of HDDs (4x 22TB HDDs to be exact). I had never undertaken something like this before so from the storage side there was a bunch of stuff I had to learn, like what ZFS is and why I should or shouldn't use it for this purpose.

I had already played with the RT API in the past (I was making an unofficial LG TV app for funsies that didn't get far) so I used that knowledge and started working on a tool called RTArchiver which would be able to login, read episode data, parse it all, and then start downloading shows. This was all in .NET that I could run it in docker images on various PCs without much trouble.

Because things were time sensitive I had a second internet connection added to my house (standard 1Gbps that I had + 250Mbps new connection). At first I was in a rush I started pulling in videos and just dumping them with their ID so I ended up with was a folder of files named things like "12345678.mp4". As time went on I started looking at organising the data. At the time cover art was also saved elsewhere in locations that were unrelated to the final videos with no relation between the two.

I had come across the Archive of Pimps and looked at how they were organising their content. I ended up sorting things into channels/shows/episodes by their slug name. It was human readable, but most of all I could almost guarantee it was always valid characters for files and folders. Then came the next bit, how would I watch it all? I figured mounting a network drive and opening VLC would be a boring.

I had a Plex server already, so I wondered if I could ingest the RT content into it. Playing with some things it became clear very quickly that the entire RT site is not linked on external sites where Plex pulls its metadata from. Looking for a solution to this I happened to came across Plex Local Metadata Files plugin/agent. It would allow me to place custom .info files around my hard drive with data that Plex would be able to associate with a given channel/show/episode. So I made a parser which would then dump the manifest data into .info files. When placed in the right location it should allow Plex to read this and after several days of fighting with it, eventually everything just worked.

Now images were in the right location, videos were in the right location, info files were in the right location. And just like that I had a copy of the Rooster Teeth site (sans community, sans a few shows that were deleted, sans a few thumbnails that no longer load, etc) that I can stream from my own mini giant local Plex server.

While all of this was happening I also made another tool called the RTPodcastArchiver which would download all of the old RT podcasts from their respective RSS feeds, rename and reorder the files and upload them to the Internet Archive. They should be accessible from here, here. Internet Archive has a player so you can just hit playback right on the site. Any podcasts that continued after the final shutdown were not archived as it felt like it was crossing into piracy territory.

I learned a whole bunch doing this. It was very interesting to take a publicly accessible API and reverse engineer it into something useful with zero documentation. There were also so many pain points I came across, for example special characters in filenames will break things. 21 years of content and so much of it is organised differently, so trying to keep it sorted on my side was difficult. There is so many little things that tell a story that is just not seen by us. Like how amazingly organised and kept in the same structure Jacks content is, and how often Geoffs wasn't. I had also come across this previously, but I am sure somewhere someone has a story of why RvB Season 4 (Complete edition maybe?) only had a 720p upload while the others are 1080p. And I know its on a lot of peoples minds, but I am sorry to report that no, there was no 4K copies of Day5 hidden in the public data.

I won't have this data forever. These disks will someday die. And when they do I will be sad. Until then I can continue browse this collection and find shows that I never knew existed and enjoy them for the first time like its 2012 all over again.