• author: Matthew Berman

The Meltdown of Reddit and the Future of the Open Internet

Last week's meltdown on Reddit was not just another controversial incident on the platform, it was an important sign of the beginning of the end of the open and free internet that we know today. The reason for this shift? Artificial Intelligence, or more specifically, large language models that require vast amounts of high-quality and structured data to be trained. The more unique and high-quality the data, the better the model will be. But with the proliferation of these models, data has become increasingly valuable.

Reddit's management recently decided to increase the cost of their API from free to paid, causing third-party developers to stop building amazing applications on top of its data. For instance, one of the biggest applications, Apollo, which is a better alternative to Reddit's native app, had to shut down due to the new pricing. The author of Apollo claims that the app would require 20 million dollars per year to continue running.

Twitter implemented the same measures few months ago to charge for their API. As more and more companies that own proprietary data follow suit, websites with unique data sets like Reddit, Stack Overflow, and Twitter are shutting down their APIs and becoming highly siloed.

The Value of Data

Large language models are ingesting data and becoming hypervaluable very quickly, which is why companies like Open AI, founded and run by Sam Altman, are scrambling to get their hands on better and better data, especially unique data. Open AI sees Reddit's data set as the ultimate competitive advantage, which is why Reddit might already be sharing their data with Open AI.

Data has always been valuable, but now we're talking about exponential increases in its value due to large language models. Most large language models are built on the same corpus of open-source data sets, making unique data sets even more valuable.

The Future of the Internet

Over the past 15 years, companies began to open their APIs and allow third-party developers to build on top of their data sets resulting in a rise of incredible ecosystems that users got a ton of value from. But now, as large language models gain more attention and ingesting data becomes an even more competitive market, websites with unique data sets are choosing to shut down their APIs and becoming highly siloed.

This is a pivotal moment for the internet as we know it. The open and free internet that we've enjoyed may become closed down as access to data becomes more valuable and companies less incentivized to share their data with third-party developers to build amazing functionality on top of it.

What Should We Do?

This moment presents an opportunity for entrepreneurs to come along and build business models that reward users for their content contributions. However, users should own their data completely, but these companies are not incentivized to legally allow that. As soon as users post something on Reddit or Twitter, these companies own the data, which is disadvantageous to users. Platforms like YouTube share a percentage of their ad revenue with creators, but many companies still don't follow suit, making it the ripe time for another platform to disrupt Reddit and other companies that are ingesting data from their users without giving them a cut of it.

Reddit's blackout was a great move, but the chance of it making a significant difference is not very promising. Having a thriving open-source ecosystem to compete against closed-source artificial intelligence companies is critical. The future of the internet is in the hands of users who create the data that these companies use.

Let's hope for significant change and user benefit in the end because the transformation of the open and free internet we know today can lead to a more controlled and potentially dangerous future.

