Zum Inhalt der Seite gehen


sooo... has anyone scraped #reddit yet?
yeah. OpenAI.
oh... right :/
Wasn't GPT-3 based partially on a reddit scrape (because if a site gets five upvotes on reddit that's Verified Reasonable And Probably True)
Yes. Pushshift has an archive of June 2005 through March 2023 (they stopped because they were one of the first casualties of Reddit's new policies).

https://old.reddit.com/r/DataHoarder/comments/1479c7b/historic_reddit_archives_ongoing_archival_effort/

Doug Webb hat dies geteilt

thanks! As grateful as I am for the openai answers, this is actually what I was asking for :)