Search API Connector Documentation
Import Reddit Data to Google Sheets
In this guide, we’ll walk through how to pull data from the Reddit API directly into Google Sheets, using the API Connector add-on for Sheets.
We’ll be accessing public data so we don’t need any API keys or other authorization. Let’s jump right in!
Contents
- Before You Begin
- Part 1: Create your API Request URL
- Part 2: Pull Reddit API Data into Sheets
- Part 3: Get More Reddit API Data
- Part 4: Handle Pagination
- Part 5: API Documentation and Resources
Before You Begin
Click here to install the API Connector add-on from the Google Marketplace.
Part 1: Create Your API Request URL
We’ll first access pull in top posts from a specific subreddit for the past year.
- API Base URL: https://www.reddit.com
- Endpoint: /r/Wallstreetbets/top.json
- Query String: ?limit=10&t=year
Putting it together, we get the full API Request URL:
https://www.reddit.com/r/Wallstreetbets/top.json?limit=10&t=year
Substitute in your subreddit of interest where it says Wallstreetbets.
Part 2: Pull Reddit API Data into Google Sheets
We can now enter those values into API Connector.
- Open up Google Sheets and click Extensions > API Connector > Open.
- In the Create screen, enter the Request URL we just created
- We don’t need OAuth authentication so just leave that set to None. We don’t need to enter any headers either, so just skip the Headers section.
- Create a new tab and click Set current to use that tab as your data destination.
- (Optional) The Reddit API sends back dozens or even hundreds of columns by default, including details about the awards received, links to thumbnails of different sizes, and various other minor data points. To filter these out and speed up your request, let’s add the following JMESPath expression. Just copy/paste this block into the Output options > JMESPath field:
data.children[].data.{subreddit_name_prefixed:subreddit_name_prefixed,permalink:permalink,url:url,title:title,selftext:selftext,upvote_ratio:upvote_ratio,total_awards_received:total_awards_received,num_comments:num_comments,score:score,created_utc:created_utc}
- Name your request and click Run. A minute or two later you’ll see data about the top 10 posts from the subreddit populate your Sheet:
Part 3: More Example API Requests
You can check the documentation for the full set of endpoints and parameters, but if you just want to check it out, try the following examples:
- Search a subreddit for a keyword (keep the same JMESPath shown above to avoid pulling in excessive columns)
https://www.reddit.com/r/Wallstreetbets/search.json?limit=10&q=amc
- Search all subreddit titles for a keyword
https://www.reddit.com/subreddits/search?q=robots
- Get a list of popular subreddits
https://www.reddit.com/subreddits/popular.json
Part 4: Handle Pagination
Note that the Reddit limits the number of records returned in each request. By default, only 100 records will be returned unless you use the limit
parameter, like this:
https://www.reddit.com/r/Wallstreetbets/search.json?limit=100&q=amc
The limit parameter has a maximum value of 100 per page, so if you still need to retrieve more records, you’ll need to paginate through as described in their documentation.
With API Connector, you can handle this automatically through pagination handling, like this:
- API URL: enter your request URL, including limit=100
- Pagination: cursor
- Next token parameter: after
- Next token path: data.after
- Run until: choose when to stop fetching data
Part 5: API Documentation and Resources
- Official API documentation: https://www.reddit.com/dev/api/
- Generally you can access any of the endpoints marked “rss support” without further authorization. Endpoints returning private data about your account will require an OAuth connection.
- Pushshift.io is an alternative to the official Reddit API. It provides additional functionality like the ability to search by date: https://pushshift.io/api-parameters/