Import Wikipedia Data to Google Sheets
- Before You Begin
- Part 1: Create your API Request URL
- Part 2: Pull Wikipedia API Data into Sheets
- Part 3: More Example API URLs
BEFORE YOU BEGIN
Click here to install the API Connector add-on from the Google Marketplace.
PART 1: CREATE YOUR API REQUEST URL
We’ll first follow the Wikimedia REST API documentation to pull in information about a random Wikipedia article, then show how to extend this to other API requests.
- Base URL: https://en.wikipedia.org/api/rest_v1
- Endpoint: /page/random/summary
Putting it together, we get the full API Request URL:
PART 2: PULL WIKIPEDIA API DATA INTO SHEETS
Now let’s enter our URL into API Connector and import Wikipedia data into Google Sheets.
- Open up Google Sheets and click Add-ons > API Connector > Open.
- In the Create screen, enter the Request URL we just created
- Under Headers, enter a key-value pair like this:
While this request will work without including a header, Wikipedia’s global API rules request that you set a unique User-Agent that allows them to contact you quickly. You may use an Email address or the URL to a contact page, like this:
- Leave authorization set to None as we don’t need extra authorization here. Create a new tab and click ‘Set current’ to use that tab as your data destination.
- Name your request and click Run. A moment later you’ll see information about a random article populate your Google Sheet. If you’d like more random articles, switch to Append mode, and hit Run several times.
PART 3: MORE EXAMPLE API URLS
Experiment with endpoints as described in the documentation here, as well as the documentation on pageview analytics here to see other types of Wikipedia responses. If you just want to get started, you can try out the following requests, one at a time:
- Get metadata and an abstract about a specific page:
- List of births on a specific day:
- Pageviews per day for in April 2020 for the Wikipedia article on Tiger King
- Top 1000 articles in March, 2020 (Tip: switch to the Compact report style to avoid timing out):
The article Create API Request Based on a Cell describes how you can point to a cell to dynamically change the date or endpoint in the URL, which can be very convenient for constructing requests.