Import Wikipedia Data to Google Sheets

In this guide, we’ll show how to pull data from Wikipedia directly into Google Sheets, using the API Connector add-on for Sheets.

CONTENTS

PART 1: CREATE YOUR API REQUEST URL

We’ll first follow the Wikimedia REST API documentation to pull in information about a random Wikipedia article, then show how to extend this to other API requests.

  • Base URL: https://en.wikipedia.org/api/rest_v1
  • Endpoint: /page/random/summary

Putting it together, we get the full API Request URL:

https://en.wikipedia.org/api/rest_v1/page/random/summary

PART 2: PULL WIKIPEDIA API DATA INTO SHEETS

We can now enter our URL into API Connector and import Wikipedia data into Google Sheets.

  1. Open up Google Sheets and click Add-ons > API Connector > Open.
  2. In the Create screen, enter the Request URL we just created
    wikipedia-img1
  3. Under Headers, enter a key-value pair like this:
    User-AgentYOUR_CONTACT_INFO

    While this request will work without including a header, Wikipedia’s global API rules request that you set a unique User-Agent that allows them to contact you quickly. You may use an Email address or the URL to a contact page, like this:
    wikipedia-img2

  4. Leave authorization set to None as we don’t need extra authorization here. Create a new tab and click ‘Set current’ to use that tab as your data destination.
  5. Name your request and click Run. A moment later you’ll see information about a random article populate your Google Sheet. If you’d like more random articles, switch to Append mode, and hit Run several times.wikipedia-img3

PART 3: MORE EXAMPLE API URLS

Experiment with endpoints as described in the documentation here, as well as the documentation on pageview analytics here to see other types of Wikipedia responses. If you just want to get started, you can try out the following requests, one at a time:

  • Get metadata and an abstract about a specific page:
    https://en.wikipedia.org/api/rest_v1/page/summary/Google_Sheets
  • List of births on a specific day:
    https://en.wikipedia.org/api/rest_v1/feed/onthisday/births/06/02
  • Pageviews per day for in April 2020 for the Wikipedia article on Tiger King
    https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/all-agents/Tiger_King/daily/20200401/20200430
  • Top 1000 articles in March, 2020 (Tip: switch to the Compact report style to avoid timing out):
    https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia/all-access/2020/04/10
    wikipedia-img4

The article Create API Request Based on a Cell describes how you can point to a cell to dynamically change the date or endpoint in the URL, which can be very convenient for constructing requests.

Leave a Comment