Skip to content
wiki

Resource URIs

Use wiki as a database/sql-style driver so a host program can address Wikipedia pages and categories as wikipedia:// URIs.

wiki is a command line, but the wiki Go package is also a small driver that makes Wikipedia addressable as a resource URI. A host program registers it the way a program registers a database driver with database/sql, then dereferences wikipedia:// URIs without knowing anything about the MediaWiki API.

The host that does this today is ant, a single binary that puts one URI namespace over a family of site tools. The examples below use ant; any program that links the package gets the same behaviour.

Mounting the driver

A host enables the driver with one blank import, exactly like import _ "github.com/lib/pq":

import _ "github.com/tamnd/wikipedia-cli/wiki"

The package's init registers a domain with the scheme wikipedia (alias wiki) for the hosts en.wikipedia.org and wikipedia.org. The standalone wiki binary does not use any of this, so the CLI is unchanged.

Addressing pages and categories

A URI is scheme://authority/id. The driver serves two record types:

URI What it is
wikipedia://page/<Title> an article summary (id is the title)
wikipedia://category/<Name> a category page (id is the bare name, no Category: prefix)
ant get wikipedia://page/Alan_Turing         # the article summary record
ant cat wikipedia://page/Alan_Turing         # just the extract text (the body)
ant url wikipedia://page/Alan_Turing         # the live https URL
ant get wikipedia://category/Computability    # a category's metadata

A title with spaces works either way; Alan_Turing and Alan Turing name the same page, and a pasted URL resolves too:

ant get "wikipedia://page/Alan Turing"
ant resolve https://en.wikipedia.org/wiki/Alan_Turing   # -> wikipedia://page/Alan%20Turing

Walking the graph

ls lists the members of a collection, and every member is itself an addressable page URI, so a host can follow the graph:

ant ls wikipedia://page/Alan_Turing           # the articles this page links to
ant ls wikipedia://category/Computability      # the articles in the category
ant export wikipedia://page/Alan_Turing --follow 1 --to ./corpus

Page links are article-namespace links; category members are the articles in the category (not its subcategories or files). Both come back as page summaries keyed by title.

One wiki

The driver speaks the default English Wikipedia, where a bare title is unambiguous. To read another language or project, or to query Wikidata, use the wiki binary and its -l/--lang, --project, and --site flags as described in Multiple wikis. The driver and the binary share the same on-disk cache, so a page fetched one way is warm for the other.