Getting Wikipedia IDs in MQL


Freebase <a href="http://download.freebase.com/wex/" rel="nofollow">WEX dumps</a> contain a wpid column corresponding to the page_id from the source MediaWiki database in the <a href="http://wiki.freebase.com/wiki/WEX/Documentation#freebase_wpid" rel="nofollow">freebase_wpid</a> table. This table provides a mapping between Wikipedia numeric article/redirect IDs and Freebase GUIDs (Global Unique IDs).

guid use as foreign keys is deprecated by mid for <a href="http://wiki.freebase.com/wiki/Mid" rel="nofollow">lots of good reasons</a>, but that doesn't change the fact that guids are still used at a system level so I'm going to call mid an accessor from here on.

Using the mid accessor is flexible in MQL. One can query using "mid": null and using "mid":[] depending on whether one needs the current mid or every mid.

Finding a list of wpid values per mid is straightforward in MQL:

[{ "mid": null "key": [{"namespace":"/wikipedia/en_id", "value":null}] }]

But if all is well in the universe, each current mid should have only one current wpid, so is there a way to do something like "wpid": null like one can with the mql accessor?


If you only want one wpid value per mid you could do something like this:

[{ "mid": null, "key": { "namespace": "/wikipedia/en_id", "value": null, "limit": 1 } }]​

<a href="http://tinyurl.com/6ct72sb" rel="nofollow">Try it out</a>

Bare in mind that it is entirely possible that a Freebase topic would have more than one wmid. This happens whenever we need to merge duplicate topics that we've imported from Wikipedia, or if we import them before they get merged in Wikipedia.

If you're looking for links to Wikipedia pages you might also be interested in the /wikipedia/en_title namepace:

[{ "mid": null, "key": { "namespace": "/wikipedia/en_title", "value": null, "limit": 1 } }]​

<a href="http://tinyurl.com/6yt8eod" rel="nofollow">Try it out</a>


