Lab: Linked Opend Data & SparQL
Author
- Stéphane Derrode & Lamia Derrode, Centrale Lyon, Mathematics & Computer Sciences Dpt
Objectives
- Answer 10 queries on the Linked Open Data (LOD) database called DBPedia.
- Come up with an original query on the Nobel Price LOD database.
Table of contents
Practice with DBPedia¶
1. Setup¶
First, install sparqlkernel:
pip install sparqlkernel
jupyter sparqlkernel install --user bob
and use VSCode to execute your queries. Here is a template notebook you can use to test your installation and complete latter with your own queries.
2. Some useful URIs and prefixes¶
foaf:Person | Class | Class of persons |
dbo:City | Class | Class of cities |
dbo:Country | Class | Class of countries |
foaf:name | Property | Name of a person (among other things) |
dbo:birthDate | Property | Birth date of a person |
dbo:birthPlace | Property | Birthplace of a person |
dbo:deathDate | Property | Death date of a person |
dbo:deathPlace | Property | Death place of a person |
dbo:city | Property | City |
dbo:country | Property | Country to which a place belongs |
dbo:mayor | Property | Mayor of a city |
dbr:Lyon | Instance | The city of Lyon |
dbr:France | Instance | France |
Generally, to find a prefix, visit this site.
3. Ten queries to elaborate…¶
TIP If you want to check the correctness of a query in terms of syntax, use SparQL query-validator. As a bonus, it re-indents and improves the readability of your code!
- Show the URLs of people born in Lyon (i.e., latter called Lyonnais). Here is the expected query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT ?p
WHERE {
?p a foaf:Person;
dbo:birthPlace dbr:Lyon.
}
Check that specifying or not specifying ?p
as Person
class does not change the result (the property dbo:birthPlace
applies only to people in DBPedia, so it’s not necessary to specify that we are looking for a person!).
- Show the URLs and names (
foaf:name
) of Lyonnais, in alphabetical order. Why do some Lyonnais have an empty name? (TIP: check their profile). It is not required to solve this issue. - Show the names and birthdates of Lyonnais.
- Show the names and birthdates of Lyonnais born after 1900 (
FILTER(year(?date)>1900)
), including their death date if applicable. - Show the names of all Lyonnais who died in Lyon.
- Show the names of all Lyonnais who died outside of France.
We need to distinguish between the case where the death place is a country or a city. If the death place is a country, simply check that it is not France. If the death place is a city, check that the city is not in France, meaning there is no link between the city and France.
?deathPlace a ?deathPlaceType.
FILTER ((?deathPlaceType = dbo:Country && ?deathPlace != dbr:France) ||
(?deathPlaceType = dbo:City && NOT EXISTS { ?deathPlace dbo:country dbr:France }))
- Show all French cities whose mayor is a native.
- Show all French mayors (i.e., mayors of French cities) born outside France.
- Show the number of French mayors born outside France.
- Show the 10 cities with the most natives listed in DBPedia, sorted in descending order of native count.
Querying the Nobel Prize database¶
Assignment Come up with a one or two queries for this database.
Nobel Prize endpoint access.
The data structure description can be found here. It is well done and relatively easy to understand!
Example
PREFIX nobel: <http://data.nobelprize.org/terms/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?name WHERE {
?person a nobel:Laureate;
rdfs:label ?name;
} LIMIT 10
Other LOD databases¶
There are search engines for databases on a user-chosen topic. However, be aware that many of the databases proposed by these engines are inaccessible (obsolete). You will need to try several before finding a valid one!
Below are examples of recently tested databases, along with a sample query. For others, feel free to consult The Linked Open Data Cloud !
TIP : When you don’t know a database, you can easily find the most used classes with this query
SELECT ?obj
WHERE {
?sub a ?obj .
}
GROUP BY ?obj
ORDER BY DESC(COUNT(?sub))
1. WikiData endpoint access.¶
I advise against using this database because the queries are quite unreadable (all classes and properties are labeled as codes, like wd:Q6256). It is maybe better to use Yago, see below.
Example
This query lists the identifiers of countries in the database.PREFIX bd: <http://www.bigdata.com/rdf#>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?object ?objectLabel
WHERE {
?object wdt:P31 wd:Q6256.
?object wdt:P463 wd:Q458
SERVICE wikibase:label {bd:serviceParam wikibase:language "fr,en" .} #For labels in English and French
}
2. Yago endpoint access.¶
Example
This query lists the properties associated with Elvis Presley.PREFIX yago: <http://yago-knowledge.org/resource/>
SELECT ?property (GROUP_CONCAT(DISTINCT ?valueOrObject; separator=", ") AS ?values)
WHERE {
yago:Elvis_Presley ?property ?valueOrObject .
}
GROUP BY ?property
3. European Patent Ontology endpoint access¶
The description of the data structure can be found here.
Example
PREFIX patent: <http://data.epo.org/linked-data/def/patent/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX st3: <http://data.epo.org/linked-data/def/st3/>
PREFIX text: <http://jena.apache.org/text#>
PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?application ?appNum ?filingDate ?authority {
?application rdf:type patent:Application ;
patent:applicationNumber ?appNum ;
patent:filingDate ?filingDate ;
patent:applicationAuthority ?authority.
} LIMIT 10
4. Bibliotheque Nationale de France endpoint access¶
Example
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX bio: <http://vocab.org/bio/0.1/>
SELECT ?auteur ?jour ?date1 ?date2 ?nom
WHERE {
?auteur foaf:birthday ?jour.
?auteur bio:birth ?date1.
?auteur bio:death ?date2.
OPTIONAL {?auteur foaf:name ?nom.}
}
ORDER BY (?jour)
LIMIT 100
An author's URI can be obtained by running the following query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?auteur
WHERE {
?auteur a foaf:Person.
?auteur foaf:name "Honoré de Balzac".
}
5. Musiekweb endpoint access¶
Example #1
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX vocab: <https://data.muziekweb.nl/vocab/>
SELECT *
WHERE{
?artist a vocab:Performer.
?artist rdfs:label ?artistlabel
}
LIMIT 10
Example #2
PREFIX vocab: <https://data.muziekweb.nl/vocab/>
SELECT DISTINCT ?prop
WHERE{
?artist a vocab:Performer;
?prop ?val.
}
6. OpenStreetMap endpoint access¶
Example
SELECT ?ecole ?site ?niveau ?coord
WHERE {
?ecole osmt:amenity "school";
osmt:website ?site;
osmt:grades ?niveau;
osmm:loc ?coord.
}
limit 10
7. DBLP endpoint access¶
Example
This query lists the number of theses by university in _dblp_ (access the human-readable version of _dblp_ [here](https://dblp.org)).PREFIX dblp: <https://dblp.org/rdf/schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?school (COUNT(DISTINCT ?thesis) as ?count) WHERE {
?thesis rdf:type dblp:Book .
?thesis dblp:thesisAcceptedBySchool ?school .
}
GROUP BY ?school
ORDER BY DESC(?count)
8. Others¶
- CIA World Factbook endpoint access
- Bio2Rdf endpoint access