Lab: Linked Opend Data & SparQL

SparQL

Author

Objectives

  • Answer 10 queries on the Linked Open Data (LOD) database called DBPedia.
  • Come up with an original query on the Nobel Price LOD database.

Table of contents



Practice with DBPedia

1. Setup

First, install sparqlkernel:

pip install sparqlkernel
jupyter sparqlkernel install --user bob

and use VSCode to execute your queries. Here is a template notebook you can use to test your installation and complete latter with your own queries.

2. Some useful URIs and prefixes

foaf:Person Class Class of persons
dbo:City Class Class of cities
dbo:Country Class Class of countries
foaf:name Property Name of a person (among other things)
dbo:birthDate Property Birth date of a person
dbo:birthPlace Property Birthplace of a person
dbo:deathDate Property Death date of a person
dbo:deathPlace Property Death place of a person
dbo:city Property City
dbo:country Property Country to which a place belongs
dbo:mayor Property Mayor of a city
dbr:Lyon Instance The city of Lyon
dbr:France Instance France

Generally, to find a prefix, visit this site.

3. Ten queries to elaborate…

TIP If you want to check the correctness of a query in terms of syntax, use SparQL query-validator. As a bonus, it re-indents and improves the readability of your code!

  1. Show the URLs of people born in Lyon (i.e., latter called Lyonnais). Here is the expected query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT ?p 
WHERE {
  ?p         a      foaf:Person;
    dbo:birthPlace dbr:Lyon.
}

Check that specifying or not specifying ?p as Person class does not change the result (the property dbo:birthPlace applies only to people in DBPedia, so it’s not necessary to specify that we are looking for a person!).

  1. Show the URLs and names (foaf:name) of Lyonnais, in alphabetical order. Why do some Lyonnais have an empty name? (TIP: check their profile). It is not required to solve this issue.
  2. Show the names and birthdates of Lyonnais.
  3. Show the names and birthdates of Lyonnais born after 1900 (FILTER(year(?date)>1900)), including their death date if applicable.
  4. Show the names of all Lyonnais who died in Lyon.
  5. Show the names of all Lyonnais who died outside of France.

We need to distinguish between the case where the death place is a country or a city. If the death place is a country, simply check that it is not France. If the death place is a city, check that the city is not in France, meaning there is no link between the city and France.

?deathPlace a ?deathPlaceType.
FILTER ((?deathPlaceType = dbo:Country && ?deathPlace != dbr:France) || 
    (?deathPlaceType = dbo:City && NOT EXISTS { ?deathPlace dbo:country dbr:France }))
  1. Show all French cities whose mayor is a native.
  2. Show all French mayors (i.e., mayors of French cities) born outside France.
  3. Show the number of French mayors born outside France.
  4. Show the 10 cities with the most natives listed in DBPedia, sorted in descending order of native count.

Querying the Nobel Prize database

Assignment Come up with a one or two queries for this database.

Nobel Prize endpoint access.

The data structure description can be found here. It is well done and relatively easy to understand!

Example
PREFIX nobel: <http://data.nobelprize.org/terms/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?name WHERE {
    ?person a nobel:Laureate;
        rdfs:label ?name;
} LIMIT 10



Other LOD databases

There are search engines for databases on a user-chosen topic. However, be aware that many of the databases proposed by these engines are inaccessible (obsolete). You will need to try several before finding a valid one!

Below are examples of recently tested databases, along with a sample query. For others, feel free to consult The Linked Open Data Cloud !

TIP : When you don’t know a database, you can easily find the most used classes with this query

SELECT  ?obj
WHERE {
  ?sub a ?obj .
}
GROUP BY ?obj
ORDER BY DESC(COUNT(?sub))


1. WikiData endpoint access.

I advise against using this database because the queries are quite unreadable (all classes and properties are labeled as codes, like wd:Q6256). It is maybe better to use Yago, see below.

Example This query lists the identifiers of countries in the database.
PREFIX bd: <http://www.bigdata.com/rdf#> 
PREFIX wikibase: <http://wikiba.se/ontology#> 
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 

SELECT  ?object ?objectLabel
WHERE {
  ?object wdt:P31 wd:Q6256.
  ?object wdt:P463 wd:Q458 

  SERVICE wikibase:label {bd:serviceParam wikibase:language "fr,en" .} #For labels in English and French
}


2. Yago endpoint access.

Example This query lists the properties associated with Elvis Presley.
PREFIX yago: <http://yago-knowledge.org/resource/>

SELECT  ?property (GROUP_CONCAT(DISTINCT ?valueOrObject; separator=", ") AS ?values)
WHERE {
      yago:Elvis_Presley ?property ?valueOrObject .
    } 
GROUP BY ?property


3. European Patent Ontology endpoint access

The description of the data structure can be found here.

Example
PREFIX patent: <http://data.epo.org/linked-data/def/patent/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX st3: <http://data.epo.org/linked-data/def/st3/>
PREFIX text: <http://jena.apache.org/text#>
PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT ?application ?appNum ?filingDate ?authority {
  ?application rdf:type patent:Application ;
      patent:applicationNumber ?appNum ;
      patent:filingDate        ?filingDate ; 
      patent:applicationAuthority ?authority.
} LIMIT 10


4. Bibliotheque Nationale de France endpoint access

Example
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX bio: <http://vocab.org/bio/0.1/>

SELECT ?auteur ?jour ?date1 ?date2 ?nom
WHERE {
  ?auteur foaf:birthday ?jour.
  ?auteur bio:birth ?date1.
  ?auteur bio:death ?date2.
  OPTIONAL {?auteur foaf:name ?nom.}
}
ORDER BY (?jour)
LIMIT 100
An author's URI can be obtained by running the following query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?auteur 
WHERE {
  ?auteur a foaf:Person.
  ?auteur foaf:name "Honoré de Balzac".
}


5. Musiekweb endpoint access

Example #1
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX vocab: <https://data.muziekweb.nl/vocab/>
SELECT *
WHERE{
  ?artist a vocab:Performer.
  ?artist rdfs:label ?artistlabel
}
LIMIT 10


Example #2
PREFIX vocab: <https://data.muziekweb.nl/vocab/>

SELECT DISTINCT ?prop
WHERE{
  ?artist a vocab:Performer;
          ?prop ?val.
}


6. OpenStreetMap endpoint access

Example
SELECT ?ecole ?site ?niveau ?coord 
WHERE {
  ?ecole osmt:amenity "school";
        osmt:website ?site;
        osmt:grades  ?niveau;
        osmm:loc     ?coord.
  }
limit 10


7. DBLP endpoint access

Example This query lists the number of theses by university in _dblp_ (access the human-readable version of _dblp_ [here](https://dblp.org)).
PREFIX dblp: <https://dblp.org/rdf/schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?school (COUNT(DISTINCT ?thesis) as ?count) WHERE {
  ?thesis rdf:type dblp:Book .
  ?thesis dblp:thesisAcceptedBySchool ?school .
}
GROUP BY ?school 
ORDER BY DESC(?count)


8. Others