私は RDF の使い方を学ぼうとしており、学習演習として dbpedia から一連の事実を引き出そうとしています。次のコード サンプルは機能していますが、配偶者などのサブジェクトについては、常に自分自身を引き出します。
質問:
- get_name_from_uri() は URI の最後の部分を引き出し、アンダースコアを削除します - 人の名前を取得するためのより良い方法が必要です
- 配偶者の結果は配偶者を引き戻すだけでなく、データ主体も引き戻す - そこで何が起こっているのか分からない
- 一部の結果では、URI 形式とテキスト項目の両方でデータが返されます -
これはコード ブロックからの出力であり、私が得ている奇妙な結果の一部を示しています (プロパティの混合出力を参照してください。
Accessing facts for Napoleon held at http://dbpedia.org/resource/Napoleon
There are 800 facts about Napoleon stored at the URI
http://dbpedia.org/resource/Napoleon
Here are a few:-
Ontology:deathdate
Napoleon died on 1821-05-05
Ontology:birthdate
Napoleon was born on 1769-08-15
Property:spouse retruns the person themslves twice !
Napoleon was married to Marie Louise, Duchess of Parma
Napoleon was married to Napoleon
Napoleon was married to Jos%C3%A9phine de Beauharnais
Napoleon was married to Napoleon
Property:title retruns text and uri's
Napoleon Held the title: "The Death of Napoleon"
Napoleon Held the title: http://dbpedia.org/resource/Emperor_of_the_French
Napoleon Held the title: http://dbpedia.org/resource/King_of_Italy
Napoleon Held the title: First Consul of France
Napoleon Held the title: Provisional Consul of France
Napoleon Held the title: http://dbpedia.org/resource/Napoleon
Napoleon Held the title: Emperor of the French
Napoleon Held the title: http://dbpedia.org/resource/Co-Princes_of_Andorra
Napoleon Held the title: from the Memoirs of Bourrienne, 1831
Napoleon Held the title: Protector of the Confederation of the Rhine
Ontology birth place returns three records
Napoleon was born in Ajaccio
Napoleon was born in Corsica
Napoleon was born in Early modern France
これは、上記の出力を生成する python です。これには rdflib が必要であり、非常に進行中の作業です。
import rdflib
from rdflib import Graph, URIRef, RDF
######################################
# A quick test of a python library reflib to get data from an rdf graph
# D Moore 15/3/2014
# needs rdflib > version 3.0
# CHANGE THE URI BELOW TO A DIFFERENT PERSON AND SEE WHAT HAPPENS
# COULD DO WITH A WEB FORM
# NOTES:
#
#URI_ref = 'http://dbpedia.org/resource/Richard_Nixon'
#URI_ref = 'http://dbpedia.org/resource/Margaret_Thatcher'
#URI_ref = 'http://dbpedia.org/resource/Isaac_Newton'
#URI_ref = 'http://dbpedia.org/resource/Richard_Nixon'
URI_ref = 'http://dbpedia.org/resource/Napoleon'
#URI_ref = 'http://dbpedia.org/resource/apple'
##########################################################
def get_name_from_uri(dbpedia_uri):
# pulls the last part of a uri out and removes underscores
# got to be an easier way but it works
output_string = ""
s = dbpedia_uri
# chop the url into bits devided by the /
tokens = s.split("/")
# because the name of our person is in the last section itterate through each token
# and replace the underscore with a space
for i in tokens :
str = ''.join([i])
output_string = str.replace('_',' ')
# returns the name of the person without underscores
return(output_string)
def is_person(uri):
##### SPARQL way to do this
uri = URIRef(uri)
person = URIRef('http://dbpedia.org/ontology/Person')
g= Graph()
g.parse(uri)
resp = g.query(
"ASK {?uri a ?person}",
initBindings={'uri': uri, 'person': person}
)
print uri, "is a person?", resp.askAnswer
return resp.askAnswer
URI_NAME = get_name_from_uri(URI_ref)
NAME_LABEL = ''
if is_person(URI_ref):
print "Accessing facts for", URI_NAME, " held at ", URI_ref
g = Graph()
g.parse(URI_ref)
print "Person Extract for", URI_NAME
print "There are ",len(g)," facts about", URI_NAME, "stored at the URI ",URI_ref
print "Here are a few:-"
# Ok so lets get some facts for our person
for stmt in g.subject_objects(URIRef("http://dbpedia.org/ontology/birthName")):
print URI_NAME, "was born " + str(stmt[1])
for stmt in g.subject_objects(URIRef("http://dbpedia.org/ontology/deathDate")):
print URI_NAME, "died on", str(stmt[1])
for stmt in g.subject_objects(URIRef("http://dbpedia.org/ontology/birthDate")):
print URI_NAME, "was born on", str(stmt[1])
for stmt in g.subject_objects(URIRef("http://dbpedia.org/ontology/eyeColor")):
print URI_NAME, "had eyes coloured", str(stmt[1])
for stmt in g.subject_objects(URIRef("http://dbpedia.org/property/spouse")):
print URI_NAME, "was married to ", get_name_from_uri(str(stmt[1]))
for stmt in g.subject_objects(URIRef("http://dbpedia.org/ontology/reigned")):
print URI_NAME, "reigned ", get_name_from_uri(str(stmt[1]))
for stmt in g.subject_objects(URIRef("http://dbpedia.org/ontology/children")):
print URI_NAME, "had a child called ", get_name_from_uri(str(stmt[1]))
for stmt in g.subject_objects(URIRef("http://dbpedia.org/property/profession")):
print URI_NAME, "(PROPERTY profession) was trained as a ", get_name_fro m_uri(str(stmt[1]))
for stmt in g.subject_objects(URIRef("http://dbpedia.org/property/child")):
print URI_NAME, "PROPERTY child ", get_name_from_uri(str(stmt[1]))
for stmt in g.subject_objects(URIRef("http://dbpedia.org/property/deathplace")):
print URI_NAME, "(PROPERTY death place) died at: ", str(stmt[1])
for stmt in g.subject_objects(URIRef("http://dbpedia.org/property/title")):
print URI_NAME, "(PROPERTY title) Held the title: ", str(stmt[1])
for stmt in g.subject_objects(URIRef("http://dbpedia.org/ontology/sex")):
print URI_NAME, "was a ", str(stmt[1])
for stmt in g.subject_objects(URIRef("http://dbpedia.org/ontology/knownfor")):
print URI_NAME, "was known for ", str(stmt[1])
for stmt in g.subject_objects(URIRef("http://dbpedia.org/ontology/birthPlace")):
print URI_NAME, "was born in ", get_name_from_uri(str(stmt[1]))
else:
print "ERROR - "
print "Resource", URI_ref, 'does not look to be a person or there is no record in dbpedia'