NewsReader project Web site NewsReader Data Model ontology 0.2

Namespace Document 09 Jan 2014

Document URL:
http://dkm.fbk.eu/ontologies/newsreader.html (HTML, PDF)
Vocabulary URL:
http://dkm.fbk.eu/ontologies/newsreader.owl (RDF/XML, TURTLE, Manchester Syntax)
Authors:
Marco Rospocher, Luciano Serafini (FBK-Irst)
Francesco Corcoglioniti
Valid XHTML + RDFa


Abstract

This specification describes the NewsReader Data Model Ontology, an OWL 2 ontology that formalizes the data model of the KnowledgeStore instance for the NewsReader project. The ontology provides a specialization of the KnowledgeStore Core Data Model ontology with respect to the three Resource, Mention and Entity representation layers. The ontology is based on the annotation guidelines of NewsReader WP3 (to be described in Deliverable D3.3.1: Annotated Data) and on the specification of the NLP Annotation Format (NAF) of NewsReader WP2 (to be described in Deliverable D2.1: System Design).

Status of This Document

The NewsReader Data Model ontology is a work in progress. This document describes the latest version of the ontology as currently used in the NewsReader KnowledgeStore instance.

Table of Contents

Overview

The following UML class diagram informally presents an overview of the ontology. Classes are rendered as UML classes, datatype properties as attributes and object properties as UML relations; minimum and maximum cardinalities and expected datatypes are also shown. The components of the ontology are detailed in the following sections.

Class diagram showing ontology classes and properties

The vocabulary namespace is http://dkm.fbk.eu/ontologies/newsreader. The suggested prefix for referencing the vocabulary is nwr:.

A list of classes and properties is reported below, with links to their reference documentation:

Resource layer

For each processed news, two resources are stored in the KnowledgeStore: (i) a News resource for the news itself, containing its metadata and, optionally, its textual content (depending on availability and copyright agreements); and (ii) a NAFDocument resource storing the NAF document generated for the news.

News are described using metadata from the Dublin Core Metadata Terms vocabulary (dct:* attributes), augmented with NewsReader-specific attributes to keep track of the external source document the news has been imported from: originalFileName, originalFileFormat, originalPages (as defined in NAF).

NAF documents are described with the subset of metadata from the NAF header that is most relevant for selecting NAF documents in the KnowledgeStore. This subset comprises: the NAF version (attribute version); the publicId of the NAF document (attribute dct:identifier); the NAF layers available in the NAF document, e.g., text, terms, deps (attribute layer); the NAF processors used (attribute dct:creator) and the language of the processed document (attribute dct:language); complete metadata and all the produced linguistic annotations are available in the stored XML content of the NAF document.

Mention layer

Based on the markables of WP3 annotation guidelines, four main types of mentions are formalized in this ontology: Entity mentions, Relation mentions, Signal and CSignal mentions and Value mentions; for all of them, the position of the mention in the news is encoded with numerical character offsets based on the NLP Interchange Format (NIF) vocabulary (attributes nif:beginIndex, nif:endIndex, nif:anchorOf), so to enable interoperability with tools consuming NIF data.

Entity mentions (class EntityMention) denote entities in the domain of discourse (linked with ks:refersTo). An optional localCorefID attribute can be used to group mentions coreferring within a document (intra-document coreference). Entity mentions are further characterized based on the type of entity:

Relation mentions (class RelationMention) express relations between two entities, whose mentions are identified by source and target links. Different kinds of relation mentions are stored:

Signal and CSignal mentions (respectively, classes SignalMention and CSignalMention) identify pieces of text supporting the existence of a temporal or causal relation, to which they are linked by relations signal and csignal.

Value mentions (class ValueMention) are numerical expressions used for quantities (cardinal numbers in general), percentages and monetary expressions; the type of value is expressed by attribute valueType, enumeration ValueType.

Entity layer

Different kinds of entities are stored, including persons, organizations, geo-political entities or locations, events, points and intervals in time extracted from text; the type of entity is conveyed by an rdf:type axiom.

The context in which an axiom holds is described and identified in terms of temporal validity (attribute sem:hasTimeValidity) and time-referenced point of view (attribute sem:hasPointOfView), e.g., ``Financial Times'' point of view expressed on 2013/12/15; the Simple Event Model (SEM) and the OWL Time vocabularies are used to that purpose.

Axiom metadata consists of a confidence value (attribute confidence), a provenance indication (attribute dct:source) and a crystallized flag (attribute crystallized). Confidence is represented on a 0.0 - 1.0 scale and quantifies how reliable an extracted statement is. Provenance is stored for background knowledge axioms and denote the external sources they have been imported from, e.g., DBPedia (note that he adoption of a provenance model to track sources, authority, and tool processing activities, is still under definition at project level). The crystallized flag is set for axioms belonging to background knowledge or assimilated to it after repeated extraction of the conveyed information, according to some crystallization algorithm.

Terms reference

Classes: | Aspect | CLink | CSignalMention | Certainty | EntityClass | EntityMention | EntityType | EventClass | EventMention | Factuality | FunctionInDocument | GLink | NAFDocument | NAFLayer | NAFProcessor | News | ObjectMention | PartOfSpeech | Participation | Polarity | RelationMention | SLink | SignalMention | SyntacticType | TIMEX3Modifier | TIMEX3Type | TLink | TLinkType | Tense | TimeMention | TimeOrEventMention | ValueMention | ValueType |

Properties: | anchorTime | annotatedWith | annotationOf | aspect | beginPoint | certainty | confidence | crystallized | csignal | endPoint | entityClass | entityType | eventClass | factuality | factualityConfidence | framenetRef | freq | functionInDocument | layer | localCorefID | mod | modality | nombankRef | originalFileFormat | originalFileName | originalPages | polarity | pos | pred | propbankRef | quant | relType | signal | source | syntacticHead | syntacticType | target | temporalFunction | tense | termID | thematicRole | timeType | value | valueFromFunction | valueType | verbnetRef | version |

Classes and Properties (full detail)


Classes

Class: nwr:Aspect

aspect Enumeration of verb aspects.
Used with: nwr:aspect

Class: nwr:CSignalMention

causal signal mention A piece of text supporting the existence of a causal (CLink) relation among events.
Used with: nwr:csignal
Sub class of ks:Mention
Restriction(s): The property ks:refersTo must have at most 0 value(s)

Class: nwr:Certainty

certainty Enumeration of possible types of certainty.
Used with: nwr:certainty

Class: nwr:EntityClass

entity class Enumeration of entity classes.
Used with: nwr:entityClass

Class: nwr:EntityMention

entity mention A piece of text denoting an entity in the domain of discourse (identified by relation nwr:refersTo), such as a person, organization or location.
Properties include: nwr:localCorefID
Used with: nwr:source nwr:target
Sub class of ks:Mention
Has sub class nwr:ObjectMention nwr:TimeOrEventMention

Class: nwr:EntityType

entity type Enumeration of entity types.
Used with: nwr:entityType

Class: nwr:EventClass

event class Enumeration of event classes.
Used with: nwr:eventClass

Class: nwr:EventMention

event mention A mention of an event.
Properties include: nwr:eventClass nwr:factuality nwr:pos nwr:pred nwr:tense nwr:factualityConfidence nwr:modality nwr:polarity nwr:certainty nwr:aspect
Sub class of nwr:TimeOrEventMention

Class: nwr:Factuality

factuality Enumeration of possible types of factuality.
Used with: nwr:factuality

Class: nwr:FunctionInDocument

function in document Enumeration of possible functions of a time mention in a document.
Used with: nwr:functionInDocument

Class: nwr:NAFDocument

NAF annotation The annotation of a news according to the NAF format, consisting in one or more layers of NLP annotations encoded in a standoff, XML-based format.
Properties include: nwr:layer nwr:annotationOf
Sub class of nfo:TextDocument ks:Resource
Restriction(s): The property nwr:layer must have some nwr:NAFLayer value(s)
The property dcterms:creator must have some nwr:NAFProcessor value(s)
The property nwr:annotationOf must have some nwr:News value(s)

Class: nwr:NAFLayer

NAF layer A NAF layer. Currently defined layers include text, terms, dependencies (deps), chunks, entities, coreferences, opinions, events and timex3 expressions.
Used with: nwr:layer

Class: nwr:NAFProcessor

NAF processor An NLP module able to produce (and possibly consume) NAF contents, characterized by a name and version.

Class: nwr:News

news A news article, consisting in the news plain text and associated metadata.
Used with: nwr:annotationOf
Sub class of nfo:TextDocument ks:Resource

Class: nwr:ObjectMention

object mention A mention of an endurant object (in KR literature), such as a person, organization or location (known as 'entities' in the NLP literature).
Properties include: nwr:entityType nwr:syntacticType nwr:syntacticHead nwr:entityClass
Sub class of nwr:EntityMention

Class: nwr:PartOfSpeech

part-of-speech Enumeration of possible part-of-speech.
Used with: nwr:pos

Class: nwr:Participation

participation A mention denoting the participation of an object (e.g., a person) to a certain event, further characterized by the role played by that object.
Properties include: nwr:thematicRole
Sub class of nwr:RelationMention
Restriction(s): The property nwr:target must have some nwr:ObjectMention value(s)
The property nwr:source must have some nwr:EventMention value(s)

Class: nwr:Polarity

polarity Enumeration of event polarities (either positive or negative).
Used with: nwr:polarity

Class: nwr:RelationMention

relation mention A piece of text expressing a relation between two entities, whose mentions are identified by nwr:source and nwr:target links).
Properties include: nwr:target nwr:source
Sub class of ks:Mention
Restriction(s): The property ks:refersTo must have at most 0 value(s)
Has sub class nwr:TLink nwr:CLink nwr:Participation nwr:GLink nwr:SLink

Class: nwr:SignalMention

signal mention A piece of text supporting the existence of a temporal (TLink) relation among events and/or time expressions.
Used with: nwr:signal
Sub class of ks:Mention
Restriction(s): The property ks:refersTo must have at most 0 value(s)

Class: nwr:SyntacticType

syntactic type Enumeration of syntactic types, such as proper name (nwr:syn_nam), pronoun (nwr:syn_pro), ...
Used with: nwr:syntacticType

Class: nwr:TIMEX3Modifier

TIMEX3 modifier Enumeration of possible TIMEX3 modifiers.
Used with: nwr:mod

Class: nwr:TIMEX3Type

TIMEX3 type Enumeration of TIMEX3 temporal expression types.
Used with: nwr:timeType

Class: nwr:TLinkType

TLink type Enumeration of TLink types.
Used with: nwr:relType

Class: nwr:Tense

tense Enumeration of verb tenses.
Used with: nwr:tense

Class: nwr:TimeOrEventMention

time or event mention Utility concept aggregating mentions of events and mentions of time expressions.
Sub class of nwr:EntityMention
Has sub class nwr:TimeMention nwr:EventMention

Class: nwr:ValueMention

value mention A numerical expression denoting either a quantity (cardinal numbers in general), a percentage or a monetary value.
Properties include: nwr:valueType
Sub class of ks:Mention
Restriction(s): The property ks:refersTo must have at most 0 value(s)

Class: nwr:ValueType

value type Enumeration of value types.
Used with: nwr:valueType

Properties

Property: nwr:anchorTime

anchor time Links a time mention whose time value cannot be independently determined to an anchoring mention that permits to resolve its value.
Domain: nwr:TimeMention
Range: nwr:TimeMention

Property: nwr:annotatedWith

annotated with Specifies the NAF annotation(s) associated to a news resource.
Inverse property of nwr:annotationOf

Property: nwr:annotationOf

annotation of Specifies the news resource a NAF annotation resource is associated to.
Domain: nwr:NAFDocument
Range: nwr:News
Has inverse property nwr:annotatedWith

Property: nwr:aspect

aspect Specifies the aspect of the verb conveying the mentioned event.
Domain: nwr:EventMention
Range: nwr:Aspect

Property: nwr:beginPoint

begin point Links a time mention denoting a time interval to the time mention denoting the beginning of that interval.
Domain: nwr:TimeMention
Range: nwr:TimeMention

Property: nwr:certainty

certainty Specifies whether and how a mentioned event is certain.
Domain: nwr:EventMention
Range: nwr:Certainty

Property: nwr:confidence

confidence Specifies a confidence value on a 0-1 scale.
Range: xsd:decimal

Property: nwr:crystallized

crystallized Specifies whether an axiom has been crystallized (i.e., it can be considered as background knowledge).
Domain: ks:Axiom
Range: xsd:boolean

Property: nwr:csignal

csignal Associates a CLink mention to the CSignal mention denoting the existence of the relation.
Domain: nwr:CLink
Range: nwr:CSignalMention

Property: nwr:endPoint

end point Links a time mention denoting a time interval to the time mention denoting the end of that interval.
Domain: nwr:TimeMention
Range: nwr:TimeMention

Property: nwr:entityClass

entity class Specifies the definiteness of the mentioned entity.
Domain: nwr:ObjectMention
Range: nwr:EntityClass

Property: nwr:entityType

entity type Specifies the semantic type of the mentioned entity.
Domain: nwr:ObjectMention
Range: nwr:EntityType

Property: nwr:eventClass

event class Specifies the semantic type of the mentioned event.
Domain: nwr:EventMention
Range: nwr:EventClass

Property: nwr:factuality

factuality Specifies whether and how a mentioned event is factual.
Domain: nwr:EventMention
Range: nwr:Factuality

Property: nwr:factualityConfidence

factuality confidence Specifies the degree of confidence in a factuality prediction.
Domain: nwr:EventMention
Range: xsd:double

Property: nwr:framenetRef

FrameNet reference Encodes a link to a FrameNet object.

Property: nwr:freq

freq Used for specifying sets that denote quantified times. It contains an integer value and a time granularity to represent any frequency contained in the set. Usual values are '2X' (twice-a-month), '3D' (three-days), etc.
Domain: nwr:TimeMention
Range: xsd:string

Property: nwr:functionInDocument

function in document Specifies the function of a time mention within the containing document (e.g., document creation date).
Domain: nwr:TimeMention
Range: nwr:FunctionInDocument

Property: nwr:layer

available layer Specifies the NAF layers available in a NAF annotation resource
Domain: nwr:NAFDocument
Range: nwr:NAFLayer

Property: nwr:localCorefID

local coreference ID Specifies the ID of the intra-document coreference cluster an entity mention belongs to.
Domain: nwr:EntityMention
Range: xsd:string

Property: nwr:mod

mod Used for temporal modifiers that cannot be expressed either within value proper, or via links or temporal functions.
Domain: nwr:TimeMention
Range: nwr:TIMEX3Modifier

Property: nwr:modality

modality Conveys different degrees of modality of an event. Its value is the lemma of the modal verb modifying the main event, e.g., may (English), potere (Italian), poder (Spanish).
Domain: nwr:EventMention
Range: xsd:string

Property: nwr:nombankRef

NomBank reference Encodes a link to a NomBank object.

Property: nwr:originalFileFormat

original file format The file format of the original document a News was imported from (NAF property)
Range: xsd:string

Property: nwr:originalFileName

original file name The file name of the original document a News was imported from (NAF property)
Range: xsd:string

Property: nwr:originalPages

original pages The number of pages of the original document a News was imported from (NAF property)
Range: xsd:int

Property: nwr:polarity

polarity Specifies the polarity of the mentioned event.
Domain: nwr:EventMention
Range: nwr:Polarity

Property: nwr:pos

pos Specifies the part-of-speech for the event mention.
Domain: nwr:EventMention
Range: nwr:PartOfSpeech

Property: nwr:pred

pred Specifies the lemma of the token describing the event.
Domain: nwr:EventMention
Range: xsd:string

Property: nwr:propbankRef

PropBank reference Encodes a link to a PropBank object.

Property: nwr:quant

quant Used for specifying sets that denote quantified times. Generally a literal from the text that quantifies over the expression. Usual values are 'EVERY', 'SOME', etc.
Domain: nwr:TimeMention
Range: xsd:string

Property: nwr:relType

relation type Specifies the type of TLink relation.
Domain: nwr:TLink
Range: nwr:TLinkType

Property: nwr:signal

signal Associates a TLink mention to the Signal mention denoting the existence of the relation.
Domain: nwr:TLink
Range: nwr:SignalMention

Property: nwr:source

source Specifies the first argument of a relation mention.
Domain: nwr:RelationMention
Range: nwr:EntityMention

Property: nwr:syntacticHead

syntactic head Specifies the syntactic head of a mention, which is a string contained in the mention extent.
Domain: nwr:ObjectMention
Range: xsd:string

Property: nwr:syntacticType

syntactic type Specifies the syntactic category of the mention.
Domain: nwr:ObjectMention
Range: nwr:SyntacticType

Property: nwr:target

target Specifies the second argument of a relation mention.
Domain: nwr:RelationMention
Range: nwr:EntityMention

Property: nwr:temporalFunction

temporal function Specifies whether a time mention is used as a temporal function.
Domain: nwr:TimeMention
Range: xsd:boolean

Property: nwr:tense

tense Specifies the tense of the verb conveying the mentioned event.
Domain: nwr:EventMention
Range: nwr:Tense

Property: nwr:termID

term ID Specifies the term ID(s) that constitute a mention extent.
Domain: ks:Mention
Range: xsd:string

Property: nwr:thematicRole

thematic role Specifies the thematic role of an object in an event.
Domain: nwr:Participation

Property: nwr:timeType

time type Specifies the type of time expressed by a time mention.
Domain: nwr:TimeMention
Range: nwr:TIMEX3Type

Property: nwr:value

value Specifies the normalized value of a temporal expression using the ISO-8601 standard.
Domain: nwr:TimeMention
Range: xsd:string

Property: nwr:valueFromFunction

value from function Used when the value is taken from a temporal function timex3.
Domain: nwr:TimeMention
Range: nwr:TimeMention

Property: nwr:valueType

value type Specifies the type of value expressed by a value mention.
Domain: nwr:ValueMention
Range: nwr:ValueType

Property: nwr:verbnetRef

VerbNet reference Encodes a link to a VerbNet object.

Property: nwr:version

version Specifies the version of an artefact.
Range: xsd:string