Maxine is a PhD student in the sociology. She is searching for most recent surveys covering questions on migration and publications analysing these datasets. As a newbie, she is uncertain about which academic search portal to use for finding such inter-connected information. The GESIS KG contains links between social science research data and publications. This data is integrated in the GESIS Search portal. Maxine can use the GESIS Search to find the information she needs.
You can find an example SPARQL query here.
Will is a senior researcher in information science. In his research, he is investigating citation behaviour and usage of data in different scientific disciplines across time. To analyse data citation and usage behaviour in the social science domain, Will can query the necessary data from the GESIS KG via its provided SPARQL endpoint. Alternatively, he can download the KG as a dump file to analyse it offline with other preferred tools. The provided documentation of the underlying schema of the GESIS KG helps him to understand the structure of the data.
You can find an example SPARQL query here.
Nancy is a research data engineer and works in a research infrastructure organization. In her current project, she needs to integrate metadata from scientific resources from GESIS and other organizations into another search system. For doing so, she needs to access and harvest metadata from the GESIS KG via an OAI-PMH API. The OAI-PMH API of the GESIS KG will allow Nancy to harvest the metadata from GESIS she needs in a standardized format. The webpage of the API will also provide documentation for Nancy on how to use the API and how to harvest the data.
The GESIS Knowledge Graph contains content from the GESIS Search which comprises information about social science research data, publications on research data and open access publications. Detailed information about the content of the GESIS Search can be found here. GESIS Search aggregates information from different data collections of GESIS. Additionally, the GESIS KG comprises links between scientific resources such as links between research data and publications, publications and instruments, and so on which are also integrated and available in the GESIS Search.
Detailed provenance information about the original data sources information and the source of the links is reflected in the GESIS KG and is described in the section Provenance.
The GESIS Knowledge Graph comprises several types of scientific resources, entities and the direct relationships without provenance between them. The latest version of the GESIS Knowledge Graph ontology can be found in the Download section.
The figure above illustrates the four main scientific resources which are currently entailed in the GESIS KG. We distinguish between scientific resources and entities. The main resources and entities of the GESIS Knowledge Graph are listed and described below:
Resource name | Class |
---|---|
Dataset | schema:Dataset |
Publication | schema:ScholarlyArticle |
Instrument | ddi:Instrument |
Variable | ddi:Variable |
Entity name | Class |
---|---|
Person | schema:Person |
Organization | schema:Organization |
Location | schema:Place |
Keyword, Concept, Topic | schema:DefinedTerm |
Scientific resources may occur as part of a group of resources. The following classes are used in the data model to reflect such groups.
Class |
---|
schema:DataCatalog |
schema:CreativeWorkSeries |
schema:Collection |
schema:Periodical |
This section describes the relationships between scientific resources in the GESIS Knowledge Graph and how they are represented in the data model.
The figure above depicts indirect relationships with provenance. In the GESIS KG Ontology, we provide both direct relationships without provenance and indirect relationships with provenance—the former for easy querying, and the latter for capturing additional metadata about how the links were created. Since the GESIS KG represents many-to-many (m:n) links between scientific resources and includes specific data about these links, additional classes for references and link metadata are incorporated into the data model.
Class | Subclass of |
---|---|
gesiskg:DatasetReference | gesiskg:Reference |
gesiskg:PublicationReference | gesiskg:Reference |
gesiskg:VariableReference | gesiskg:Reference |
gesiskg:LinkMetadata | gesiskg:ReferenceMetadata |
gesiskg:DuplicateMetadata | gesiskg:ReferenceMetadata |
Links between resources in the GESIS KG are either manually curated or automatically generated. Manual links are created by GESIS staff or are derived from research data bibliographies curated for specific research data programs. These manually linked resources are clearly marked as such and typically point to unique research datasets. The curation of these links is handled by the Manual Link Curation Pipeline. For automatically generated links between publications and research data, GESIS has developed the Dataset Citation Detection Pipeline. This pipeline, an extension of the InfoLink tool, is designed to detect and disambiguate dataset citations. It identifies mentions of research data within full texts and automatically links them to the corresponding datasets. These links are marked as automatic and may not always refer to a single, unique dataset, but rather to potential datasets used in the publication. It is important to note that the automatically generated links have not yet been evaluated by domain experts—this is planned as future work. Another pipeline, the Variable Detection Pipeline, automatically identifies links between publications and variables. This was developed as part of the DFG-funded VADIS project. Additionally, the Publication Citation Detection Pipeline identifies citation links between publications and their referenced works automatically, developed under the DFG-funded Outcite project.
Metadata name | Description | Property |
---|---|---|
Link context | Text snippet or annotation marking the reason why a link has been detected | gesiskg:linkContext |
Link score | Computed confidence score of the automatically generated link | gesiskg:linkScore |
Linking method | Specifies whether a link is manually curated, automatically generated, or a search link | gesiskg:linkingMethod |
Link type | Specifies whether a link is a citation or marks a methodological usage of a dataset | gesiskg:linkType |
Link source | Specifies information about the source of a link, e.g., naming the pipeline by which a link has been generated or the project in which a manual link has been identified | gesiskg:linkSource |
In the following table, it is described how provenance information is reflected in the GESIS KG. In different properties, it is captured from which data source within GESIS Search a particular resource is originating, from where a mentioned dataset is originating, and from which link detection pipeline or manual effort a link is originating as well as versioning information.
Metadata name | Description | Property |
---|---|---|
Source info | Specifies information about the original data source of a scientific resource | gesiskg:sourceInfo |
Data source | Specifies information about the source of a dataset mentioned in a publication | gesiskg:dataSource |
Link source | Specifies information about the source of a link, e.g., naming the pipeline by which a link has been generated or the project in which a manual link has been identified | gesiskg:linkSource |
Version | Specifies the versioning information of a resource if available | schema:version |
Resources and entities in the GESIS Knowledge Graph hold several identifiers. While this includes persistent identifiers like DOIs assigned by PID authorities, there are also identifiers assigned to resources by the authority of the data source. Thirdly, a dereferenceable URI within the namespace of the GESIS Knowledge Graph has been assigned to every resource and entity which is part of the graph. The table below gives an overview of all identifiers which are present in the GESIS Knowledge Graph.
Identifier | Description |
---|---|
DOI | Digital Object Identifier |
URN | Uniform Resource Name |
ORCID | Open Researcher and Contributor ID |
ISSN | International Standard Serial Number |
ISBN | International Standard Book Number |
Internal GESIS ID | Internal ID used within GESIS |
GESIS Study Number | Number used for research data archived at GESIS |
GESIS KG URI | Uniform Resource Identifier defined for the GESIS KG, reusing the Internal GESIS ID |
The GESIS KG uses the same IDs for its resources like the IDs in URLs used by the GESIS Search, i.e. URIs of the GESIS KG can be easily constructed if the URL, resp. the ID of a scientific resource in the GESIS Search is known.
Examples
GESIS Search URL of a resource: https://search.gesis.org/research_data/ZA5280
GESIS KG URI of the same resource: https://data.gesis.org/gesiskg/resource/ZA5280
Total number of RDF triples | 97133374 |
Total number of scientific resources | 1986662 |
Publications | 583085 |
Datasets | 7546 |
Instruments | 532 |
Variables | 1395499 |
Persons | 466265 |
Organizations | 20533 |
Locations | 28314 |
Keywords, Concepts, Topics | 18297 |
Total number of classes | 33 |
Reused classes | 26 |
New defined classes | 7 |
Total number of object properties | 34 |
Reused object properties | 19 |
New defined object properties | 15 |
Total number of datatype properties | 114 |
Reused datatype properties | 30 |
New defined datatype properties | 84 |
Total number of links | 1861964 |
Automatically generated links between publicatiosn and datasets | 78733 |
Manually curated links between publications and datasets | 49509 |
Links between publications and datasets | 145382 |
Links between publications and instruments | 5813 |
Links between datasets and instruments | 74 |
Links between datasets and variables | 1393842 |
Links between publications | 313899 |
Links between publications and variables | 2954 |
The GESIS Knowledge Graph is available through various access points: via public APIs, as download, and integrated into the GESIS Search portal.
We provide an OAI-PMH API, available at https://data.gesis.org/gesiskg/oai/, which allows you to access and harvest metadata provided by the GESIS KG in the DataCite and OpenAIRE format.
You can explore the data within the GESIS Knowledge Graph using SPARQL queries at the following SPARQL endpoint: https://data.gesis.org/gesiskg/sparql
Below you can find some example SPARQL queries.
The following query lists all publications which are included in the GESIS KG (up to a limit of 10000 resources). (Result)
SELECT ?id ?title
WHERE {?id ?p <https://schema.org/ScholarlyArticle>.
?id <https://schema.org/name> ?title.
}
LIMIT 10000
To retrieve resources from a different type, change <https://schema.org/ScholarlyArticle> accordingly to <https://schema.org/Dataset>, <http://rdf-vocabulary.ddialliance.org/lifecycle#Variable> or <http://rdf-vocabulary.ddialliance.org/lifecycle#Instrument>.
The following query lists all information which is available for a particular resource in the GESIS KG. (Result)
SELECT *
WHERE {<https://data.gesis.org/gesiskg/resource/ZA5282> ?p ?o}
The GESIS KG uses the same IDs for its resources like the IDs in URLs used by the GESIS Search, i.e. URIs of the GESIS KG can be easily constructed if the URL, resp. the ID of a scientific resource in the GESIS Search is known.
Examples
GESIS Search URL of a resource: https://search.gesis.org/research_data/ZA5280
GESIS KG URI of the same resource: https://data.gesis.org/gesiskg/resource/ZA5280
The following query Lists 100 datasets and the publications that cite them for the topic "Migration" (reflecting the user story of Maxine). (Result)
SELECT ?publication ?publication_title ?dataset ?dataset_title
WHERE {?publication <https://schema.org/about> ?topic.
?topic <https://schema.org/name> "Migration"@en.
?publication ?p <https://schema.org/ScholarlyArticle>.
?publication <https://schema.org/citation> ?dataset.
?dataset ?p <https://schema.org/Dataset>.
?publication <https://schema.org/name> ?publication_title.
?dataset <https://schema.org/name> ?dataset_title.
}Limit 100
This query can easily be adjusted and explored by changing the string and language tag in line 3 from "Migration"@en to, e.g., "Germany"@en, "Gesundheit"@de, or "Politik"@de. Please note that the retrieved results depend on whether resources have been originally indexed with German or English keywords or in both languages.
The following query retrieves a year-wise count of publications citing datasets focusing on the topic "Migration" (reflecting the user story of Will). (Result)
SELECT ?year (COUNT(?publication) AS ?count)
WHERE {?publication ?p <https://schema.org/ScholarlyArticle>.
?publication <https://schema.org/citation> ?dataset.
?dataset ?p <https://schema.org/Dataset>.
?dataset <https://schema.org/about> ?topic.
?topic <https://schema.org/name> "Migration"@en.
?publication <https://schema.org/datePublished> ?year .
}
GROUP BY ?year
ORDER BY ?year
You can download the current version of the GESIS Knowledge Graph as a full RDF dump (JSON-LD and Turtle) as well as its underlying ontology at: https://doi.org/10.7802/2878
Older version: v0.1.0-beta: https://doi.org/10.5281/zenodo.14229945
The GESIS Knowledge Graph is integrated in the GESIS Search. Links between scientific resources are included in the result list and detailed views of search results.
The GESIS Knowledge Graph is available for access, download, and reuse under a Creative Commons Attribution 4.0 license since the license of some input sources is CC-BY as well.
If you are using the GESIS Knowledge Graph or parts from it, please cite the GESIS KG as follows:
Biswas, Debanjali, & Zapilko, Benjamin (2025). GESIS Knowledge Graph (GESIS KG). GESIS, Köln. Datenfile Version 1.0.0, https://doi.org/10.7802/2878.
This section documents all changes for each version of the GESIS Knowledge Graph.
v1.0.0 - 12.05.2025Benjamin Zapilko, GESIS - Leibniz Institute for the Social Sciences (Germany), https://www.gesis.org/
Debanjali Biswas, GESIS - Leibniz Institute for the Social Sciences, https://www.gesis.org/
Daniel Hienert, GESIS - Leibniz Institute for the Social Sciences, https://www.gesis.org/
Dagmar Kern, GESIS - Leibniz Institute for the Social Sciences, https://www.gesis.org/
Benjamin Zapilko, GESIS - Leibniz Institute for the Social Sciences, https://www.gesis.org/
Yudong Zhang, GESIS - Leibniz Institute for the Social Sciences, https://www.gesis.org/
Knowledge Technologies for the Social Sciences: https://www.gesis.org/en/institute/about-us/departments/knowledge-technologies-for-the-social-sciences