This website presents a browsable version of an analysis of graph measures conducted on RDF datasets, which were part of the last LOD Cloud 2017 (22th August 2017). It is a case study for a software framework that is able to acquire, efficiently prepare and perform a graph-based analysis on large scale RDF graphs.
Both collections, i.e., all 280 datasets analyzed and the results for 56 graph measures, are part of a resource published with a paper at ESWC 2019. The paper has won the best student paper award
The results are presented per dataset. To the left you can see the domains introduced by the LOD Cloud. Per dataset, you can download (a) the original metadata package acquired and (b) a serialized binary object that represented the graph-structure at the time of analysis. The main benefit from this collection is that each RDF dataset is already prepared. This enables to reproduce the results and to perform further analysis of graph measures on the graphs from scratch without further preparation.
The framework is available for reuse. The source code is maintained on Github .
You can download a csv
-file export of all the results from our Github repository. There you will find:
Below are some basic descriptive statistics about all of the analyzed datasets.
Domain | Max. # of Vertices | Max. # of Edges | Avg. # of Vertices | Avg. # of Edges |
---|---|---|---|---|
Cross Domain | 614,448,283 | 2,656,226,986 | 57,827,358 | 218,930,066 |
Geography | 47,541,174 | 340,880,391 | 9,763,721 | 61,049,429 |
Government | 131,634,287 | 1,489,689,235 | 7,491,531 | 71,263,878 |
Life Sciences | 356,837,444 | 722,889,087 | 25,550,646 | 85,262,882 |
Linguistics | 120,683,397 | 291,314,466 | 1,260,455 | 3,347,268 |
Media | 48,318,259 | 161,749,815 | 9,504,622 | 31,100,859 |
Publications | 218,757,266 | 720,668,819 | 9,036,204 | 28,017,502 |
Social Networking | 331,647 | 1,600,499 | 237,003 | 1,062,986 |
User Generated | 2,961,628 | 4,932,352 | 967,798 | 1,992,069 |