Dataset links

Links to some useful datasets resources which I have used or an interested in.

  • GDELT dataset: World events cateloged and classified.
  • Google Dataset Search: Search interface by google.
  • Quandll: Repository of useful datasets.
  • SNAP Dataset: Useful network datasets.
  • Datasets on Github: List compiled of publicaly available datasets on Github.
  • 911 Dataset: Crowdsourced dataset about 911.
  • Indiana University Network Data: A set of very large data sets, including some non-network data sets, compiled by the School of Library and Information Science at Indiana University. Network data sets include the NBER data set of US patent citations and a data set of links between articles in the on-line encyclopedia Wikipedia.
  • UCI Network Data: Links to many important network datasets or lists to network datasets.
  • List of datasets: Organized on Github page.
  • Google Public Datasets: Search many publically available datasets.
  • Common Crawl: Crawled data from internet.
  • WebDataCommons: Data parsed from CommonCrawl
  • UCI ML: UCI datasets for Machine Learning tasks.
  • Amazon datasets: Large Scale datasets for Commerical Level Analysis.
  • KDNuggets: Large collection of datasets by KDNuggets.
  • Data is Plural: A google spreadsheet with links and description to some amazing datasets.

SAIL 2018 Resources

School network evolution over grades
Documentation about Gephi networks
NetLogo Web