St@tServ Links to Datasets Libraries


  • David Rosen's data sources, with over a dozen pointers to other data sets.
  • Delve -- a site with Data for Evaluating Learning in Valid Experiments
  • Elena datasets 3 artificial databases ('Gaussian', 'Clouds' and 'Concentric') and 4 real databases ('Satimage', 'Texture', 'Iris' and 'Phoneme'). Here is additional documentation.
  • The Financial Data Finder at OSU, a large catalog of financial data sets
  • Information Exploration Shootout. and Network Intrusion Dataset
  • National Space Science Data Center (NSSDC) WWW homepage with a significant amount of information about all of NASA's data sets from planetary exploration, space and solar physics, life sciences, astrophysics, including many links to other sites.
  • Neural Networks Benchmarking homepage . The homepage of the very successful NIPS*95 workshop.
  • SGI Adult Datasets Census-Income (101MB) and Census-Year (47MB), based on two years of real US census data.The files are in the standard UCI/C4.5 format with some documentation on the attributes.
  • STATLOG project datasets. This project did comparative studies of different machine learning, neural and statistical classification algorithms. About 20 different algorithms were evaluated on more than 20 different datasets.
  • Synthetic Classification Data Sets (SCDS) program, developed by Gabor Melli for generating synthetic data sets which are particularly useful to test Knowledge Discovery from Database (KDD) algorithms.
  • UCLA Statistics Textbooks Data sets from Jan de Leeuw Textbook
  • United States Census Bureau
  • Handbook of Small Data Sets by D.J.Hand et.al. (1994)
    Abstract and Data
  • Data from the book Data by Andrews and Herzberg
  • DASL: The Data and Story Library
    (from Statlib, Carnegie Mellon University,
    contains numerous datasets and their "stories", describing the problem, the statistical methods being applied and their results)
  • Data Sets from Bavarian Joint Research Public Health
  • Dr. B's Wide World of Web Data
    (from the Arizona State University)
  • JASA Data Sets
    (Journal of the American Statistical Association)
  • Statlib Data Sets
    (Carnegie Mellon University, with about 50 well documented data sets, e.g. Longley und Cardata)
  • Laplace
    (University of California, Los Angeles
    with the data sets from Cox and Snell, Andrews and Herzberg, Hand et.al.)
  • Data ZOO
    (Center for Coastal Studies at the University of California, San Diego)
  • JSE Datasets
    (Journal of Statistical Education,
    with about 20 well documented data sets)
  • JBES Datasets
    (Journal of Business & Economic Statistics)
  • Datasets at the University of Southern California
  • The Panel Study of Income Dynamics



    Copyright St@tServ 1997 - 1999, All rigths reserved
    Click here to send information to the St@tServ contributors.