Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 241 result(s)
Kaggle is a platform for predictive modelling and analytics competitions in which statisticians and data miners compete to produce the best models for predicting and describing the datasets uploaded by companies and users. This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know beforehand which technique or analyst will be most effective.
The Polinsky Language Sciences Lab at Harvard University is a linguistics lab that examines questions of language structure and its effect on the ways in which people use and process language in real time. We engage in linguistic and interdisciplinary research projects ourselves; offer linguistic research capabilities for undergraduate and graduate students, faculty, and visitors; and build relationships with the linguistic communities in which we do our research. We are interested in a broad range of issues pertaining to syntax, interfaces, and cross-linguistic variation. We place a particular emphasis on novel experimental evidence that facilitates the construction of linguistic theory. We have a strong cross-linguistic focus, drawing upon English, Russian, Chinese, Korean, Mayan languages, Basque, Austronesian languages, languages of the Caucasus, and others. We believe that challenging existing theories with data from as broad a range of languages as possible is a crucial component of the successful development of linguistic theory. We investigate both fluent speakers and heritage speakers—those who grew up hearing or speaking a particular language but who are now more fluent in a different, societally dominant language. Heritage languages, a novel field of linguistic inquiry, are important because they provide new insights into processes of linguistic development and attrition in general, thus increasing our understanding of the human capacity to maintain and acquire language. Understanding language use and processing in real time and how children acquire language helps us improve language study and pedagogy, which in turn improves communication across the globe. Although our lab does not specialize in language acquisition, we have conducted some studies of acquisition of lesser-studied languages and heritage languages, with the purpose of comparing heritage speakers to adults.
The Henry A. Murray Research Archive is Harvard's endowed, permanent repository for quantitative and qualitative research data at the Institute for Quantitative Social Science, and provides physical storage for the entire IQSS Dataverse Network. Our collection comprises over 100 terabytes of data, audio, and video. We preserve in perpetuity all types of data of interest to the research community, including numerical, video, audio, interview notes, and other data. We accept data deposits through this web site, which is powered by our Dataverse Network software
Content type(s)
The National Archives and Records Administration (NARA) is the nation's record keeper. Of all documents and materials created in the course of business conducted by the United States Federal government, only 1%-3% are so important for legal or historical reasons that they are kept by us forever. Those valuable records are preserved and are available to you, whether you want to see if they contain clues about your family’s history, need to prove a veteran’s military service, or are researching an historical topic that interests you.
As one of the cornerstones of the U.S. Geological Survey's (USGS) National Geospatial Program, The National Map is a collaborative effort among the USGS and other Federal, State, and local partners to improve and deliver topographic information for the Nation. It has many uses ranging from recreation to scientific analysis to emergency response. The National Map is easily accessible for display on the Web, as products and services, and as downloadable data. The geographic information available from The National Map includes orthoimagery (aerial photographs), elevation, geographic names, hydrography, boundaries, transportation, structures, and land cover. Other types of geographic information can be added within the viewer or brought in with The National Map data into a Geographic Information System to create specific types of maps or map views.
VertNet is a NSF-funded collaborative project that makes biodiversity data free and available on the web. VertNet is a tool designed to help people discover, capture, and publish biodiversity data. It is also the core of a collaboration between hundreds of biocollections that contribute biodiversity data and work together to improve it. VertNet is an engine for training current and future professionals to use and build upon best practices in data quality, curation, research, and data publishing. Yet, VertNet is still the aggregate of all of the information that it mobilizes. To us, VertNet is all of these things and more.
Gemma is a database for the meta-analysis, re-use and sharing of genomics data, currently primarily targeted at the analysis of gene expression profiles. Gemma contains data from thousands of public studies, referencing thousands of published papers. Users can search, access and visualize co-expression and differential expression results.
The NCBI database of Genotypes and Phenotypes archives and distributes the results of studies that have investigated the interaction of genotype and phenotype, including genome-wide association studies, medical sequencing, molecular diagnostic assays, and association between genotype and non-clinical traits. The database provides summaries of studies, the contents of measured variables, and original study document text. dbGaP provides two types of access for users, open and controlled. Through the controlled access, users may access individual-level data such as phenotypic data tables and genotypes.
Content type(s)
The New York Brain Bank (NYBB) at Columbia University was established to collect postmortem human brains to meet the needs of neuroscientists investigating specific psychiatric and neurological disorders.
LAADS DAAC is the web interface to the Level 1 and Atmosphere Archive and Distribution System (LAADS). The mission of LAADS is to provide quick and easy access to MODIS Level 1, Atmosphere and Land data products, VIIRS Level 1 and Land data products MAS and MERIS data products. MODIS (or Moderate Resolution Imaging Spectroradiometer) is a key instrument aboard the Terra (EOS AM) and Aqua (EOS PM) satellites.
SuperDARN is an international HF radar network designed to measure global-scale magnetospheric convection by observing plasma motion in the Earth’s upper atmosphere. This network consists of more than 20 radars operating on frequencies between 8 and 20 MHz that look into the polar regions of Earth. These radars can measure the position and velocity of charged particles in our ionosphere, the highest layer of the Earth's atmosphere, and provide scientists with information regarding Earth's interaction with the space environment.
Greenland Environmental Observatory (GEOSummit) provides long term year round data on core atmospheric measurements, spatial phenomena, ice sheets, and the Arctic Environment. These data are available to researchers through the National Science Foundation's Science Coordination Office (SCO) which coordinates all research at GEOSummit. Currently there is not a central platform for multi-collaborator data distribution. For specific information related to research it is recommended to contact investigators directly.
Geochron is a global database that hosts geochronologic and thermochronologic information from detrital minerals. Information included with each sample consists of a table with the essential isotopic information and ages, a table with basic geologic metadata (e.g., location, collector, publication, etc.), a Pb/U Concordia diagram, and a relative age probability diagram. This information can be accessed and viewed with any web browser, and depending on the level of access desired, can be designated as either private or public. Loading information into Geochron requires the use of U-Pb_Redux, a Java-based program that also provides enhanced capabilities for data reduction, plotting, and analysis. Instructions are provided for three different levels of interaction with Geochron: 1. Accessing samples that are already in the Geochron database. 2. Preparation of information for new samples, and then transfer to Arizona LaserChron Center personnel for uploading to Geochron. 3. Preparation of information and uploading to Geochron using U-Pb_Redux.
Brainlife promotes engagement and education in reproducible neuroscience. We do this by providing an online platform where users can publish code (Apps), Data, and make it "alive" by integragrate various HPC and cloud computing resources to run those Apps. Brainlife also provide mechanisms to publish all research assets associated with a scientific project (data and analyses) embedded in a cloud computing environment and referenced by a single digital-object-identifier (DOI). The platform is unique because of its focus on supporting scientific reproducibility beyond open code and open data, by providing fundamental smart mechanisms for what we refer to as “Open Services.”
The Fragile Families and Child Wellbeing Study changed its name to The Future of Families and Child Wellbeing Study (FFCWS). Note that all documentation issued prior to January 2023 contains the study’s former name. Any further reference to FFCWS should kindly observe this name change. The Fragile Families & Child Wellbeing Study is following a cohort of nearly 5,000 children born in large U.S. cities between 1998 and 2000 (roughly three-quarters of whom were born to unmarried parents). We refer to unmarried parents and their children as “fragile families” to underscore that they are families and that they are at greater risk of breaking up and living in poverty than more traditional families. The core Study was originally designed to primarily address four questions of great interest to researchers and policy makers: (1) What are the conditions and capabilities of unmarried parents, especially fathers?; (2) What is the nature of the relationships between unmarried parents?; (3) How do children born into these families fare?; and (4) How do policies and environmental conditions affect families and children?
The twin GRACE satellites were launched on March 17, 2002. Since that time, the GRACE Science Data System (SDS) has produced and distributed estimates of the Earth gravity field on an ongoing basis. These estimates, in conjunction with other data and models, have provided observations of terrestrial water storage changes, ice-mass variations, ocean bottom pressure changes and sea-level variations. This portal, together with PODAAC, is responsible for the distribution of the data and documentation for the GRACE project.
The Growing Up Today Study is a collaborative study between clinicians, researchers, and thousands of participants across the US and beyond. The aim of this study is to gain a deeper understanding of the factors that affect health throughout life. Together we are working to building one of the most powerful resources for fighting cancer, obesity, heart disease, depression, and so much more.
The Biodiversity Research Program (PPBio) was created in 2004 with the aims of furthering biodiversity studies in Brazil, decentralizing scientific production from already-developed academic centers, integrating research activities and disseminating results across a variety of purposes, including environmental management and education. PPBio contributes its data to the DataONE network as a member node: https://search.dataone.org/#profile/PPBIO
The Earth System Grid Federation (ESGF) is an international collaboration with a current focus on serving the World Climate Research Programme's (WCRP) Coupled Model Intercomparison Project (CMIP) and supporting climate and environmental science in general. Data is searchable and available for download at the Federated ESGF-CoG Nodes https://esgf.llnl.gov/nodes.html
>>>>!!!!<<<< The Cancer Genomics Hub mission is now completed. The Cancer Genomics Hub was established in August 2011 to provide a repository to The Cancer Genome Atlas, the childhood cancer initiative Therapeutically Applicable Research to Generate Effective Treatments and the Cancer Genome Characterization Initiative. CGHub rapidly grew to be the largest database of cancer genomes in the world, storing more than 2.5 petabytes of data and serving downloads of nearly 3 petabytes per month. As the central repository for the foundational genome files, CGHub streamlined team science efforts as data became as easy to obtain as downloading from a hard drive. The convenient access to Big Data, and the collaborations that CGHub made possible, are now essential to cancer research. That work continues at the NCI's Genomic Data Commons. All files previously stored at CGHub can be found there. The Website for the Genomic Data Commons is here: https://gdc.nci.nih.gov/ >>>>!!!!<<<< The Cancer Genomics Hub (CGHub) is a secure repository for storing, cataloging, and accessing cancer genome sequences, alignments, and mutation information from the Cancer Genome Atlas (TCGA) consortium and related projects. Access to CGHub Data: All researchers using CGHub must meet the access and use criteria established by the National Institutes of Health (NIH) to ensure the privacy, security, and integrity of participant data. CGHub also hosts some publicly available data, in particular data from the Cancer Cell Line Encyclopedia. All metadata is publicly available and the catalog of metadata and associated BAMs can be explored using the CGHub Data Browser.
A premier source for United States cancer statistics, SEER gathers information related to incidence, prevalence, and survival from specific geographic areas that represent 28 percent of the population, as well as compiles related reports and reports on the national cancer mortality rates. Their aim is to provide information related to cancer statistics and decrease the burden of cancer in the national population. SEER has been collecting data from cancer cases since 1973.
The CDHA assists researchers to create, document, and distribute public use microdata on health and aging for secondary analysis. Major research themes include: midlife development and aging; economics of population aging; inequalities in health and aging; international comparative studies of health and aging; and the investigation of linkages between social-demographic and biomedical research in population aging. The CDHA is one of fourteen demography centers on aging sponsored by the National Institute on Aging.
Human Protein Reference Database (HPRD) has been established by a team of biologists, bioinformaticists and software engineers. This is a joint project between the PandeyLab at Johns Hopkins University, and Institute of Bioinformatics, Bangalore. HPRD is a definitive repository of human proteins. This database should serve as a ready reckoner for researchers in their quest for drug discovery, identification of disease markers and promote biomedical research in general. Human Proteinpedia (www.humanproteinpedia.org) is its associated data portal.
CDC.gov is the Centers for Disease Control and Prevention primary online communication channel. CDC.gov provides users with credible, reliable health information on Data and Statistics, Diseases and Conditions, Emergencies and Disasters, Environmental Health, Healthy Living, Injury, Violence and Safety,Life Stages and Populations, Travelers' Health, Workplace Safety and Health