Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 28 result(s)
The Eurac Research CLARIN Centre (ERCC) is a dedicated repository for language data. It is hosted by the Institute for Applied Linguistics (IAL) at Eurac Research, a private research centre based in Bolzano, South Tyrol. The Centre is part of the Europe-wide CLARIN infrastructure, which means that it follows well-defined international standards for (meta)data and procedures and is well-embedded in the wider European Linguistics infrastructure. The repository hosts data collected at the IAL, but is also open for data deposits from external collaborators.
CLARIN is a European Research Infrastructure for the Humanities and Social Sciences, focusing on language resources (data and tools). It is being implemented and constantly improved at leading institutions in a large and growing number of European countries, aiming at improving Europe's multi-linguality competence. CLARIN provides several services, such as access to language data and tools to analyze data, and offers to deposit research data, as well as direct access to knowledge about relevant topics in relation to (research on and with) language resources. The main tool is the 'Virtual Language Observatory' providing metadata and access to the different national CLARIN centers and their data.
Cocoon "COllections de COrpus Oraux Numériques" is a technical platform that accompanies the oral resource producers, create, organize and archive their corpus; a corpus can consist of records (usually audio) possibly accompanied by annotations of these records. The resources registered are first cataloged and stored while, and then, secondly archived in the archive of the TGIR Huma-Num. The author and his institution are responsible for filings and may benefit from a restricted and secure access to their data for a defined period, if the content of the information is considered sensitive. The COCOON platform is jointly operated by two joint research units: Laboratoire de Langues et civilisations à tradition orale (LACITO - UMR7107 - Université Paris3 / INALCO / CNRS) and Laboratoire Ligérien de Linguistique (LLL - UMR7270 - Universités d'Orléans et de Tours, BnF, CNRS).
Lithuania became a full member of CLARIN ERIC in January of 2015 and soon CLARIN-LT consortium was founded by three partner universities: Vytautas Magnus University, Kaunas Technology University and Vilnius University. The main goal of the consortium is to become a CLARIN B centre, which will be able to serve language users in Lithuania and Europe for storing and accessing language resources.
The ESO/ST-ECF science archive is a joint collaboration of the European Organisation for Astronomical Research in the Southern Hemisphere (ESO) and the Space Telescope - European Coordinating Facility (ST-ECF). ESO observational data can be requested after the proprietary period by the astronomical community.
The EUDAT project aims to contribute to the production of a Collaborative Data Infrastructure (CDI). The project´s target is to provide a pan-European solution to the challenge of data proliferation in Europe's scientific and research communities. The EUDAT vision is to support a Collaborative Data Infrastructure which will allow researchers to share data within and between communities and enable them to carry out their research effectively. EUDAT aims to provide a solution that will be affordable, trustworthy, robust, persistent and easy to use. EUDAT comprises 26 European partners, including data centres, technology providers, research communities and funding agencies from 13 countries. B2FIND is the EUDAT metadata service allowing users to discover what kind of data is stored through the B2SAFE and B2SHARE services which collect a large number of datasets from various disciplines. EUDAT will also harvest metadata from communities that have stable metadata providers to create a comprehensive joint catalogue to help researchers find interesting data objects and collections.
ANPERSANA is the digital library of IKER (UMR 5478), a research centre specialized in Basque language and texts. The online library platform receives and disseminates primary sources of data issued from research in Basque language and culture. As of today, two corpora of documents have been published. The first one, is a collection of private letters written in an 18th century variety of Basque, documented in and transcribed to modern standard Basque. The discovery of the collection, named Le Dauphin, has enabled the emerging of new questions about the history and sociology of writing in the domain of minority languages, not only in France, but also among the whole Atlantic Arc. The second of the two corpora is a selection of sound recordings about monodic chant in the Basque Country. The documents were collected as part of a PhD thesis research work that took place between 2003 and 2012. It's a total of 50 hours of interviews with francophone and bascophone cultural representatives carried out at either their workplace of the informers or in public areas. ANPERSANA is bundled with an advanced search engine. The documents have been indexed and geo-localized on an interactive map. The platform is engaged with open access and all the resources can be uploaded freely under the different Creative Commons (CC) licenses.
As part of the Copernicus Space Component programme, ESA manages the coordinated access to the data procured from the various Contributing Missions and the Sentinels, in response to the Copernicus users requirements. The Data Access Portfolio documents the data offer and the access rights per user category. The CSCDA portal is the access point to all data, including Sentinel missions, for Copernicus Core Users as defined in the EU Copernicus Programme Regulation (e.g. Copernicus Services).The Copernicus Space Component (CSC) Data Access system is the interface for accessing the Earth Observation products from the Copernicus Space Component. The system overall space capacity relies on several EO missions contributing to Copernicus, and it is continuously evolving, with new missions becoming available along time and others ending and/or being replaced.
The European Environment Agency (EEA) is an agency of the European Union. Our task is to provide sound, independent information on the environment. We are a major information source for those involved in developing, adopting, implementing and evaluating environmental policy, and also the general public. Currently, the EEA has 33 member countries. EEA's mandate is: To help the Community and member countries make informed decisions about improving the environment, integrating environmental considerations into economic policies and moving towards sustainability To coordinate the European environment information and observation network (Eionet)
The Language Bank features text and speech corpora with different kinds of annotations in over 60 languages. There is also a selection of tools for working with them, from linguistic analyzers to programming environments. Corpora are also available via web interfaces, and users can be allowed to download some of them. The IP holders can monitor the use of their resources and view user statistics.
EMSC collects real time parametric data (source parmaters and phase pickings) provided by 65 seismological networks of the Euro-Med region. These data are provided to the EMSC either by email or via QWIDS (Quake Watch Information Distribution System, developped by ISTI). The collected data are automatically archived in a database, made available via an autoDRM, and displayed on the web site. The collected data are automatically merged to produce automatic locations which are sent to several seismological institutes in order to perform quick moment tensors determination.
ArrayExpress is one of the major international repositories for high-throughput functional genomics data from both microarray and high-throughput sequencing studies, many of which are supported by peer-reviewed publications. Data sets are submitted directly to ArrayExpress and curated by a team of specialist biological curators. In the past (until 2018) datasets from the NCBI Gene Expression Omnibus database were imported on a weekly basis. Data is collected to MIAME and MINSEQE standards.
Cryo electron microscopy enables the determination of 3D structures of macromolecular complexes and cells from 2 to 100 Å resolution. EMDataResource is the unified global portal for one-stop deposition and retrieval of 3DEM density maps, atomic models and associated metadata, and is a joint effort among investigators of the Stanford/SLAC CryoEM Facility and the Research Collaboratory for Structural Bioinformatics (RCSB) at Rutgers, in collaboration with the EMDB team at the European Bioinformatics Institute. EMDataResource also serves as a resource for news, events, software tools, data standards, and validation methods for the 3DEM community. The major goal of the EMDataResource project in the current funding period is to work with the 3DEM community to (1) establish data-validation methods that can be used in the process of structure determination, (2) define the key indicators of a well-determined structure that should accompany every deposition, and (3) implement appropriate validation procedures for maps and map-derived models into a 3DEM validation pipeline.
PDBe is the European resource for the collection, organisation and dissemination of data on biological macromolecular structures. In collaboration with the other worldwide Protein Data Bank (wwPDB) partners - the Research Collaboratory for Structural Bioinformatics (RCSB) and BioMagResBank (BMRB) in the USA and the Protein Data Bank of Japan (PDBj) - we work to collate, maintain and provide access to the global repository of macromolecular structure data. We develop tools, services and resources to make structure-related data more accessible to the biomedical community.
LINDAT/CLARIN is designed as a Czech “node” of Clarin ERIC (Common Language Resources and Technology Infrastructure). It also supports the goals of the META-NET language technology network. Both networks aim at collection, annotation, development and free sharing of language data and basic technologies between institutions and individuals both in science and in all types of research. The Clarin ERIC infrastructural project is more focused on humanities, while META-NET aims at the development of language technologies and applications. The data stored in the repository are already being used in scientific publications in the Czech Republic. In 2019 LINDAT/CLARIAH-CZ was established as a unification of two research infrastructures, LINDAT/CLARIN and DARIAH-CZ.
IMGT/GENE-DB is the IMGT genome database for IG and TR genes from human, mouse and other vertebrates. IMGT/GENE-DB provides a full characterization of the genes and of their alleles: IMGT gene name and definition, chromosomal localization, number of alleles, and for each allele, the IMGT allele functionality, and the IMGT reference sequences and other sequences from the literature. IMGT/GENE-DB allele reference sequences are available in FASTA format (nucleotide and amino acid sequences with IMGT gaps according to the IMGT unique numbering, or without gaps).
IMGT/mAb-DB provides a unique expertised resource on monoclonal antibodies (mAbs) with diagnostic or therapeutic indications, fusion proteins for immune applications (FPIA), composite proteins for clinical applications (CPCA) and relative proteins of the immune system (RPI) with clinical indications.
CLARINO Bergen Center repository is the repository of CLARINO, the Norwegian infrastructure project . Its goal is to implement the Norwegian part of CLARIN. The ultimate aim is to make existing and future language resources easily accessible for researchers and to bring eScience to humanities disciplines. The repository includes INESS the Norwegian Infrastructure for the Exploration of Syntax and Semantics. This infrastructure provides access to treebanks, which are databases of syntactically and semantically annotated sentences.
The 1000 Genomes Project is an international collaboration to produce an extensive public catalog of human genetic variation, including SNPs and structural variants, and their haplotype contexts. This resource will support genome-wide association studies and other medical research studies. The genomes of about 2500 unidentified people from about 25 populations around the world will be sequenced using next-generation sequencing technologies. The results of the study will be freely and publicly accessible to researchers worldwide. The International Genome Sample Resource (IGSR) has been established at EMBL-EBI to continue supporting data generated by the 1000 Genomes Project, supplemented with new data and new analysis.
CLARIN.SI is the Slovenian node of the European CLARIN (Common Language Resources and Technology Infrastructure) Centers. The CLARIN.SI repository is hosted at the Jožef Stefan Institute and offers long-term preservation of deposited linguistic resources, along with their descriptive metadata. The integration of the repository with the CLARIN infrastructure gives the deposited resources wide exposure, so that they can be known, used and further developed beyond the lifetime of the projects in which they were produced. Among the resources currently available in the CLARIN.SI repository are the multilingual MULTEXT-East resources, the CC version of Slovenian reference corpus Gigafida, the morphological lexicon Sloleks, the IMP corpora and lexicons of historical Slovenian, as well as many other resources for a variety of languages. Furthermore, several REST-based web services are provided for different corpus-linguistic and NLP tasks.
>>>!!!<<< stated 13.02.2020: the repository is offline >>>!!!<<< Data.DURAARK provides a unique collection of real world datasets from the architectural profession. The repository is unique, as it provides several different datatypes, such as 3d scans, 3d models and classifying Metadata and Geodata, to real world physical buildings.domain. Many of the datasets stem from architectural stakeholders and provide the community in this way with insights into the range of working methods, which the practice employs on large and complex building data.
The focus of PolMine is on texts published by public institutions in Germany. Corpora of parliamentary protocols are at the heart of the project: Parliamentary proceedings are available for long stretches of time, cover a broad set of public policies and are in the public domain, making them a valuable text resource for political science. The project develops repositories of textual data in a sustainable fashion to suit the research needs of political science. Concerning data, the focus is on converting text issued by public institutions into a sustainable digital format (TEI/XML).