Thursday, January 12, 2017

API IMPORT in Zenodo, Zenodo Github. Research data repository and open access archives. ORCID and DataCite Metadata


Zenodo is a research data repository. It was created by OpenAIRE and CERN to provide a place for researchers to deposit datasets.
https://home.cern/about/updates/2013/05/cern-and-openaireplus-launch-european-research-repository

some examples

an example of a .zip with many pdf

https://zenodo.org/record/168580#.WIeylGrNzdQ

7 blocks (see green arrows)

  1. Title
  2. author
  3. abstract
  4. acknowledgments
  5. frame with pdf or zip...
  6. Files
  7. References




DOI





many export solutions

  1. BibTeX Export
  2. Citation Style Language JSON Export
  3. DataCite XML Export
  4. Dublin Core Export
  5. JSON Export
  6. MARC21 XML Export
  7. a link to Mendeley:
    https://www.mendeley.com/sign/in/?acw=&utt=


If you select JSON for example, you will get directly in the window:


another example, communities COAR

Publications and outputs from or related to the Confederation of Open Access Repositories (COAR). Topics on open access repositories, interoperability, usage data, vocabularies, training, licenses and more.
https://zenodo.org/communities/coar

another example, an article

https://zenodo.org/communities/2249-0205/?page=1&size=20
And
google search
gives this 2nd position:

a web service

Zenodo, a CERN service, is an open dependable home for the long-tail of science, enabling researchers to share and preserve any research outputs in any size, any format and from any science.

DOI

Zenodo assigns all publicly available uploads a Digital Object Identifier (DOI) to make the upload easily and uniquely citeable. Zenodo further supports harvesting of all content via the OAI-PMH protocol.
Withdrawal of data and revocation of DOIs:
Content not considered to fall under the scope of the repository will be removed and associated DOIs issued by Zenodo revoked. Please signal promptly, ideally no later than 24 hours from upload, any suspected policy violation. Alternatively, content found to already have an external DOI will have the Zenodo DOI invalidated and the record updated to indicate the original external DOI. User access may be revoked on violation of Terms of Use.

 DOI from DataCite not CrossRef then you cannot use the crossRef's services.
http://stephane-mottin.blogspot.fr/2017/02/tous-les-doi-noffrent-pas-des-services.html

log

You can log by
  • ORCID Id/passORCID
  • GitHub username/pass
  • email/pass

Upload

What can I upload?

All research outputs from all fields of science are welcome. In the upload form you can choose between types of files: publications (book, book section, conference paper, journal article, patent, preprint, report, thesis, technical note, working paper, etc.), posters, presentations, datasets, images (figures, plots, drawings, diagrams, photos), software, videos/audio and interactive materials such as lessons. We do check every piece of content being uploaded to ensure it is research related.

Dans le champ "description" qui a un  Rich text editor, on ne peut même pas copier/coller du HTML par exemple d'un article PLOS. On ne peut même pas mettre un lien.
On peut saisir une équation en TeX sous la forme par exemple (entre {}):
x = {-b \pm \sqrt{b^2-4ac} \over 2a}

community

Zenodo allows you to create your own collection and accept or reject uploads submitted to it. Creating a space for your next workshop or project has never been easier. Plus, everything is citeable and discoverable!

Want your own community?
It's easy. Just click the button to get started.
  • Curate — accept/reject what goes in your community collection.
  • Export — your community collection is automatically exported via OAI-PMH
  • Upload — get custom upload link to send to people
We currently accept up to 50GB per dataset (you can have multiple datasets); there is no size limit on communities.

Metadata types and sources

All metadata is stored internally in MARC according to the schema defined in http://inveniosoftware.org/wiki/Project/OpenAIREplus/DevelopmentRecordMarkup.
Metadata is exported in several standard formats such as MARCXML, Dublin Core, and DataCite Metadata Schema according to OpenAIRE Guidelines.


Open source

Powered by Invenio
Zenodo is a small layer on top of Invenio http://github.com/inveniosoftware/invenio, a ​free software suite enabling you to run your own ​digital library or document repository on the web.

code:
https://github.com/zenodo/zenodo


GitHub

Zenodo has integration with GitHub to make code hosted in GitHub citable.
  • Select the repository you want to preserve, and toggle the switch below to turn on automatic preservation of your software.
  • Go to GitHub and create a release. Zenodo will automatically download a .zip-ball of each new release and register a DOI.
  • After your first release, a DOI badge that you can include in GitHub README will appear next to your repository below.

https://zenodo.org/account/settings/github/

---

Ref.

https://en.wikipedia.org/wiki/Zenodo
https://en.wikipedia.org/wiki/Category:Open-access_archives

IMPORT in zenodo

resources

Invenio

Zenodo is a small layer on top of Invenio <http://github.com/inveniosoftware/invenio>, a ​free software suite enabling you to run your own ​digital library or document repository on the web.

Invenio is a free software suite enabling you to run your own digital library or document repository on the web. The technology offered by the software covers all aspects of digital library management, from document ingestion through classification, indexing, and curation up to document dissemination. Invenio complies with standards such as the Open Archives Initiative and uses MARC 21 as its underlying bibliographic format. The flexibility and performance of Invenio make it a comprehensive solution for management of document repositories of moderate to large sizes.

Invenio has been originally developed at CERN to run the CERN document server, managing over 1,000,000 bibliographic records in high-energy physics since 2002, covering articles, books, journals, photos, videos, and more. Invenio is nowadays co-developed by an international collaboration comprising institutes such as CERN, DESY, EPFL, FNAL, SLAC and is being used by many more scientific institutions worldwide.

zenodo interface

pour un upload à la main, il y a 11 catégories de champs

  1. Upload type 
    1. Book section 
    2. ... Journal article, etc
  2. Basic Info
    1. date
    2. Title
    3. Authors (one by one)!!!
    4. Description (only text (and math formula) without link!!!)
    5. Keyword
    6. Additional notes, for example sommaire
  3. License
    1. Open
    2. CC 4.0; you must add its category
  4. Communities
    1. integrations (for example)
  5. Funding
    1. CNRS (for exemple)
  6. related/alt identif
    1. ISSN, ISBN, URL
  7. Contributors for example the dir of collection
  8. reference
  9. journal
  10. c
  11. Book
    1. Publisher
    2. Place
    3. ISBN
    4. Book Title
    5. Page (of this book)

zenodo API

The process

an example:
Similar to figshare, Zenodo can store your data and give you a DOI to make it citable.
We have started to deposit all Brain Catalogue’s data at Zenodo, and soon you should be able to cite your favourite brains in your works.
Initially, we uploaded the data manually, but that became tedious very soon. Luckily, Zenodo has a very simple to use and well documented API. In just 3 lines of code using curl you can easily deposit a data file and make it citable (Full information is available at https://zenodo.org/dev).

Before starting anything you need to obtain a token, which is a random alphanumeric string that identifies your queries. You only need to do this once. With your token safely stored (I keep it in the $token variable), data uploading takes just 3 steps:

1. Create a new deposit and obtain a deposit ID:

curl -i -H "Content-Type: application/json" -X POST --data '{"metadata":{"access_right": "open","creators": [{"affiliation": "Brain Catalogue", "name": "Toro, Roberto"}],"description": "Brain MRI","keywords": ["MRI", "Brain"],"license": "cc-by-nc-4.0", "title": "Brain MRI", "upload_type": "dataset"}}' https://zenodo.org/api/deposit/depositions/?access_token=$token |tee zenodo.json

Zenodo responds with a json file, which here I’m saving to zenodo.json. Now you can use awk to parse that file and recover the deposit id. I do that like this:
zid=$(cat zenodo.json|tr , '\n'|awk '/"id"/{printf"%i",$2}')

With your deposit ID in hand, you are ready to upload your data file

2. Upload data file:

curl -i -F name=MRI.nii.gz -F file=@/path/to/the/data/file/MRI.nii.gz https://zenodo.org/api/deposit/depositions/$zid/files?access_token=$token

The server will respond with a HTTP 100 ‘Continue’ message, and depending on the size of your file you’ll have to wait some time. Once the upload is finished you are ready to

3. Publish your dataset:

curl -i -X POST https://zenodo.org/api/deposit/depositions/$zid/actions/publish?access_token=$token

And that’s it. You can now go to Zenodo and view the web page for your data


Ref.
http://siphonophore.org/blog/2016/01/16/at-brain-catalogue-we-love-zenodo/

---
A bug in JSON object
https://github.com/zenodo/zenodo/issues/865
on the web documentation API Documentation for developers ( https://zenodo.org/dev)
Resources > Representations > Deposition metadata > subjects

the example of json object for subject is :
[{"term": "Astronomy",
"id": "http://id.loc.gov/authorities/subjects/sh85009003",
"scheme": "url"}]
but id is not supported and the json is rejected
the field must named 'identifier'

resources

http://developers.zenodo.org/ 
(Zenodo REST API documentation uses Slate. )

bof: https://zenodo.readthedocs.io/



Ref. https://indico.cern.ch/event/533421/contributions/2330179/attachments/1378438/2094268/kumasi2016-practical-exercises-rest-api.pdf

blog zenodo

http://blog.zenodo.org/
Zenodo docs have landed!
by  Krzysztof Nowak on January 23, 2017

wiki zenodo

https://github.com/zenodo/zenodo/wiki/What's-new%3F

YAML Github
Zenodio is a Python package we’re building to interact with Zenodo. For our various doc/technote/publishing projects we want to use YAML files (embedded in a Git repository, for example) to maintain deposition metadata so that the upload process itself can be automated.
The zenodio.metadata sub package provides a Python representation of Zenodo metadata (but not File or Zenodo deposition metadata).
Zenodio is a simple Python interface for getting data into and out of Zenodo, the digital archive developed by CERN. Zenodo is an awesome tool for scientists to archive the products of research, including datasets, codes, and documents. Zenodio adds a layer of mechanization to Zenodo, allowing you to grab metadata about records in a Zenodo collection, or upload new artifacts to Zenodo with a smart Python API.
We’re still designing the upload API, but metadata harvesting is ready to go.
Zenodio is built by SQuaRE for the Large Synoptic Survey Telescope.
https://github.com/lsst-sqre/zenodio/tree/metadata_api
http://zenodio.lsst.io/en/latest/
https://jira.lsstcorp.org/browse/DM-4852

Differences between ORCID and DataCite (DOI) Metadata

THOR is a 30 month project funded by the European Commission under the Horizon 2020 programme. It will establish seamless integration between articles, data, and researchers across the research lifecycle. This will create a wealth of open resources and foster a sustainable international e-infrastructure.

Differences between ORCID and DataCite Metadata
One of the first tasks for DataCite in the European Commission-funded THOR project, which started in June 2015, was to contribute to a comparison of the ORCID and DataCite metadata standards. Together with ORCID, CERN, the British Library and Dryad we looked at how contributors, organizations and artefacts - and the relations between them - are described in the respective metadata schemata, and how they are implemented in two example data repositories, Archaeology Data Service and Dryad Digital Repository. The focus of our work was on identifying major gaps. Our report was finished and made publicly available in September 2015. The key findings are on these topics:
  • Common Approach to Personal Names
  • Standardized Contributor Roles
  • Standardized Relation Types
  • Metadata for Organisations
  • Persistent Identifiers for Projects
  • Harmonization of ORCID and DataCite Metadata

https://project-thor.readme.io/docs/differences-between-orcid-and-datacite-metadata

This document identifies gaps in existing PID infrastructures, with a focus on ORCID and DataCite Metadata and links between contributors, organizations and artefacts. What prevents us from establishing interoperability and overcoming barriers between PID platforms for contributors, artefacts and organisations, and research solutions for federated attribution, claiming, publishing and direct data access? It goes on to propose strategies to overcome these gaps.:
https://zenodo.org/record/30799#.WIi5DmrNzdQ

2 comments:

  1. Excellent article. Very interesting to read. I really love to read such a nice article. Thanks! keep rocking. Big Data Hadoop Online Course Bangalore


    ReplyDelete
  2. Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating.

    Digital Marketing Training in Chennai

    Digital Marketing Course in Chennai


    ReplyDelete