ABCDEFG
1
InvenioRDM field# valuesProcedureCodeMeta fieldsCFF fieldsGitHub release JSON fieldsGitHub repo JSON fields
2
additional_descriptionsMultipleAdd separate items as follows:
 • If the CodeMeta releaseNotes is set and it’s not a URL and we didn’t use it as the value of the main description, add it with the InvenioRDM CV value “other”.
 • If the CodeMeta description is set and we didn’t use it as the value of the main description, add it with the InvenioRDM CV value “other”.
 • If the CFF abstract is set and we didn’t use it as the value of the main description, add it with the InvenioRDM CV value “other”.
 • If the GitHub repo description is set and we didn’t use it as the value of the main description, add it with the InvenioRDM CV value “other”.
 • If the CodeMeta readme is set and it’s not a URL, add it with the InvenioRDM CV value “technical-info”. (If the value is a URL, create a string of the form “Additional information is available at {URL}” and add that instead.)

Deduplicate the resulting list of descriptions to avoid duplicate values.
description
readme
releaseNotes

abstract(None)description
3
additional_titlesMultipleAdd separate items as follows:
 • If the CodeMeta name is set, add it with InvenioRDM CV type “alternate-title”.
 • If the CFF title is set, add it with InvenioRDM CV type “alternate-title”.

Deduplicate the resulting list of descriptions to avoid duplicate values.
name
title
(Not used here; see title below)(Not used here; see title below)
4
contributorsMultipleAdd separate items as follows:
 • If the CFF contact is set, add the (single) identity with an InvenioRDM role CV value of “contactperson”.
 • If the CodeMeta maintainer is set, add each identity in the list with an InvenioRDM role CV value of “other”.
 • If the CodeMeta sponsor is set, add each identity in the list with an InvenioRDM role CV value of “sponsor”.
 • If the CodeMeta producer is set, add each identity in the list with an InvenioRDM role CV value of “producer”.
 • If the CodeMeta editor is set, add each identity in the list with an InvenioRDM role CV value of “editor”.
 • If the CodeMeta copyrightHolder is set, add each identity in the list with an InvenioRDM role CV value of “rightsholder”.
 • If the CodeMeta provider is set, add each identity in the list with an InvenioRDM role CV value of “other”.
 • If the CodeMeta contributor is set, add the identities with role "other"; else, if CodeMeta contributor is not set, use the GitHub repo contributors field to create a list of contributors, using the GitHub API to look up people’s names, and add them with an InvenioRDM role CV value of “other”.

Remove identities that have a role of “other” and are also listed in creators.
sponsor
producer
editor
copyrightHolder
maintainer
provider
contributor
contact
contributors
(None)contributors
5
creatorsMultipleAdd separate items for each identity in the list of values from CodeMeta author or CFF author (but not both) if any are present; else, use the (single) GitHub release author if present; else, use the (single) GitHub repo owner. The method uses ORCID to look up names if only ORCID ID’s are given, as well as multiple NLP methods to split names into given/family name parts if names are given as single strings.authorauthorauthorowner
6
datesMultipleAdd separate items as follows:
 • An item with InvenioRDM date CV type “created” using the value of CodeMeta dateCreated (if set) or the GitHub repo created_at.
 • An item with InvenioRDM date CV type “updated” using the value of CodeMeta dateModified (if set) or the GitHub repo updated_at.
 • An item with InvenioRDM date CV type “available” using the value of the GitHub release published_at.
 • If CodeMeta copyrightYear is set, an item with InvenioRDM date CV type “copyrighted” using the value of CodeMeta copyrightYear.
dateCreated
dateModified
copyrightYear
(None)published_atcreated_at
updated_at
7
descriptionOneIf the GitHub release body is not empty, use that; else, if the CodeMeta releaseNotes is not empty and not a URL, use that; else, try CFF description, CFF abstract, and the GitHub repo’s description field, in that order.releaseNotes
description
abstract
bodydescription
8
formatsMultipleIf the GitHub release has a value for tarball_url, add “application/x-tar-gz”. If the GitHub release has a value for zipball_url, add “application/zip”. If there are values in the GitHub release assets list, infer additional MIME types based on file extensions.(fileFormat – not used)(None)If tarball_url set ⟹ tgz
If zipball_url set ⟹ zip
Values in assets may imply additional types.
(None)
9
fundingMultipleUse CodeMeta funding and funder values, intelligently constructing InvenioRDM funding objects with names of funders (looking up ROR identifiers in ROR.org if necessary).funding
funder
(None)(None)(None)
10
identifiersMultipleFor every item in CodeMeta identifier and CFF identifiers, detect recognizable identifiers of type ARXIV, DOI, GND, ISBN, ISNI, ORCID, PMCID, PMID, ROR, and SWH, and add InvenioRDM objects with scheme based on InvenioRDM identifier-types CV terms.identifieridentifiers(None)(None)
11
languagesMultipleHardwired to the value representing “English”.(None)(None)(None)(None)
12
locationsMultipleHardwired to an empty list.(None)(None)(None)(None)
13
publication_dateOneUse CodeMeta datePublished, CFF date-released, or the GitHub release published_at, tried in that order.datePublisheddate-releasedpublished_at(None)
14
publisherOneSet to the name of the InvenioRDM server(Not used)(Not used)(None)(None)
15
referencesMultipleLook at each item in CodeMeta referencePublication and CFF preferred-citation and references and collect identifiers of type DOI, ARXIV, ISBN, PMCID, and PMID. Use a combination of Crossref and Python’s isbnlib to get the corresponding reference metadata, then generate plain-text references in APA format, and finally add each item to the InvenioRDM references field.referencePublicationpreferred-citation
references
(None)(None)
16
related_identifiersMultipleAdd separate items as follows:
 • The GitHub release html_url field value with InvenioRDM relation CV term “isidenticalto” and scheme “url”
 • The value of one of the fields CodeMeta codeRepository, CFF repository-code, or the GitHub repo html_url (whichever has a value first) with InvenioRDM relation CV term “isderivedfrom” and scheme “url”
 • If the CodeMeta releaseNotes is a URL, add it with the invenioRDM relation CV term “isdescribedby”.
 • The value of one of the fields CodeMeta url, CFF url, or the GitHub repo homepage field (whichever has a value first) with InvenioRDM relation CV term “isdescribedby” and scheme “url”
 • The value of CodeMeta sameAs with InvenioRDM relation CV term “isversionof” and scheme “url”
 • The value of Codemeta downloadUrl or CFF repository-artifact (whichever has a value first) with InvenioRDM relation CV term "isvariantformof" and scheme "url"
• The value of Codemeta installUrl with InvenioRDM relation CV term "isvariantformof" and scheme "url"
 • If CodeMeta softwareHelp is set, or if the GitHub repo has an associated GitHub Pages URL, add one of them with InvenioRDM relation CV term “isdocumentedby” and scheme “url”
 • If the CodeMeta issueTracker is set, add it with the invenioRDM relation CV term “issupplementedby”; else if the GitHub repo issues_url is set, add it instead.
 • The value(s) of CodeMeta relatedLink with InvenioRDM relation CV term “references” and scheme “url”
 • For each value in the CodeMeta referencePublication and CFF preferred-citation and references that has not already been added as a related identifier, add the identifier with InvenioRDM relation CV term “isreferencedby” and scheme according to the identifier type
codeRepository
downloadUrl
installUrl
issueTracker
referencePublication
relatedLink
releaseNotes
sameAssoftwareHelp
url
preferred-citation
references
repository-artifact
repository-code
url

html_urlhtml_url
homepage
has_pages
issues_url
17
resource_typeOneIf the CFF field type is set to “dataset”, use InvenioRDM CV value “dataset”, otherwise in all other cases use “software”.(None)type(None)(None)
18
rightsMultipleLook for CodeMeta license, CFF license, and CFF license-url in that order; if none are available, look for GitHub repo license field value; if not set, look in the GitHub repository’s files for a file named “LICENSE”, “License”, “COPYING”, or similar. If the info found includes a name or a URL, match it against known SPDX licenses and use the identifier (e.g. "bsd-1-clause") as the value of the rights object's "id" field, with the title of the license as the "title" value and the URL of the license as the "link" value. If only a license file is found in the repo, create a value of the form {"title": {"en": "License"}, "link": URL}.licenselicense
license-url
(None)license
19
sizesMultipleSet to the sizes of the file(s) uploaded to the InvenioRDM server. Value is a list of strings, with each value given in the same order as the values of the formats field.(fileSize – not used)(None)tarball_url
zipball_url
assets
(None)
20
subjectsMultipleCreate a union of all terms found in the repo topics field, CodeMeta keywords, CFF keywords, CodeMeta programmingLanguage, and the GitHub repo languages_url.keywords
programmingLanguage
keywords(None)topics
languages_url
21
titleOneConstruct a string of the form “title_part – version_part”, using an en-dash instead of a colon to separate the parts in order to avoid accidentally introducing two colons into the string.
 • For title_part, use the CodeMeta name; if that’s not set, use the CFF title; and if that’s not set, use the GitHub repository full_name.
 • For version_part, use the GitHub release name, or if that’s not set, the GitHub release tag_name.
name
titlename or tag_name (if name is empty)full_name
22
versionOneUse the GitHub release tag_name, first removing any leading text of “v” or “version” if it appears as part of the tag name.(Not used)(Not used)tag_name(None)