A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | M4ML | INESData | ||||||||||||||||||||||||
2 | Source | Property | Range | Decription | Property | Range | Description | Comments LJ | Comments Jenifer T | Comments Nelson Q | Comments Rohitha | Comments Daniel G | ||||||||||||||
3 | FAIR4ML | deployedAt | Thing | Platform, website, webservice or similar where this ML model has been deployed. There could be deployments that this ML model is not aware of (e.g., done by third-parties). | while this may be interesting, it is very difficult to main this information (at least in a model card). | |||||||||||||||||||||
4 | FAIR4ML | ethicalLegalSocial | Text | Considerations wrt ethical, legal and social aspects. | (+1) | (+1) | ||||||||||||||||||||
5 | FAIR4ML | evaluationMetricValue | PropertyValue | Evaluation metric values obtained when creating this ML model. There should be a correspondence with the evaluation metrics declared by the ML software used to create this ML model. | evaluationMetrics | Text | Description of the metrics used for evaluating the ML model | Proposal: include in 0.0.1. To cover m4ml and InesData (metrics plus results) Name: evaluationResults Range: Text or PropertyValue Example text: ["Precision: 0.8", "Mean: 0.9"] Example PropertyValue: [ { minValue: 0.0, maxValue: 1.0, value: 0.8, measurementTechnique: "Precision" } ... ] | I also think that the metric and the value should both be stored. | Equivalent to ind:evaluationResults. But I think that having ind:evaluationMetrics and ind:evaluationResults can result in repeated information. Connecting this property to an evaluation dataset like the ind:evaluatedOn would be nice. | I also think that the metric and the value should be stored together using evaluationMetric. | This may need further discussion. On the one hand, having a text value to add a tecxtual description is easy to do. On the other hand, it may be worth it discussing a propert value pair <metric, result> to be able to rank results | ||||||||||||||
6 | FAIR4ML | externalValidation | m4ml:MLModelValidationAction | A validation action using this ML model with an external validation dataset (e.g., when new datasets are produced from experiments and were not used as part of the learning process of this ML model). There could be external validations that this ML model is not aware of (e.g., done by third-parties). | I think here we could use the m4ml:testedOn. Don't think makinig a explicit separation is necessary. | Pointing to an actino may bring in additional complexity that may not be easy to capture in my opinion. I am ok pointing to the validation dataset directly | ||||||||||||||||||||
7 | FAIR4ML | fineTunedBy | SoftwareApplication or SoftwareSourceCode | ML Software fine-tuning this ML model. | Proposal: Leave it out for now The model is created by a software that does all the training. That training can include some fine-tuning. Still, we are talking about the same software. What would be more interesting is knowing what exisitng MLModel was used for fine-tuning, so the name would be fineTuneOf and the range MLModel... To be honest, I am not so sure on what and how to model here. Fine-tuning, re-terining are important and should be captured but not sure how... | (+1) | ||||||||||||||||||||
8 | FAIR4ML | generatedBy | m4ml:MLOptimizationAction | Optimization action on an ML software used to create this ML model. | Proposal: An MLModel is craeted/generated by a specific run of a software. To simplify it we can go for MLModel - generatedBy - SoftwareSourceCode or SoftwareApplication Name: generatedBy Range: SoftwareSourceCode or SoftwareApplication Description: Softawre used to cenerate this model | I don't understand this one | Inesdata schema has the ind:ModelTraining Class to represent that event. | What does optimization action mean here? What does a particular execution of an MLSoftware imply? If it implies the training of the model, then a trainedBy property could be added to reference the dataset. Or could be linked to the optimizedfor property. | I think this may be too granular level, at least in the first iteration of the model | |||||||||||||||||
9 | FAIR4ML | hyperparameterValue | PropertyValue | Hyperparameter values used to create this ML model. There should be a correspondence with the hyperparameters declared by the ML software used to create this ML model. | Proposal: Leave it out for now For 7B weights we could use a URL to the file. But, I do not really see how to get this one from the model cards. It is considered in DOME but not sure how to get them easily so we can leave it for the next round. | This only works for regular ML models but not neural metworks, right? Or if so, storing 7B weights but be a bit expensive | ||||||||||||||||||||
10 | FAIR4ML | intendedUse | Text or DefinedTerm or URL | Purpose and intended use stated to enable users to make a decision as to the suitability of this creative work (e.g., lab protocol, machine learning model, software) to their experimental problem or own use case. | ||||||||||||||||||||||
11 | FAIR4ML | mlAlgorithm | Text or DefinedTerm | ML algorithm used to solve the task. For instance logistic regression or random forests. | modelCategory | Text | Category of the model (e.g., SVM, Transformer, Supervised, etc.) | SVM is supervised. I suggest split in algorithm (SVM, CNN) and category (supervised, reinforcement, unsupervised) | (+1) | |||||||||||||||||
12 | FAIR4ML | mlTask | Text or DefinedTerm | ML task addressed by this Ml software or model. For instance binary classification. | task | Text | Task for which the model was trained or fine tuned. E.g., image classification, sentiment analysis, etc. | (+1) | ||||||||||||||||||
13 | FAIR4ML | optimizedFor | Dataset | AI-ready dataset (after pre-processing) used by the ML software for the training and optimization of this ML model. | ind:trainedOn | ind:MLModel | Link to the dataset(s) used for training the model. | Equivalent to ind:trainedOn | ||||||||||||||||||
14 | FAIR4ML | retrainedBy | SoftwareApplication or SoftwareSourceCode | ML software used to re-train this ML model. | Will an MLModel know if it is retrained by a software? I guess yes when it was created by aretraining process. But it is the "big" model that will be used by others to etrain their own models, then not necessarily. Do we need an inverse property for the second case? | I think there should be a retrained entity, that specifies if the model was retrained completely or partially (fine-tuned), that gives information about the base model used (Or multiple models with techniques like Mixture of Experts), which algorithm/technique was used for the retraining (Reinforcement-learning, MoE, Low-rank adaptation, IA3, etc), what dataset was used for the retraining, what task was optimized in the retraining, etc. | ||||||||||||||||||||
15 | FAIR4ML | developmentLibrary | Text or URL | In the diagram is inesdata but in the JSON-LD is codemeta. It is not described in any of them | ||||||||||||||||||||||
16 | FAIR4ML | evaluationResults | Text | Description of the evaluation results obtained from the model (comparison, metric tables, etc.) | See proposal for evaluationMetrics Not sure what it means. Are these the actual values corresponding to the evaluation metrics? Or e.g., comparison tables across results from different models? If the former, it will be messy to do the one to one. If the latter, would you not need link to the other models you are comparing this one to? It sound then more complex than a property with only text, maybe a Dataset (table) would work in that case but still it would miss the info on the compared models | It could be combined im the evaluationMetric if we apply the changes mentiomed there | Could be combined with evaluationMetric. | We need to figure out how to combine with evaluation metric | ||||||||||||||||||
17 | FAIR4ML | GPURequirements | Text | Description of the GPU requirements needed to run the model | Why is this different from schema:ProcessorRequirements? A GPU is a processor, is not it? | This makes sense to me because some models (eg scikit-learn) run on CPU | Better to keep schema:ProcessorRequirements separate. | The rationale for having it separate is that GPU is a special type of processor. | ||||||||||||||||||
18 | FAIR4ML | hasCO2eEmissions | Text | Amount of CO2 equivalent emissions produced by the model. The unit should be included in the field (e.g., 10 tonnes) | (+1) There is also schema:emissionsCO2. To keep it compatible I suggest adding Number to range. Or, use the same as you do with schema:distribution where you have changed domain and range | (+1) | we can merge it with schema.org +1 to Jael's suggestion | |||||||||||||||||||
19 | FAIR4ML | modelRisks | Text | Description of the risks and biases of the model, in a human-readable manner | (+1) | I do not understand this. Does it refer to malware and any viruses that could be caused by the model? | May need to clarify the definition | |||||||||||||||||||
20 | FAIR4ML | parameterSize | Text | Brief description on the parameter size used to train the model (e.g., 7B). The unit (e.g., billions) must be included in the description | Is this about the number of data points used to train the model? I think the name is confusing. For me parameters would be those used in the algorithm, e.g., window size in embeddings | Does this refer to the weiths (or also called parameters in terms of NNs)? The word 'size' is a bit missleading to me, 'Number' maybe? | I do not understand what parameterSize refers to. Are they the number of parameters used to train the model? Or all the permutations of hyperparameters possible? | |||||||||||||||||||
21 | FAIR4ML | schema:distribution | ind:MLModelDownload | (+1) for property and ind:MLModelType | ||||||||||||||||||||||
22 | FAIR4ML | usageInstructions | Text | Description of the instructions needed to run the model (e.g., to do inference on a task). Code snippets may be used for illustration | (+1) schema:usageInfo is similar, I would make the range compatible or name it the same so easy to integrate later | Agreed | ||||||||||||||||||||
23 | codemeta | buildInstructions | URL | Link to installation instructions/documentation. | Is this property not inside ind:usageInstructions? | |||||||||||||||||||||
24 | codemeta | contIntegration | URL | Link to continuous integration service. | I don't think that knowing the continous integration approch relates too much to the Model itself. | (+1 to Nelson's comment) | ||||||||||||||||||||
25 | codemeta | developmentStatus | Text | Description of development status, e.g. Active, inactive, suspended. See <a href='http://www.repostatus.org/' target='_blank'>repostatus.org</a> | SAME | Is difficult to keep track of this property. | Very much needed. | |||||||||||||||||||
26 | codemeta | embargoDate | Date | Software may be embargoed from public access until a specified date (e.g. pending publication, 1 year from publication). | I think this property is too specific. | |||||||||||||||||||||
27 | codemeta | issueTracker | URL | Link to software bug reporting or issue tracking system. | Needed to track bugs in different versions of the same model (if different versions exist). But this could also be conveyed by releaseNotes. | |||||||||||||||||||||
28 | codemeta | readme | URL | Link to software Readme file. | Proposal: keep it as it is. For clarity MLModel (domain) - readme - URL (range) Description: Link to the readme file of this creative work (e.g., software or MLModel) Idealy, all the elements from the readme would go to different properties but that also happens with software, right? So, we not to keep it? | (+1) | Exactly to what software are we refering? The software used to train the model? To run the model? Of the model itself? | (+1 to Nelson's comment) | ||||||||||||||||||
29 | codemeta | referencePublication | ScholarlyArticle | An academic publication related to the software. | SAME | But we will need to change the domain to CreativeWork or to "SoftwareSourceCode or SoftwareApplication or MLModel" to cover what codemeta has and the new case we are introducing | (+1) | (+1) | ||||||||||||||||||
30 | schema:CreativeWork | archivedAt | URL or WebPage | Indicates a page or other link involved in archival of a [[CreativeWork]]. In the case of [[MediaReview]], the items in a [[MediaReviewItem]] may often become inaccessible, but be archived by archival, journalistic, activist, or law enforcement organizations. In such cases, the referenced page may not directly publish the content. | Proposal: keep it as it is, it is very important for ZB MED case. The idea is having here to which registry (MLentory for ZB MED case) the model has been added | I am not sure what this means. Model archival is not a thing yet. | ||||||||||||||||||||
31 | schema:CreativeWork | author | Organization or Person | The author of this content or rating. Please note that author is special in that HTML 5 provides a special mechanism for indicating authorship via the rel tag. That is equivalent to this and may be used interchangeably. | (+1) | |||||||||||||||||||||
32 | schema:CreativeWork | citation | CreativeWork or Text | A citation or reference to another creative work, such as another publication, web page, scholarly article, etc. | SAME | |||||||||||||||||||||
33 | schema:CreativeWork | conditionsOfAccess | Text | Conditions that affect the availability of, or method(s) of access to, an item. Typically used for real world items such as an [[ArchiveComponent]] held by an [[ArchiveOrganization]]. This property is not suitable for use as a general Web access control mechanism. It is expressed only in natural language.\\n\\nFor example \"Available by appointment from the Reading Room\" or \"Accessible only from logged-in accounts \". | ||||||||||||||||||||||
34 | schema:CreativeWork | contributor | Organization or Person | A secondary contributor to the CreativeWork or Event. | What is the difference between this and the author? | |||||||||||||||||||||
35 | schema:CreativeWork | copyrightHolder | Organization or Person | The party holding the legal copyright to the CreativeWork. | ||||||||||||||||||||||
36 | schema:CreativeWork | dateCreated | Date or DateTime | The date on which the CreativeWork was created or the item was added to a DataFeed. | (+1) | |||||||||||||||||||||
37 | schema:CreativeWork | dateModified | Date or DateTime | The date on which the CreativeWork was most recently modified or when the item's entry was modified within a DataFeed. | SAME | |||||||||||||||||||||
38 | schema:CreativeWork | datePublished | Date or DateTime | Date of first broadcast/publication. | ||||||||||||||||||||||
39 | schema:CreativeWork | discussionUrl | URL | A link to the page containing the comments of the CreativeWork. | ||||||||||||||||||||||
40 | schema:CreativeWork | funding | Grant | A Grant that directly or indirectly provide funding or sponsorship for this item. See also ownershipFundingInfo. Inverse property: fundedItem | ||||||||||||||||||||||
41 | schema:CreativeWork | isAccessibleForFree | Boolean | A flag to signal that the item, event, or place is accessible for free. | (+1) | |||||||||||||||||||||
42 | schema:CreativeWork | keywords | DefinedTerm or Text or URL | Keywords or tags used to describe this content. Multiple entries in a keywords list are typically delimited by commas. | SAME | |||||||||||||||||||||
43 | schema:CreativeWork | license | CreativeWork or URL | A license document that applies to this content, typically indicated by URL. | SAME | |||||||||||||||||||||
44 | schema:CreativeWork | maintainer | Organization or Person | A maintainer of a [[Dataset]], software package ([[SoftwareApplication]]), or other [[Project]]. A maintainer is a [[Person]] or [[Organization]] that manages contributions to, and/or publication of, some (typically complex) artifact. It is common for distributions of software and data to be based on \"upstream\" sources. When [[maintainer]] is applied to a specific version of something e.g. a particular version or packaging of a [[Dataset]], it is always possible that the upstream source has a different maintainer. The [[isBasedOn]] property can be used to indicate such relationships between datasets to make the different maintenance roles clear. Similarly in the case of software, a package may have dedicated maintainers working on integration into software distributions such as Ubuntu, as well as upstream maintainers of the underlying work.\n | ||||||||||||||||||||||
45 | schema:CreativeWork | headline | Text | Headline of the article. | (+1) What would be the purpose wrt ML models? Is the name not enough? Would this be an alternate name? Or a subtitle/moto phrase? | Should not be enough to have references to the articles related to the model in referencePublication? What if there are multiple articles mentioning the model? | Should it not be enough with citation? | Headline has nothing to do with citation. It's a short description of the model | ||||||||||||||||||
46 | schema:CreativeWork | inLanguage | Language or Text | The language of the content or performance or used in an action. Please use one of the language codes from the IETF BCP 47 standard. See also availableLanguage. | (+1) Just to make sure, this is the natural language and not the programming on, right? | |||||||||||||||||||||
47 | schema:SoftwareApplication | installUrl | URL | URL at which the app may be installed, if different from the URL of the item. | Isn't the installUrl the same as the readme for many models? I wonder is this is needed | |||||||||||||||||||||
48 | schema:SoftwareApplication | memoryRequirements | Text or URL | Minimum memory requirements. | SAME | (+1) | ||||||||||||||||||||
49 | schema:SoftwareApplication | operatingSystem | Text | Operating systems supported (Windows 7, OSX 10.6, Android 1.6). | SAME | (+1) | ||||||||||||||||||||
50 | schema:SoftwareApplication | processorRequirements | Text | Processor architecture required to run the application (e.g. IA64). | SAME | I am not sure if the definition of "processor architecture required to run the application" is good enoug. When I read the name "processorRequirement" I think about what processor hardware is required to run the model, a GPU, CPU or TPU and a particular model associated with it. Could be nice to have different attributes for the hardware needed to run and to train the model. | ||||||||||||||||||||
51 | schema:SoftwareApplication | releaseNotes | Text or URL | Description of what changed in this version. | ||||||||||||||||||||||
52 | schema:SoftwareApplication | softwareHelp | CreativeWork | Software application help. | ||||||||||||||||||||||
53 | schema:SoftwareApplication | softwareRequirements | Text or URL | Component dependency requirements for application. This includes runtime environments and shared libraries that are not included in the application distribution package, but required to run the application (Examples: DirectX, Java or .NET runtime). | SAME | |||||||||||||||||||||
54 | schema:SoftwareApplication | softwareVersion | Text | Version of the software instance. | schema:version | Proposal: use schema:version Why: it is better to extend from CreativeWork so that one is the one available there. It could be that versioning models is not much of a thing now but it should be. For instance, if only the dataset changes and nothing else, yes, it is a new model but it would be more correct to say that it is a new version... Not sure, I like vesioning but we can keep it or leave it for later | The idea is to keep track of the version of the model? Usually when a new version of the model appears is launched as a new model. | |||||||||||||||||||
55 | schema:SoftwareApplication | storageRequirements | Text or URL | Storage requirements (free space required). | SAME | |||||||||||||||||||||
56 | schema:SoftwareSourceCode | codeRepository | URL | Link to the repository where the un-compiled, human readable code and related code is located (SVN, GitHub, CodePlex). | Would this be the code repository fot the software that created the model? If not, what is the role wrt ML models? | It is not clear what related code means in regards to the model. | ||||||||||||||||||||
57 | schema:Thing | description | Text or TextObject | A description of the item. | SAME | |||||||||||||||||||||
58 | schema:Thing | identifier | PropertyValue or Text or URL | The identifier property represents any kind of identifier for any kind of [[Thing]], such as ISBNs, GTIN codes, UUIDs etc. Schema.org provides dedicated properties for representing many of these, either as textual strings or as URL (URI) links. See [background notes](/docs/datamodel.html#identifierBg) for more details.\n | SAME | |||||||||||||||||||||
59 | schema:Thing | name | Text | The name of the item. | SAME | |||||||||||||||||||||
60 | schema:Thing | sameAs | URL | URL of a reference Web page that unambiguously indicates the item's identity. E.g. the URL of the item's Wikipedia page, Wikidata entry, or official website. | Why do we need same as here? | |||||||||||||||||||||
61 | schema:Thing | url | URL | URL of the item. | I don't understand in regards to what are we storing this url. | |||||||||||||||||||||
62 | ||||||||||||||||||||||||||
63 | ||||||||||||||||||||||||||
64 | Legend: | |||||||||||||||||||||||||
65 | Agreement (part of FAIR4ML core) | |||||||||||||||||||||||||
66 | Initial agreement, but more discussion is needed to be in core FAIR4ML vocab | |||||||||||||||||||||||||
67 | Disagreement (not part of FAIR4ML core at this time, but may be introduced in a future version) | |||||||||||||||||||||||||
68 | ||||||||||||||||||||||||||
69 | In column H | Revise if it is possible to keep it, justification and proposal on how to keep it included in column H | ||||||||||||||||||||||||
70 | In column H | Undecided so we can leave it for next time | ||||||||||||||||||||||||
71 | In column H | Undecided so we maybe leave it for next time | ||||||||||||||||||||||||
72 | In column H | Indeed, better to leave it for next time, more discussion needed | ||||||||||||||||||||||||
73 | ||||||||||||||||||||||||||
74 | ||||||||||||||||||||||||||
75 | ||||||||||||||||||||||||||
76 | ||||||||||||||||||||||||||
77 | ||||||||||||||||||||||||||
78 | ||||||||||||||||||||||||||
79 | ||||||||||||||||||||||||||
80 | ||||||||||||||||||||||||||
81 | ||||||||||||||||||||||||||
82 | ||||||||||||||||||||||||||
83 | ||||||||||||||||||||||||||
84 | ||||||||||||||||||||||||||
85 | ||||||||||||||||||||||||||
86 | ||||||||||||||||||||||||||
87 | ||||||||||||||||||||||||||
88 | ||||||||||||||||||||||||||
89 | ||||||||||||||||||||||||||
90 | ||||||||||||||||||||||||||
91 | ||||||||||||||||||||||||||
92 | ||||||||||||||||||||||||||
93 | ||||||||||||||||||||||||||
94 | ||||||||||||||||||||||||||
95 | ||||||||||||||||||||||||||
96 | ||||||||||||||||||||||||||
97 | ||||||||||||||||||||||||||
98 | ||||||||||||||||||||||||||
99 | ||||||||||||||||||||||||||
100 |