ABCDEFGHIJK
1
ComponentLegal FrameworkLegal DocumentArctic Document AnalysisFreedoms defined: OSAID v. 0.0.8Notes
2
Component definitions: Model Openness FrameworkFor each component (source: OSAID v. 0.0.8)Paste link to each component's legal document belowUse for any purpose and without having to ask for permissionStudy how the system works and inspect its componentsModification for any purpose, including to change its outputSharing for others to use, with or without modifications, for any purposeChallenges in analyzing this component? Need for clarification? Let us know! (Use the comment function to annotate specific cells.)
3
Version reviewed: snowflake-arctic-instruct and snowflake-arctic-base (both seems to have the same characteristics) https://huggingface.co/Snowflake/snowflake-arctic-instruct https://huggingface.co/Snowflake/snowflake-arctic-base
4
Required
5
Data Information
6
Training methodologies and techniquesAvailable under OSD-compliant licensehttps://medium.com/snowflake/snowflake-arctic-cookbook-series-arctics-approach-to-data-b81a8a0958bdAllowedAllowedAll of the information for "Data information" rows is provided in a set of Medium posts, with no specific license attached. Therefore, I understand the information can be used and studied, but not modified or shared. However, this could be hindered (at least wrt use) by patents, which are not mentioned. The collection of posts is available at https://www.snowflake.com/en/data-cloud/arctic/cookbook/ BTW, not all posts have already being published, only the first ones (but they seem to include most, if not all, information about the data. In any case, the final dataset used for training is also not available )or I could not find it, to be more precise)
7
Training data scope and characteristicsAvailable under OSD-compliant licensehttps://medium.com/snowflake/snowflake-arctic-cookbook-series-arctics-approach-to-data-b81a8a0958bdAllowedAllowedIdem
8
Training data provenance (including how data was obtained and selected)Available under OSD-compliant licensehttps://medium.com/snowflake/snowflake-arctic-cookbook-series-arctics-approach-to-data-b81a8a0958bdAllowedAllowedIdem
9
Training data labeling procedures, if usedAvailable under OSD-compliant licensehttps://medium.com/snowflake/snowflake-arctic-cookbook-series-arctics-approach-to-data-b81a8a0958bdAllowedAllowedIdem
10
Training data cleaning methodologyAvailable under OSD-compliant licensehttps://medium.com/snowflake/snowflake-arctic-cookbook-series-arctics-approach-to-data-b81a8a0958bdAllowedAllowedIdem
11
Code
12
Data pre-processingAvailable under OSI-approved licenseSomething Snowflake is willing to share, but they haven't yet published this anywhere yet because no one has asked so far. I could not find it. Maybe it is disclosed in some of the pending Medium posts?
13
Training, validation and testingAvailable under OSI-approved license
https://github.com/Snowflake-Labs/snowflake-arctic/tree/main/training
I could not find it. Maybe it is disclosed in some of the pending Medium posts?
14
InferenceAvailable under OSI-approved licensehttps://github.com/Snowflake-Labs/snowflake-arctic/blob/main/LICENSEAllowedAllowedAllowedAllowedThe repository with the inference code (really,very short examples of how to use) is under Apache-2.0. But that is barely some tens of Python LoCs. The real action is in libraries: transformers & torch for one of the examples, vllm for the other.
15
Supporting libraries and tools Available under OSI-approved licenseAllowedAllowedAllowedAllowedMany components involved. Two different examples, based on two different modules (vllm and transformers). All of them seem to be OSI-compliant licenses, anyway (see comment in legal document cell)
16
Model
17
Model architectureAvailable under OSI-approved licensehttps://huggingface.co/Snowflake/snowflake-arctic-instructAllowedAllowedAllowedAllowedDescribed in the blog posts, and in code
18
Model parameters Available under OSD-conformant termshttps://huggingface.co/Snowflake/snowflake-arctic-instructAllowedAllowedAllowedAllowedApache 2.0 license, as stated in the model card
19
Optional
20
Data Information All data sets, including:
21
Training data setsAvailable under OSD-compliant licenseDatasets used are referenced in the Medium post about the matter. There are detailed descriptions of how they are processed to produce the training dataset, but I couldn't find the training dataset itself.
22
Testing data setsAvailable under OSD-compliant license
23
Validation data setsAvailable under OSD-compliant license
24
Benchmarking data setsAvailable under OSD-compliant license
25
Data cardAvailable under OSD-compliant license
26
Evaluation dataAvailable under OSD-compliant licenseAllowedAllowedAllowedAllowed
27
Evaluation resultsAvailable under OSD-compliant license
28
Other data documentationAvailable under OSD-compliant license
29
Code
30
Code used to perform inference for benchmark testsAvailable under OSI-approved licensehttps://github.com/Snowflake-Labs/snowflake-arctic/blob/main/LICENSEAt least some scripts for benchmarking seems to be available under Apache-2.0 https://github.com/Snowflake-Labs/snowflake-arctic/tree/main/inference/vllm/benchmarks
31
Evaluation codeAvailable under OSI-approved license
32
Model All model elements, including:
33
Model cardAvailable under OSD-compliant licensehttps://huggingface.co/Snowflake/snowflake-arctic-instructAllowedAllowedAllowedAllowedI'm linking the model card as "legal document", since it states that license is Apache-2.0. However, it is not completely clear if license applies to the model card itself: I could it say it does, but I see no specific reference, since al what is available is the usual "License:" field in the model card. In any case, the freedoms analysis is based on what Apache-2.0 allows.
34
Sample model outputsAvailable under OSD-compliant license
35
Model metadataAvailable under OSD-compliant license
36
Other
37
Research papersAvailable under OSD-compliant license
38
Technical reportAvailable under OSD-compliant licenseAllowedAllowedI guess the collection of cookbooks could be considered as a detailed technical report for Arctic (see reference above)