Measuring Creative Commons at
Ten Years
ACKNOWLEDGEMENTS | The CC metrics project is led by Anna Daniel under direction from Cathy Casserly, and with contributions from Creative Commons Staff and Affiliates worldwide. Special acknowledgements: Technical expertise and server data: Nathan Kinkade and Greg Grossmeier Global coordination: Jessica Coates Platform data sourcing: Eric Steuer Coding: Puneet Kishor Additional CC network input: Iolanda Pensa Copy edit and social media input: Elliot Harmon Editorial feedback: Cable Green, Tim Vollmer, Iris Brest, Mike Linksvayer, Kat Walsh, Jane Park, Heidi Chen, Paul Stacey, Aurelia Schultz and Sarah Pearson And a full and final edit of the metrics presentation by Sara Crouse With thanks. |
CREDITS | Cover photo by Leon Brocard (acme) is licensed under CC BY Website presentation design was influenced by the Mendeley Global Report with advice from Andrew Officer of Mendeley Website presentation design and build by Jon Phillips and Christopher Adams at Fabricatorz.com First copy edit by David T. Kindler Mapping advice from Rebecca Shaply (via Google Fusion Table forums) and maps based on code by Chris Keller available on GitHub Nearly all tools for analysis and preparation of this report were open source and/or available at zero cost. |
Unless otherwise specified, this work is licensed by Creative Commons under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, please visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.
Contents
Metrics about CC marked objects are important
Table 1: CC marked objects by general format
Table 2: Use of CC marked objects
Figure 1: Use of CC marked objects by country: the complexity users do not see
Figure 2: How free is the legal tool you use?
Table 4: Most frequently used CC legal tool by country
Table 5: Platform data approach
Figure 5: Identified sites and platforms
Figure 6: Direct hits to CC license deeds
Table 6: Direct visits to license/CC0 Public Domain Dedication deed pages
What is the impact of CC activities?
Table 7: CreativeCommons.org website popularity in context
Table 8: Which pages within the Creative Commons domain are viewed the most?
Figure 7: Location of visitors to the Creative Commons domain
Figure 8: Views to the Creative Commons website
Table 9: Direct visits to Korean translation license pages stood out in 2012
The use of CC marked objects and innovation
Figure 9: Global Innovation Index
Figure 10: Does use of CC materials correlate with Innovation?
Table 10: Nations with high Innovation Scores use ‘Free’ CC marked objects
What is the CC ecosystem and what is the influence of CC?
When people write about CC they refer to ‘free’ and ‘open’
Figure 12: The current public perception of CC is:
Figure 13: CC BY = $0 cost to reuse
Introduction
Creative Commons’ vision is nothing less than realizing the full potential of the Internet – universal access to research, education, and full participation in culture - for driving a new era of development, growth, and productivity. Achieving that vision requires an understanding of the impact of CC tools and activities.
How many CC licensed items exist? When you use a Creative Commons license, how does your work contribute to global innovation? We’ve explored these questions and think you’ll be surprised at what we’ve found.
As the pool of CC-licensed content increases in size and diversity, studying its reach is critical to understanding CC's contribution and decisions on future initiatives and investments. This metrics report is a first step towards quantifying the impact of Creative Commons.
Cathy Casserly, CEO
Anna Daniel, Analyst
On behalf of the Creative Commons team
What we saw in 2012, the year of our tenth birthday
Please read on for more details.
Our licenses form the infrastructure of the Commons and our users create the content of the vast and growing digital commons. We informally define the Commons as a pool of content resources accessible to all members of a society. In the Commons individuals may hold copyright over content, and may have reserved some rights (such as the right of attribution) but they have contributed certain other rights (for example, the right to copy, distribute, or edit) to the commons. They may also have waived all rights and dedicated the content to the public domain.
It is important to identify the size and impact of CC marked objects because:
These issues are critical to everyone interested in understanding the value of openness, not just participants in the Commons. For example, proof of the impact of open licensing may assist policymakers and legislators considering open policies, thus ensuring publicly funded resources are openly licensed so the public can access them. Because licenses[4] are at the core of CC and the open community, these data may reflect the value of investment in CC and guide future CC initiatives.
The research questions we asked and will continue to ask are:
A breakdown of these data by CC license type, jurisdiction and version is also important.
We initiated a systematic process of gathering and analyzing data, with the aim of capturing metrics before and after the launch of a new version of CC licenses (4.0 - slated for early 2013), because it represents a major intervention by CC within the Commons. The introduction of Creative Commons licenses (and new versions) into jurisdictions has historically influenced the volume of CC marked objects. We intend to measure any change after 4.0 is released. For example:
CC marked objects are materials that have been a) licensed with a CC license, b) released into the public domain using the CC0 Public Domain Dedication, or c) identified as already being in the public domain using the Public Domain Mark. Typically a CC marked object will have a license icon (badge) at the bottom of the first page, often where copyright information is also placed. When a digital work is properly marked, the license icon links to the relevant CC license deed page. This enables users to locate the work; it is also the way we have counted CC-licensed works for this project. But not all marked objects can be located this way. For instance, print works don’t contain links, and some users simply write “CC BY” or other license indicators on the work without including a link. Those works are necessarily omitted from our count.
While we cannot identify with certainty the exact number of CC marked objects we do know that 410 million have been identified via a combination of manual counting and a sample of 11 days data from a CC server (representative dates and dates of interest in 2012). This is just a starting point and many platforms have not been counted, for example: platforms of non-Roman character based languages (including content from China, Korea and Russia); large platforms such as the Internet Archive; and blogs have not been captured.
CC licenses may be applied to any copyrightable format, material, or object, including educational resources, music, photographs, datasets, government and public sector information and many other types of creative content. The only category of works for which CC does not recommend its licenses is computer software. As can be seen below, the works we’ve identified are in a wide variety of formats, ranging from: a color scheme, 3D visualizations, hardcopy books and metadata. Categorizing them has involved imperfect subjective judgments, on a best-fit basis. We were less interested in a ‘total number of objects’ than in the utility of the various licenses to different users. Some typologies emerged from the analysis, for example:
The count includes not only objects marked with a CC license but also those in the public domain (CC0 Public Domain Dedication and the Public Domain Mark). The approach and method used to obtain these numbers is described on page 12.
(n=411,480,319 November 2012)
Articles / Documents / Reports / Pages | 45,452,984 | Educational Courses | 18,828 | |
Audio Files | 6,219,350 | Educational Resources | 2,941,899 | |
Blogposts | 6,977,239 | Images | 262,638,027 | |
Books | 640,965 | Journals | 1,384 | |
Data Files | 66,899,129 | Mixed Media | 12,756,909 | |
Datasets | 284,845 | Videos | 6,648,760 |
Source: websites, personal contacts at platforms, CC server data sample. A link to data is provided at the end of this report.
Note: to the best of our ability and unless where specified Journals counted are not included in the Article count. Educational Resources are not included in the Educational Courses count, and an Educational Course may comprise many resources. Mixed media includes PowerPoint files and
sites with a mix of media.
When a user correctly marks their work with CC[5] they place a ‘license icon’[6] on the page which links to the Creative Commons server. The CC server for icons receives an average of 12 - 18 million requests per day, each indicating that a page with an embedded CC license, CC0 Public Domain Dedication or Public Domain Mark was accessed. This suggests 5.5 billion viewers accessed pages with a CC icon per annum, and popular pages and users may account for a large number of these hits. That is, some pages with a CC icon may be accessed multiple times so this number does not represent ‘total number of CC marked objects’, rather the total use of pages that have been correctly marked with a CC license or CC0 Public Domain Dedication. The reality of use is much larger, because many sites are marked manually or some CC licensed materials are used offline and so are not counted by this method. For the purpose of this report, hits to the license deeds provide an indication of use of CC marked objects. From a metrics perspective use is defined as viewing / reading / listening / redistributing / linking to a CC marked object. Using the numbers above, we estimate each object marked with a CC license, Public Domain Mark or CC0 Public Domain Dedication will be used on average 13 times per annum, by the following assumptions:
12-18 million requests for icons per day, est. 15m | 15,000,000 |
x 365 days per annum | 365 |
= Total requests per annum | 5,475,000,000 |
411,000,000 identified CC marked objects | 411,000,000 |
Average use of an item per annum = | 13 |
and there are cases of extraordinary use:
PLOS (Public Library of Science) | Khan Academy | Minecraft Wiki | |
License | CC BY | CC BY-NC-SA | CC BY-NC-SA |
Total number | 50,000 articles. | 3,305 videos. | 23,831 pages including 2390 articles using CC licenses. |
Use cases: | Each article is viewed on average 2,400 times. PLOS was a very early platform adopter of CC licenses in 2003. They initially chose CC BY and continue to use it. | Has delivered 170 million lessons, and nearly 2 million exercises are completed daily by students worldwide. Videos may be downloaded but not remixed. | Each article is viewed 44,000 times and each page has been edited on average 17 times. |
Managing the Creative Commons suite of license types, versions and jurisdictions is complex, but doing so in a way that is low cost (time and funds) and straightforward for licensors and licensees is critical to the
Commons. For example, someone in Thailand may view a site that includes an icon to the French deed of a CC BY-SA license and that communicates how the content on that site may be used. To the user this is a simple
everyday transaction, but behind it is a complex web of porting and internationalization factors that the Creative Commons Legal, Policy and Affiliate teams actively manage, as well as issues regarding compatibility with other open licenses and much more. CC manages unported versions and 57 ported[7] local adaptations of CC licenses and hosts 550+ unique licenses across all versions. Our translation teams around the globe have completed, or are working on, translating our license deeds into 72 languages. To illustrate the complexity that CC actively manages, the chart below of views per country (yellow circles) to CC license pages (blue circles) suggests that the most popular works are marked with version 3.0 of the CC BY, CC BY-SA, CC BY-NC-ND and CC BY-NC-SA licenses, yet we see more than 700 combinations of use in 2012. This is a core element of the value CC offers - enabling ease of content sharing within a complex legal environment. Most users of CC licensed materials never see this complexity, to them use is a zero cost and simple activity.
(January 1 - December 2012, n=17,459,627)
Source: Google Analytics and Fusion Tables
Creative Commons licenses underpin collaboration and remixing of materials, and below are some large scale examples (as of November 2012):
has since been peer reviewed, adopted over 50 times by educators, edited/revised 40 times (to remain current), and remixed by at least 16 authors. Students can download for $0 current versions of the textbook, sometimes modified to their specific course. Printed versions can be arranged for about $20 per book. Because it is public, it is open to peer review from a broad spectrum of users.
When CC licensed works are remixed and/or reused, how properly are they attributed? At this time we’re unable to tell what percent of materials are correctly attributed. However two entities we consider reflect best practice in attribution are:
However these are sites with automated attribution. Individuals can see examples of correct attribution on the CC website and CC is working towards an easier process for attribution.
Degrees of freedom in CC licenses
All CC licenses (as well as the CC0 Public Domain Dedication and the Public Domain Mark) enable legal sharing (redistribution/reproduction) of content; they are all open and all have zero financial cost. However some licenses place more limits on sharing than others. The CC BY and CC BY-SA licenses, and the Public Domain Mark and CC0 Public Domain Dedication, are generally considered ‘free’ or ‘libre’ because they enable users maximum freedom in how the content may be used (for commercial purposes, derivatives may be made etc.). Under the 'Definition of Free Cultural Works' the other four licenses, which contain more restrictive conditions, are considered non-free. For the purpose of this report, the CC BY, CC BY-SA, Public Domain Mark and CC0 Public Domain Dedication will collectively be referred to as ‘free’, and CC BY-NC, CC BY-NC-SA, CC BY-ND, CC BY-NC-ND are described as least free or non-free. Free in this context has no relevance to financial cost - all CC legal tools are zero cost to use.
Of the major platforms we’ve identified, few elect to use a ‘ND’ license. However the volume of overall objects marked with the ‘ND’ license is comparable with the free CC licenses. We also see a high current use of non-free licensed objects across versions. If we cannot find major platforms that use blanket non-free licenses, who is using them? To explore this, platforms where users self select a license were identified: Data Basin.org, Europeana, Flickr, Freesound, Slideshare, and Vimeo. However the sample was skewed because of the inclusion of Flickr (243M) versus the total of the other five entities (3.2M). While we have only small numbers from the other platforms, they show a greater use of the CC0 Public Domain Dedication and CC BY than the Flickr results. To be conclusive however, more data is required from sites where users self-select licenses. Platforms that allow self selection of license types may help CC with future analysis by enabling visibility to a breakdown of license types (and CC0 Public Domain Dedications and Public Domain Marks) chosen by users.
Below is a case study of user self-selection in Flickr, which is the only major platform we’ve identified that allows users to select a CC license. The sample suggests normative use of licenses is towards the least free licenses.
(2006-2012)
Mar. 17 2006 | Jun. 23 2008 | Feb. 25 2010 | Dec. 3 2012 | Mar. 17 2006 | Jun. 23 2008 | Feb. 25 2010 | Dec. 3 2012 | |
| Number | Number | Number | Number | % of Total | % of Total | % of Total | % of Total |
CC BY | 1,085,582 | 8,419,516 | 17,961,963 | 37,270,619 | 10.8% | 11.7% | 13.2% | 15.2% |
CC BY-SA | 801,211 | 5,740,973 | 11,761,829 | 22,171,640 | 8.0% | 8.0% | 8.7% | 9.0% |
CC BY-NC | 1,468,755 | 10,318,038 | 18,660,010 | 32,743,604 | 14.6% | 14.3% | 13.8% | 13.3% |
CC BY-ND | 317,345 | 2,796,090 | 6,137,718 | 13,404,566 | 3.2% | 3.9% | 4.5% | 5.5% |
CC BY-NC-SA | 3,241,697 | 24,315,688 | 41,621,048 | 68,743,893 | 32.2% | 33.8% | 30.7% | 28.0% |
CC BY-NC-ND | 3,169,502 | 20,373,003 | 39,507,645 | 71,425,061 | 31.4% | 28.3% | 29.1% | 29.1% |
Total | 10,084,092 | 71,963,308 | 135,650,213 | 245,759,383 |
Source: Flickr and Creative Commons
Note: unevenly spaced time series
The chart below reflects the rapid growth of Flickr as a platform over the last seven years and highlights the weight of least-free licenses in the breakdown by license type:
(volume, 2006-2012)
Source: Flickr and Creative Commons
If we look more closely at the breakdown of license types over the four periods, we see a slight pattern:
(percent, by license type for four dates)
Source: Flickr and Creative Commons
Figure 4 shows currently 24% of users apply free CC licenses (CC BY and CC BY-SA[8]), compared with 19% in 2008. It suggests a slow shift towards marking with free licenses, as also seen in a 2010 Flickr study. We hypothesize that users may transition to free CC licenses by initially choosing least free licenses and over time progressing towards CC BY. This is currently not discernable from data because we cannot segregate between established users and new adopters in platforms where users have a choice of license.
To further understand where the least free licenses are being used, we identified countries where their use is prominent. Of the top 36 countries where CC licensed materials were used (by volume, January 1 - December 31 2012), the most frequently used CC marked objects were:
(volume of use per country, 1 January 2012 - 31 December 2012, n=21,124,584)
Source: Google Analytics and CC Analysis
The above table will be discussed in more detail in the section titled ‘The Use of CC marked objects and innovation’.
Background and methodology to identify the number and use of CC marked objects
We approached this using two methods, a top down and bottom up investigation:
1. nominated key platforms / sites that contain CC marked materials for deeper analysis; and
2. via data off CC servers that measured hits (requests) for images of CC icons. This occurs each time a page with an icon was opened, regardless of whether the viewer clicked the icon.
Research question: | indication: | using data: |
What’s the number and growth rate of CC marked objects on identified platforms? | Marking - applying a CC license (or CC0 Public Domain Dedication) to content | Time series of CC marked works by Jurisdiction, License type (& CC0 Public Domain Dedication), License version, Domain (site) |
What’s the use of CC marked materials on the site? | Use | Time series of use of CC marked works by Jurisdiction, License type (& CC0 Public Domain Dedication), License version, Domain (site) |
We aspire to answer: What’s the re-use of CC marked materials on the site? | Re-use, Remix | Time series of remix of CC marked works by Jurisdiction, License type (& CC0 Public Domain Dedication), License version, Domain (site) |
We aspire to explore how well the terms of CC licenses are complied with. Of re-use we identify, how much comes with proper attribution? | Best practice attribution | To be explored |
How many people are impacted by CC licenses? | Number of users of CC marked content | Number of users of largest sites holding CC content - as indicators |
Initial nominated sites included:
The CC content directories wiki and staff recommendations provided a starting point for identifying major platforms of CC marked content. The initial intention was that only 20 major sites would be analyzed, but the list grew to 289 as more platforms and then sites of interest were identified. The collation and reporting of metrics will be dynamic and ongoing, so we expect this number to grow as we continue to identify sites. At time of writing platforms were still being added to the list, and identifying, reviewing and collating materials via platforms was the most time consuming aspect of this project:
Sources: Wordle and CC Analysis
This is merely a start towards identifying platforms and sites that include CC marked objects, and there is still much work to do. Many large repositories of Creative Commons materials have not been included, for example: platforms of non-Roman character based languages (including content from China, Korea and Russia); large platforms such as the Internet Archive[9]; nor have the total number of OER platforms and resources, articles, journals and blogs been adequately captured. The reasons for this are discussed in the notes to this report.
Data on the number of materials were manually obtained within each site or via professional contacts at those sites (where not publicly available). In roughly 22 percent of sites identified, a count of CC marked objects could not be found or estimated with confidence so they were left blank. Once obtained, the data was aggregated by format (e.g. total number of CC marked photos across Flickr, Wikimedia Commons, CAPL etc., total number of data files where identified within datasets), although educational resources (in various formats) were counted separately because they tend to exist within OER specific platforms. Aggregation is challenging when dealing with a count of more than 410 million items from disparate sources, and the result is imperfect.
Key sites for deeper analysis were selected on the basis that: they contain predominantly CC marked materials; are large; obtaining license data is relatively straightforward; and are new/ anticipated fast growth sites. Where data is available it was recorded at the highest granularity (article, blog post, image, video, song, data entity) possible. Some sites provide a clean, deeper level of data, for example, CCMixter data clearly indicates remixing and PLOS articles are all CC BY and provide rich data on article use.
CC does not control how CC licenses are used; they are used in many different ways and sometimes not marked correctly. Sites where objects were marked incorrectly were included in the count where the intention appeared to be to use a Creative Commons license. For example sometimes sites would be marked with a CC license but also carry an All Rights Reserved Copyright notice. However we omitted marked computer software from this count, because CC does not recommend its licenses for software. We will continue to work closely with key sites to educate users on marking content with CC licenses.
Objects marked with a CC0 Public Domain Dedication were included in the total count, as they have been in previous CC metrics efforts. These include roughly 17 million datasets, 3.5 million data files and half a million images. Note that these are not CC licensed objects, they are objects that have been dedicated to the public domain by a waiver of copyright.
Treatment of bibliometric records
We identified more than 280 million bibliometric records and resource description frameworks (for example in library catalogs) but they were counted as 17 CC marked objects. For example, Europeana this year dedicated a dataset of 20 million bibliometric records to the public domain (CC0 Public Domain Dedication) and it has been included as one in the count. Europeana also released - with a CC0 Public Domain Dedication - a set of 3.68 million objects (images, audio files and videos) and these were included as 3.68 million in the count. When the items were simply descriptive of books or other components of a collection, we treated the collection as a database[10] and counted it as a single object, but when the objects themselves were made available, we counted each of them.
Server data
We sought to measure hits to the Creative Commons server whenever a page with a license (or CC0 Public Domain Dedication or Public Domain Mark) icon was accessed. This does not identify ‘new’ works or a newly marked object; it identifies works via use of those works. The sample from 11 days (of hits to our server whenever a page with a icon was opened) provided valuable insights.
The sample data was useful for three reasons:
The sample set of dates was chosen taking into account cycles of CC use. We see cycles in volume of views to the website via Google Analytics, although this is a different use to requests for licenses icons from the CC server. Google Analytics indicates the CC website has a strong weekly cycle of high use (Monday to Friday) and drops off on weekends. It also drops off dramatically December 25-31, although other holidays throughout the year do not seem to have a notable effect (about 23% of hits to the website come from the U.S. and we also assessed holidays in major regions worldwide). CC website views trend down in May-July and then upwards again in August. This pattern may be typical of most Internet sites, and we selected the sample dates after consideration of these cycles. We used: the last day of each month January to September 2012 (nine days); 13 and 14 July; and until midday for the 15th July 2012. The mid July dates were included because during this time there was an anomalous spike in hits not seen in prior years that warranted examination.
Despite fluctuations from cycles of use, hits to the CC license (and CC0 Public Domain Dedication and Public Domain Mark) deed pages and the Creative Commons website are trending up. As an indication, the chart below shows hits to the top four license deed pages (CC BY, CC BY-SA, CC BY-NC-ND and CC BY-NC-SA) by volume. The free (CC BY and CC BY-SA) are trending steadily higher. Note this represents use of CC marked objects, as opposed to number of CC marked objects. For example, this figure includes duplicate views of the same one object.
(monthly January 2009 - December 2012, n=56,792,672)
Source: Google Analytics sample.
Note: use trends down annually over the December holiday season.
In the table below, ‘direct visits’ suggest viewers clicked on an attribution link that led them directly to that license page – reflecting the use of CC licenses – and then left the domain. They did not click through to the legal deed.
(total, 1 January - 31 December 2012. s=30,462,293)
Legal tool | Page views | Landing pages (direct visits) | Direct % of Total (%) |
CC BY-SA | 6,950,815 | 5,961,711 | 86 |
CC BY | 7,064,929 | 5,681,872 | 80 |
CC BY-NC-ND | 5,929,470 | 4,984,796 | 84 |
CC BY-NC-SA | 6,614,704 | 4,655,874 | 70 |
CC BY-NC | 1,729,813 | 1,469,498 | 85 |
CC BY-ND | 970,051 | 809,403 | 83 |
CC0 Public Domain Dedication | 392,238 | 339,336 | 87 |
Public Domain Mark | 810,273 | 726,254 | 90 |
Source: Google Analytics
Note: Includes duplicate hits from same referring URI.
Impact, or added value, is difficult to quantify because it is a long-term measure and may result from a variety of causal factors. However we assume that a byproduct of our activities is increased views to the CC website. Website use may reflect: increased interest and awareness of CC; outcomes from campaigns; and longer-term growth in the CC profile and use of CC marked objects. The Creative Commons website continued to enjoy stable but cyclical growth in use through 2012. This next section explores use of the CC website.
CC does not pay to direct traffic to our website (e.g. via advertising, SEO etc.), yet the CC website is ranked in the top 3000 of most popular websites worldwide (Alexa 2012). According to Alexa it is the number one ranked site worldwide in the category of intellectual property (December 2012). To provide context, below are some randomly nominated sites from the fields of intellectual property and creative industries:
(we’re much bigger than Bieber)
Site | Rank | Site | Rank | Site | Rank | ||
YouTube | 3 | Creative Commons | 2,887 | WIPO | 12,071 | ||
Wikipedia | 6 | AllKPop.com | 4,666 | Electronic Frontier Foundation | 28,728 | ||
Internet Archive | 219 | GNU.org | 5,922 | OpenSource.org | 40,041 | ||
McGraw-Hill | 2,809 | Chilling Effects | 7,690 | Justin Bieber Zone | 59,104 |
Source: Alexa.com Dec. 12 2012
Views to the license deed pages consistently dominate use of our website, and this reflects use of CC marked objects described previously. An upgraded version of the CC Chooser was released in late 2012 and time spent on the Chooser decreased in the last two months of 2012 from about 2 minutes to 1.24 minutes. This suggests it has streamlined the process of choosing a CC license. The CC10 site was specially created as part of the Creative Commons tenth birthday celebrations.
(total, January 1 - December 31 2012, s= 54,189,261)
Section of the CC Domain | Total Page Views | Average. Time on Page (minutes) |
License pages | 26,362,293 | 2.37 |
Chooser | 12,710,149 | 1.24 |
Search | 10,849,322 | 3.08 |
Home page | 4,061,776 | 1.57 |
Support | 205,721 | 0.38 |
CC10 special site |
Source: Google Analytics
Most website managers aim to keep viewers on a site for as long as possible and minimize ‘bounce rates’
(people leaving the domain immediately after clicking a page). Creative Commons is more nuanced in that (a) our license deeds are the highest use pages and (b) a goal is to make the license deeds simple to understand in plain language (‘human readable’) and the use of our license chooser tool is simple and relatively quick. We aim to minimize the transactional time cost of CC licensing.
As discussed previously, in 2012 viewers spent on average 1.24 minutes on the license chooser page and two and a half minutes on a license deed page and then typically leave the site. The site has an average bounce rate of 77 percent (Google Analytics). Creative Commons staff debated this and it suggests the need for user experience studies. For example:
- do users read the license, understand it (did not click to the legal code or elsewhere) then leave? Is the process fast and simple to understand (user friendly)? or
- do users click on it, think ‘what is this?’ and then their attention quickly moves elsewhere?
The result highlights that the search page is used heavily, and perhaps CC should focus on improving the ability for users to find CC licensed materials via the search page. For example, there are now hundreds of sites containing CC licensed materials, and the ability for users to meta-search across sites (for example search across sites of open educational resources, or of audio files) may improve the search experience and cut the time to find CC licensed materials.
In the 2012 calendar year the Creative Commons domain (including license deed pages) has been viewed more than 27 million times. Use is worldwide and, within countries, 23 percent came from the United States, followed by Germany, Spain and South Korea each with five percent (Google Analytics). On a city basis, the highest volume of hits to the CC domain during the year came from Seoul (475,459 hits) and London (315,888 hits). It is no surprise that use of CC marked objects is high in Seoul, given its high internet density.
(total, January 1 - December 31 2012, n= 27,256,878)
Source: Google Analytics
Note: sample number differs because Table 7 includes other pages that are measured separately.
What causes spikes in use?
Events and activities typically drive spikes in views to our website. As mentioned, the license pages score the highest number of hits - because they are tied to use of CC materials. In July 13-15 there was a significant spike in hits to the website, that we attribute to views to the Korean translation of the CC BY-NC-ND license deed. We hypothesize it may have been an event (online game or media broadcast?) on those dates that drove people to open a page with a CC license icon - but cannot confirm this.
(January 1 - December 31 2012, n=43,799,001)
Source: Google Analytics
(January 1 - December 31 2012, s= 2,341,901)
License type | Direct views | URL |
CC BY-NC-SA 3.0 in Korean | 1,132,302 | http://creativecommons.org/licenses/by-nc-sa/3.0/deed.ko |
CC BY-NC-ND 2.0 Korea | 808,335 | http://creativecommons.org/licenses/by-nc-nd/2.0/kr/ |
CC BY 2.0 Korea | 201,588 | http://creativecommons.org/licenses/by/2.0/kr/ |
CC BY-NC-SA 2.0 Korea | 199,676 | http://creativecommons.org/licenses/by-nc-sa/2.0/kr/ |
Source: Google Analytics
What is the impact of the use of CC marked objects?
The value of CC licenses includes:
This metrics report cannot canvas all values at once and a body of evidence already supports points 1-6. So it will focus on innovation because substantiation of the broader impact of CC licensing and activities ties with the aims of this report, namely:
and because innovation[11] is a driver of economic and social impact.
The World Economic Forum (2012) nominates capacity for innovation - especially in the field of knowledge - as a key pillar of global competitiveness. Experts at the Brookings Institute (2012) also identify conditions - such as improved knowledge transmission by increased use of CC licenses - that promote innovation as being key to stronger economies. Enabling the inexpensive dissemination of knowledge addresses a fundamental issue of economics identified by Arrow (1962 614-15): that knowledge is a market with a high degree of uncertainty. If the cost of transmitting information were zero, “then optimal allocation would obviously call for unlimited distribution of the information without cost.” With digital dissemination and Creative Commons licenses, information is no longer an ‘indivisible commodity’, the problems of inefficient allocation are removed and owners no longer need to exercise monopoly rights. Benefits include reduced risk and optimal resource allocation due to less uncertainty and efficiencies.
INSEAD (a leading Business School) publishes the Global Innovation Index (GII) and 2012 marked the first year of collaboration on the report with the World Intellectual Property Organization (WIPO). The GII recognizes the key role of innovation as a driver of economic growth and prosperity and acknowledges the need for a broad horizontal vision of innovation that is applicable to both developed and emerging economies, with the inclusion of indicators that go beyond the traditional measures of innovation (such as the level of research and development in a given country). It has been reported annually for five years and has evolved into a valuable benchmarking tool whereby policymakers, business leaders and other stakeholders can evaluate progress (Dutta 2012). This year it noted a new metric is being considered for their input index indicators: “the size of the public domain and the availability of materials where transaction costs are near zero — such as works licensed under Creative Commons” (Dutta 2012, 158). While it is currently impossible to identify the total number of works per country, we can see use of CC marked objects within a country.
The index scores countries on total innovation which comprises innovation inputs and innovation outputs. Factors are summarized below:
Source: Global Innovation Index
The innovation output sub-index measures elements that are the result of innovation within an economy such as knowledge impact and diffusion, creative goods and services and creative intangibles. Efforts made on enabling environments result in increased innovation outputs (INSEAD 2012, 16), for example, open licenses enable an environment of sharing, reuse and remixing of creative goods and services. Therefore the innovation output sub-index may appropriately be compared with use of CC marked objects within countries to explore any correlation of use of CC marked objects and innovation.
There is a correlation between high use of CC marked objects and Internet density per country. Clearly as Internet use increases so should use of CC marked objects. But does high use of CC marked objects in a country correlate with high innovation outputs? If so, can this index be used as a benchmark to gauge longitudinally the impact of Creative Commons?
To answer that, we identified the top 25 countries for CC use in terms of volume of hits to our website, which is the best sample set available to represent volume of use of CC marked objects. Total number of uses was then divided by the number of internet users in each country to create a per capita ratio. Note the ratio is small: between 0% to 3.5%. It was then correlated with each country’s Innovation Output score. The Innovation Output score is a number between 0-100, and the 2012 spread was from 10.3 (Sudan) to 68.5 (Switzerland). In the correlation (see below) there are clearly outliers, but the results suggest use of CC marked objects in some way are in sync with innovation outputs. Of the top 25 countries recording highest use of CC marked objects, 17 are amongst the most innovative countries in the world, and of those the most prevalent license type used is a free license (CC BY or CC BY-SA) or CC0 Public Domain Dedication. However this is tentative and we cannot clearly identify the causal factor.
(Use of CC marked objects per Internet user (January 1 - December 31 2012) versus Innovation Index Output score (2012) per country)
Sources: Google Analytics, Wikipedia and 2012 Global Innovation Index
Note: data link available in Sources section.
The proposed inclusion into the Innovation Index of a new indicator: ‘the number of open licensed materials per country’ has merit, although use per capita may be more do-able. The number of CC marked objects is one indicator, as are other open licenses such as open source software licenses. To some extent this is already being captured by the inclusion into their input indicators of Wikipedia edits and YouTube uploads, but they do not capture fully the use of open licenses within creative industries. Capturing the number of open policies within countries may be another useful indicator. Creative Commons and partners are taking steps to develop a global network around open policies, and part of this effort may include identifying existing policies across institutions, governments and other entities. Note we are not inferring any causal relationship between open licenses and innovation - merely a correlation - and that it merits further analysis, as described in the next section.
Does the type of CC license correlate with innovation in countries?
To begin to answer this question the total number of use (via volume of hits to license deed pages) of CC marked objects was collated, January 1 - December 31 2012. From the list the most prevalent license type used in each country was identified and categorized as either free or non-free: CC BY and CC BY-SA licenses and the CC0 Public Domain Dedication were categorized as free whereas the four other licenses place more restrictive conditions upon use so were categorized as non-free. See Table 4 previously for details.
Taking the top 25 countries where CC licensed materials are used, we created three tiers - those with high, medium and low Innovation Scores. We then identified the prevalent CC license type (or CC0 Public Domain Dedication) and relative rate of CC licensed materials use. With the exception of the Republic of Korea (see
note), use of free licenses correlates with Innovative countries:
High Innovation Scores | Australia | Korea (Rep.) | Germany | Canada | United States | Netherlands | United Kingdom | Sweden |
Most used CC mark is: | Free | Non-Free | Free | Free | Free | Free | Free | Free |
Hits to CC deed pages per internet user is: | High | High | Mid | High | High | High | Mid | High |
Sources: Google Analytics, Wikipedia and 2012 Global Innovation Index
Notes:
1. In 2012 measurement changes impacted the rankings of Japan and the Republic of Korea and the report advises further analysis is required. Secondly, there was an anomalous spike in hits to the Korean CC license deeds this year, as described earlier in this report.
2. Between July 13-15 2012 there was a spike in use in relation to the translation of the Korean version of CC license deeds. This anomaly may have influenced these results.
3. Taiwan is one of the top 25 ‘countries’ for use of CC marked objects, however the Global Innovation Index does not separately measure Taiwan, so it could not be included.
If each individual, group, or region of the world existed in a silo, without the ability to build on the ideas, creations, knowledge and discoveries of others, what would the world look like? How would we learn, create, and collaborate without connecting these silos? For one, the Internet would not exist as we know it today. Each silo would produce only replicas of its first works – copies. Creative Commons ambitiously aims to open silos, and cultivate an online culture that is rich, diverse, and open for the benefit of all. We believe that progress, innovation, and good things happen when creators – those who grow and add to the global Commons – share, comment on and remix each other’s work to yield even better returns. They then feed back in new ways into the culture and ecosystem from which they came.
Tim O'Reilly spoke about the Clotheslines Paradox at an expert panel in April 2012. The clothesline paradox highlights that certain activities are hard to measure and count in the economy, but have a major impact nonetheless.We believe our professional networks and associated activities within those networks generate intangible value but to date they have not been identified or measured. The next section describes the effort undertaken in 2012 to explicitly show our networks, and ultimately we aim to explore the impact of these networks on our activities and our influence within them. Every individual, institution, and enterprise has the opportunity to enrich the first-ever globally shared public space – the Internet – and to be recognized for his or her contribution to this open ecosystem.
We initially define the ecosystem as the network in which CC operates. CC works within an environment over which we have little control. Events arise from the fields of technology, society and non-users of CC licenses, and economic, regulatory and environmental influences. For example: new regulation may be proposed, new technologies may be introduced, and economic downturns may turn attention to the wealth of quality zero cost open content. CC may have some influence over: licensing of digital content; users of CC licenses; our Affiliates and the digital commons; and the technical infrastructure we use. We have a high degree of control over: our internal processes; our activities; how we communicate and promote our work; and our suppliers.
We sought - and continue to seek - to explore “how would the Open ecosystem look if CC didn’t exist? How would the ecosystem be affected without CC?” Secondly, explicitness about the CC network may provide a foundation for ongoing analysis and assist policy making by:
Ultimately we may show how CC licenses and activities have facilitated growth in the value of the global Commons.
As a first step towards exploring our influence in the open ecosystem we needed to identify the reach of our networks. Creative Commons has 100+ affiliates working in over 70 legal jurisdictions to support and promote CC activities around the world. As seen in Figure 1, our licenses are applied to content that is used worldwide, and CC teams have completed, or are working on translating our license deeds into 72 languages. To begin this project, CC staff and Affiliates brainstormed to build a list of 1200 professional contact entities and the entities were informally grouped into categories. This process was voluntary so results are not wholly representative. Several categories were nominated but they were refined into: Education, Funding Partners, GLAM (Galleries, Libraries, Archives and Museums), Legal, Policy, Science, and Technology. From this we identified physical addresses of the entities and mapped the network. The map below of links between CC teams and entities shows the digital world is neither constrained by geography, nor evenly connected. The map may be sorted by category (what contacts do we have in the education field?) and by team (how extensive is the reach of the CC Italy Affiliate team?).
It is a start, and may be used to formulate ongoing analysis concerning:
The map will develop iteratively according to need. Ideally it is best viewed online but below are some screenshots of the map by category:
Funding Partner | Policy |
Galleries. Libraries, Archives and Museums | Science and Data |
Education | Technology |
Also highlighted is a large project underway amongst Affiliates in Italy and Africa[12]:
WikiAfrica Share Your Knowledge Project
CC aimed to explore how the public perceives us to ensure we are communicating clearly our mission. Over time we have seen the term ‘Creative Commons’ informally used in at least three different ways within the context of: licenses; concepts of open and the Commons; and the entity Creative Commons. But to what extent is the current public perception of Creative Commons about licenses versus ‘open’ or the entity itself? Does one perception dominate others? Are these terms used interchangeably? Or is there another perception of CC? Ultimately public perception influences the (intangible) brand value of CC, and is a measurable indicator.
The challenge of trying to identify public opinion about Creative Commons is to distinguish between opinion versus licenses and attributions whenever Creative Commons is mentioned. Keyword searches of news, blogs and other sources return results that include licenses and attributions and they need to be removed. Twitter however does not include licenses so provides a clean sample set for analysis. To identify current opinion we harvested tweets from Twitter during November 1-20 2012 that included the hashtag #CreativeCommons, roughly 800 tweets. We filtered these to identify the descriptive contexts in which CC is mentioned. The predominant descriptive words associated with #CreativeCommons were free, open and license.
(n=800)
Note: the size of word reflects volume of use.
This doesn't answer definitively the question: when people refer to CC do they generally mean the entity, licenses or 'open'? There is no one dominant signal, but ‘free’ was the most frequently used word. The word 'free' implied in equal number of mentions: open (non-proprietary, libre), zero cost and 'Free Bassel'. This may be tracked periodically over time to measure and manage changes in public perception going forward.
In a small way this finding is in sync with theories of self-regulating governance of the commons (Ostrom 1990, Hess & Ostrom 2003). Frequently used keywords in the tweets included: ‘please’, ‘thanks’, ‘help’, ‘support’, ‘aportes’ (Spanish for ‘contributions’) and other expressions that suggest peer pressure incentivizes self regulation, or “ordinary people are capable of creating rules and institutions that allow for the sustainable and equitable management of shared resources.” Caterina Fake in 2010 noted the “Internet was premised on this culture of generosity,” and as the web grew, it became clear that the traditional rules of copyright did not transfer well to new ways of content use in the digital realm. Somewhere along the way, this culture of generosity got “lost in lockdown.” It is reassuring to see that within the context of Creative Commons the spirit of generosity, openness, helping, contributing and support persists.
We use a combination of methods to measure our reach and influence over social media, including the Hootsuite and Thinkup social analytics tools, and Google Analytics. Creative Commons’ online reach is continually expanding, with our Twitter followers and Facebook Likes each increasing by approximately 3000 a month.
A list of our most influential Twitter followers (measured by a combination of each user’s influence and her consistent engagement with Creative Commons on Twitter) reveals both an engagement with influencers in fields we work in and an engaged international staff.
1. Ari Juliano Gema (lawyer, CC Indonesia lead)
2. Donatella Della Ratta (journalist, CC Arab World coordinator)
3. Joi Ito (MIT Media Lab director, CC board member)
4. Free Music Archive
5. Bad Panda Records
6. Stephanie Terroir
7. Carolina Botero (CC Latin America coordinator)
8. Peer 2 Peer University
A list of our most retweeted tweets of 2012 demonstrates our audience’s engagement in our key program areas[13].
Storytelling is often an impactful way of conveying impact. CC has heard hundreds of stories of how CC has helped people, communities and economies. Below is a small selection we’ve heard through the year.
Textbooks may be printed and distributed at nominal cost
In Poland and South Africa students may download textbooks (that include videos and presentations) for free. In Poland the government will provide free digital textbooks to all primary schools as part of a ‘Digital Schools’ program. The project is currently being piloted amongst 380 schools and also aims to create a national repository of training materials. The South African government has printed and distributed more than 2.4 million copies of Siyavula textbooks at a nominal cost of roughly 40 rand, or 26 percent the price of a traditional textbook. Both initiatives use CC licenses. Similar initiatives are underway in British Columbia (Canada), California (USA) and Latin America.
| “Free and open textbooks are what parents and teachers demanded for years, now we will be able to observe how they will work in practice.” Kamil Śliwowski, education lead of Creative Commons Poland |
CC marked objects cost zero to reuse
The U.S. Department of Labor TAACCCT program is a high profile four year effort to create new Community College professional training programs. Materials created via TAACCCT funding will be CC BY licensed and so will be available for reuse and remixing by anyone anywhere at $0 cost, with attribution to original authors. Potential impact indicators include: increased availability of courseware worldwide; cost savings for students; reduced teacher/faculty preparation time; enhanced course quality; collaborative innovations in course creation and learning. Creative Commons has established a framework to measure the impact of CC BY licensing on content created via this program. We aim to identify global OER benchmarks for comparison, although nothing on this scale has been implemented before.
51 Community College consortia across the United States are participating in the first wave of grants. Many consortia comprise multi-state collaborations of Community Colleges. The second wave of funding (in October 2012) comprised 27 awards to Community College and University consortia, totalling 297 schools receiving $359,237,048, and another 27 awards to individual institutions totaling $78,262,952. An additional 25 further grants are to be announced in the second wave of funding.
Source: Twitter
A quality textbook can be produced within 48 hours.
Collaborative hackathons to create content quickly are no longer the domain of coders and geeks. In Finland a team of 30 enthusiasts, mathematicians and Professors got together in September 2012 to write a maths textbook in an intense weekend of collaboration. Using the principles of a ‘hackathon’ they called the event Oppikirjamaraton (“textbook marathon”). How did the weekend go? Organizer Joonas Mäkinen described it as: “the immediate physiological response after finishing the marathon on Sunday was euphoria. Everyone agreed immediately to organize another sprint.” As at January 2013, the textbook is in use in Finnish secondary schools and plans for further textbooks are underway. The ‘hackathon’ approach to content creation is increasingly being used in other fields. The ‘textbook’ was then made openly available at zero cost on Github for others to review, use and remix.
“We haven’t come close to tapping the full potential of OER. We need to help more people understand that these materials are not just free, they can also create communities of teachers and learners who collaborate on their continuous improvement, and that’s the real magic – in the actual reuse and remix.” Cathy Casserly, CEO of Creative Commons |
|
In government and philanthropy, where tax dollars and charitable contributions support the public good, open licensing ensures the results of investments are made available to the broader public and can be continually built upon.
Share Your Knowledge is a structured procedure for institutions to enhance the use and visibility of their content through using Creative Commons CC BY-SA licenses and the Wikimedia Foundation websites. It involves strong collaboration and engagement between entities in Africa (wikiAfrica) and Italy (lettera27 Foundation). After one year more than 100 institutions are involved worldwide. For example: entities in Italy helped to openly license music for use in the Pacific Islands and Southern Africa.
In 2010 GlaxoSmithKline released its malarial dataset (of 13,500 chemical compounds known to be active against malaria) with the CC0 Public Domain Dedication. Teams of scientists worldwide are now using it in open source projects, including in Switzerland, Australia, India, USA and Israel, and the world is closer to a cure for malaria. "We all stepped up and took a risk to put our data out into the public domain, ... it’s demonstrably working," (R. Kiplin Guy, a Medicinal Chemist in Cressey 2012a). Other positive consequences include the World Health Organization adopting a similar approach in 2012 for neglected tropical diseases in partnership with 11 pharmaceutical companies who will share their intellectual property.
“What’s unique about today is getting everybody on the same page,” Bill Gates, Gates Foundation (funder of the neglected tropical diseases project). “What we’ve seen over the last year or so is a real coming together of industry,” Andrew Witty, chief executive of GlaxoSmithKline. “We’re starting to work together in partnership in an unprecedented way,” Christopher Viehbacher, chief executive of pharmaceutical corporation Sanofi (Cressey 2012b). |
Zero cost enables open business models that have a healthy effect on creative industries.
Changes in the digital landscape - such as zero distribution costs and new funding processes - are enabling business models that are innovative (INSEAD 2012, 398) and competitive. Some business models are grounded in open licensing, sharing and collaboration within creative industries. According to Kickstarter:
traditional funding models are dissolving, new forms of expressing ownership have arisen to accommodate for remix culture, and artists are finding ways to connect physical art experiences and traditions to the Internet. In the digital era, the experience of art from the perspective of the artist and the art audience is shifting rapidly, and bringing more people into the creative process.
Kickstarter provides a marketplace for creators to propose projects, and for supporters to fund them. While there is no one answer for an optimal business model, this PBS video is an interesting case study of the impact of Creative Commons: http://youtu.be/024vLBBJf4I.
CC licenses don’t just travel online.
Free Culture principles actively build real world communities. The Libre Bus is a project inspired by Free Culture principles. In 2012, 51 passengers rode the bus 8000 kilometres during 35 days through Chile, Argentina, Paraguay and Uruguay. During this they shared open knowledge and practices via content production, talks, workshops and press promotion in more than 40 events. Over 1,000 people in 25 cities were visited by the Librebus.
Similarly, a real world community designed car participated in a cross country internet freedom tour of the United States during 2012. The aim of the tour was to promote the benefits of an open internet. With 6.2 liters, V8 engine and 430 horsepower the Local Motors Rally Fighter is “basically like driving a Corvette on steroids” (Patterson 2012) and the crowd sourced design is licensed with CC BY-NC-ND, so anyone can build one. |
This report marks a start towards capturing and reporting the impact and value of Creative Commons. We hope you enjoyed reading it. It aims to stimulate questions and conversations and we acknowledge there are many gaps and more work to do. Future analysis will be guided by feedback from our community. Secondly we have made available the data used in this study and encourage others to examine, reuse or visualize it. Please let Creative Commons know about any resulting works or feedback.
We’ve identified more than 410 million CC marked objects but this is only a subset of what exists. The challenge in 2013 is to identify more comprehensively the number of CC marked objects, their use and derivative works. We hope to expand and maintain the list of platforms identified and aim to seek input from our global Affiliate network to identify a broader range of platforms including those with character based languages. Work has already begun on this in the Africa region. We may reach out to some platforms who offer little visibility to metrics and seek assistance to improve our coverage. In 2013 we hope to generate time series data for longitudinal measurement forward and if possible historically to more clearly reflect the impact of Creative Commons and the Commons in general. Possible sources for analysis include the CC server data and/or search engines.
At the same time we hope to build a technical solution to identify the number of Creative Commons marked objects from either server or search engine data. Our aspiration is for a real time automated tool for tracking the growth and use of CC marked objects. We are watching with excitement the innovative developments in Altmetrics, and ideally similar tools that measure use of articles may be applied to other formats over the next two years.
This foundation of metrics may be analyzed by anyone anywhere, and please suggest corrections, improvements and ideas to guide our ongoing analysis. We welcome all feedback. The questions raised from this initial exploration will drive our research agenda for 2013. For example:
We need to strengthen our understanding of the CC network and ecosystem by diving deeper into the representation of our network:
The public perceives CC as ‘free, open licenses’:
Further exploration on these and other aspects of the value of Creative Commons is needed to substantiate the value of openness and sharing through metrics and data. Via these methods (and others as recommended by our community) we intend in 2013 to broaden and strengthen our understanding of the economic and social value of Creative Commons.
Glossary
For the purpose of this report the following definitions were used:
Icons - license buttons (badges) that mark an object as CC licensed (or CC0 Public Domain Dedication). When correctly embedded on a website they ‘ping’ the CC server each time a page with one is accessed, and it is recorded by the server. Examples can be seen at http://creativecommons.org/about/downloads (buttons).
CC - Creative Commons
CC0 Public Domain Dedication - “no rights reserved”. The CC0 Public Domain Dedication empowers the choice to opt out of copyright and database protection, and the exclusive rights automatically granted to creators – the “no rights reserved” alternative to our licenses.
The Commons - For this report we informally define the Commons as a pool of content resources accessible to all members of a society. Individuals may hold copyright in the content, and may have reserved some rights, such as the right of attribution, but they have contributed certain other rights -- for example, the right to copy, distribute, or edit -- to the commons.
Data files - collections of data objects.
Free - "free" is often split into "gratis" and "libre." Gratis refers to no cost access. Libre refers to freedoms to reuse, revise, remix and redistribute.
GLAM - Galleries, Libraries, Archives and Museums.
Libre: libre refers to freedoms to reuse, revise, remix and redistribute. Within CC context it applies to the CC BY and CC BY-SA licenses and the CC0 Public Domain Dedication.
License Type - CC BY; CC BY-SA; CC BY-ND; CC BY-NC; CC BY-NC-SA; CC BY-NC-ND. Please note the links are to version 3.0.
License Version - version 1.0 (Dec 2002), version 2.0 (May 2004), version 2.5 (June 2005), version 3.0 (Feb 2007).
Marking - applying a CC license (or CC0 Public Domain Dedication) to content. This is a once off event that occurs usually when the object is made available. More detailed description is available at the Creative Commons Wiki.
Mixed media - platforms where content - audio files, images, videos, documents - is mixed and a breakdown was unavailable.
Objects - items. Ideally it may be ‘anything with a CC license on it’ as represented by a license icon or hyperlink to the CC license deed, although using the platform approach it meant objects as expressed within that platform - for example: videos, images, books.
OER - Open Educational Resources. OER are defined by the Hewlett Foundation as teaching, learning, and research resources that reside in the public domain or have been released under an intellectual property license that permits their free use and re-purposing by others. For the purpose of this report an education resource is a discrete educational object, it may be a: course unit, 3D visualization, test, exercise, textbook etc. Educational courses were treated differently because they are collections of objects.
Pages - in the manual count of platforms that use CC licenses, the number of CC marked pages was used as a measure where no other could be used (or to best describe the volume of content). For example some sites include an entry per page, such as wiki’s.
Ports - generally stated, porting involves the translation and legal adaptation of CC's core license suite (the international suite, formerly known as the "unported" or "generic" license suite) to the languages and copyright laws of individual jurisdictions.
Public Domain Mark - the Public Domain Mark is recommended for works that are free of known copyright
around the world. These will typically be very old works. It operates as a tag or a label, allowing institutions like
those as well as others with such knowledge to communicate that a work is no longer restricted by copyright and can be freely used by others.
Re-use/Remix - actively taking parts of the CC licensed object/s and merging (mixing) them into other objects to create a new object (e.g. video and music remixes, remixed open textbooks or mixed open courseware).
Site - domain/platform e.g. Flickr, Wikipedia, YouTube, CC Mixter, Vimeo, Bandcamp, Soundcloud, Wordpress, Blogger, Public Library of Science, Directory of Open Journals, Khan Academy, Science Commons, Government sites.
Use - the viewing/reading/listening/linking to a CC marked object.
Methodology notes on the licensed materials work
Consistency is a major challenge with the data. This work uses materials from a wide variety of sources that use different collation techniques and so may not be consistent. This includes data across different time periods - where accurate data is unavailable best estimates have been made if there is some indication of volume. A decision was made to obtain the highest possible granularity in each object of measurement, but each object differs, for example one photo does not equal one educational course or one journal article as an object of measurement. Every effort has been made to standardize the data for consistency, however inaccuracies may result, especially in this early draft phase. To address this sources have been made available wherever practical for the reader to investigate calculation methods within each source. Data used may also have been cleaned.
Multiple sites contain multiple versions of a work. This is a positive outcome of open licensing - Creative Commons Licenses enable wide dissemination and remixing - each variant (remix) of a work was counted as a separate object, because a remix creates a new work, the original author does not control it (but is attributed). We removed from the count - where identified - direct copies of the same work. For example: there are many sites that act as directories (referatories) to content, but they were mostly not counted as the original content resided elsewhere; and some sites host their content on repositories, for example staff images from WIred.com are hosted on Flickr so included in the Flickr count. Given we have not encompassed the totality of objects we still believe these numbers are less that what exists. Where translations of works were found they were counted as separate objects, for example, Wu Ming provide 23 books with multiple translations resulting in 46 objects.
When creating the count of platform data, we were less interested in a ‘total number of materials’ than in the utility of the various licenses to different users. The variety of CC licensed media and materials generated debate about what qualifies as a unit of measurement. CC licenses are applied to datasets and metadata, complete educational courses, books, code, articles, images, video, 3D simulations, audio, datasets, blog and forum posts, presentations, referatories and more. The notes at the end of this report provide a description of the different units of measurement used. This is imperfect, one educational course includes many CC licensed objects is counted as one, as is one photo. Courses may contain a large number of CC licensed materials, so to more accurately represent them in any total number of licensed materials they should be weighted as perhaps equally ten units (at a minimum conservative estimate).
The bulk of data used has been obtained from publicly available sources (websites), however some data was obtained via personal contacts within key platforms - we have little influence in how it is calculated by others and are very grateful to these sources for going beyond their remit to supply us with any data. In many instances we have no visibility to detailed breakdowns of the data (by license type, jurisdiction) but data is supplied to the fullest extent of availability. This is why we provide breakdowns by license type/jurisdiction for some platforms but not all - it is dependent upon data availability and reliability.
An intention was to build upon the time series of prior work undertaken by Creative Commons concerning license statistics, however that work was heavily caveated with reliability concerns. For example we saw: volatility in the estimation algorithm; differences between Google and Yahoo results relating to license types, although positively correlated by jurisdiction and volume; and the method used can no longer be replicated. Because of
this we advise that the two time series be kept separate. Secondly, server data and platform data should be kept separately and not summed. This is because there may be double counting between the two sources. Giorgos Cheliotis, from our Singapore Affiliate team, worked closely with us on prior metrics work. He has been undertaking an advanced analysis on the historical data from search engines and plans to release a more definitive final report on this effort in early 2013.
Methodology notes on the CC impact work
The data in this report have been cleaned. For example, where Google Analytics lists language types - en, en-US, en-IN (language by region) – they were merged into English, and where variants of pages were listed they were merged (e.g. /video/, /videos/, /videos). Raw and cleaned data can be found in the 30June12 metrics workbook. As a result, numbers in this report do not directly match Google Analytics exports.
Methodology notes on the public perception work
The Twitter hashtag #CreativeCommons is used across many languages, and frequently used words were translated to English except for two words- aportes and tranmision because they outweighed English versions in frequency. The search did not retrieve character based translations of #CreativeCommons, although this is doable in future analysis. Variants of words were aggregated (e.g. licensing, license, licenses).
The keyword ‘check' was used in context of 'check out'. The word 'creative' appears separately to 'Creative Commons' or 'creativecommons'. 'OER' is more prominent than 'educational' - probably because it's Twitter.
#CC, an informal hashtag for Creative Commons, was not used because it retrieves a lot of noise. The current perception of CC is predominantly positive despite four mentions of negative words, being: 'harmful' (referring to a CC blogpost 'WIPO's Broadcasting Treaty: Still Harmful, Still Unnecessary' so not attributable to CC) and 'stabby' (someone tweeted a #ccmusic song made them feel stabby).
General study notes
This study is not exhaustive nor does it reflect the totality of Creative Commons licensed materials and so results should be viewed as conservative indicators. Google Analytics data were heavily relied upon but Google is not exhaustive, for example there are parts of Asia (e.g. China, Korea) where Google is not the major ‘web indexer’. A possible way to address this may be to seek linkback or similar analytical data from other search engines - for example via Baidu, Naver and Yandex.
To the best of our abilities there is a high degree of confidence in the number of objects identified, albeit as a minimum starting point. Yet when dealing with millions of objects manually there are bound to be inaccuracies, and this is balanced with the knowledge that we have not identified the number of CC licensed objects on platforms we know contain large volumes of CC licensed materials.
All care has been taken to compile these data but Creative Commons accepts no responsibility for the ways in which it may be used by others. This is a first draft and we are still in the initial phase of this project so inaccuracies and corrections are expected. If you would like to suggest another major platform or source of CC licensed materials, or have any questions about these data we encourage you to contact anna @ creativecommons . org directly.
Data
We encourage you to reuse or visualize our data, it is released into the public domain with a CC0 Public Domain Dedication:
CC Server data: http://labs.creativecommons.org/metrics/2012_license_badge_sample_data/
Platform data: http://bit.ly/VtYJ0I
The Network data of CC Staff and Affiliates: https://sites.google.com/site/ccmetrics/home/maps
CC and Innovation calculations and inputs: http://bit.ly/VtYJ0I
Country use of licenses / CC0 Public Domain Dedication may be viewed at: http://bit.ly/VtYJ0I
CC Metrics wiki: http://wiki.creativecommons.org/Metrics
Should you have any comments or corrections on the data please send them to anna@creativecommons.org .
Images
‘Cathy Casserly’ 2012, Creative Commons, Mountain View USA, accessed January 20 2013, https://creativecommons.org/staff is licensed under CC BY
‘DSC00003‘ (Cover photo) 2006, Leon Brocard (acme) via Flickr, accessed January 20 2013, http://www.flickr.com/photos/acme/151392454/ is licensed under CC BY
‘Kamil Śliwowski’ 2012, Zespół Creative Commons Polska, accessed January 20 2013, http://creativecommons.pl/o-nas/ is licensed under CC BY
‘Libre Bus’ 2011, LibreBus el Documental, Universidad Luterana Salvadoreña, San Salvador, accessed January 20 2013, http://uls.edu.sv/libre/index.php?option=com_content&view=article&id=97:bus&catid=5:inicio&Itemid=11 is licensed under CC BY-ND
‘Malaria’ 2012, Bill and Melinda Gates Foundation, Seattle USA, accessed January 20 2013, http://www.gatesfoundation.org/grantee-profiles/Pages/who-global-malaria-programme.aspx
‘Rally Fighter and 2012 Internet tour bus’ 2012, Internet Bus Tour 2012, The Internet (and Reddit), accessed 20 January 2013, http://internet2012bustour.com/
Software and databases
The analysis for this report used free open tools - including Gephi, R and SQL, Google Fusion Tables, Google Forms and Google Analytics, academic literature from open repositories, and Wordle. Proprietary social media tools used were Hootesuite and Thinkup social analytics.
Sources used and further reading
Alexa Internet 2012, Site info: Creative Commons, accessed 4 December 2012, http://www.alexa.com/siteinfo/creativecommons.org
Arrow, K. 1962, ‘Economic Welfare and the Allocation of Resources for Innovation’, in The Rate and Direction of Inventive Activity: Economic and Social Factors, Universities-National Bureau, UMI, 1962, pp.
609-626, accessed 20 November http://www.nber.org/chapters/c2144
Bole, K. 2012, UCSF Implements Policy to Make Research Papers Freely Accessible to Public, University of California San Francisco, 23 May, accessed 20 November, http://www.ucsf.edu/news/2012/05/12056/ucsf-implements-policy-make-research-papers-freely-accessible-public
Bollier, D. & Helfrich, S. (Eds.) 2012, The Wealth of the Commons. A World Beyond Market and State. Levellers Press, Massachusetts, accessed 20 November http://www.wealthofthecommons.org/contents
Brierley, C. Wellcome Trust strengthens its open access policy Wellcome Trust, London, 28 June, accessed 20 November 2012,
http://www.wellcome.ac.uk/News/Media-office/Press-releases/2012/WTVM055745.htm
Brown G.O. 2003, Out of the Way, PLoS Biol, vol.1, no.1, e9, accessed 20 November, http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.0000009
Creative Commons website http://www.creativecommons.org
Cressey, D. 2012a, Data Sharing Aids the Fight Against Malaria, Nature.Com, 14 February, accessed 20 January 2013, http://www.nature.com/news/data-sharing-aids-the-fight-against-malaria-1.10018
Cressey, D. 2012b, Road map unveiled to tackle neglected diseases, Nature.Com, 30 January, accessed 20 January 2013, http://www.nature.com/news/road-map-unveiled-to-tackle-neglected-diseases-1.9938
Dutta, S. (ed.), 2012 The Global Innovation Index, INSEAD & WIPO, Fontainebleau France, accessed January 29 2013, http://www.globalinnovationindex.org/gii/main/fullreport/index.html
European Commission 2012, Towards better access to scientific information: Boosting the benefits of public investments in research, Communication from the Commission, Brussels, 17 July, accessed November 20 2012, http://ec.europa.eu/research/science-society/document_library/pdf_06/era-communication-towards-better-access-to-scientific-information_en.pdf
Faculty Advisory Council 2012 Memorandum on Journal Pricing Harvard University, Cambridge, 17 April, accessed 20 November, http://isites.harvard.edu/icb/icb.do?keyword=k77982&tabgroupid=icb.tabgroup143448
Finch, J. 2012, Accessibility, sustainability, excellence: how to expand access to research publications. Report of the Working Group on Expanding Access to Published Research Findings, Research Information Network, London, June, accessed November 20 2012, http://www.researchinfonet.org/wp-content/uploads/2012/06/Finch-Group-report-FINAL-VERSION.pdf
Fuster Morell, M. 2010, The Governance of online creation communities for the building of digital commons. (Unpublished doctoral dissertation), European University Institute, Florence, accessed December 11 2012, http://www.onlinecreation.info/?page_id=338
Green, C. 2012, US Department of Labor Invests in Open Educational Resources, Creative Commons News, 2 October, accessed 20 November 2012, http://creativecommons.org/weblog/entry/34328
Hacking Society 2012, [Roundtable conversations at Hacking Society event], New York, 24 April, accessed 20 November 2012, http://hackingsociety.us/#video
Hacking Society 2012, Visualizing the web's hidden economies, New York, 24 April, accessed 20 November 2012, http://hackingsociety.us/videos/hidden-economies
Hess, C. & Ostrom E. 2009, ‘Ideas, Artifacts, and Facilities: Information as a Common-Pool Resource’ in Law and Contemporary Problems, v.66, p.111-146, accessed 20 November, http://scholarship.law.duke.edu/lcp/vol66/iss1/5/
Laakso, M. & Björk, B. 2012 Anatomy of open access publishing: a study of longitudinal development and internal structure, in BMC Medicine, 10:124, accessed 20 November, http://www.biomedcentral.com/1741-7015/10/124
Lerner , J. & Tirole, J. 2004, The Economics of Technology Sharing: Open Source and Beyond, NBER Working Paper No. 10956, December, accessed 20 November http://www.nber.org/papers/w10956
Lerner , J. & Tirole, J. 2000, The Simple Economics of Open Source, NBER Working Paper No. 7600, March, accessed 20 November http://www.nber.org/papers/w7600
Malone, P. 2011 Foundation Funding: Open Licenses, Greater Impact, Berkman Center for Internet & Society at Harvard University, Cambridge, accessed 20 November, http://cyber.law.harvard.edu/publications/2011/foundation_funding
Moe, M. etal. 2012 American Revolution 2.0: How Education Innovation is Going to Revitalize America and Transform the U.S. Economy. GSV Asset Management and GSV Advisors, Chicago, July 4, accessed 20 November http://gsvadvisors.com/
Neylon, C. 2012, Free and Open Data as a Worldwide Economic Engine, Reuters blog, October 22, accessed 20 November, http://blogs.reuters.com/great-debate-uk/2012/10/22/free-and-open-data-as-a-worldwide-economic-engine/
OpenData White Paper: Unleashing the Potential. Presented to Parliament by the Minister of State for the Cabinet Office and Paymaster General by Command of Her Majesty. The Stationery Office, London, 28 June 2012, accessed 20 November, http://www.official-documents.gov.uk/document/cm83/8353/8353.asp
Ostrom, E. 1990, Governing the Commons: The Evolution of Institutions for Collective Action, Cambridge University Press, Cambridge.
Patterson, S. 2012, Crowd-Sourced Rally Car Driver Joins Reddit on its Cross-Country Internet Freedom Tour, WebProNews, 11 October, accessed 20 January 2013, http://www.webpronews.com/crowd-sourced-rally-car-driver-joins-reddit-on-its-cross-country-internet-freedom-tour-2012-10
Perez, C. 2004, Finance and Technical Change: A Neo-Schumpeterian Perspective, in H. Hanusch & A. Pyka (eds.), The Elgar Companion to Neo-Schumpeterian Economics, Elgar, Cheltenham U.K., 217-242
Peters, D. 2012, World Bank stakes leadership position by announcing Open Access Policy and launching Open Knowledge Repository under Creative Commons, Creative Commons News, 10 April, accessed 20 November, http://creativecommons.org/weblog/entry/32335
Royal Society 2012, Science as an Open Enterprise, Royal Society Science Policy Centre, London, June, accessed 20 November, http://royalsociety.org/uploadedFiles/Royal_Society_Content/policy/projects/sape/2012-06-20-SAOE.pdf
Scientists, Foundations, Libraries, Universities, and Advocates Unite and Issue New Recommendations to Make Research Freely Available to All Online
http://www.opensocietyfoundations.org/press-releases/scientists-foundations-libraries-universities-and-advocates-unite-and-issue-new
Schwab, K. & Sala-i-Martín, X. 2012, Global Competitiveness Report 2012-13. World Economic Forum, Geneva.
Slater, D. and Wruuck, P., 2012 ‘We Are All Content Creators Now: Measuring Creativity and Innovation in the Digital Economy’, in The Global Innovation Index 2012, INSEAD & WIPO, Fontainebleau, France, pp. 165-167, accessed 20 November http://www.globalinnovationindex.org/gii/main/fullreport/index.html
Twenty Million Minds Foundation, 2012, Embracing the Future: Free College Textbooks, accessed 20 November, http://20mm.org/infographic-open-source-impact.html
Twitter 2012, hashtag search #CreativeCommons
UNESCO 2012 Paris OER Declaration, Commonwealth of Learning, Paris, June, accessed 20 November,
http://www.unesco.org/new/fileadmin/MULTIMEDIA/HQ/CI/CI/pdf/Events/Paris%20OER%20Declaration_01.pdf
We the People 2012, Require free access over the Internet to scientific journal articles arising from taxpayer-funded research (online petition). White House, accessed 20 November, https://petitions.whitehouse.gov/petition/require-free-access-over-internet-scientific-journal-articles-arising-taxpayer-funded-research/wDX82FLQ?utm_source=wh.gov&utm_medium=shorturl&utm_campaign=shorturl
West, D., Friedman, A. & Valdivia, W. 2012. Building an Innovation-Based Economy, Governance Studies at Brookings, Brookings Institute, Cambridge, Nov., p.14, accessed 20 November, http://www.brookings.edu/research/papers/2012/11/13-innovation-technology-west-friedman-valdivia
Page of
[1] Items marked with a CC license, CC0 Public Domain Dedication or the CC Public Domain Mark. This is defined in more detail in the section titled ‘CC marked objects’.
[2] In a metrics context use is defined as viewing / reading / listening / redistributing / linking to a CC marked object.
[3] Taking parts of it and revising and or mixing them into new works, referencing them in new works etc.
[4] and the CC0 Public Domain Dedication and Public Domain Mark
[5] For example see the Creative Commons License Chooser at http://creativecommons.org/choose/ which provides a license icon in the results for users to place on their site
[6] see Figure 2 for examples.
[7] Generally stated, porting involves the translation and legal adaptation of CC's core license suite (the international suite, formerly known as the "unported" or "generic" license suite) to the languages and copyright laws of individual jurisdictions.
[8] Flickr does not provide an option for the CC0 Public Domain Dedication
[9] We reached out to our friends at the Internet Archive but they do not currently track this metric, which is understandable given the vast scale of their content.
[10] a general way to distinguish the difference between dataset and data is that it is the framework in which the data resides that carries the CC0 Public Domain Dedication, not the data itself.
[11] Perez (2004) noted the distinction Schumpeter’s 1947 theory of creative destruction made between invention and innovation, notably that ‘inventions’ (such as new technologies, new products or methods) require stewardship in order to gain mainstream traction and innovation (economic and/or social impact).
[12] more details, including participants, at http://outreach.wikimedia.org/wiki/GLAM/Case_studies/WikiAfrica/Share_Your_Knowledge
[13] note retweet volumes are influenced by many factors, including timing of tweets, so this is a general finding.
[14] a resource allocation decision