How to win friends and influence decisions

A statistical framework for analysing public libraries

A paper for


Libraries Plus: Adding Value in the Cultural Community


8th Northumbria International Conference on Performance Measurement in Libraries and Information Services (PM8)





Tord Høivik
Associate professor
Oslo University College


Oslo, 2009


Direct at:
http://tinyurl.com/m3jekr
With background at: http://pliny.wordpress.com/events/florence-2009/

Summary

As professionals we want our data to speak to us. But at the moment, public library statistics remain less than fully utilized. The full knowledge potential of the data is not realized. To realize this potential, at least three conditions must be fulfilled:

  1. statistical agencies must collaborate with the library community in developing and refining concepts, indicators and data collection methods
  2. all statistical data must be made freely available in convenient digital formats
  3. the library community must integrate statistical reasoning into their own daily practice 

The paper is intended as an invitation to systematic statistical reasoning based on existing data. To next step - using the numerical information  - is up to the libraries themselves. 


Statistics is a form of knowledge production. Tables and diagrams are the results of production processes. Collecting, processing and presenting statistical data demand hard work as well as technical and statistical competence. The introduction of computers and web-based systems makes many of the routine tasks much lighter than before, however.
A new generation of statistical systems is creating much better possibilities of documenting and understanding what is happening in the library sector. The old "comparisons with last year" can be replaced with systematic studies of particular libraries in the context of other libraries.

Statistical production is carried out by statistical agencies. One of their tasks ought to be the systematic mapping of library landscapes for comparative purposes.

KOSTRA is an innovative data collection programme run by Statistics Norway, which is the English name of our Central Bureau of Statistics. KOSTRA provides standardized data on all public services, in all Norwegian municipalities, on an annual basis. The purpose of KOSTRA is to produce comparative data for benchmarking, policy making and public sector research.
In this paper, we show - step by step - how such a mapping can be carried out with the data that are now available from the KOSTRA system.

All KOSTRA data are published in an open data base. The 2007 revision of KOSTRA includes seventeen variables that describe the public library sector. The KOSTRA variables were selected by Statistics Norway from a much larger data set, with more than 200 variables, collected by the Norwegian Archive, Library and Museum Authority. At the moment only the KOSTRA variables are available for further digital processing, however. The remaining data can only be accessed from a series of predetermined tables. These are published on paper and, since 2002, in the PDF format on the web. The full data set is only available inside the Authority. 

Variables and indicators

Here we use the seventeen KOSTRA variables to develop two sets of indicators. The first set, consisting of six indicators, is aimed at managers and politicians outside the library sector. The second set, which comprises twenty indicators, is intended for library managers and other people interested in the operational details of library management. We show, with concrete examples, how the indicator values for any particular library (the library profile)

  1. may be compared or ranked relative to all the other libraries
  2. may be used for benchmarking or pairwise comparisons

The corresponding data sets, with information from all Norwegian public libraries in 2007 and 2008, have been published as spreadsheets on Google Docs (see Bibliography).









Contents








From industry to knowledge

Half a million visitors


Let me start with a singular fact: In 2007 Asker Public Library reported 440 thousand visitors.


But what does this number mean? What is its purpose? How was it measured? How can we use it? 

Asker itself is an affluent suburban community some fifteen miles south-west of Oslo. The municipality has an excellent library with a dynamic staff. Since Asker itself has some 53 thousand inhabitants, the number of visits corresponds to 8.4 visits per capita. Is this much or little?

Library statistics are intermediate goods. They are the end results of complex data production processes. But data as such have no value unless they are used as inputs - for the production of library services. The purpose of statistics is to improve services. To interpret statistical data, we must place them in a practical  operational context.

This is easier said than done. Libraries are changing under the impact of digital technology. But our statistical systems are conservative. The way we produce and utilize data is still shaped by print and paper rather than by digital  technology. We see this most concretely in the way statistics are published. In a digital world, all public data should be made available in convenient digital formats. In that way, professional users can choose how to present and analyze the data, and are not restricted to the variables, indicators and cross-tabulations selected by library authorities.

Library statistics are also used in library research. Researchers use data files to perform customized data analysis not available in the web tools and publications. For example, publications and web tools may not make available an analysis using the particular variables the researcher needs.

But the power of tradition is strong. Most countries still stick to paper, to paper-oriented digital formats (like PDF), or to preselected combinations of digital variables when they publish their data. This traditional orientation impedes the use and reduces the value of the data that library organizations strenuously collect.


Cultural change


In library statistics, our most urgent task is to shift our statistical systems from a paper-based - or industrial - model to a knowledge-based, digital model. This is a major undertaking, which may take a generation or more to be completed. Like all big development efforts, it will require new systems, new skills and new organizational structures.

Technology should not be a major issue, however. The tools we need to produce and study library statistics are coming fast. Cheap digital devices, both portable and stationary, are spreading rapidly in most countries of the world. 

The barriers to change are social rather than technical. Organizations and individuals must develop attitudes, skills and ways of working that are appropriate to a digital rather than an industrial environment. The next step in library statistics involves, as Bourdieu would say, changes in our routinized responses - or habitus.

Most change is gradual. Big conversions are rare. We tend to move step by step from one mode of behavior to another. I assume this will be the case for library statistics as well. The established systems must be reformed slowly from within rather than replaced en bloc from without.

I also assume that our statistical authorities are willing to change. They are technicians rather than theologians. They provide  the library community with useful tools - and should be happy to make the tools more useful.

The biggest challenge is cultural. In the industrial world, we needed complex vertical organizations - traditional, paper-pushing bureaucracies - to collect, process and distribute information. In the digital world, information can be produced, processed and shared by anybody through the web.

Statistics are information goods. They must, like books and time tables, be understood in in order to be used.  If we apply 2.0 principles to library statistics, we can construct systems that are more transparent, more horizontal and more productive - in  terms of knowledge - than those we have today. But the new systems will require a different balance of power in order to work. The relationships between the stake-holders must be redefined.

This means that ordinary libraries and librarians need to develop their skills and increase their power. At the same time central authorities must delegate responsibility - offering training and support rather than decisions ex cathedra.

Such changes within a social field (Bourdieu) affect the way we talk and listen to each other. Digital technology flattens both social and economic pyramids (Friedman). In a networked society, power and influence is distributed among many independent actors.

Our current elites rose to power in the industrial world. They often find it hard to understand the consequences of digital technology. The knowledge game - one might say - differs from the industrial game. The rules of effective behaviour are different. In digital environments, formal positions carry less weight, while professional insight carry more. Knowledge is like science: it grows by being shared rather than by being monopolized and controlled.

This changes the way we confront and discuss library issues. Distances between ranks and roles are reduced. Conversations become more egalitarian and more professional. In the knowledge economy, what you know is more important than who you know. Knowledge is the new capital. It resides in persons rather than positions.

These are general statements. They apply to all fields that are deeply affected by digital technology. This is definitely the case for the library sector - and even more true for library statistics. I therefore take it for granted that this tiny corner of the world must be rethought and reshaped on digital principles.


Systematic and critical


In library discussions, data turn up in many guises. We often argue with reference to single cases. This is handy, but not scientific. Evidence-based librarianship requires systematic and critical analysis of empirical data. Professionals are required to doubt in order to know.

Statistical data and methods play an important role in all empirical science and professions. All meaningful statistics is based on systematic data collection. In order to judge the data, we must also describe - and evaluate - the methods used to collect the data.

In social statistics, data collection is usually based on a two-dimensional scheme: a rectangular matrix. The units being studied (libraries, books, users) are placed on one axis and a set of variables (properties, characteristics) on the other.

Official library statistics generally use a matrix with libraries as units and library properties as variables. Since I want to improve existing systems rather than proposing new ones, I follow this convention.

Norwegian library statistics divide libraries into six categories: public libraries (folkebibliotek), county libraries (fylkesbibliotek), primary school libraries, secondary school libraries, mobile libraries, academic and special libraries. Each category has its own data set - in other words its own matrix with its own set of variables. The data for Asker Public Library derive, of course, from the public library matrix.

In Norway, every municipality has its own public library. This is mandated by the Norwegian Library Code (Bibliotekloven). Municipalities are allowed to set up joint libraries, but only two communities - Tønsberg and Nøtterøy - have chosen this option. Since Norway has 432 municipalities, we currently have 431 independent public library systems.

Comparing libraries

All libraries are asked to report the number of visits. Not all of them do, however. In 2007 we had data on visits from 408 libraries. But is this context relevant? Should Asker, with fifty thousand inhabitants, be compared with Utsira with less than 250? Comparisons are only useful when the units we compare are roughly similar. We must divide our public library systems into meaningful groups or categories before we can proceed to comparative analyses.

Grouping "like with like" is a fundamental operation in social research. Categorization involves two separate decisions. We must first select the basic characteristic or dimension that is used to differentiate units from each other. At the second stage we must choose specific cut-off points that separate one group from the next.

In the case of public libraries, the municipal population is widely used as a dimension. The cut-off points may, however, differ from country to country. In Norway, ABM uses a scale with twelve different categories:


This scale is far too detailed. It provides too much information. I have never seen it used for practical purposes. Effective tables condense and summarize data to make them more readable. If we are lucky, tables will reveal patterns and relationships between variables. The twelve-step ladder should be replaced by a simpler scale. The one I prefer reduces the twelve categories to five:


This means - in practice - that I choose to compare Asker with the largest libraries in Norway. I choose to look at Hamar, an inland town with 28 thousand inhabitants, with other libraries in the 20 to 50 thousand range, and so on. 

In 2007, the universe - or statistical population - of Norwegian public libraries was distributed by size as follows:


Visits per capita


In 2007, the median number of visits per capita in these five groups were


The general pattern is clear: more visits in bigger communities. But this is only a statistical tendency, not a general rule. In 2007 the national champion of visits was Vegårshei, a tiny community
on the south coast with only two thousand inhabitants. Vegårshei reached fifteen visits per capita. But I have already decided not to compare big with small - they are too different to be of interest. Asker, with more than fifty thousand inhabitants, does not play in the junior league.

In 2007 Asker outperformed all other libraries in its own category:

Visits per capita in municipalities with more than 50.000 inhabitants. Norway 2007.
 
  1. 8,28 - Asker          
  2. 8,04 - Tromsø           
  3. 7,45 - Trondheim            
  4. 7,41 - Kristiansand          
  5. 6,46 - Stavanger          
  6. 5,70 - Bærum           
  7. 5,48 - Tønsberg/Nøtterøy         
  8. 5,21 - Skien           
  9. 5,18 - Bergen           
  10. 4,46 - Oslo             
  11. 3,96 - Drammen            
  12. 3,89 - Sandnes         
  13. 2,92 - Fredrikstad          
  14. 2,74 - Sarpsborg          

But can we trust this information? I am not worried about the population data. In Norway, these are exceptionally accurate. But how are visits defined - and how are they actually measured in practice? These are practical issues in statistical methodology.

When people work with official library statistics, they often take such questions lightly. Researchers that conduct experiments and social surveys tend to pay great attention to methodical issues. They are forced to by academic tradition and by the peer review process.  But when we deal with official data, we often assume that the government has taken care of the necessary quality control.

This, however, is seldom the case. After forty years as a statistician - and fifteen as a library teacher - I am struck by the rough and ready nature of most library statistics produced by government bodies. Governments are willing to fund development work in large and important fields, like health and education statistics. But libraries are not so high on the political agenda. Library statistics represent a very small cog in a very large machine. Very few people combine deep knowledge of libraries with statistical expertise at a professional level.

Library scholars know how to scrutinize research-based statistics. We should be equally observant with regard to official statistics. I turn to visitor statistics with this in mind. 

Calibration

Statistical variables represent social concepts. When we speak about visits or visitors, we imagine people that enter library premises in order to use library facilities. Dogs, staff members, plumbers and bicycles may be registered when they pass through our electronic portals, but they are not library visitors or users in the ordinary sense of the word. Such situations are quite common in the world of statistics. There are gaps between our mental concepts, on the one hand, and the actual counting, on the other.

We can sometimes reduce such discrepancies. But we can seldom avoid them completely. Direct observation would solve the problem, but is far too expensive for an ordinary library. Electronic counters are simply too convenient to be replaced by live observers. We could, however, try to adjust the electronic counts. Counters tend to overestimate the number of real visitors. Dogs are a minor problem. Children that run in and out is a bigger one. If they start playing with the counter, it is Armageddon.

An electronic turnstile is a tool for data collection. In science, researchers calibrate their instruments. Since the field of librarianship is becoming more professional, we might well do the same with our instruments. The normal way to calibrate an existing counter would be to


I do not expect large numbers of libraries to follow this route. Calibration takes time and effort. Staff members cannot serve the public and count visitors at the same time. But our understanding of visitor statistics would improve substantially if a few libraries were willing to compare direct and electronic counting - preferably with some economic support from central authorities.

Time sampling

Our visitor statistics are rather approximate. Their quality differs from library to library. Small libraries without electronic counters are especially prone to errors. Estimates based on manual counting during one or two weeks - usually in "high season" - will have a definite upward bias. It would be better to gather data during the same number of days - but disperse the counting days (the time sample) throughout the year.

There are additional techniques and methods that could be used to improve visitor data. But there is little point in describing them unless libraries are ready to try them out and library authorities are eager to disseminate the results.


Profiles and distributions

The data matrix has two dimensions. Read vertically the observations constitute a set of statistical distributions. Read horizontally they describe a set of library profiles. Effective statistical reasoning combines the horizontal and the vertical approach. When we are interested in a particular library, we focus on profiles. When we are interested in the library world as a whole, we study distributions.

When we start from scratch, we must start by studying distributions, however. We cannot interpret profiles without context. What does 8.4 visits per capita mean? The empirical distributions provide that context. To show this approach in action, we return to Asker.

The population consists of the fourteen largest public libraries in Norway. KOSTRA provides a convenient set of variables. After a revision in 2007 KOSTRA covers fifteen variables drawn from the public library sector:

  1. the number of library visits
  2. loans of children's books
  3. stock of children's books
  4. loans of fiction books (adults)
  5. stock of fiction books
  6. loans of non-fiction books (adults)
  7. stock of non-fiction books
  8. loans of audio books
  9. loans of videos
  10. loans of music
  11. loans of other media ("non-book loans")
  12. number of accessions
  13. total media expenses
  14. total salary expenses
  15. total staff (FTEs - full time equivalents)

In addition, two demographic variables are included:

16. total population served
17. population 0-13 years served

These seventeen variables describe seventeen different characteristics or properties of each library organization.

A biased selection

The particular selection of variables is not very balanced. Only one variable - the number of visits - really describes the user, while eleven variables (2-12) describe properties related to the collection. Only two economic variables - the media and the salary expenses - are included. The staffing variable (15) makes no distinction between professional and non-professional staff.  The library data in KOSTRA are based on information that is easy to collect rather than on information that is needed to understand library activities. The biggest weakness, in Norway as in other countries, is the lack of user data. We know lots about our collections but almost nothing about our customers.

The fifteen KOSTRA variables were selected by Statistics Norway from a much larger set collected by the Norwegian Archive, Library and Museum Authority. The full set contains some of the data we would like to study - for instance on the physical size of the library, on opening hours, on the number of reference queries and on the size of the professional staff.  But at the moment only the KOSTRA variables are available in a convenient digital format. The Authority does not publish the data set as such - only a set of tables derived from the more than two hundred variables collected.

During the data collection process, the nineteen county libraries assist libraries in their region with statistical advice and quality control. A few county libraries have also  started to publish the full data sets, in the form of spreadsheets. I would, in fact, urge all counties to do so. Our library statistics contain much useful information that is effectively lost as long as the data sets themselves remain out of reach.

I must add that the full set of variables is also quite unbalanced - in Norway as elsewhere. Collecting good descriptive data takes time and effort. Numbers as such are meaningless. Useful statistics must be based on relevant indicators derived from meaningful concepts. In economics, researchers spend much time in developing the chains of argument that lead from concepts to data and back again. Indicators are constantly discussed, tested and refined. In the social and cultural field, methodological work is less developed. Systematic collection and use of cultural statistics, in particular, is a recent phenomenon. There is a widespread lack of systematic and professional methodological debate.

Standardized data collection

A second problem must also be mentioned. As library systems become digitized, libraries are in fact collecting and storing vast amounts of data on their collections and customers. Amazon and Google analyze this type of information all the time - and use the results to develop better services, systems and interfaces.  But libraries have no tradition of analyzing and utilizing data on lending behaviour. Individual studies are carried out from time to time. They may add to the knowledge base in librarianship and information science. But for operational purposes we need repeated, regular, routine and standardized data collection. The results of ad hoc studies tend to be read, discussed and forgotten. They flare like fireworks - and vanish as quickly.

The great advantage of KOSTRA is its established nature. Professional understanding and use of statistics among librarians require regular production and discussion of significant indicators.  I know that KOSTRA will deliver roughly the same public library data, in the same open format, year by year - for the foreseeable future. A set of indicators based on the seventeen KOSTRA variables can be calculated without collecting additional data from more than four hundred municipalities. That is a blessing.

In the next section of the paper I propose two sets of indicators:

  1.  a smaller set of six indicators aimed at people outside the library field
  2. a bigger set of twenty aimed at professional librarians and library staff

The "external" set is also of interest to librarians, of course. The second set could probably be expanded. But I am not trying to list all possible indicators. The goal is to present a way of working with a given set of available statistical variables.

From variables to indicators

Indicators are new variables that can be calculated from the original data matrix. The simplest and most common indicators are fractions - or ratios between two original variables. For instance:


When we choose indicators, we should try to find properties that are intuitively meaningful. The number of visits per capita tells us about the intensity of use. Loans per visit reveals something about the pattern of use.  Media costs per accession indicates the average price of new items. More complex indicators can also be constructed. I believe, like Finnish librarians, that the sum of loans and visits per capita is a useful indicator. It can be used as a substitute or proxy for the total library
activity or general service output.

Formally speaking, this sum is an unweighted additive indicator. Loans and visits count the same. The German library indicator BIX is a much more complex construction. BIX is a weighted additive indicator, which combines a total of seventeen different properties into a single quality index.

Indicators may also be logically related to each other. For instance:

  1. loans per capita = loans per visits * visits per capita
  2. visits per capita = visits per active user * percentage of active users in population

Ratios, additive indexes and formal relationships between indicators are some of the standard tools we may use when we study libraries by means of indicators.




External indicators


In our case, indicators must be built from the variables available in KOSTRA. I have, as an illustration, constructed twenty-six indicators based on the KOSTRA data matrix. I think six of them will be of particular interest to people outside the library field, such as managers and politicians in the municipal sector. These indicators are particularly relevant for advocacy.

A. Three output indicators

  1. visits per capita
  2. loans per capita
  3. level of activity, defined as the sum of visits and loans per capita

B. Two input indicators

  1. staff per 10.000 inhabitants
  2. operational costs, defined as the sum of media and salary expenses per capita

C. One productivity indicator

    1. level of productivity, defined as the ratio between the level of activity and the operational costs

The level of productivity can also be written as:

(no. of visits + no. of loans)/(media expenses+ salary expenses)

Internal indicators

The remaining twenty indicators are aimed at the library staff and hence
more technical in nature:

D. Expenses (3)

  1. salary expenses per FTE
  2. media expenses per accession
  3. media expenses as a percentage of (salary + media) expenses

E. Turnover indicators (3)

  1. turnover of non-fiction = loans of non-fiction / stock of non-fiction
  2. turnover of fiction  = loans of fiction / stock of fiction
  3. turnover of children's books = loans of children's books / stock of children's books

F. Output indicators (5)

  1. loans per visit
  2. loans of non-book media per inhabitant
  3. loans of non-fiction books per adult
  4. loans of fiction books per adult
  5. loans of children's books per child

G. Other loan indicators (3)

  1. loans of non-book media as a percentage of all loans
  2. loans of non-fiction books as a percentage of all adult book loans
  3. loans of children's books as a percentage of all book loans

H. Stock indicators (6)

  1. accessions per capita
  2. stock of non-fiction books per adult
  3. stock of fiction books per adult
  4. stock of children's books per child
  5. stock of non-fiction books as a percentage of adult book stock
  6. stock of children's books as a percentage of total book stock

The full set of data for 2007 and 2008 have been published on Plinius Data (see Bibliography). Here we
take a brief look at the six external indicators.

Visits per capita

The frequency of visits per capita, as measured by the median, falls when we go from bigger to smaller municipalities. The reduction is substantial. More than a third of the visits "disappear" as we go from the big cities to the small communities in the periphery.
The result is interesting, but generates - as findings usually do - new questions:


Investigating the patterns and exploring the causes of traffic to the physical library could obviously become a substantial research project in its own right. Here I only take a brief look at the time dimension. One particular result - for 2007 - does not prove anything. The numbers suggest a positive correlation between size and traffic. But the relationship might be a fluke. In large data sets, random patterns emerge and disappear like ocean waves  all the time. But if the relationship is permanent, we should find it in earlier years as well.  

Median number of visits, by population size, 2006-2008 


The 2007 pattern is confirmed for 2006 and 2008. To explore this further, I also spent some hours studying the visitor statistics from 2002 to 2005. The KOSTRA database does not include visits before 2006, so I had to work from the published tables. Here I had to use averages (mean values) rather than medians. Calculating the median values from scratch was not a practical option. It would involve entering visit and population data by hand for all municipalitities in my own spreadsheet - and several days of extremely boring work. Fortunately the statistical yearbook includes a table that shows the average number of visits by size. If the pattern is real and strong, the choice of central tendency parameter - mean or median - should not matter. Going from the twelve categories used by the Authority to the five categories I prefer, was just a matter of calculating weighted averages.

Average number of visits per capita, by population size, 2002-2005



The results are very clear. I find the same decrease in traffic from large to small municipalities for all five years.  We can take this as a well-established fact: in Norway the number of library visits per capita tends to decrease as we move from bigger to smaller municipalities.

Loans per capita

The number of loans per capita follows a U-curve rather than a linear trend. It is highest in the biggest and in the smallest libraries - and lowest in mid-size municipalities - between ten and twenty thousand inhabitants.

Median loans per capita, by population size 2007



We find the same U-type relationship in earlier years. But what can explain the relationship?

Money is the first suspect. Do mid-size libraries get less money from the local government? The answer is yes, as data on operational costs show:

Median operational costs per capita, by population size 2007


The number of staff hours (full time equivalents) is also lowest in the middle group:

Median FTEs per capita, by population size 2007


If we compare the mid-size communities with the largest ones, we find that the former had

  1. 27% fewer visits per capita
  2. 22% fewer loans per capita
  3. 14% lower operational budgets per capita
  4. 20% fewer staff hours per capita

Productivity

Libraries are service organizations. As such they receive certain amounts of public money which they convert into public services. When we speak about productivity, we refer to the efficiency of the conversion process. How effectively is the library able to utilize its funds?

Before its revision in 2007, KOSTRA used loans per FTE as its main indicator. Statistics Norway did not describe this ratio as an indicator of productivity. But its logical form - an output divided by an input - and its prominence in the system, suggests productivity measurement.

In an earlier paper I have criticized this indicator as far too simplistic
(Høivik, 2007). Libraries have several different inputs and many different outputs. Picking just one of each is bound to mislead in very many cases. As an improvement I have suggested the ratio between operational costs and the activity level. This relates two outputs and two inputs. The indicator is still rather basic, but may be worth trying out.

Productivity values for all Norwegian libraries in 2007 are listed in the Google document referred to above. The median values for each size category are:

Median productivity by population size 2007



The general tendency is clear: smaller libraries are less efficient than large libraries. This is hardly surprising. Most organizations in the service sector, from schools to hair-dressers, experience economies of scale. The surprise occurs near the top. The median value for the MTL libraries breaks the pattern - "sticking out like a sore thumb". As a group, it seems, it is the medium-to-large sized municipalities that run their operations most efficiently.

A second research paper, on the high productivity of MTL libraries, seems called for - had we but world enough and time. Instead we must return to our starting point - Asker Public Library, which set the whole distribution machinery in motion.

Time series

To understand what the number of visitors per capita - or any other indicator value - means, we must place it within a relevant context. The most common way of creating context is to look at changes over time. We compare, in other words, the current situation with the situation in earlier time periods. In the case of Asker, visitor levels from 2000 to 2008 fluctuated as follows:


In libraries, such comparisons are usually made on an annual basis. Under normal circumstances, this can be done without much effort. We only need a handful of past data from our own library to set up the relevant time series.

Comparing a library with itself over time gives some useful information. But a real understanding of the data requires comparisons with other libraries. Two types of questions are common:

  1. how does my library compare with all the libraries in the same category
  2. how does my library compare with specific libraries in the same category

In the first case we find the relative position of the library within a distribution. In the second case we carry out a benchmarking exercise.

Relative positions


The relative position of Asker Public Library - among the fourteen biggest library systems in Norway - on the six external indicators, is given in the Table below.

Table 1. Asker compared with all other large libraries. Six external indicators. 2007



Visits
per capita
Loans
per capita
Activity
level
FTE per 10K population
Operational costs per capita
Productivity
Asker
8.28
7.14
15.42
5.14
241
6.41
Rank
1
1
1
1
2
3
Relative to median*
155%
128%
137%
131%
121%
129%
Median
5.34   
5.6
11.25
3.93
200
4.95

*The Asker value as a percentage of the median among the fourteen libraries

In terms of these indicators, Asker is clearly a highly successful library. It has the highest service output in its group. The municipality supports the output by providing - in comparative terms - solid budgets (rank=2) and a high level of staffing (rank=1). But the last column shows that the library also contributes - by utilizing its inputs in an effective way (rank=3).

The number of visits per capita is exceptionally high. To explain this fact we need, I suspect, a bit of local information. Asker is a rich sub-urban community with a single substantial center, the Asker township. The main library is part of the Cultural Center, where you also find halls for live performances, the local cinema, an art gallery and a cafeteria. The whole complex lies in the middle of the shopping area and just a hundred yards from the bus and railway station. The geographic location is ideal for attracting a high level of traffic.

This illustrates a general point. Comparative statistics help us formulate interesting questions by indicating features that need to be explained. The data may also provide some possible answers. But these answers tend to be partial. In statistics we work with a small number of standardized variables and indicators. In any concrete situation, there will always be many aspects, conditions and relationships that are not represented in the tables. The map is not the territory.

But we should of course build on rather than abandon mapping. The quantitative statistical description of a library may be seen as a first step towards understanding the situation. It provides a framework or matrix for local analysis and discussion. Decisions in the library field should preferably be based on a combination of statistical and local information. 

Benchmarking

The purpose of benchmarking is improvement. When we benchmark we try to find libraries that do better than ourselves in one or several areas. Statistical comparisons allow us to identify such libraries. The next step goes beyond statistics. It is qualitative rather than quantitative. In order to learn from "our betters" we must study the way they operate - through documents, discussions and actual visits.

In the table immediately below, we compare Asker with its geographical neighbours (in the same size category). Here, Asker represents the champion or benchmark rather than than the challenger:


Table 2. Asker compared with two large library neighbours. Six external indicators. 2007


Visits
per capita
Loans
per capita
Activity
level
FTE per 10K population
Operational costs per capita
Productivity
Asker
8.28
7.14
15.42
5.14
241
6.41
Bærum
5.70
5.64
11.33
4.67
233
4.87
Oslo
4.46
4.07
8.53
3.05
172
4.95








Oslo, the capital, has about half a million inhabitants. Suburban Bærum, with about one hundred thousand, is located between Oslo and Asker. In terms of social structure, Asker and Bærum are quite similar. The local newspaper - Asker og Bærums Budstikke - serves both communities. But Bærum is polycentric rather than unicentric. Bærum library has one very large and two smaller branches in addition to the main library at Bekkestua. This makes the production of services more costly (lower productivity) than in Asker.

In the following table we compare Asker with the top scorers on operational costs per capita and on productivity.


Table 3. Asker compared with two large library top scorers. Six external indicators. 2007


Visits per capita
Loans per capita
Activity level
FTE per 10K population
Operational costs per capita
Productivity
Asker
8.28
7.14
15.42
5.14
241
6.41
Tønsberg/
Nøtterøy
5.48
7.04
12.52
4.44
254
4.92
Trondheim
7.45
6.63
14.08
2.74
153
9.21








Tønsberg/Nøtterøy has a slightly better budget than Asker. Trondheim has the worst budget among the fourteen libraries - but an amazing level of productivity. To learn anything useful from such numbers, we must combine them with local and qualitative knowledge, however.

Tønsberg is a medium sized city on the coast sixty miles southwest of Oslo. It is an important commercial centre, with a strong position in maritime transport and technology. Nøtterøy is a sub-urban community to the west of Tønsberg. Tønsberg Public Library serves both municipalities. Local politics is dominated by the conservatives, but in Tønsberg they have a tradition of supporting the library. The striking glass-walled building from 1992 is one of the best examples of new library architecture in Norway. The budget for 2007 shows that local political support remains strong.

Trondheim is the third largest city - and also a former capital - of Norway. It is located in the county of Sør-Trøndelag, three hundred miles to the north of Oslo. The ability of the library to deliver a high level of output on an extremely low budget is admirable - but hardly an example to be desired. The main library is efficient, but the facilities are cramped. Trondheim is home to the Norwegian Technological University and has a big student population. Thus, many of the library users are students.

New production of knowledge

The present paper is not meant as a statistical report on Asker Public Library in 2007, but as an invitation to systematic statistical reasoning, based on official data that we know will be available, year by year, in a convenient digital form.

Statistics is a form of knowledge production. Tables and diagrams are the results of production processes. Collecting, processing and presenting statistical data require both technical competence and lots of hard work. The introduction of computers and web-based systems makes many of the specific tasks much easier than before. The KOSTRA system demands digital working environment.

But old habits die hard. The way the library community defines, develops and applies statistics is still shaped by past traditions. This is bound to change. Governments, businesses and voluntary organizations are all turning to the web. Traditional forms of data collection, processing and printing will not survive the digital transformation.

I see library statistics as a very small corner in a very tall world. But it happens to be our corner. We face the same choice as everybody else: should we resist, tolerate or promote change in our personal back-garden? 

This paper wants to promote change in our statistical environment. A new generation of statistical systems is creating much better possibilities of documenting and understanding what is happening in the library sector. The old "comparisons with last year" can be replaced with systematic studies of libraries in context. Judging a particular library in the context of other libraries involves more work than just looking at our local time series. Rather than documenting our own trail, we have to fit the trail into a larger landscape.

But we now have the tools to do so without being overwhelmed. That is why we need statistical agencies. One of their tasks ought to be the systematic mapping of library landscapes for comparative purposes. In Norway, as in most other countries, this goal is only partly achieved. The Norwegian Archive, Library and Museum Authority does collect a wide range of library statistics. I have already mentioned that the set of variables gives a lopsided view of library activties. But that is true in all countries. My main concern is rather that the data we do collect - with great effort - remain in cold storage.  The full knowledge potential of the data is not realized. The numerical maps we create are loose sketches rather than technical tools and professional instruments.

To realize this potential, at least three conditions must be fulfilled:

  1. statistical agencies must collaborate with the library community in developing and refining concepts, indicators and data collection methods
  2. all statistical data must be made freely available in convenient digital formats
  3. the library community must integrate statistical reasoning into their own regular practice

In Norway, this is starting to happen. But the pace, I must admit, is slow.





Bibliography



Data sources