This is the fourth post in a series that looks at steps toward building a digital archive. After an introduction to building a digital archive we looked at the Scoping process that aligns your goal of building a digital archive with the vision, mission and strategic objectives of your organisation, and the Screening process that asks probing questions about your collections to establish which collections should be prioritised for digitisation, or digital processing, for incorporation into the digital archive. Once you have established which collections should have priority, the Selecting process then forces you to engage closely with the collection and actually select which items in your analogue collection should be digitised and incorporated into your digital archive or which born digital files should be selected out, processed and ingested into the digital archive.

Selecting is a controversial activity. The argument is often made, if we select what is to be digitised and what is not, then we are distorting access to our full collection. That is true and a point well made. The point, however, fails to take into account the fact that your very analogue collection is not a collection of everything. In itself it is a selection from all that possibly could have been collected. You did not have the space, time or resources to collect everything, so you collected what had meaning for the body of work as a whole, to take on a weight of meaning and perspective that is important to communicate with your audience. Likewise the same is true about a born digital collection. Take a digital photographic collection, for instance. The photographer did not take a picture of everything. The photographer made decisions about what to capture and what not to capture. Those are selection decisions. There is not the time or resource to capture every moment.
So too, in building a digital archive, you do not have the time and resources (and digital space) to digitise or digitally process everything. You need to identify what carries a weight of meaning in terms of your whole collection and what doesn’t. Professional photographers do this all the time. So do newspaper editors. And so too, do you and I in everyday life.

Selection is, in fact, an important activity fundamental to the way in which we make sense of the World. Good historians and bad historians alike select from all the facts of history certain facts and moments in order to draw out meaning and conclusions. What separates a good historian from a bad one is not the fact of selection, but the openness, honesty and integrity with which the decisions are made.
Does that open us to the accusation of presenting a partisan view? Absolutely. We can do no other. We are all a product of history, of a particular time and cultural milieu, and to claim to be uninfluenced by our presuppositions and the emphases of our times would be dishonest. Getting back to openness, honesty and integrity, then, it is important that when embarking on the tough process of selecting we make our presuppositions known in the process. That will assist future generations understand our methodology and hopefully they won’t judge us too harshly for it.
One of the toughest aspects of selecting I find is scope creep. Not sticking to the absolute boundaries of what one has decided to select. One always wants to include more. It is not the items that should obviously be included or obviously be excluded that present the problem, it is the borderline items that one potentially agonises over. So a useful exercise, I find, is to firm up the boundaries by using what I call the Selection Funnel.

Let’s say, for example, that my institution is set up to showcase the best of South African English literature and the Scoping process has identified that as an institution we need to emphasise struggle literature in the 1960s to tie in with a significant anniversary in our nation’s history. And perhaps in the screening exercise you identified that there was a particular need to support the school curricula with poetry in English relating to the struggle. One might therefore create a number of positive statements of what you want selected such as:
- Select works from our collection published by South African authors
- Select works from our collection published in English
- Select works from our collection published in the 1960s
- Select works from our collection that are directly related to the Struggle
- Select works from our collection that are poetry
One would want to then create the opposite statement to reinforce the statement in the mind of the selector.
- Do not select works from our collection not published by South African authors
- Do not select works from our collection not published in English
- Do not select works from our collection not published in the 1960s
- Do not select works from our collection that are not directly related to the Struggle
- Do not select works from our collection that are not poetry
Then one creates a joint statement
- Select works from our collection published by South African authors, Do not select works from our collection not published by South African authors
- Select works from our collection published in English; Do not select works from our collection not published in English
- Select works from our collection published in the 1960s; Do not select works from our collection not published in the 1960s
- Select works from our collection that are directly related to the Struggle; Do not select works from our collection that are not directly related to the Struggle
- Select works from our collection that are poetry; Do not select works from our collection that are not poetry
Now you have a bunch of statements that may seem to be all over the place. What you want is to arrange them in order from broadest to narrowest statement, creating a funnel that makes it more and more difficult for any piece of literature to qualify in terms of all the statements as one proceeds down the funnel.

The order I place these statements is also going to be influence by the arrangement of my collection. So if my collection is arranged at the broadest level into poetry, novels etc. then that is going to be at the top of my selection funnel. If however, at the broadest level my collection was arranged by language, then that would be at the top of the funnel. Assuming the former scenario I might arrange the statements like this:
- Select works from our collection that are poetry, Do not select works from our collection that are not poetry
- Select works from our collection published in English, Do not select works from our collection not published in English
- Select works from our collection published by South African authors, Do not select works from our collection not published by South African authors
- Select works from our collection published in the 1960s, Do not select works from our collection not published in the 1960s
- Select works from our collection that are directly related to the Struggle, Do not select works from our collection that are not directly related to the Struggle

So one can see that the exercise of creating a Selection Funnel both helps to firm up decision making about what is in and what is out of our selection, but it is also helpful in documenting our approach to selecting so that future users may understand the method in our madness!