I’m starting to think that next big innovation online, other than ‘Big Data’ (the relation of massively huge sets of data to one another to make deeper insights possible), is ‘Meaningful Data’. 1
Note: This quote by Louis Jordan, is more accurate a description about Web 2.0 that we realise. Huge quantities of data files are being uploaded to the internet every day. Latest statistics from YouTube say 60 hours of video are uploaded every minute. With cloud storage being offered for low prices, we are gathering digital data at an incredible rate, which not only questions, do we need to store all this data, but, how are we going to find anything ever again? Do a search on Google theses day, you rarely get what you are looking for. Unless you know how to search. I’ve become an expert at searching, I though it was because, I was just becoming more saavy, until I realised, I was, unknowingly becoming more metadata aware. I have been trying to get round to learning more about metadata, and how it can be used for my research, but then I realised it should not be just for research. Metadata is for Life!
My Aha, Eureka moment came, when my back door wouldn’t open. Naturally I turned to Google. Instead of typing in, how to fix a back door, and getting a gazillion useless results, I searched for types of doors (mine’s a uPVC), what type of hinge (I found out it was a flat hinge), then I put the following in the search engine, ‘uPVC door, flat hinge, won’t open’. I got a few forums, and video tutorials on what to do when the lock in your uPVC door isn’t unlocking correctly. Result.
At Digitalis 2.0 a Conference organised by the students of Digital Humanities at University College Cork, in 2016, one of the keynote speakers was Deirdre Ní Luasaigh, co-founder of Culture Ark. Culture Ark is a Digital Archiving and Preservation company, based in Ireland. Ní Luasaigh spoke about the importance of metadata, for accessing and future proofing data, in particular to important archival material. With new technologies coming to the fore every day, the current formats, media, software and even the hardware we use, could face obsolescence in around in ten years. Since, most of our data, in particular with cultural and personal heritage, is now saved in digital format, losing this information would be a great loss to future society. She also mentioned emerging standards for metadata such as Dublin Core. Her talk was very interesting, because, although I was aware of metadata, it was not a strategy I had thought of integrating into my research.
What is Metadata?
Metadata is data that provides information about other data 2. It provides information about an item, or its content, for example, common metadata properties for an image could include: creator, description, title, headline, keywords, location, licence, size, and resolution. A text document’s metadata may contain information about information such as how long the document is, author, date written, summary and date published.
“data” with “meta-,” which means “transcending” and is often used to describe a new but related discipline designed to deal critically with the original one. “Metadata” takes the “transcending” aspect a step further, applying it to the concept of pure information instead of a discipline. “Metadata” is a fairly new word (it first appeared in print in 1983), whereas “data” can be traced back to the middle of the 17th century)3.
Why use Metadata:
A few reasons to use metadata are mentioned above, such as future proofing your archives or data. However, there are other reasons, why a metadata strategy should be implemented, in any kind of digital data you create. If there was a standard metadata description across document types, created by different organisations, there would be much easier access, improved retrieval of information and better interoperability across systems, from governments to non-profits, business to museums. All organisations need to be compliment already with certain standards, of say record keeping and revenue. A standard of compliance with management of digital data is just as important.
We are all aware of digital technologies, and use them daily, in our lives without thinking. The speed and ease at which we gather data, yet we seem to have no regard for its long-term preservation. In article in the Telegraph, Internet pioneer Vint Cerf, Vice-President of Google believes it is time to start preserving the vast quantities of digital data before they are lost forever. He warned that the 21st century could become a second Dark Ages, due to the amount of data now keep in digital format. 4
Why everyone needs to know about Metadata
Many of us use metadata without even realising it. Search for music on Spotify, and you have a choice to search by artist, album, song title, because they have metadata embedded in the audio files. Search for a book on Amazon, and you have a range of criteria you can select from. These are not just tags, added on top of the image associated with the file. The information can be embedded within the file itself. This is called Embedded Metadata. We use metadata everyday without realising. Photographs we take with our camera phones has embedded metadata: the date we took the photo, the size of the file.
Knowledge about metadata is not just the reserve of archivists and researchers. Anyone who uses the internet, who uploads any kind of content, from photographs to videos, blogs to websites, should know about metadata. Photographers posting their photos, artists posting images of their art work, writers their work in online journals or magazines, should all have a basic grounding of metadata creation. These digital forms can have a metadata file attached or embedded with the information you choose to add. It will only take a small step of awareness to step from unconscious metadata to consciously recording metadata. Understanding what metadata is, and creating simpler tutorials and tools is the key. It should be taught not just at a university level, but at a school level, as it affects everything we do, as we become a more digitized world.
Metadata Challenge
While data curators, and increasingly researchers, know that good metadata is fundamental, the challenge to create a metadata strategy, that meets with personal and future users needs feels like a complex task to the majority.
Taking for granted that cultural heritage digital applications create digital libraries, every project have to face the challenge of choosing a framework of metadata to guarantee the sound management of its life-cycle, form creation to preservation passing through data delivery. The right choice of on or more metadata application profiles depends both on the current state of the art of metadata standards and on each specific scenario of application. Some misunderstandings for example have been done in applying administrative schemes with the goal of building retrieval base for digital collection [Feliciati, 2010]
The main challenge to creating a metadata strategy, is the different vocabularies and standards that can be used. Fortunately with the development of organisations such as Dublin Core Metadata Initiative (DCMI), we are closer to creating basic standards. Though no doubt these will need constant upgrading as technologies progress. DCMI is an open organisation that began in the 1990s from an informal workshop series, and that now has attracted participation of a world-wide community. The mission of the DCMI is to create a Dublin Core metadata standard, to make it easier to find resources using the Internet through the following activities:
- Developing metadata standards for discovery across domains;
- Defining frameworks for the interoperation of metadata sets;
- Facilitating the development of community or discipline-specific metadata sets that work within the frameworks of cross-domain discovery and metadata interoperability [Weibel & Koch 2000].
Metadata Terms
The DCMI supports shared innovation in metadata design and best practices across a broad range of purposes and business models. They have compiled a list of Metadata Terms including properties, classes, definitions and elements
A full list of DCMI metadata terms can be found on Dublin Core Metadata Initiative website: http://dublincore.org/documents/dcmi-terms/#H1
Breakdown of Metadata:
As a visual learner, I am always searching for visualizations of what I am researching. Among the thousands of boring charts and graphs, I came across these diagrams of Metadata in an article on BrainTraffic.com, An Intro to Metadata and Taxonomies, written by Christine Benson. Finally, a diagram that makes sense! (They don’t hold all the relevant information however, if I get the time, I will draw up my own diagrams and revise this article).
Metadata records are composed of three pieces:
Synthax: encoding standards; a container for the structure, semantics and a set of rules by which the contents should be interpreted: eg RDF, EML, XHTML, DHML. It gives the computer system the instructions on how to interpret the record as a whole.
Structure: fits inside the syntax and is composed of a metadata scheme. Examples of structure include Dublin Core, Encoded Archieva Description (EAD), Visual Resources Association (VRA) Core 4.0. This gives humans instructions on how to interpret the information within individual tags (eg creator, title, subject).
Semantics: content standards: they define how the information within the tags, how it should be formatted. For example where periods and commas should go, order of names, (last, first). This ensures uniformity of the information within the tags
The Wendler taxonomy with metadata divided into three functional categories: Descriptive, Administrative and Structural. To observe better how this categorisation works, and to relieve the eyes from text for just one moment, I will introduce a picture of a cute kitten.
1. Structural Metadata | 2. Administrative Metadata | 3. Descriptive Metadata |
Title: |
Title: | Title: Kitten |
Tags: |
Tags: | Tags: kitty, cat, kitten, pet, animal, cute |
Description: |
Description: |
Description: image of a very cute kitten |
Created by: | Created by: Ty Swartz | Created by: Ty Swartz |
Date: | Date:Aug. 25, 2014 | Date:Aug. 25, 2014 |
Source | Source https://pixabay.com/en/kitty-cat-kitten-pet-animal-cute-551554/ | Source https://pixabay.com/en/kitty-cat-kitten-pet-animal-cute-551554/ |
Usage Rights: |
Usage Rights:CC0 Public Domain | Usage Rights:CC0 Public Domain |
Above is a breakdown of some possible elements of the metadata I could write about the image.
- Structural: In the first column, Structural, the red type describes the internal structure of documents: The headings or elements of the photo eg Title, Tags, Descriptions, Created by, Date, Source, Usage Rights.
- Administrative. In the second column, in blue, we add the technical information about the digital images creation: copyright and licensing informations, the source, the date it was created, and who created it.
- Descriptive: In the third column, used to identify and recover digital objects; we added the name of the image, the description of the image content, and tags about the content.
Metadata Taxonomy:
To explain about taxonomy, I must return to the article by Christine Benson, and replicate another of her diagrams.
[Benson 2012]
Taxonomy is the process or system of describing the way in which different living things are related by putting them in groups 5.
At its simplest, a taxonomy organizes information, and metadata describes it. For the taxonomy to be able to organize the information, terms need to be stored as metadata. It all works together to make the content findable, recognizable, and useful. [Benson 2012]
Common types of taxonomy include:
- Term List: preferred language
- Thesauri: this makes associations between the term lists. It translates conceptual relationships between the content, often made naturally by humans, into something a computer can understand. Thesauri typically address three types of relationships: equivalent (synonyms), hierarchical (broad-to-narrow terms), and/or associative (related terms). [Benson 2012]
- Hierarchies: structural framework, which we know to be the elements of metadata as in: Title, Tags, Description, Created By, Date, Source
- Schema: A key component of metadata is the schema. Metadata schemes are the overall structure for the metadata. It describes how the metadata is set up, and usually addresses standards for common components of metadata like dates, names, and places. There are also discipline-specific schemas used to address specific elements needed by a discipline. [Hayslett]
Conclusion:
Good metadata can make up for mistakes we can make. With our computers becoming more and more filled with digital data, a good filing process is essential. In today’s economy, with temporary or zero hours contract, or even voluntary organisations, the digital footfall of people through data has never been greater. By implementing a metadata strategy, workflow, ease of and access to information will be more successful.
There are several possibilities for the metadata generation:
- metadata created by an expert or authority,
- hybrid systems that have automatic metadata extraction, collaborative metadata generation, where the user may input metadata, and it is assessed, by an expert,
- or the new phenomenon of social tagging, which we are all familiar with. Free-form tags, not edited by professionals. They tend to be descriptive, collection of keywords, or tags, used for navigation, filtering and searching. Site such as Flickr, or YouTube allow for social tagging; the users create the metacontent, by attaching tags to their uploaded digital web content, it tends to be more user-friendly but perhaps not as uniform.
I will be discussing in further articles about metadata including tutorials on how to create metadata for various program, links to online guides and also its relevance as a digital tool for community participation. By demystifying digital language, and its potential uses; we can create conversation about digital systems, that are user friendly. A ‘bottoms-up’, grassroots approach to the evolution of digital technologies, rather than a ‘top-down’, academic approach. An example of this is the term, Folksonomy, coined by Thomas Vander Wal in 2004.
Folksonomy is the result of personal free tagging of information and objects (anything with a URL) for one’s own retrieval. The tagging is done in a social environment (usually shared and open to others). Folksonomy is created from the act of tagging by the person consuming the information. [Vander Wal 2007]
Constantly, the words of Heidigger keep coming back to haunt me. It is essential, through the development of the web, the information, the digital world we live in, that at all times, we keep the human element in the back of our minds. Technology should be used to free us from the toils of this world, not make it harder, make us work harder, and make us feel we don’t or wont understand. That is what religion is for, the mysteries. We need to put more folk back in, but without the long hair and flared trousers.
The essence of technology is nothing technological, that is to say technology cannot be understood through its functionality, but only through our specifically technological engagement in the world [Heidegger 1977].
References:
- Jordan Louis, ‘The Importance of Web Semantics’, Cardinal Path – Web Analytics and Data Driven Marketing <http://www.cardinalpath.com/the-importance-of-web-semantics/>
- Definition of METADATA’ Merrian-Webster Dictionary. Web <http://www.merriam-webster.com/dictionary/metadata>
- Definition of METADATA’ Merrian-Webster Dictionary. Web <http://www.merriam-webster.com/dictionary/metadata>
- Sarah Knapton, ‘Print out Digital Photos or Risk Losing Them, Google Boss Warns’, 2015 <http://www.telegraph.co.uk/news/science/science-news/11410506/Print-out-digital-photos-or-risk-losing-them-Google-boss-warns.html>
- ‘Definition of TAXONOMY’ <http://www.merriam-webster.com/dictionary/taxonomy>
Bibliography:
Benson, Christine, ‘An Intro to Metadata and Taxonomies « Brain Traffic Blog’, An Intro to Metadata and Taxonomies, 2012 <http://blog.braintraffic.com/2012/03/an-intro-to-metadata-and-taxonomies/>
Merriam-Webster, ‘Definition of METADATA’ <http://www.merriam-webster.com/dictionary/metadata>
Merriam-Webster,‘Definition of TAXONOMY’ <http://www.merriam-webster.com/dictionary/taxonomy>
‘DMP_Checklist_FINAL – DMP_Checklist_2013.pdf’ <http://www.dcc.ac.uk/sites/default/files/documents/resource/DMP/DMP_Checklist_2013.pdf>
Feliciati, Pierluigi, ‘Towards a Sound Management of Digital Culture. Metadata Schemes and Application Profiles for Digital Repositories’, 2010 <http://eprints.rclis.org/handle/10760/14667>
Hayslett, Michele, ‘LibGuides: Metadata for Data Management: A Tutorial: Why Do I Need It?’ <http://guides.lib.unc.edu/c.php?g=8749&p=44500>
Hedden, Heather, ‘Taxonomies and Controlled Vocabularies Best Practices for Metadata’, Journal of Digital Asset Management, 6 (2010), 279–84 <http://dx.doi.org/10.1057/dam.2010.29>
Heidigger, Martin. (1977). The Questions Concerning Technology. The Question Concerning Technology. Garland Science. Germany
Kiorgaard, Deirdre, ‘Resource Description and Access | National Library of Australia’, National Library of Australia, 2009 <https://www.nla.gov.au/content/resource-description-and-access>
Knapton, Sarah, ‘Print out Digital Photos or Risk Losing Them, Google Boss Warns’, 2015 <http://www.telegraph.co.uk/news/science/science-news/11410506/Print-out-digital-photos-or-risk-losing-them-Google-boss-warns.html>
Louis, Jordan, ‘The Importance of Web Semantics’, Cardinal Path – Web Analytics and Data Driven Marketing <http://www.cardinalpath.com/the-importance-of-web-semantics/>
Lubas, R., A. Jackson, and I. Schneider, The Metadata Manual: A Practical Workbook (Elsevier Science, 2013) <https://books.google.ie/books?id=kgBEAgAAQBAJ>
RDA, ‘Arts and Humanities Metadata Standards’, RDA / Metadata Directory, 2016 <http://rd-alliance.github.io/metadata-directory/standards/>
Reser, Greg, ‘Metadata Deluxe / Introduction to Embedded Metadata’, Metadata Deluse, 2013 <http://metadatadeluxe.pbworks.com/w/page/20792243/Introduction%20to%20Embedded%20Metadata>
‘Spatial_Heritage_and_Archaeological_Research_Environment_I.T._Final_Report_08.pdf’ <http://www.heritagecouncil.ie/fileadmin/user_upload/INSTAR_Database/Spatial_Heritage_and_Archaeological_Research_Environment_I.T._Final_Report_08.pdf>
Vander Wal, Thomas, ‘Folksonomy’, Vanderwal, 2007 <http://www.vanderwal.net/folksonomy.html>
Weibel, Stuart L., and Traugott Koch, ‘The Dublin Core Metadata Initiative: Mission, Current Activities, and Future Directions’, D-Lib Magazine, 6 (2000) <http://dx.doi.org/10.1045/december2000-weibel>
So, what do you think ?