A Step Ahead: From Acts to Aggregates — Record-ness and Data-ness in Practice

Introduction 

What is the difference between records and data? What differentiates records managers from data managers? Do these distinctions still matter as organizations take the plunge into artificial intelligence? Discussions that attempt to distinguish between records and data frequently articulate a heuristic for differentiation. “These items are records; those items are data.” Many organizations have sensible legal, compliance, and business reasons for making this kind of distinction between records and data. It is often difficult, however, to logically and consistently classify assets strictly as either data or records. One can, however, highlight important and useful differences between records and data by thinking of them as characteristics rather than distinct object classes. When grappling with the difference between records and data, we should explore the “record-ness” and “data-ness” of all assets, considering what they do, not just what they are.  

What is the essential “ness” of records and data? This question prompts endless debate. Here, I ask the reader to consider that for data, the essential-ness is the aggregate. The magic of data is their ability to represent facts about the world and facilitate aggregation for analysis to illuminate larger fact patterns that can support decisions and actions.i Data-ness, therefore, is the ability of any type of asset to represent facts and facilitate aggregation. The essentialness of records is the particular. The magic of records is their ability to execute and memorialize individual acts and decisions and convey representations of those acts and decisions into the future for others to engage and understand.ii Record-ness is thus the ability of any type of asset to enable and document acts and continue to represent those acts beyond their occurrences. Many information assets have characteristics of both data-ness and records-ness. Conceptualizing records and data as characteristics rather than object classes provides a pathway for thinking about the roles of data managers and records managers and fostering fruitful partnerships between them.  

What Data and Records Are: The Path Leading to the Thicket 

Many discussions about data and records focus on characteristics of assets that distinguish them as either records or data. A commonly articulated distinction is that data are structured assets and that records are unstructured assets. While this distinction may make sense at first glance, it does not withstand scrutiny. Daragh O Brien notes in a 2024 TDAN article that assets commonly referred to as unstructured data, such as letters, memos, emails, and books, are, in fact, highly structured artifacts that are framed by the conventions of their documentary form.iii Handwritten letters, for example, are internally structured assets typically containing elements such as the date, salutation, body, and signature. Definitions of records commonly note that records have context, content, and structure.iv  

Digital technologies have made it essentially untenable to keep a clear and stable distinction between structured and unstructured data. Digital content transforms between what people refer to as “structured” and “unstructured” states every day. For example, websites deliver assets commonly thought of as unstructured data, such as articles and blog posts, from multiple infrastructural layers of structured encoded texts and databases. Virtually all information assets possess some form of structure unless they are a random collection of text, numbers, or other symbols.   

Many organizations define records as a distinct asset class within a broader universe of information assets, often asserting that information assets only become records when they are finalized, checked into recordkeeping systems, and formally declared records. This characterization of records is driven more by business needs and compliance obligations to exert careful control over records than by an abstract consideration of the definitions of records and data. These are reasonable choices to make to establish sustainable and effective controls over records within an organization. This approach, however, does not offer the final word on the distinction between records and data. Do records come from data or do people extract data from records? Do records contain data? Can assets be data and records at the same time or are they mutually exclusive states of being? If an asset can only be a record once it is fixed and finalized, does that mean data can be alterable? Can long-term records, such as personnel records and medical records, be updated over time?  

While individual data and records managers may have to make categorical distinctions between records and data that make business sense at their institutions (“These are records; those are data.”), these distinctions are unlikely to withstand the intellectual rigors of universality. At a conceptual level, rather than going through complex analysis to parse out which types of assets are records and which are data, it is more fruitful to focus on what records and data do and the affordances they provide. Engaging in this conceptual work provides records and data managers with firm grounding to make sound policy and process decisions for their organizations.        

What Data and Records Do: The Path to Manageable Clarity  

A cornerstone of records — what gives it its records-ness — is its relationship to acts and events and its ability to represent those acts and events beyond their occurrence. Definitions of records typically feature this relationship with acts and decisions. International Standard 15489-1:2016 observes that records are “created, received, and maintained as evidence…in pursuit of legal obligations or in the transaction of business.”v In defining federal records, the US government notes that federal records are recorded information “made or received by a Federal agency under Federal law or in connection with the transaction of public business” and preserved as “evidence of the organization, functions, policies, decisions, procedures, operations, or other activities of the United States Government or because of the informational value of data in them.”vi  

There is a rich body of archival literature on the complex and contested nature of relationship between records and the acts and events they represent. An important concept to highlight is that records are created as part of the process of — or in close proximity to — the acts they represent. Records can either be of the acts they represent (dispositive records) or about the acts they represent (probative records). A contract is the embodiment of an agreement between two or more parties (dispositive). A birth certificate is not the actual birth of a child, but it is documentation of that birth (probative). Both the contract and the birth certificate continue to represent those acts long after they have occurred, conveying representations of those acts across both time and context. Thus, records play an essential role in people knowing and accepting the reality of acts, decisions, and events occurring. “Even though I was not there, I accept that this person was born in this hospital, in this city, on this date.”  

Because records gain their record-ness through the representation of particular acts, decisions, and events, record-ness does not, on its own, provide users with the affordance of scalability. By itself a birth record tells us a lot about the person it documents, but it does not tell us much about that person’s society. A public health official may, however, gather that birth record along with thousands of other birth records to extract structured information to gain demographic and public health insights. Here, the compiled information has strong data-ness because it facilitates analysis that enables insight not just into a particular birth, but larger patterns of births. It is through strong record-ness and data-ness of assets that individuals, organizations, and societies tackle their most difficult challenges. The recognition of the presence of an opioid crisis in the United States, for example, came from data analysis of thousands of records documenting individual overdoses that revealed a nation-wide pattern of addiction.    

While using the term “structured assets” as a shorthand to distinguish data from records is problematic, the term does capture an essential characteristic of data-ness — the ability to aggregate multiple representations of facts, acts, and events into a patterned view of many facts, acts, and events. Aggregation places a premium on well-structured assets that allow accurate insights to be consistently derived from multiple assets. One can think of the term “structured asset” as not a shorthand for “data,” but a shorthand for “a characteristic an asset needs to possess to facilitate robust analysis with other data, thereby maximizing the core affordance of its data-ness.” 

Implications and Conclusion 

By focusing on what records and data do rather than on what they are or how they are structured, data managers, records managers, and other information management professionals can harness a nimble, humble, and confident approach to grappling with the complexities, constraints, contradictions, and challenges of records and data. While many data and records managers may have to identify assets as either data or records for practical reasons at their institutions, they should not limit their understanding of their responsibilities to this binary view. Assigning each role to narrowly “manage the data” or “manage the records” constrains their potential and limits organizational effectiveness. 

A data-ness and record-ness mindset can facilitate the development of information management and governance policies that do not get hung up on definitions but focus on what people can do with information assets. This approach to records and data can provide organizations with the conceptual flexibility needed to govern the draft documents, ad-hoc datasets, and other working files whose informality defies clear categorization. It can also directly shape how organizations construct their data and records lifecycle models and carry out the retention and disposition of their information assets. This mindset can also inform the composition and work of data and information governance teams. 

Data-ness and records-ness also provides organizations the grounding for managing assets as records and data at the same time. A hospital’s patient records, for example, have strong record-ness qualities, documenting the care of patients and being governed by records retention laws and regulations, but also have data-ness characteristics, existing in the form of highly structured data to facilitate information exchange that supports the timely delivery of high-quality care. Doctors may aggregate data from the patient records to conduct quality control analysis. That quality control dataset has strong data-ness characteristics, but also has records-ness considerations, as that dataset provides evidence of the hospital carrying out quality control work. Focusing on what data and records do, rather than what they are helps data and records managers navigate the complex realities of our dynamic digital world.   

Records-ness suggests that records managers should primarily concern themselves with documenting the acts and decisions of their institutions and conveying authentic representations of those acts and decisions. Data-ness suggests that data managers should primarily concern themselves with ensuring assets are well-structured to facilitate sound data analysis that can generate insights that are creative, non-obvious, and true to the physical and social world. This should be seen as a spectrum of responsibilities, not a distinct cleaving of tasks. Rather than focus on divisions of labor, data managers and records managers should concentrate on collaboration. They should both ask shared questions across three key domains: sustaining institutional memory, facilitating effective governance, and enabling analytics. 

Sustaining institutional memory:  

  • How well are the acts and decisions of our organization documented?  
  • How well can representations of those acts and decisions be conveyed into the future to those who were not present at those acts and decisions? How well does our organization preserve and protect those representations so others can understand them and generally accept them as authentic — that they are what they purport to be? 

Facilitating effective governance:  

  • Does our organization have controls in place to enable appropriate access to assets? 
  • Does our organization have effective and systematic processes in place to identify and remove assets that no longer support any obligations or meet business needs? 

Enabling analytics:

  • Does our organization have the assets it needs to make smart business decisions, sustain itself, and meet legal, contractual, fiduciary, and ethical obligations? 
  • Does our organization have well-structured assets that facilitate new insights and observations that scale?   

Addressing these questions in a collaborative spirit, in alliance with other stakeholders, affords data managers and records managers the best opportunities to meet the challenges of their work and thrive.  


References 

iDAMA International, Data Management Body of Knowledge, Second Edition (Technics Publications: Basking Ridge, NJ), 2017, pgs. 18-19. 

ii Archival scholar Geoffrey Yeo provides the characterization of records that I find to be the most persuasive. He describes records as “persistent representations of activities, created by participants or observers or their authorized proxies.” “Concepts of Record (1): Evidence, Information, and Persistent Representations,” American Archivist, Vol 70, No 2 (2007), pg. 342. He also describes records as “persistent representations through which social acts are performed.” Records, Information and Data: Exploring the role of record-keeping in an information culture, (Facet Publishing: London), 2018, pg. 191. 

iii “Data is Risky Business: Structured, Unstructured (Who Cares?),” TDAN: The Data Administrator Newsletter, (December 4, 2024), https://tdan.com/data-is-risky-business-structured-unstructured-who-cares/32252.   

iv International Standard 15489-01, Information and Documentation—Records Management: Part 1: Concepts and Principles, Second Edition, (International Standards Organization: Switzerland), 2016. 

v International Standard 15489-01:2016, clause 3.14. 

vi 44 U.S. Code § 3301. 


Approved for Public Release; Distribution Unlimited. Public Release Case Number 25-2048. The authors’ affiliation with The MITRE Corporation is provided for identification purposes only and is not intended to convey or imply MITRE’s concurrence with, or support for, the positions, opinions, or viewpoints expressed by the author. ©2025 THE MITRE CORPORATION. ALL RIGHTS RESERVED.  

About the Author 

Dr. Eliot Wilczek is a principal records management engineer at the MITRE Corporation, whose work focuses on federal records management and disclosure. He previously worked as an archivist and records manager at Tufts University, Brandeis University, and Bowdoin College. He has a Ph.D. in library and information science from Simmons University.  

Share this post

The MITRE Corporation

The MITRE Corporation

MITRE administers several federally funded research and development centers (FFRDCs) - public-private partnerships that conduct research and development for the United States Government. Through FFRDCs, MITRE provides thought-leadership in a number of evolving technical areas, including many related to data.

scroll to top