Skip to content

DBAzine.com

Sections
Personal tools
You are here: Home » Of Interest » Articles of Interest » Questions Meta Data Can Answer
Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]
Votes : 331
 

Questions Meta Data Can Answer

by Robert S. Seiner

The world of information technology has "grown-up" dramatically in the last fifteen years -- the term of my comparably short career. From the days of punching cards and feeding deck readers at midnight at the university computer lab to the world of dot-coms, electronic business, and business intelligence, one might believe that they have seen it all.

But they're not even close. One can only imagine what the next 15 years have in store for us. Post-Y2K and for the foreseeable future, the need and speed to manage data, information, and knowledge will (if it hasn't already) become the business driver.

"Managing data, information, and knowledge will be the business driver."
That is a phrase worth repeating several times, if you don't already know it. A company's ability to manage data, information and knowledge will determine how successful a company can be - or whether or not they can be successful at all.

To manage data, information, and knowledge, companies need to know what data they have. Companies need to know precisely how their data is being used and how that data can be used to create competitive advantage. To know these things, a company needs to manage and use its meta data.

Meta data is information, documented in IT tools, that improves both business and technical understanding, of data and data-related processes. This definition is significantly longer then the "data about data" definition that is overused by folks in our industry. When you break this definition into pieces it tells us what meta data is, where it can be found, how it can be helpful, and who it will help.

Over the next 15 years, meta data will become increasingly important. Meta data will no longer be the "Wednesday's child of information processing systems," as stated by the father of data warehousing, Bill Inmon, in Data Management Review.

Every company has meta data. There is no question about that. Databases are built on meta data. Data models are built on meta data. Programs, screens, reports, queries, data movement, all of the components of information systems are built using meta data. This, on its own, should make it obvious that managing meta data is important. But it doesn't.

Meta Data Questions

Questions are still raised about meta data. What exactly is meta data? How much will it cost to manage meta data? How do I justify the "investment" in meta data? Who uses meta data? How does one get started managing meta data? These are all very important questions. And the answers are a key determinant of whether or not a company will proceed with a meta data management strategy and implementation plan.

These questions aren't always easy to answer, particularly if the person asking the questions doesn't participate in the daily building of the data and technical architectures to support the enterprise. That is the person most likely to foot the bill to pay for the effort. Experts have already written volumes that answer these questions, so they won't be addressed in this article.

Instead of focusing on "answers" to meta data questions, this article will focus on the "questions" that meta data can answer.

Question Categories

The "questions meta data can answer" fall into 10 categories. I selected these 10 categories because this is a logical breakdown of meta data that I have used before. If these categories do not suit your needs, organize your own according to the requirements of your organization . The 10 categories I selected include:

      • Database Meta Data
      • Data Model Meta Data
      • Data Movement Meta Data
      • Business Rule Meta Data
      • Data Stewardship Meta Data
      • Application Component Meta Data
      • Data Access/Reporting Meta Data
      • Rationalization Meta d Data
      • Data Quality Meta Data
      • Computer Operations Meta Data

Reading The Questions

While you are reading the list of "questions meta data can answer", ask yourself three simple questions. In your current environment: Can my company answer these questions? What is it costing my company to answer these questions? What is the results when we are not being able to answer these questions?

You will be surprised at how easy it is justify meta data management if you can look at your answers to the three questions listed above regarding the Questions Meta Data Can Answer.

Many of the questions fall under multiple categories. During data movement, for example, data flows from source to target. The action that is taken (value assigned) to the target may come from a map list (or conversion table), depending on the source or several sources. The action that is taken when source data is missing or source values do not have an assigned target values (sometimes known as a missing rule) can be considered data movement meta data or data quality meta data.

The questions should not be considered all encompassing. Rather, consider the meta data questions as a "starter kit" that can assist your company to understand that:

      • The answers to these questions are important.
      • The answers to these questions are NOT always available.
      • The IT division will "perform" better if they have access to this information.
      • "Cost savings" and "competitive advantage" are associated with managing data through meta data.

Database Meta Data

Database meta data describes the physical data. It is typically stored in the database catalog or in copybook/segment definitions and is accessed by developers and database administrators using database or file-aid type tools.

      • Does the data exist in a database (or a flat/sequential file)?
      • What databases exist?
      • What is the physical name of the database where the data is stored?
      • Where is the data located?
      • What are the names of the tables in the database?
      • What columns are on the tables?
      • What is the primary key?
      • What other indexes exist?
      • How is this table related to other tables?
      • Is this table part of any views?
      • When was the database last updated?
      • Who last updated the data?
      • What flat and sequential files exist?
      • What is the physical name of the dataset where my data is stored?
      • Where is the data located? mainframe, region, dataset name, ...
      • How many generations of the data exist?
      • Do the datasets exist on tape or on storage?
      • What copybooks represent the data in the file?
      • What programs use the copybook?
      • What job streams execute the programs?
      • How is the data processed, combined, sorted?
      • and much more...

Data Model Meta Data

Data Model meta data describes the logical design of the data and the mapping from the logical design to physical data. Data model meta data can also include business rules, entity relationships, domain values, ... Data model meta data is typically found in data modeling and CASE tools although some may still track this information in diagram and spreadsheet tools.

      • What data models exist?
      • Where can the models be found?
      • Is there an enterprise data model?
      • Who created the models and for what purpose, project/database?
      • Who is responsible for keeping the models up to date?
      • What business entities have been defined and what models do they exist on?
      • Where are the business entities represented in databases-tables, systems-files?
      • What are the definitions of the business entities?
      • What attributes make up these entities?
      • What is the business definition of the attributes?
      • Do the attributes have restrictive domains?
      • What are the allowable values for the attributes?
      • What is the relationship between the logical data model and the physical data model?
      • Is the physical data model in synch with the logical data model?
      • Is the physical data model in synch with the physical database?
      • What maps exist between entities and tables, attributes and columns, ...?
      • much more...

Data Movement Meta Data

Data movement meta data describes the movement of data from source to target. Data movement meta data includes information about the selection and extraction of data, mapping, transformation, and loading of data. Data movement meta data can be found in ETL or data movement tools, spreadsheets, desktop databases, or in the logic of the code written to perform the data movement.

      • Where did my data originate? What system or database did it come from?
      • What field was used to populate this data or was the field derived?
      • How was the data derived? Using calculation, conditionals, both, ...?
      • In the derivation, what other data was used?
      • Is the value of this data dependent on the values of other data? What data and how?
      • Is the target data allowed to be null?
      • What was done when data was missing?
      • What action was taken when source data did not fall within quality guidelines?
      • What action was taken when the source value was not assigned a mapped target value?
      • What values can the target data take on?
      • How do these values map to the previous values?
      • When is the data moved?
      • Has the data always "moved" this way or is there a history of changes over time?
      • When did those changes take place?
      • and much more...

Business Rule Meta Data

Business Rule meta data describes how the business operates through the use of its data. Business Rule meta data describes entity relationships, cardinality, domain rules, ... that define the use of data. Business Rule meta data typically exists in data modeling or CASE tools, or in other forms of documentation maintained outside of a tool, word processing, spreadsheet, etc.

      • What is the relationship between two entities of data in the logical data model?
      • What is the cardinality between those same entities?
      • What are the conditions under which a piece of data can take on certain values?
      • What values can a piece of data take on? What are the values meanings?
      • How is data created, updated, deleted, ...?
      • When are rules established? By whom?
      • and much more...

Data Stewardship Meta Data

Data Stewardship meta data describes who in the organization is accountable for actions taken using data. Data Stewardship meta data defines who in the organization defines the data, who in the organization creates, maintains, and eliminates data, and who consumes the data or directly uses the data or information in their jobs. Data Stewardship meta data is not maintained by a lot of companies (yet!) but those that do manage this type of meta data use desktop databases and spreadsheets.

      • Who do you call if you have a question about the data?
      • Who is responsible for defining, creating, reading, updating, and deleting the data?
      • What accountabilities go along with the actions that individuals can take with the data?
      • Who are the data "consumers" who use the data as part of their job?
      • What information can be shared within the company? Outside the company?
      • Who has to approve reports that are being distributed outside the company?
      • Who is responsible for assigning acceptable values for the data?
      • How does the stewardship program relate to the company information policy?
      • What information exists in the information policy?
      • Where can I find the information policy?
      • and much more...

Application Component Meta Data

Application Component Meta data describes all objects of an application from data files or tables, to programs, to scripts and jobs, to screens, ... Application Component meta data is a giant cross reference of all of the components that make up a system and how the components are shared and reused. Mainframe cross-reference tools and desktop tools with repositories often are the place where this information is stored.

      • What application components are considered standard reusable objects?
      • How was this "reusable object" determination made?
      • How were these objects tested and who maintains these objects?
      • What programs (& data & screens, ... ) are part of a system (or process or function) ?
      • What jobs (or procs or scripts) execute the programs?
      • What data is used by the programs and jobs? How is the data used?
      • How is the data passed from program to program, job to job, system to system?
      • What system is the data dependent on? What system is dependent on the data?
      • What programs and jobs are reused? Where are they reused?
      • What changes have been made to the programs and jobs over time?
      • Who wrote the programs and jobs?
      • Who is responsible for supporting and maintaining the programs and jobs?
      • What programs update the data?
      • What reports display the data? What screens report the data?
      • What programs use
      • and much more...

Data Access/Reporting Meta Data

Data Access and Reporting meta data describes how to access the data and which reports have already been created that can be read or recreated. It can also describe the steps that must be taken to get authorization to read the data, the description of how the data can be interpreted, available tools, descriptions of reports, etc.

Data Access and Reporting meta data typically is found within reporting tools and in traditional types of documentation (i.e. desktop databases, word processing and spreadsheets).

      • What reports have been written that use the data?
      • What is the description of a report?
      • How do I access the reports?
        What steps should be taken to get authorization to use the data?
      • How do the reports select, organize/sort, group, total and display the data?
      • What data was used by my report?
      • What reports use my data?
      • When were the reports last updated?
      • Do I have to execute the report myself or are the results already available?
      • Where will I find the results?
      • much more ...

Rationalization Meta Data

Rationalization meta data describes standard "corporate accepted" pieces of information and how those pieces of information are represented or mapped to data captured in the systems. The standard pieces of information can be a select list of data elements that have accepted meanings, histories, and values and/or the standard pieces of information can come from an enterprise data model.

The Rationalization meta data can describe the degree to which the data elements are the same piece of information and the differences. Rationalization meta data is often stored in repositories and traditional types of documentation.

      • What is the standard (core) elements that exist in the company?
      • What are the business names and definitions of these elements?
      • How were the standard elements chosen? By whom?
      • Are the standard elements verified for reuse?
      • Where do the standard elements map to existing data?
      • How should the standard elements be used?
      • and much more...

Data Quality Meta Data

Data Quality meta data describes the quality of the data. It describes the accuracy confidence level, the change management, the history of the data values and definitions and how changes over time affect how data can be understood. Data Quality describes what actions are taken when data is "bad", missing or duplicated. Data quality meta data is tracked using data quality tools, repositories, and traditional documentation types.

      • How has the accepted values of the data changed over time?
      • When did the accepted values change?
      • How has the definition of the data changed over time?
      • When did the definition of the data change?
      • What constitutes "bad" data?
      • What quality checks were performed against my data?
      • What are the quality check procedures? Who wrote and executed them?
      • Who analyzed the results?
      • With what level of confidence can I trust my data?
      • What is the accepted level of confidence before the data is considered "low quality" data?
      • and much more...

Computer Operations Meta Data

Computer Operations meta data describes the activities of the data and scheduling center. It describes data storage, tape usage, job operations, server operations, scheduling dependencies, abend procedures, backup and restore procedures. Computer Operations meta data can be found through scheduling systems, storage systems, operating and server systems, etc.

      • What operations / jobs are scheduled to run against my data?
      • What types of data backup and recovery are available?
      • When was the last time my data was backed up, restored, verified?
      • What is the process for backing up and restoring data?
      • Who is responsible for backup and recovery?
      • Who has security privileges to use my data?
      • When is the best time to run a program/report against specific data?
      • What operations are dependent on data from another process?
      • What are the actions taken when job or system fails or abends?
      • Who should be called when a job or system fails?
      • What version of the software are we running?
      • If licensed, how many licenses do we have, who is using them?
      • When are the licenses scheduled to expire?
      • When is the next release of the software due to be installed?
      • What changes/enhancements are being made to the software with the new release?
      • How much disk storage is available?
      • How much disk storage is being used? At what rate is the data growing?
      • Who allocates storage and should be contacted for questions about disk storage?
      • How are the tape storage headers defined?
      • and much more...

Please feel free to send in any additional Questions Meta Data Can Answer, and I will be glad to add them to the article. Almost every question asked during the system development life cycle or by end-users of data can be answered by meta data, if the meta data is captured and made available.

Keep in mind that meta data is very important to running a successful IT "shop". That said, meta data is essential to the business community when the business community becomes closer to and more dependent on the services of the IT "shop".

These questions offer a new way to look at a constantly revisited topic (meta data) without getting bogged down by political mumbo-jumbo.

---

Robert (Bob) S. Seiner is recognized as the publisher of The Data Administration Newsletter (TDAN.com), an award winning electronic publication that focuses on sharing information about data, information, content and knowledge management disciplines. Mr. Seiner speaks often at major conferences and user group meetings across the U.S. He can be reached at the newsletter at rseiner@tdan.com or 412-220-9643 (fax 9644).

Mr. Seiner is the owner and principal of KIK Consulting Services, a company that focuses on Consultative Mentoring or simply stated ... teaching company's employees how to better manage and leverage their data, information, content, and knowledge assets. Mr. Seiner's firm focuses on data governance/stewardship, meta-data management, business intelligence and knowledge management. KIK has developed a 4-Step Method© for Consultative Mentoring that involves customizing industry best practices to work in your environment.

For more information about Mr. Seiner, KIK Consulting Services and The Data Administration Newsletter (TDAN.com), please visit www.tdan.com and www.tdan.com/kik.htm.


Contributors : Robert S. Seiner
Last modified 2005-04-12 06:21 AM
 
 

Powered by Plone