跳到主要内容

The Basics of 元数据管理: Examples, Tools, and Best Practices

为什么它对数据工程师、科学家和分析师至关重要.

Every time you run an online search, you get a list of page titles and short descriptions. 

The titles and descriptions are an example of metadata because they’re data 关于 每个网站页面的内容. 

搜索引擎使用元数据来帮助您发现所需的数据. Ecommerce websites use metadata to enable you to filter or sort products by price, color, 品牌, 和更多的. 

同样的,元数据对你的网络体验也是至关重要的, 这对数据工程师也很重要, 科学家们, 分析师则规划数据架构, 工程师 数据管道,并使用数据分析.

什么是元数据管理?

To understand what metadata management is, we first have to be clear on the definition of metadata. 

经典的解释是元数据是关于数据的数据. 数据汇总了其他数据的基本信息. File names, last saved dates, and number of values in a row are all examples of metadata. 

为了理解这个概念, look at the two-column table below that has a list of customer orders with two columns named “Customer Name” and “Order Number”. 

元数据管理的例子

Compared to the metadata data 工程师s deal with, these examples of metadata are painfully simple. Metadata in the context of data 工程师ing exists on much larger scales and in far more complex forms. 

In data 工程师ing, metadata describes how data flows through an organization. It also describes a data assets’ origin and how that asset has changed over time. 

Given this definition, let’s return to our question: What is metadata management? 

简单地说,元数据管理就是元数据的管理. 元数据管理是确保创建元数据的工作, 存储, and maintained in a standardized way that’s aligned with the goals and processes of the business. 元数据管理的工作正在进行中, 它涉及所有数据用户和商业领袖的输入.

元数据管理的例子

To expand on the definition of metadata management, let’s take a look at a couple of examples. 

A common metadata management challenge is when one company acquires another. In this scenario, the acquiring company’s metadata admin team must determine:

  • 被收购公司的元数据存在于何处.
  • 被收购公司的元数据将如何被访问和存储.
  • To what extent the acquired company’s metadata must be standardized to establish uniformity with the acquiring company’s metadata. 

Another common metadata management task is to create, update, and maintain a data catalog. 数据目录 整理数据资产清单 在企业. 使用数据目录, 元数据管理员可以访问, 组织, 收集, 更新元数据以支持数据发现和数据治理. 

Before we get to the challenges of machine learning and how to improve data, 让电子游戏网址大全更深入地看看机器学习是如何工作的.

类型的元数据

一般来说, 有四种类型 元数据:技术元数据、业务元数据、操作元数据和使用元数据. 

  1. 技术元数据 描述存储数据的规则、结构和格式. Examples of technical metadata include data models, data lineage, and backup rules.
  2. 业务元数据 描述数据的业务定义、规则和上下文. Examples of business metadata include wikis, data quality rules, report annotations, and glossaries. 
  3. 操作元数据 包括信息 how and when data was created or transformed. Examples include data such as time stamps, location, job execution logs, and data owners. 
  4. 使用元数据 包括信息 数据是如何使用的. Examples of usage metadata include user ratings, access-pattern metadata, and comments.

数据操作的权威指南

为什么元数据管理很重要

Organizations lose tens of thousands of hours of productivity to time spent 搜索和访问数据. That in itself is a strong business case for metadata management since good metadata is so key to quickly accessing data. 

但是由于许多其他原因,元数据管理也很重要. 它帮助数据专业人士:

  • 建立共同的商业语言
  • 评估数据沿袭
  • 确定数据质量
  • 捕捉机构知识
  • 重用和重用数据资产
  • 准备、排序和筛选数据

元数据管理还可以帮助您回答以下问题:

  • 对我的分析工作来说,最好的客户成功数据集是什么?
  • 执行客户订单涉及哪些系统?
  • 个人或其他敏感资料在何处储存和处理?

Metadata management enables data 工程师s to be more efficient because it clarifies relationships between data and maps out data flows. Metadata also provides mechanisms data 工程师s can use for critical data processing techniques, 就像 变化数据捕获.

In short, metadata management helps companies improve data quality, comply with regulations, and 推进数据操作

元数据管理框架

A metadata management framework is the basic structure of an organization’s approach to creating and maintaining metadata. Essentially your metadata management framework is the guiding principles, people, processes, and 用于管理元数据的工具. 

对组织有用, the MMF must be developed from both a strategic and tactical perspective. 从战略的角度来看, executives must lay out clear business visions and data strategies with which the creators of the framework must align their MMF. 在战术层面也是如此, MMF必须指导人们和算法如何以及何时捕获, 集成, 和发布元数据. 

批判性的, metadata management tools must work within the framework to facilitate (or automate) the work of metadata management.

Data governance is also tightly coupled with metadata management frameworks. mmf通知 数据治理工作 关于数据质量、遵从性和可访问性的信息. 与此同时, data governance initiatives determine how and when metadata is delivered and whether metadata is available. 

元数据管理最佳实践

不幸的是, we can’t give you a step-by-step list detailing exactly how to tackle metadata management. 但是您可以使用最佳实践来加速您的进程. 

以下是三个普遍适用的最佳实践.

  1. 分配元数据管理的职责许多企业都有专门负责管理元数据的人员. 而您的元数据项目可能不需要全职员工, metadata management must at least be part of someone’s responsibilities.  在你开始行动的时候, you’ll need a team that works with executive stakeholders to formulate a metadata strategy, 开发管理流程, 选择有效的元数据管理工具, 和实施过程. 
  2. 元数据标准化When using metadata to document your data, standards for vocabularies, schemas, 等. 确保系统之间的互操作性很重要吗. 元数据标准还可以帮助您查找和访问所需的数据.  There are dozens of existing metadata standards for various industries – library science, 生物学, 艺术, 还有更多的和一般的用法. 一种常见且易于使用的通用元数据标准是 都柏林核心.
  3. 部署数据目录(使用正确的成分)- We touched on data catalogs earlier, but they’re worth another mention given their fundamental role. 数据目录 essential to effective metadata management, but they must be done right. 选择并部署具有以下特征的数据目录:
    • 灵活的搜索
    • 从不同来源获取元数据的能力.e.、内部系统、对象存储等.
    • 元数据收集和发现自动化
    • 业务术语表编辑和集成功能

DataOps平台,方便数据集成

使用streamset自动化元数据管理

电子游戏厅 shares metadata with other security and governance tools through an open metadata sharing model. Metadata produced by running pipelines can be used across the entire solution stack (Collibra, Cloudera经理, 哨兵, 亚马逊的秘密, 等.)它使您能够设置警报 模式变化, automate column and table creation, and detect and react to both semantic and structural changes. 

In short, 电子游戏厅 makes the lives of both your metadata managers and your data 工程师s easier. 

+, 作为一个单独的, 全面管理, 智能数据管道端到端平台, 电子游戏厅 facilitates clear visibility into data flows from source to destination. 

尝试电子游戏厅免费

回到顶部

电子游戏网址大全使用cookie来改善您对电子游戏网址大全网站的体验. 单击“允许所有人同意”并继续访问电子游戏网址大全的网站. 隐私政策