Newsletter

Holen Sie sich die neuesten Updates von Hortonworks per E-Mail

Einmal monatlich erhalten Sie die neuesten Erkenntnisse, Trends und Analysen sowie Fachwissen zu Big Data.

AVAILABLE NEWSLETTERS:

Sign up for the Developers Newsletter

Einmal monatlich erhalten Sie die neuesten Erkenntnisse, Trends und Analysen sowie Fachwissen zu Big Data.

cta

Erste Schritte

Cloud

Sind Sie bereit?

Sandbox herunterladen

Wie können wir Ihnen helfen?

* Ich habe verstanden, dass ich mich jederzeit abmelden kann. Außerdem akzeptiere ich die weiteren Angaben in der Datenschutzrichtlinie von Hortonworks.
SchließenSchaltfläche „Schließen“
Apache-Projekte
Apache Atlas

Apache Atlas

MENÜ

ÜBERSICHT

Agile enterprise compliance through metadata

Atlas is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack, thereby enabling platform-agnostic governance controls that effectively address compliance requirements

 

What Atlas Does

Screen Shot 2016-09-06 at 4.30.46 PM

Apache Atlas provides scalable governance for Enterprise Hadoop that is driven by metadata. Atlas, at its core, is designed to easily model new business processes and data assets with agility. This flexible type system allows exchange of metadata with other tools and processes within and outside of the Hadoop stack, thereby enabling platform-agnostic governance controls that effectively address compliance requirements

Apache Atlas is developed around two guiding principles:

  • Metadata Truth in Hadoop: Atlas provides true visibility in Hadoop. By using native connector to Hadoop components, Atlas provides technical and operational tracking enriched by business taxonomical metadata. Atlas facilitates easy exchange of metadata by enabling any metadata consumer to share a common metadata store that facilitates interoperability across many metadata producers.
  • Developed in the Open: Engineers from Aetna, Merck, SAS, Schlumberger, and Target are working together to help ensure Atlas is purposely built to solve real data governance problems across a wide range of industries that use Hadoop. This approach is an example of open source community innovation that helps accelerate product maturity and time-to-value for the data-first enterprise.

Apache Atlas empowers enterprises to effectively and efficiently address their compliance requirements through a scalable set of core governance services. These services include:

  • Daten Lineage: Captures lineage across Hadoop components at platform level
  • Agile Data Modeling: Type system allows custom metadata structures in a hierarchy taxonomy
  • REST API: Modern, flexible access to Atlas services, HDP components, UI & external tools
  • Metadata Exchange: Leverage existing metadata / models by importing it from current tools. Export metadata to downstream systems

 

 

 

How Atlas Works

Apache Atlas is designed to effectively exchange metadata within Hadoop and the broader data ecosystem. Atlas’s adaptive model reduces enterprise time to compliance by leveraging existing metadata and industry-specific taxonomy. With Atlas, data administrators and stewards also have the ability to define, annotate and automate the capture of relationships between data sets and underlying elements including source, target and derivation processes.

Atlas also ensures downstream metadata consistency across the ecosystem by enabling enterprises to easily export metadata to third-party systems.

 

atlas_architecture

Technische Vorschau

Business Taxonomy (Catalog)

Big Data sorgt für eine Demokratisierung des Zugangs zu Informationen und erleichtert den unternehmensweiten Informationsaustausch. Allerdings kann unerwartetes Wachstum zu „Datensümpfen“ voller Inhalte führen, die nicht ordentlich getaggt oder katalogisiert werden. Die Geschäftstaxonomie kann sich hier als fehlendes Bindeglied bewähren. Das Wort leitet sich vom griechischen „taxis“ für „Ordnung“ oder „Anordnung“ ab. Taxonomien verwenden eine Begriffshierarchie zur Klassifizierung und Anordnung von Begriffen oder physischen/logischen Objekten. Sie sind somit das ideale Mittel zur strukturellen Erfassung sämtlicher Inhalte eines Unternehmens.

Consistent classification and tagging across the enterprise using taxonomies supports system/ platform interoperability and value generation from structured and unstructured data sources by mapping them to common shared vocabulary. This authoritative reference taxonomy improves both data confidence and time to insight.

Requirements for a Big Data Business Catalog

  • Purpose-Built Platform Solution: In order to make sense of big data and provide users with the ability to find the right information, enterprises need a data governance solution that is designed for Hadoop and operates at the platform level, so that it consistently classifies data across all the engines used by the organization to move and analyze data.
  • Eine zweckmäßige Plattformlösung kann als einheitliche Metadatenquelle in Hadoop dienen, indem sie automatisch die Aktivitäten mehrerer Nutzer und Anwendungen in Hadoop-Komponenten mithilfe nativer Konnektoren verfolgt. Data-Governance-Lösungen arbeiten dagegen nur auf Anwendungsebene und erfordern einen geschützten Lösungspfad, der am Ende nur wieder isolierte Datensysteme hervorbringt.
  • Faster Data Discovery: The business catalog enables data officers and stewards to search for data and metadata quickly and in a number of different ways to reduce time to value. This includes the ability to search by:
    • Asset Type: Search for a Hive table, Storm Topology or any connected component.
    • Tags: Search for all columns or tables that have a specific tag such as PII
    • Business Language: Aligned with compliance standards & policies

Durch die Kombination dieser Suchkapazitäten können Datenpfleger ein Modell ihres Unternehmens und seiner Geschäftsabläufe aufbauen. Dazu zählt auch die Möglichkeit, ein Unternehmen durch die Verbindung logischer und phsyischer Datenentitäten zu modellieren, um das Verständnis zu vertiefen.

What's New in HDP 2.6

Cloud

  • Shared enterprise services for governance

Component Coverage

  • Tag-based policy support for HDFS, Kafka and HBase
  • Knox SSO für Atlas UI

Benutzerfreundlich

  • API revamp
  • Simplified UI for basic search
  • Manual entity creation – support for HDFS, HBase, Kafka & custom entity types etc.
  • Performance and scalability improvements
  • SmartSense metrics

Recent Progress with Atlas

The Atlas/ Ranger integration represents a paradigm shift for big data governance and security. By integrating Atlas with Ranger enterprises can now implement dynamic classification-based security policies, in addition to role-based security. Ranger’s centralized platform empowers data administrators to define security policy based on Atlas metadata tags or attributes and apply this policy in real-time to the entire hierarchy of data assets including databases, tables and columns.

Latest release of Apache Atlas has focused on delivering scalable metadata services to model any business process enhanced with industry-specific terminology, as well as the ability to import and export metadata from other systems and tools.

Apache Atlas Version Progress
Apache Atlas 0.7
  • Enterprise deployment
    • Performance enhancements
    • HA, DR and BC support
    • AD integration
  • Component lineage
    • Kafka/ Storm
    • Scoop
    • Falcon
  • Sicherheit
    • Support for Kerberos
    • Atlas/ Ranger integration for dynamic tag-based security
  • Benutzeroberfläche
    • Improved GUI
    • Business catalog (Technical Preview)
  • Governance-ready partner ecosystem
 Apache Atlas 0.6
  • Built-in types for HDFS
  • Metadata tag management
  • Expanded support for Apache Hive
Apache Atlas 0.5
  • Scalable metadata service
    • Enterprise/Business unit level modeling with industry-specific vocabulary
    • Extend visibility into HDFS Path, Hive DB, table, columns
    • Flexible access to Atlas services
  • Hive integration leverages existing metadata
    • Leverage existing metadata with import / export capability
    • Capture SQL runtime metrics directly
  • UI driven Hive table lineage and domain-specific search
    • Support for keyword, faceted and free text searches

Governance Ready Certification

Screen Shot 2016-09-07 at 4.11.40 PM

To address enterprise requirements for Hadoop application integration, Atlas strives to foster a vibrant ecosystem based on a centralized metadata store. The Governance Ready program aims to create a curated group of partners that contribute a rich set of data management features focusing on data preparation, integration, cleansing, tagging, ETL visualization and collaboration areas.

 

Certified partners will help define a set of standards to exchange metadata and contribute conforming data integration features to the metadata store. Customers can then subscribe to desired features with low switching costs and faster deployment time.

Foren

Atlas Tutorials

Atlas in the Press

Webinare und Präsentationen