02.09.2021|News

With metadata analysis to data mining

GRAU DATA Metadata-Hub extracts metadata tags from over 320 file formats of large data pools

Schwaebisch Gmuend September 2nd, 2021 – With its Metadata Hub, GRAU DATA presents a new solution for reading out and capturing metadata. This gives companies the opportunity to precisely search and analyze their unstructured data, integrate it into big data projects and use the potential of large unstructured data volumes in a sustainable and long-term manner.

“More than 80 percent of all data in companies is available in unstructured form and most companies have not yet been able to use the data, its content and, above all, its value in a sustainable manner. Without a detailed metadata analysis, the data is worthless after a short time, as the content can no longer be traced. With the metadata hub, the potential of large amounts of files can be exploited quickly and easily,” explains Herbert Grau, CEO of GRAU DATA GmbH.

The Metadata-Hub recognizes, analyzes and processes “embedded” metadata from unstructured data on file systems of any size, can process over 320 file formats and read out more than 50,000 different metadata tags in a very short time. “Embedded” metadata contain much more extensive information than standard file system metadata. The Metadata-Hub is far more powerful than solutions that are mostly limited to certain file formats and do not allow cross-company and cross-departmental analysis of all file formats.

Universally applicable and scalable as required

The Metadata-Hub is platform-independent and can be easily and quickly integrated into almost any IT structure. It is controlled via a browser-based web interface. The metadata hub can be scaled to any extent by installing multiple hubs in parallel and administering them via the central WebUI. This means that the Metadata-Hub can be used in any company size and with any amount of files – from classic medium-sized companies to corporations or large research organizations with billions of files.

The core component of the Metadata-Hub is the intelligent file system crawler & harvester (metadata collector). This continuously extracts the embedded metadata from the files. The Crawler & Harvester accesses all “embedded” metadata via NFS or SMB and extracts millions of tags in a very short time. The tags are saved in a specially designed database immediately after extraction. The metadata information is then available in a structured manner, for example for evaluations, queries or queries. A GraphQL-based API, a native Python SDK and a comprehensive command line interface also offer seamless integration into solutions from other providers for automated big data processing.