Data integration refers to the processes for taking data from different sources and making it usableThere are four broad subcategories of data integration that each have their own specific tools:

  1. Data warehousing
  2. Data migration
  3. Enterprise Application Integration
  4. Master Data Management

For a more comprehensive look at the topics above, be sure to check out our data integration blog

In this article, you can find out more about these different categories and browse over 40 vendors across them.

Data Warehousing Tools 

Data warehousing consists of aggregating structured data from one or more sources for use in Business Intelligence (BI) endeavors. It also provides a view on the overall health and performance of a business because of the wide range of data available for use in analysis. This further enables a historical context through a long-term view of data over time.

When choosing a data warehouse, there are a few functionalities that should be included:

  • Support of both relational and multidimensional databases, including built-in readiness for star and snowflake schema database designs and query optimization
  • Online analytical processing (OLAP) functionality so that developers and end users can code less complex queries
  • Data movement capabilities such as simple load and unload or replication
  • Optimization of queries from an operational or transactional database management system (DBMS)
  • In-memory functionality to improve performance
  • Support for zone maps so that queries can be optimized via pruning data blocks

 Data warehouse platforms today come in a range of formats, so choosing the right one can feel intimidating. Some of the most common options are relational database management systems (RDBMS), analytical DBMS, data warehouse as a services (DWaaS), and appliances. Some criteria under which to evaluate these options include:

  • Cloud vs on-premises
  • Performance
  • Reliability
  • Usability, integration
  • Scalability
  • Security
  • Supported data types
  • Ecosystem
  • Backup and recovery

 A few major data warehouse platforms include:

NameFoundedStatusNumber of Employees
Google BigQuery2010Public10,001+
Amazon Redshift2012Public10,001+
Cloudera2008Public1,001-5,000
Panoply2005Private11-50
Ab Initio1995Private501-1,000
AnalytiX DS2006Private51-200
DATAllegro2003Private51-200
Teradata 1979Private10,001+
Informatica1993Public1,001-5,000

Data Migration 

Data migration refers to the movement of data between locations, formats, or applications. It can be caused by the introduction of a new system or location for the data, such as the change from on-premises to cloud-based options.

Depending on the specific needs of your migration, there are different tools that have different functionalities to meet these needs. Some common types of data migration and their associated tools include:

Database migration: Also known as schema migration, refers to managing incremental and reversible changes to relational database schemas. This allows for fixing mistakes and adapting data to new requirements. This type of migration is generally done when it’s time to upgrade or replace existing hard disks and servers, perform server maintenance, data center relocation, or asset consolidation.

Some tools that specialize in database migration are:

NameFoundedStatus Number of Employees
AWS DMS and Schema Conversion2015Public10,001+
Attunity Database Migration1998Public201-500
Flyway Database Migration by Boxfuse2010Private
FlySpeed by Active Database Software 2005Private2-10
SAP Hana1972Public10,000+
Scribe Software1996Private51-200

Database migration also often includes storage migration, where volume data from an older storage system is moved to a new storage system with minimal disruption to ongoing daily processes.

Application migration: As the name suggests, this method focuses on moving an application from one environment to another. This can often mean moving from an on-premises to a cloud location. Such changes can be challenging because of the inherent differences in applications that enabled them to function in their initial location. Subsequently, many brands who support applications in multiple types of environments will have migration guides and tools to help assist in the transition.

Some tools that have been developed specifically to help with application migration include:

NameFoundedStatusNumber of Employees
1E1997Private201-500
CloudSwitch2008Private11-50
Altoros2001Private201-500
CloudAtlas Inc2015Private11-50
Red Hat Application Migration Toolkit1993Public5,001-10,000
CloudEndure2012Private11-50

Enterprise Application Integration 

Enterprise application integration (EAI) is a category of approaches to obtaining interoperability between different business systems. Specifically, it requires approaching problems related to the modular architecture of the organization. The end goal of EAI l is to minimize the number of single point-to-point connectors between services and applications through the use of different middleware.

Some functionalities that any EAI solution should help users to achieve include:

  • Activity monitoring and real-time analytics
  • Transformation of data
  • Process orchestration
  • Storage, routing, filtering

 Perspectives on EAI 

There are two common methodologies for achieving effective EAI: with an enterprise service bus (ESB) or via the ‘hub and spoke’ (broker) system.

Image Source: Neuron ESB

An ESB works by enables different applications to be connected via a ‘bus’ with which each application can communicate. This means that every application only needs to be able to communicate with the bus, not with every other application. Such a system allows for easier scaling and less dependency than point-to-point integration.

Some ESB tools that can assist in the creation of the ideal EAI architecture include:

NameFoundedStatusNumber of Employees
Red Hat Jboss Fuse1993Public5,001-10,000
Mulesoft ESB 2006Public1,001-5,000
Microsoft BizTalk2000Public10,001+
IBM Websphere ESB1911Public10,001+
Oracle ESB1977Public10,001+
Talend Open Source ESB2005Public1,001-5,000
Fiorano1995Private51-200
Software AG WebMethods1969Public1,001-5,000
WSO2 Carbon2005Private501-1,000
Tibco ActiveMatrix Service Bus1997Private1,001-5,000

In a hub and spoke arrangement, unlike in the case of ESB where there is a messaging solution, a central ‘hub’ distributes the right information to all of its ‘spokes’. This hub helps to translate and communicate all of the messages across services and operations.

Master Data Management 

Master Data Management (MDM) is an integrative method of linking all key data within an organization through a common point of reference. It can also help in enabling connectivity between differing system platforms, applications, and architectures. For an effective MDM strategy, members of the organization must learn how data is to be formatted, described, and accessed.

The capabilities that you require for your MDM platform will heavily influence the criteria and functionality by which tools are evaluated. However, there are some features to look out for in order to meet some of the most common tasks undertaken by MDM:

  • Multi-domain MDM support
  • Data model flexibility
  • Data standardization inclusive of matching, cleansing, merge, and unmerge
  • Support for data governance and related workflows
  • Matching and survivorship strategies
  • On-premises or cloud deployment
  • Integrations and data connectivity
  • Performance and scalability

And others based on the specific needs of your business.

Some vendors in the MDM space include:

NameFoundedStatusNumber of Employees
Orchestra Networks EBX2000Private51-200
Dell Boomi1984Public10,001+
Stibo Systems STEP1976Private501-1,000
Profisee MDM2007Private51-200
Ataccma MDM2007Private51-200
Semarchy Intelligent MDM2011Private11-50
EnterWorks Enable1996Private51-200
Riversand MDM2001Private201-500
Information Builders Data Management Platform1975Private1,001-5,000
SAP Master Data Governance1972Public10,001+

Challenges with Data Integration

As with any major technical endeavor, there are a few challenges (and solutions) associated with data integration.

Challenge 1: Disjointed initiative with data integration being viewed in large as a technical effort, without need for business involvement.

SOLUTION: Incorporate a champion that understands the data assets of the organization and will lead discussions regarding long-term integration plans. This will help to demonstrate the benefits of the initiative.

Challenge 2: Achieving an accurate analysis of requirements.

SOLUTION: Ask the following questions:

  1. What is the goal of the data integration?
  2. What are the deliverables and objectives?
  3. What are the business rules?
  4. Where will the data be sourced from?

Challenge 3: Achieving an accurate analysis of source systems

SOLUTION: Ask the following questions:

  1. What are the extraction options?
  2. How is the data quality?
  3. What are the data volumes being processed?
  4. What is the frequency of extraction?

As with any analysis prior to embarking on a new data integration effort, these are just a few questions to begin your efforts.

Want to learn more about data integration, management, and other related tasks? Be sure to see our blog full of posts on these topics. Need a vendor to fulfill some of these tasks? Our directory of over 3000 vendors might be just the tool you need.

Featured image source

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*