Data migrationSupported data sources

Overview of supported data sources

Memgraph supports multiple data sources for importing data into a running instance. Whether your data is structured in files, relational databases, or other graph databases, Memgraph provides the flexibility to integrate and analyze your data efficiently.

Memgraph supports file system imports like CSV files, offering efficient and structured data ingestion. However, if you want to migrate directly from another data source, you can use the migrate module from Memgraph MAGE for a direct transition without the need to convert your data into intermediate files.

💡

In order to learn all the pre-requisites for importing data into Memgraph, check import best practices.

File types

CSV files

CSV files provide a simple and efficient way to import tabular data into Memgraph using the LOAD CSV clause.

Using the LOAD CSV import guide, you can implement your own CSV import through Memgraph LAB, or a simple database driver.

JSON files

Memgraph supports importing data from JSON files, allowing structured and semi-structured data to be efficiently loaded, using the json_util module and import_util module.

Check out the JSON import guide.

Cypherl file

Cypherl file is a simple format which defines a cypher command per line. It is usually divided in 2 parts:

  • CREATE statements for nodes, followed by
  • MATCH, MATCH + CREATE statements for relationships

An example for node creation in Cypherl would be similar to this query:

CREATE (:Person {id: 1, name: "John", age: 18});
CREATE (:Person {id: 2, name: "Mark", age: 25});

While the example for relationship creation in Cypherl would be similar to this query:

MATCH (p1:Person {id: 1}) MATCH (p2:Person {id: 2}) CREATE (p1)-[:KNOWS]->(p2);

If you have existing Cypherl files defining your graph structure, Memgraph can execute them directly to populate your database. See the Cypherl import guide.

Database management systems

DuckDB

Memgraph is fully compatible with all data source systems supported by DuckDB. Using our DuckDB migration module, you can load data from any DuckDB-supported source, run analytical queries directly on it, and import the results into Memgraph.

MySQL

Memgraph can integrate with MySQL databases, enabling seamless migration and real-time synchronization. Learn more about MySQL data import.

PostgreSQL

Memgraph can integrate with MySQL databases using the PostgreSQL migration module.

SQL Server

For enterprises using Microsoft SQL Server, Memgraph supports data import using ETL processes or direct queries. Read the SQL Server import documentation.

StarRocks

Memgraph supports migration from StarRocks through its compatibility with the MySQL migration module. Since StarRocks implements a MySQL-compatible protocol, you can use the same module to connect and migrate data into Memgraph.

OracleDB

Memgraph can extract and import data from OracleDB, leveraging SQL queries or external connectors. Find more details in the OracleDB import guide.

Graph databases

Neo4j

Memgraph offers a dedicated Neo4j migration module that allows users to easily transition from Neo4j. By connecting to an existing Neo4j instance, you can migrate your entire graph to Memgraph using a single Cypher query.

Memgraph

The fastest way to migrate from one Memgraph instance to another is by using snapshot recovery, which leverages Memgraph’s proprietary durability files for efficient data backup and restore. This process can be parallelized during startup for even faster recovery.

If you only want to migrate a portion of your data, Memgraph also provides a migration module that allows selective migration between Memgraph instances.

RPC Protocols

Arrow Flight

Memgraph supports migration from any system that uses Arrow Flight as its RPC protocol for fast and efficient data transfer. The Arrow Flight migration module enables you to stream data directly into Memgraph.

Streaming Sources

Apache Kafka

Memgraph offers native integration with Apache Kafka, enabling real-time data streaming directly into the graph. It includes built-in Kafka consumers for easy setup and processing of incoming data streams.

For more advanced setups, users can integrate with Kafka Connect or build custom consumers using Memgraph client libraries available in multiple programming languages.

Apache Pulsar

Memgraph offers native integration with Apache Pulsar, enabling real-time data streaming directly into the graph. It includes built-in Pulsar consumers for easy setup and processing of incoming data streams.

For more advanced setups, users can build custom consumers using Memgraph client libraries available in multiple programming languages.

Apache Spark

Memgraph is compatible with the Neo4j Spark Connector, which makes it easy to integrate Spark-powered pipelines with Memgraph. You can use this connector to process large-scale data in Spark and ingest the results directly into Memgraph for real-time graph analytics.

Dremio

Memgraph supports migration from the Dremio query engine using the Arrow Flight RPC protocol, which enables high-performance data transfer from data lakes and lakehouses. You can use Dremio to query formats like Apache Iceberg and stream the results directly into Memgraph.

Data platforms

ServiceNow

Memgraph supports the migration module from ServiceNow through its dedicated migration module. By connecting to the ServiceNow REST API, the module enables you to retrieve and ingest data directly into Memgraph for further analysis and visualization.

Storage services

Amazon S3

Memgraph supports data migration from Amazon S3 and other S3-compatible storage services. Using the S3 migration module, you can load structured data stored in cloud object storage directly into Memgraph for graph-based processing and analysis.

If you’re unsure about the best way to import your data or need assistance, reach out on Discord for help.