Data migration

Memgraph Lab provides multiple data migration options, including importing predefined datasets for exploration and tools for importing custom data. Users can import CSV files using the CSV file import tool, import their dataset with Cypher queries using the CYPHERL format or export their data to CYPHERL file. They can also stream real-time data from Kafka, Redpanda and Pulsar. The following sections provide details on how to use each approach efficiently.

Datasets

In Memgraph Lab, there is a dedicated Datasets section that offers a collection of datasets covering different topics and sizes and allows you to explore and experiment with a variety of preloaded datasets. It’s designed to help you:

  • Explore the Cypher query language: Run queries on real-world data and practice Cypher.
  • Learn Memgraph Lab features: Understand visualization, query execution and database exploration tools.
  • Test and experiment: Experiment with different graph structures before working with your own data.

You can check the structure of the dataset by checking its graph schema, as well as reading the explanations of all the entities and their properties.

Datasets

Exploring datasets

Once you got Memgraph Lab up and running, select a dataset to work with in the Datasets section of the sidebar. After importing a dataset, you can:

By leveraging the Datasets section in Memgraph Lab, users can quickly gain hands-on experience with graph databases, making it easier to transition into solving real-world graph problems with Memgraph.

Data migration tools

Memgraph Lab provides multiple tools for importing data into the database, each suited for different use cases. The CSV file import tool allows users to load data from CSV files, making it easier to structure nodes and relationships. For real-time data ingestion, Memgraph supports streaming Kafka, Redpanda and Pulsar sources. Additionally, the CYPHERL format enables importing data using Cypher queries, allowing direct creation of nodes and relationships. Each method offers specific advantages depending on the dataset size, structure and import requirements.

CSV file import

Memgraph Lab provides the CSV file import tool that allows users to load data row by row from CSV files, simplifying the process of adding nodes and relationships to the graph database. Users can merge the new data with the existing dataset or reset the database entirely before importing. While CSV file import tool is more suitable for small datasets and quick imports, larger datasets should be handled using the LOAD CSV command directly. Check out import best practices for more details on how to do that.

Import CSV files

  • Importing files with and without headers: Memgraph supports CSV files with or without headers. Users can upload files, configure nodes by assigning labels and properties and manage indexing and uniqueness constraints. Relationships can be configured by selecting the start and end nodes and setting relevant properties.
  • Importing data from a single CSV file: Data can be imported from a single CSV file containing both nodes and relationships. Users assign labels, set properties and configure relationships accordingly.
  • Importing data from multiple CSV Files: Data can be distributed across multiple CSV files, with each file containing nodes of a specific label or relationships of a specific type. Users configure each file by assigning labels, selecting properties and specifying handling rules for duplicate data.

Finalize the import

Once all files are configured, Memgraph executes the import process in four steps:

  1. Validating files to ensure consistency.
  2. Uploading to Memgraph for processing.
  3. Executing the import, including setting up indexes and constraints.
  4. Cleaning up unnecessary files after a successful import.

After completing the import, users can visualize the data using Cypher queries, such as:

MATCH p=()-[]-() RETURN p;

CYPHERL

Memgraph Lab provides a convenient way to import or export your data using the CYPHERL format, which represents data in the form of Cypher queries.

Import from CYPHERL

To import a CYPHERL file in Memgraph Lab you need to navigate to the Import section, select the CYPHERL file or drag and drop it into Memgraph Lab. CYPHERL files contain Cypher queries (such as CREATE and MERGE statements) that create nodes and relationships in your graph database. The benefit of using CYPHERL is that you only need a single file to import both nodes and relationships.

When importing CYPHERL files via Memgraph Lab, be aware of the following:

  • You can import up to 1 million nodes and relationships via Memgraph Lab using CYPHERL files.
  • All existing indexes are automatically dropped during the import process To ensure optimal performance, include CREATE INDEX queries at the beginning of your CYPHERL file.
  • To speed up import time, consider creating indexes on appropriate nodes or node properties.

In a CYPHERL file, each Cypher query must be written in a new row. Here’s an example of what a CYPHERL file might contain:

CREATE (:Person {id: "100", name: "Daniel", age: 30, city: "London"}); 
CREATE (:Person {id: "101", name: "Alex", age: 15, city: "Paris"}); 
MATCH (u:Person), (v:Person) 
WHERE u.id = "100" AND v.id = "101" 
CREATE (u)-[:IS_FRIENDS_WITH]->(v);

This example creates two Person nodes and establishes a relationship between them.

Export to CYPHERL

In addition to importing, Memgraph Lab also allows you to export your data in CYPHERL format. This is useful for backing up your database or transferring data between Memgraph instances Memgraph Lab user manual.

Streams

Streaming in Memgraph Lab enables real-time data processing, allowing users to ingest and analyze streaming data. With built-in tools and integrations, you can manage and monitor streams within Memgraph Lab, with constant data flow and insights. Here are some of the features streams offer:

  • Data ingestion: Connect to streaming platforms like Kafka, Redpanda and Pulsar or create custom streams that suit your specific needs. Live data from various sources flows directly into your applications, enabling real-time processing with minimal delays to keep your system responsive and up to date.
  • Stream processing: Run Cypher queries on incoming data to extract valuable insights instantly. Triggers can be set up to react the moment new data arrives, enabling automated, event-driven workflows that respond in real time. As data streams in, it can be modified and enriched on the fly, ensuring it’s properly structured and ready for use throughout your system.
  • Stream management: Monitor all active streams in real time. You can start, stop, or adjust streams as needed to adapt to changing demands, while also fine-tuning performance to keep data flowing efficiently and avoid processing bottlenecks.

To explore how to set up and manage streams in Memgraph Lab, check out the full documentation for more detailed explanations: Managing streams in Memgraph Lab.

To make sure you achieve the best import speed with Memgraph, please refer to the import best practices.