Persisting Couchbase Data Across Container Restarts

Best Practices for Virtualized Platforms provide best practices for running Couchbase on a virtualized platform like Amazon Web Services and Azure. In addition, it also provide some recommendations for running it as Docker container.

One of the recommendations is to map Couchbase node specific data to a local folder. Let’s understand that in more detail.

Implicit Per-Container Storage

If a Couchbase container is started as:

This container:

  • Starts in a detached mode using -d
  • Different query, caching and administration ports are mapped using -p
  • A name is provided using --name
  • Image is couchbase/server:sandbox

By default, the data for the container is stored in a managed volume. Checking volume mounts using the docker inspect command shows:

The data for Couchbase is stored in the container filesystem defined by the value of Source attribute. This can be verified by logging into the root filesystem:

Now you can see the data directory:

A new directory is created for a new run of the container. This directory is still around when the container is stopped and removed but no longer easily accessible. Thus no data is preserved across container restarts.

The volume can be explicitly removed, along with container, using the command:

If the container terminates then the entire state of the application is lost.

Explicit Host Directory Mapping

Now, let’s start a Couchbase container with explicit volume mapping:

This container is very similar to the container started earlier. The main difference is that a directory from host ~/couchbase is mapped to a directory in the container /opt/couchbase/var.

Couchbase container persists any data in /opt/couchbase/var directory in the container filesystem. Now that directory is mapped to a directory on the host filesystem. This allows to persist state of the container outside on the host filesystem. The bypasses the union filesystem used by Docker and exposes the host filesystem to the container. This allows the state to persist across container restarts. The new container only needs to start with the exact same volume mapping.

More details about the container can be seen as:

jq is a JSON processor that needs to be installed separately. And the output is shown as:

This shows the source and destination directory. RW shows that the volume is read/write.

If the container is started using Docker for Mac, then Couchbase Web Console is accessible at http://localhost:8091. The Data Buckets tab shows the default travel-sample bucket:

docker-volume-couchbase-01

Click on Create New Data Bucket to create a new data bucket. Give it the name sample:

docker-volume-couchbase-02

The Data Buckets tab is updated with this newly created bucket:

docker-volume-couchbase-03

Now stop and remove the container:

Start the container again using the same command:

Data Buckets tab will show the same two buckets in the Couchbase Web Console.

 

In this case, if the container is started on a different host then the state would not be available. Or if the host dies then the state is lost.

An alternative and a more robust and foolproof way to manage persistence in containers is using a shared network filesystem such as Ceph, GlusterFS or Network Filesystem. Some other common approaches are to use Docker Volume Plugins like Flocker from ClusterHQ or Software Defined Storage such as PortWorx. All of these storage technique simplify how state of a container can be saved in a multi-container multi-host environment. A future blog will cover these techniques in detail.

Read more details in Managing data in containers.

couchbase.com/containers provide more details about how to run Couchbase in different container frameworks.

More information about Couchbase:

Source: blog.couchbase.com/2016/october/persisting-couchbase-data-across-container-restarts

Be Sociable, Share!

Leave a Reply

Your email address will not be published. Required fields are marked *