Monthly Archives: January 2017

Deploy Docker Compose Services to Swarm

Docker 1.13 introduced a new version of Docker Compose. The main feature of this release is that it allow services defined using Docker Compose files to be directly deployed to Docker Engine enabled with Swarm mode. This enables simplified deployment of multi-container application on multi-host.

Docker 1.13

This blog will show use a simple Docker Compose file to show how services are created and deployed in Docker 1.13.

Here is a Docker Compose v2 definition for starting a Couchbase database node:

This definition can be started on a Docker Engine without Swarm mode as:

This will start a single replica of the service define in the Compose file. This service can be scaled as:

If the ports are not exposed then this would work fine on a single host. If swarm mode is enabled on on Docker Engine, then it shows the message:

Docker Compose gives us multi-container applications but the applications are still restricted to a single host. And that is a single point of failure.

Swarm mode allows to create a cluster of Docker Engines. With 1.13, docker stack deploy command can be used to deploy a Compose file to Swarm mode.

Here is a Docker Compose v3 definition:

As you can see, the only change is the value of version attribute. There are other changes in Docker Compose v3. Also, read about different Docker Compose versions and how to upgrade from v2 to v3.

Enable swarm mode:

Other nodes can join this swarm cluster and this would easily allow to deploy the multi-container application to a multi-host as well.

Deploy the services defined in Compose file as:

A default value of Compose file here would make the command a bit shorter. #30352 should take care of that.

List of services running can be verified using docker service ls command:

The list of containers running within the service can be seen using docker service ps command:

In this case, a single container is running as part of the service. The node is listed as moby which is the default name of Docker Engine running using Docker for Mac.

The service can now be scaled as:

The list of container can then be seen again as:

Note that the containers are given the name using the format <service-name>_n. Both the containers are running on the same host.

Also note, the two containers are independent Couchbase nodes and are not configured in a cluster yet. This has already been explained at Couchbase Cluster using Docker and a refresh of the steps is coming soon.

A service will typically have multiple containers running spread across multiple hosts. Docker 1.13 introduces a new command docker service logs <service-name> to stream the log of service across all the containers on all hosts to your console. In our case, this can be seen using the command docker service logs couchbase_db and looks like:

The preamble of the log statement uses the format <container-name>.<container-id>@<host>. And then actual log message from your container shows up.

At first instance, attaching container id may seem redundant. But Docker services are self-healing. This means that if a container dies then the Docker Engine will start another container to ensure the specified number of replicas at a given time. This new container will have a new id. And thus it allows  to attach the log message from the right container.

So a quick comparison of commands:

 Docker Compose v2  Docker compose v3
 Start services docker-compose up -d docker stack deploy --compose-file=docker-compose.yml <stack-name> 
 Scale service docker-compose scale <service>=<replicas> docker service scale <service>=<replicas>
 Shutdown docker-compose down docker stack rm <stack-name>
 Multi-host No Yes

Want to get started with Couchbase? Look at Couchbase Starter Kits.

Want to learn more about running Couchbase in containers?

Source: https://blog.couchbase.com/2017/deploy-docker-compose-services-swarm

Docker 1.13 Management Commands

Docker 1.13 was released yesterday, congratulations!

Docker 1.13

A quick summary of the key features:

  • Compose file to deploy Swarm mode services
  • Improved CLI backwards compatibility
  • Clean-up commands
  • CLI restructured
  • Monitoring and Build improvements

Learn more details about these features in this video by @manomarks:

Getting Started with Docker 1.13

Use Docker for Mac or Windows to get started. Once installed, the version information looks like:

Problems with Docker CLI

Docker 1.12 CLI has about ~40 top-level solo commands. While these commands wokred very well but they had a few issues:

  1. The commands are listed in one list without any organization. That makes it difficult for newbies to get started and learn the commands. (#8756)
  2. The command, such as docker inspect, also does not provide enough context whether they are operating on image or container. This mixing of image and container commands can cause confusion. (#13509)
  3. There is no consistency of command names. For example docker images is a plural and gives the list of images where as docker ps is singular and gives the list of containers. And they of course have the naming inconsistency issue. (#8829)
  4. Some of the commands like build and run are used heavily and then some arcane ones like pause and wait not so often. It does not seem fair to keep all the commands at the same level.

Docker 1.13 fixes this problem!

Docker Management Commands

Docker 1.13 groups the commands logically into management commands.

Here are the top-level solo commands now:

Now a list of images is obtained using docker image ls command instead of docker images command. Similar docker container ls shows the list of containers instead of docker ls. This brings a lot of consistency across the commands and that would make it intuitive and easier for newbie and pros to remember the commands.

Each management command has some similar set of sub-commands where they perform the operation on the command category:

Sub-command Purpose
ls List <category> (image, container, volume, secret, etc)
rm Remove <category>
inspect Inspect <category>

And there are other sub-commands based upon the management category.

Some of the heavily used commands are still at the top level.

By default, all the top-level commands are also shown. But you can set the DOCKER_HIDE_LEGACY_COMMANDS environment variable to show only the management commands. So even though docker --help will show all the solo and management commands. But the following commands will only show the new management commands:

The old syntax is still supported but it recommended to start moving to new commands.

A new Couchbase container can be started as:

The list of images can be seen as:

Mapping Docker Solo to Management Commands

Let’s look at how the existing top-level commands match to the management commands:

1.12 1.13 Purpose
attach container attach Attach to a running container
build image build Build an image from a Dockerfile
commit container commit Create a new image from a container’s changes
cp container cp Copy files/folders between a container and the local filesystem
create container create Create a new container
diff container diff Inspect changes on a container’s filesystem
events system events Get real time events from the server
exec container exec Run a command in a running container
export container export Export a container’s filesystem as a tar archive
history image history Show the history of an image
images image ls List images
import image import Import the contents from a tarball to create a filesystem image
info system info Display system-wide information
inspect container inspect Return low-level information on a container, image or task
kill container kill Kill one or more running containers
load image load Load an image from a tar archive or STDIN
login login Log in to a Docker registry.
logout logout Log out from a Docker registry.
logs container logs Fetch the logs of a container
network network Manage Docker networks
node node Manage Docker Swarm nodes
pause container pause Pause all processes within one or more containers
port container port List port mappings or a specific mapping for the container
ps container ls List containers
pull image pull Pull an image or a repository from a registry
push image push Push an image or a repository to a registry
rename container rename Rename a container
restart container restart Restart a container
rm container rm Remove one or more containers
rmi image rm Remove one or more images
run container run Run a command in a new container
save image save Save one or more images to a tar archive (streamed to STDOUT by default)
search search Search the Docker Hub for images
service service Manage Docker services
start container start Start one or more stopped containers
stats container stats Display a live stream of container(s) resource usage statistics
stop container stop Stop one or more running containers
swarm swarm Manage Docker Swarm
tag image tag Tag an image into a repository
top container top Display the running processes of a container
unpause container unpause Unpause all processes within one or more containers
update container update Update configuration of one or more containers
version version Show the Docker version information
volume volume Manage Docker volumes
wait container wait Block until a container stops, then print its exit code

Sign up for Docker Online Meetup on 1/25 at 10am PST for more details on Docker 1.13.

Use Docker for Mac or Windows to get started with Docker 1.13.

And of course, you can learn more about how to run Couchbase on Containers.

Source: https://blog.couchbase.com/2017/docker-1-13-management-commands

Analyze Donald Trump Tweets with Couchbase and N1QL

AWS Serverless Lambda Scheduled Events to Store Tweets in Couchbase explained how to store tweets in Couchbase using AWS Serverless Lambda. Now, this Lambda Function has been running for a few days and has collected 269 tweets from @realDonaldTrump. This blog , inspired by SQL on Twitter: Analysis Made Easy Using N1QL, will show how these tweets can be analyzed using N1QL.

trump-tweets
N1QL is a SQL-like query language from Couchbase that operates on JSON documents. N1QL and SQL Differences provide differences between N1QL and SQL. Let’s use N1QL to reveal some interesting information from @realDonaldTrump‘s tweets.

Many thanks to Sitaram from N1QL team to help hack the queries.

How Many Tweets

First query is to find out how many tweets are available in the database. The query is pretty simple:

Query:

As you notice, the syntax is very similar to SQL. SELECT, COUNT and FROM clauses are what you are already familiar with from SQL syntax. tweet_count is an alias defined for the returned result. twitter is the bucket where all the JSON documents are stored.

Results:

The result is a JSON document as well.

Tweet Sample JSON Document

In order to write queries on a JSON document, you need to know the structure of the document. The next query will give you that.

Query:

The new clause introduced here is LIMIT. This allows to restrict the number of objects that are returned in a result set of SELECT.

Results:

Top 5 Tweeting Days

After the basic queries are out of the way, let’s look at some interesting data now.

What are the top 5 days on which @realDonaldTrump tweeted and the tweet count?

Query:

Usual GROUP BY and ORDER BY SQL clauses perform the same function.

N1QL Functions apply a function to values. The createdAt field is returned a number as a String. TO_NUM function converts the String to a number. MILLIS_TO_STR function converts the String to a date. Finally, SUBSTR function extracts the relevant part of the date.

Results:

Jan 17th, 2017 is the most tweeted day. Now, this result is of course restricted to the data from the JSON documents stored in the database.

Does anybody have a more comprehensive database of @realDonaldTrump tweets?

Tweet Frequency

OK, our database shows that that maximum number of tweets in a day were 13. How do I find out how many days @realDonaldTrump tweeted a certain number of times?

Query:

This is easily achieved using N1QL nested queries.

Results:

In 47 days, there is only one day with a single tweet. A sum total of tweet_count shows that there is no single day without a tweet :)

Most Common Hour In a Day To Tweet

@realDonaldTrump is known to tweet at 3am. Let’s take a look what are the most common hours for him to tweet.

Query:

Results:

Now seems like the controversial tweets come at 3am. But 39 tweets are coming at 1pm ET, likely right after lunch and while having a dessert 😉

Common Day of The Week to Tweet

Let’s find out what are the most common day of the week to tweet.

Query:

DATE_PART_STR is a new function returns date part of the date. Further day_of_week attribute is used to get day of the week.

Results:

Seems like Tuesday is the most common day to tweet. Then comes Sunday and Wednesday at the same level. The performance tends to fizzle out closer to the weekend.

Here is a nice chart that shows the same trend:

realdonaldtrump-tweets-per-day

#22417 should allow to report the weekday part in English.

Top 5 Mentions in Tweets

Query:

userMentionEntities is a nested array in the JSON document. UNNEST conceptually performs a join of the nested array with its parent object. Each resulting joined object becomes an input to the query.

Results:

Needless to say, he mentions his own name the most in tweets! And his two favorite TV stations Fox News and CNN.

Top 5 Tweets with RTs

Lambda Function wakes up every 3 hours and fetches the latest tweets. So the database is a snapshot of tweets and associated information such as RTs and Favorites. So depending upon when the tweet was archived, the RTs and Favorites may not be an accurate representation. But given this information, let’s take a look at the tweets with most RTs.

Query:

Pretty straight forward query.

Results:

Original vs RTs

How many of tweets were written vs retweeted?

Query:

Results:

Most of the tweets are original with only a few RTs.

Most Common Words in Tweet

Query:

This query uses SPLIT function that

Results:

Frequency of words “media”, “fake” and “America” in tweets

Query:

LOWER function is used to compare words independent of the case.

Result:

Lambda function will continue to store tweets in the database.

Try these queries yourself?

N1QL References

Source: https://blog.couchbase.com/2017/january/analyze-donald-trump-tweets-couchbase-n1ql

AWS Serverless Lambda Scheduled Events to Store Tweets in Couchbase

This blog has explained a few Serverless concepts with code samples:

This particular blog entry will show how to use AWS Lambda to store tweets of a tweeter in Couchbase. Here are the high level components:

 

lambda-twitter-couchbase

The key concepts are:

Complete sample code for this blog is available at github.com/arun-gupta/twitter-n1ql.

Serverless Application Model

Serverless Application Model, or SAM, defines simplified syntax for expressing serverless resources. SAM extends AWS CloudFormation to add support for API Gateway, AWS Lambda and Amazon DynamoDB. Read more details in Microservice using AWS Serverless Application Model and Couchbase.

For our application, SAM template is available at github.com/arun-gupta/twitter-n1ql/blob/master/template-example.yml and shown below:

What do we see here?

  • Function is packaged and available in a S3 bucket
  • Handler class is org.sample.twittter.TwitterRequestHandler and is at github.com/arun-gupta/twitter-n1ql/blob/master/twitter-feed/src/main/java/org/sample/twitter/TwitterRequestHandler.java. It looks like:
    By default, this class reads the twitter handle of Donald Trump. More fun on that coming in a subsequent blog.
  • COUCHBASE_HOST and COUCHBASE_BUCKET_PASSWORD are environment variables that provide EC2 host where Couchbase database is running and the password of the bucket.
  • Function can be triggered by different events. In our case, this is triggered every three hours. More details about the expression used here are at Schedule Expressions Using Rate or Cron.

Fetching Tweets using Twitter4J

Tweets are read using Twitter4J API. It is an unofficial Twitter API that provides a Java abstraction over Twitter REST API. Here is a simple example:

 

Twitter4J Docs and Javadocs are pretty comprehensive.

Twitter API allows to read only last 200 tweets. Lambda function is invoked every 3 hours. The tweet frequency of @realDonaldTrump is not 200 every 3 hours, at least yet. If it does reach that dangerous level then we can adjust the rate to trigger Lambda function more frequently.

JSON representation of each tweet is stored in Couchbase server using Couchbase Java SDK. AWS Lambda supports Node, Python and C#. And so you can use Couchbase Node SDK, Couchbase Python SDK or Couchbase .NET SDK to write these functions as well.

Twitter4J API allows to fetch tweets since the id of a particular tweet. This allows to ensure that duplicate tweets are not fetched. This requires us to sort all tweets in a particular order and then pick the id of the most recent tweet. This was solved using the simple N1QL query:

The syntax is very SQL-like. More on this in a subsequent blog.

Store Tweets in Couchbase

The final item is to store the retrieved tweets in Couchbase.

Value of COUCHABSE_HOST environment variable is used to connect to the Couchbase instance. The value of COUCHBASE_BUCKET_PASSWORD environment variable is to connect to the secure bucket where all JSON documents are stored. Its very critical that the bucket be password protected and not directly specified in the source code. More on this in a subsequent blog.

The JSON document is upserted (insert or update) in Couchbase using the Couchbase Java API:

 

This Lambda Function has been running for a few days now and has captured 258 tweets from @realDonaldTrump.

serverless-lambda-couchbase-twitter-bucket

An interesting analysis of his tweets is coming shortly!

Talk to us:

Complete sample code for this blog is available at github.com/arun-gupta/twitter-n1ql.

Source: https://blog.couchbase.com/2017/january/aws-serverless-lambda-scheduled-events-tweets-couchbase

Couchbase Customer Advisory Note – Security

In light of the recent widespread news about security vulnerabilities in MongoDB and Elasticsearch, we want to proactively remind our customers of Security Best Practices for Couchbase.

At this time there have been no known ransomware attacks on Couchbase, and no new security vulnerabilities have been identified in the product. This advisory is in the spirit of ‘forewarned is forearmed’.

Comprehensive security planning is a complex topic, but getting started with Security Basics is not. This Advisory Note is intended as a heads-up and reminder of general security best practices as well as Couchbase security capabilities available to you. First of all, let’s start with the basics. All Couchbase Server installations should ensure that:

  • Proper physical security (server access and backup storage) is maintained.
  • Couchbase Server nodes are behind a firewall so that they are not publically accessible. Here is how to configure network access to Couchbase using IP tables.
  • The server operating system is up to date with the latest security patches.
  • Delete the “default” bucket.
  • Secure in-transit data by using SSL connections for client/server and server/server communication.
  • Use a strong and unique bucket password for all data buckets.
  • Add security to your Couchbase mobile application
  • Encrypt Couchbase Lite databases

Additionally, customers should consult the following Couchbase resources in order to build a comprehensive security plan:

Documentation

Blogs

As always, please reach out to us if you have any questions.

How to contact?

Source: https://blog.couchbase.com/2017/couchbase-customer-advisory-note-security

Starting a Kubernetes 1.5.x cluster

Kubernetes

Kubernetes 1.5.0 was released just about a month ago! Key theme for the release are:

Read CHANGELOG for complete details.

Up until 1.5.0, starting up a Kubernetes cluster on Amazon Web Services was pretty straight forward.

But with 1.5.0 and 1.5.1, the command fails with the error:

What happened?

Basically, Kubernetes binaries was getting bigger than 1GB. The binary was broken into a basic install bundle and client and server binaries. The updated installation process requires to download the basic install bundle of of 4.57 MB (yes, MB instead of GB). It includes cluster scripts like kubectl, kube-up.sh and kube-down.sh, examples, docs and other scripts. This then downloads client and server binaries. Server binary is the base image that is used to start EC2 instances. But instead of automating the download of binaries, somebody decided to add a README in the server directory.

This was a big user experience change, and no links in the README bundled with the release or the release blog. Ouch!

Anyway, this was filed as #38728 and fixed promptly. But it missed the 1.5.1 release and now finally showed up in the 1.5.2 release today.

So, how do you run a Kubernetes 1.5.2 cluster on AWS?

It is more seamlessly integrated now but you need to hit Enter key a couple of times to accept the default value:

After the usual Kubernetes cluster is created, the output is shown as:

Even though your Kubernetes cluster on AWS starts up fine, but kube-up.sh script is going to be deprecated soon. The recommended way is to use Kubernetes Cluster on Amazon using Kops.

Now that your Kubernetes cluster is up, what do you do next?

Source: https://blog.couchbase.com/2017/january/starting-kubernetes-1.5.x-cluster

Microservice using AWS Serverless Application Model and Couchbase

Amazon Web Services introduced Serverless Application Model, or SAM, a couple of months ago. It defines simplified syntax for expressing serverless resources. SAM extends AWS CloudFormation to add support for API Gateway, AWS Lambda and Amazon DynamoDB. This blog will show how to create a simple microservice using SAM. Of course, we’ll use Couchbase instead of DynamoDB!

This blog will also use the basic concepts explained in Microservice using AWS API Gateway, AWS Lambda and Couchbase. SAM will show the ease with which the entire stack for microservice can be deployed and managed.

As a refresher, here are key components in the architecture:

serverless-microservice

  • Client could be curl, AWS CLI/Console, Postman client or any other tool/API that can invoke a REST endpoint.
  • AWS API Gateway is used to provision APIs. The top level resource is available at path /books. HTTP GET and POST methods are published for the resource.
  • Each API triggers a Lambda function. Two Lambda functions are created, book-list function for listing all the books available and book-create function to create a new book.
  • Couchbase is used as a persistence store in EC2. All the JSON documents are stored and retrieved from this database.

Other blogs on serverless:

Let’s get started!

Serverless Application Model (SAM) Template

An AWS CloudFormation template with serverless resources conforming to the AWS SAM model is referred to as a SAM file or template. It is deployed as a CloudFormation stack.

Let’s take a look at our SAM template:

This template is available at github.com/arun-gupta/serverless/blob/master/aws/microservice/template.yml.

SAM template Specification provide complete details about contents in the template. The key parts of the template are:

  • Defines two resources, both of Lambda Function type identified by AWS::Serverless::Function attribute. Name of the Lambda function is defined by Resources.<resource>.
  • Class for each handler is defined by the value of Resources.<resource>.Properties.Handler attribute
  • Java 8 runtime is used to run the Function defined by Resources.<resource>.Properties.Runtime attribute
  • Code for the class is uploaded to an S3 bucket, in our case to s3://serverless-microservice/microservice-http-endpoint-1.0-SNAPSHOT.jar
  • Resources.<resource>.Properties.Environment.Variables.COUCHBASE_HOST attribute value defines the host where Couchbase is running. This can be easily deployed on EC2 as explained at Setup Couchbase.
  • Each Lambda function is triggered by an API. It is deployed using AWS API Gateway. The path is defined by Events.GetResource.Properties.Path. HTTP method is defined using Events.GetResource.Properties.Method attribute.

Java Application

The Java application that contains the Lambda functions is at github.com/arun-gupta/serverless/tree/master/aws/microservice/microservice-http-endpoint.

Lambda function that is triggered by HTTP GET method is shown:

A little bit of explanation:

  • Each Lambda function needs to implement the interface com.amazonaws.services.lambda.runtime.RequestHandler.
  • API Gateway and Lambda integration require a specific input format and output format. These formats are defined as GatewayRequest and GatewayResponse classes.
  • Function logic uses Couchbase Java SDK to query the Couchbase database. N1QL query is used to query the database. The results and exception are then wrapped in GatewayRequest and GatewayResponse.

Lambda function triggered by HTTP POST method is pretty straightforward as well:

A bit of explanation:

  • Incoming request payload is retrieved from GatewayRequest
  • Document inserted in Couchbase is returned as response.
  • Like the previous method, Function logic uses Couchbase Java SDK to query the Couchbase database. The results and exception are then wrapped in GatewayRequest and GatewayResponse.

Build the Java application as:

Upload Lambda Function to S3

SAM template reads the code from an S3 bucket. Let’s create a S3 bucket:

us-west-2 region is one of the supported regions for API Gateway. S3 bucket names are globally unique but their location is region specific.

Upload the code to S3 bucket:

The code is now uploaded to S3 bucket. SAM template is ready to be deployed!

Deploy SAM Template

Deploy the SAM template:

It shows the output:

This one command deploys Lambda functions and REST Resource/APIs that trigger these Lambda functions.

Invoke the Microservice

API Gateway publishes a REST API that can be invoked by curl, wget, AWS CLI/Console, Postman or any other app that can call a REST API. This blog will use AWS Console to show the interaction.

API Gateway home at us-west-2.console.aws.amazon.com/apigateway/home?region=us-west-2#/apis shows:

AWS SAM Microservice API

Click on the API to see all the APIs in this resource:

AWS SAM Microservice API Resources

Click on POST to see the default page for POST method execution:

AWS SAM Microservice API POST

Click on Test to test the API:

AWS SAM Microservice API POST Input

Add the payload in Request Body and click on Test to invoke the API. The results are shown as below:

AWS SAM Microservice API POST Output

Now click on GET to see the default execution page:

AWS SAM Microservice API GET

Click on Test to test the API:

AWS SAM Microservice API GET Input

No request body is needed, just click on Test the invoke the API. The results are as shown:

AWS SAM Microservice API GET Output

Output from the Couchbase database is shown in the Response Body.

References

Source: blog.couchbase.com/2017/january/microservice-aws-serverless-application-model-couchbase