Blog | Tug's Site

Pagination with Couchbase

October 1, 2013 · 8 min read

If you have to deal with a large number of documents when doing queries against a Couchbase cluster it is important to use pagination to get rows by page. You can find some information in the documentation in the chapter "Pagination", but I want to go in more details and sample code in this article.

For this example I will start by creating a simple view based on the beer-sample dataset, the view is used to find brewery by country:

function (doc, meta) {
  if (doc.type == "brewery" &amp;&amp; doc.country){
    emit(doc.country);
  }
}

This view list all the breweries by country, the index looks like:

Doc id	Key	Value
bersaglier	Argentina	null
cervecera_jerome	Argentina	null
brouwerij_nacional_balashi	Aruba	null
australian_brewing_corporation	Australia	null
carlton_and_united_breweries	Australia	null
coopers_brewery	Australia	null
foster_s_australia_ltd	Australia	null
gold_coast_brewery	Australia	null
lion_nathan_australia_hunter_street	Australia	null
little_creatures_brewery	Australia	null
malt_shovel_brewery	Australia	null
matilda_bay_brewing	Australia	null
...	...	...
...	...	...
...	...	...
yellowstone_valley_brewing	United States	null
yuengling_son_brewing	United States	null
zea_rotisserie_and_brewery	United States	null
fosters_tien_gang	Viet Nam	null
hue_brewery	Viet Nam	null

So now you want to navigate in this index with a page size of 5 rows.

Using skip / limit Parameters

The most simplistic approach is to use limit and skip parameters for example:

Page 1 : ?limit=5&skip0
Page 2 : ?limit=5&skip=5 ... Page x : ?limit=5&skip(limit*(page-1))

You can obviously use any other parameters you need to do range or key queries (startkey/endkey, key, keys) and sort option (descending).

This is simple but not the most efficient way, since the query engine has to read all the rows that match the query, until the skip value is reached.

Some code sample in python that paginate using this view :

This application loops on all the pages until the end of the index.

As I said before this is not the best approach since the system must read all the values until the skip is reached. The following example shows a better way to deal with this.

Using startkey / startkey_docid parameters

To make this pagination more efficient it is possible to take another approach. This approach uses the startkey and startkey_docid to select the proper documents.

The startkey parameter will be the value of the key where the query should start to read (based on the last key of the "previous page"
Since for a key for example "Germany" you may have one or more ids (documents) it is necessary to say to Couchbase query engine where to start, for this you need to use the startkey_docid parameter, and ignore this id since it is the last one of the previous page.

So if we look at the index, and add a row number to explain the pagination

Row num	Doc id	Key	Value
Query for page 1 `?limit=5`
1		bersaglier	Argentina	null
2		cervecera_jerome	Argentina	null
3		brouwerij_nacional_balashi	Aruba	null
4		australian_brewing_corporation	Australia	null
5		carlton_and_united_breweries	Australia	null
Query for page 2 `?limit=5&startkey="Australia"&startkey_docid=carlton_and_united_breweries&skip=1`
6		coopers_brewery	Australia	null
7		foster_s_australia_ltd	Australia	null
8		gold_coast_brewery	Australia	null
9		lion_nathan_australia_hunter_street	Australia	null
10		little_creatures_brewery	Australia	null
Query for page 3 `?limit=5&startkey="Australia"&startkey_docid=little_creatures_brewery``&skip=1`
11		malt_shovel_brewery	Australia	null
12		matilda_bay_brewing	Australia	null
...	...	...
...	...	...
...	...	...
...		yellowstone_valley_brewing	United States	null
...		yuengling_son_brewing	United States	null
...		zea_rotisserie_and_brewery	United States	null
...		fosters_tien_gang	Viet Nam	null
...		hue_brewery	Viet Nam	null

So as you can see in the examples above, the query uses the startkey, a document id, and just passes it using skip=1.

Let's now look at the application code, once again in Python

from couchbase import Couchbase
cb = Couchbase.connect(bucket='beer-sample')

hasRow = True
rowPerPage = 5
page = 0
currentStartkey=""
startDocId=""

while hasRow :
    hasRow = False
    skip = 0 if page == 0 else 1
    page = page + 1
    print "-- Page %s --" % (page)
    rows = cb.query("test", "by_country", limit=rowPerPage, skip=skip, startkey=currentStartkey, startkey_docid=startDocId)
    for row in rows:
        hasRow = True
        print "Country: \"%s\" \t Id: '%s'" % (row.key, row.docid)
        currentStartkey = row.key
        startDocId = row.docid
    print " -- -- -- -- \n"

This application loops on all the pages until the end of the index

Using this approach, the application start to read the index at a specific key (startkey parameter), and only loop on the necessary entry in the index. This is more efficient than using the simple skip approach.

Views with Reduce function

When your view is using a reduce function, if you want to paginate on the various keys only (with the reduce function) you need to use the skip and limit parameters.

When you are using the paramater startkey_docid with a reduce function it will calculate the reduce only to the subset of document ids that are part of your query.

Couchbase Java SDK Paginator

In the previous examples, I have showed how to do pagination using the various query parameters. The Java SDK provides a Paginator object to help developers to deal with pagination. The following example is using the same view with the Paginator API.

package com.couchbase.devday;

import com.couchbase.client.CouchbaseClient;
import com.couchbase.client.protocol.views.*;
import java.net.URI;
import java.util.HashMap;
import java.util.LinkedList;
import java.util.List;
import java.util.Properties;
import java.util.concurrent.TimeUnit;
import java.util.logging.ConsoleHandler;
import java.util.logging.Handler;
import java.util.logging.Level;
import java.util.logging.Logger;

public class JavaPaginatorSample {

public static void main(String[] args) {

    configure();
    System.out.println("--------------------------------------------------------------------------");
    System.out.println("\tCouchbase - Paginator");
    System.out.println("--------------------------------------------------------------------------");

    List<URI> uris = new LinkedList<URI>();
    uris.add(URI.create("http://127.0.0.1:8091/pools"));

    CouchbaseClient cb = null;
    try {
        cb = new CouchbaseClient(uris, "beer-sample", "");
        System.out.println("--------------------------------------------------------------------------");
        System.out.println("Breweries (by_name) with docs & JSON parsing");
        View view = cb.getView("test", "by_country");
        Query query = new Query();
        int docsPerPage = 5;

        Paginator paginatedQuery = cb.paginatedQuery(view, query, docsPerPage);
        int pageCount = 0;
        while(paginatedQuery.hasNext()) {
            pageCount++;
            System.out.println(" -- Page "+ pageCount +" -- ");
            ViewResponse response = paginatedQuery.next();
            for (ViewRow row : response) {
                System.out.println(row.getKey() + " : " + row.getId());
            }
            System.out.println(" -- -- -- ");
        }
        
        System.out.println("\n\n");
        cb.shutdown(10, TimeUnit.SECONDS);
    } catch (Exception e) {
        System.err.println("Error connecting to Couchbase: " + e.getMessage());
    }
}



private static void configure() {

    for(Handler h : Logger.getLogger("com.couchbase.client").getParent().getHandlers()) {
        if(h instanceof ConsoleHandler) {
            h.setLevel(Level.OFF);
        }
    }
    Properties systemProperties = System.getProperties();
    systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.SunLogger");
    System.setProperties(systemProperties);

    Logger logger = Logger.getLogger("com.couchbase.client");
    logger.setLevel(Level.OFF);
    for(Handler h : logger.getParent().getHandlers()) {
        if(h instanceof ConsoleHandler){
            h.setLevel(Level.OFF);
        }
    }
}

}

So as you can see you can easily paginate on the results of a Query using the Java Paginator.

At the line #37, the Paginator is created from using the view and query objects and a page size is specified
Then you just need to use the hasNext() and next() methods to navigate in the results.

The Java Paginator is aware of the fact that they query is using a reduce or not, so you can use it with all type of queries - Internally it will switch between the skip/limit approach and the doc_id approaches. You can see how it is done in the Paginator class.

Note that if you want to do that in a Web application between HTTP request you must keep the Paginator object in the user session since the current API keeps the current page in its state.

Conclusion

In this blog post you have learned how to deal with pagination in Couchbase views; to summarize

The pagination is based on some specific parameters that you send when executing a query.
Java developers can use the Paginator class that simplifies pagination.

I am inviting you to look at the new Couchbase Query Language N1QL, still under development, that will provide more options to developers including pagination, using LIMIT & OFFSET parameters, for example:

SELECT fname, age
FROM tutorial
WHERE age > 30
LIMIT 2
OFFSET 2

If you want to learn more about N1QL:

How to implement Document Versioning with Couchbase

July 18, 2013 · 7 min read

Introduction

Developers are often asking me how to "version" documents with Couchbase 2.0. The short answer is: the clients and server do not expose such feature, but it is quite easy to implement.

In this article I will use a basic approach, and you will be able to extend it depending of your business requirements.

Deploy your Node/Couchbase application to the cloud with Clever Cloud

July 11, 2013 · 6 min read

Introduction

Clever Cloud is the first PaaS to provide Couchbase as a service allowing developers to run applications in a fully managed environment. This article shows how to deploy an existing application to Clever Cloud.

I am using a very simple Node application that I have documented in a previous article: “Easy application development with Couchbase, Angular and Node”.

Clever Cloud provides support for various databases MySQL, PostgreSQL, but also and this is most important for me Couchbase. No only Clever Cloud allows you to use database services but also you can deploy and host your application that could be developed in the language/technology of your choice : Java, Node, Scala, Python, PHP, … and all this in a secure, scalable and managed environment.

SQL to NoSQL : Copy your data from MySQL to Couchbase

July 3, 2013 · 6 min read

TL;DR: Look at the project on Github.

Introduction

During my last interactions with the Couchbase community, I had the question how can I easily import my data from my current database into Couchbase. And my answer was always the same:

Take an ETL such as Talend to do it
Just write a small program to copy the data from your RDBMS to Couchbase...

So I have written this small program that allows you to import the content of a RDBMS into Couchbase. This tools could be used as it is, or you can look at the code to adapt it to your application.

The Tool: Couchbase SQL Importer

The Couchbase SQL Importer, available here, allows you with a simple command line to copy all -or part of- your SQL schema into Couchbase. Before explaining how to run this command, let's see how the data are stored into Couchbase when they are imported:

Each table row is imported a single JSON document
- where each table column becomes a JSON attribute
Each document as a key made of the name of the table and a counter (increment)

The following concrete example, based on the MySQL World sample database, will help you to understand how it works. This database contains 3 tables : City, Country, CountryLanguage. The City table looks like:

+-------------+----------+------+-----+---------+----------------+
| Field       | Type     | Null | Key | Default | Extra          |
+-------------+----------+------+-----+---------+----------------+
| ID          | int(11)  | NO   | PRI | NULL    | auto_increment |
| Name        | char(35) | NO   |     |         |                |
| CountryCode | char(3)  | NO   |     |         |                |
| District    | char(20) | NO   |     |         |                |
| Population  | int(11)  | NO   |     | 0       |                |
+-------------+----------+------+-----+---------+----------------+

The JSON document that matches this table looks like the following:

city:3805
{
  "Name": "San Francisco",
  "District": "California",
  "ID": 3805,
  "Population": 776733,
  "CountryCode": "USA"
}

You see that here I am simply taking all the rows and "moving" them into Couchbase. This is a good first step to play with your dataset into Couchbase, but it is probably not the final model you want to use for your application; most of the time you will have to see when to use embedded documents, list of values, .. into your JSON documents.

In addition to the JSON document the tool create views based on the following logic:

a view that list all imported documents with the name of the "table" (aka type) as key
a view for each table with the primary key columns

View: all/by_type

{
  "rows": [
  {"key": "city", "value": 4079},
  {"key": "country", "value": 239},
  {"key": "countrylanguage", "value": 984}
  ]
}

As you can see this view allows you to get with a single Couchbase query the number of document by type.

Also for each table/document type, a view is created where the key of the index is built from the table primary key. Let's for example query the "City" documents.

View: city/by_pk?reduce=false&limit=5

{
  "total_rows": 4079,
  "rows": [
  {"id": "city:1", "key": 1, "value": null},
  {"id": "city:2", "key": 2, "value": null},
  {"id": "city:3", "key": 3, "value": null},
  {"id": "city:4", "key": 4, "value": null},
  {"id": "city:5", "key": 5, "value": null}
  ]
}

The index key matches the value of the City.ID column. When the primary key is made of multiple columns the key looks like:

View: CountryLanguage/by_pk?reduce=false&limit=5

{
  "total_rows": 984,
  "rows": [
  {"id": "countrylanguage:1", "key": ["ABW", "Dutch"], "value": null},
  {"id": "countrylanguage:2", "key": ["ABW", "English"], "value": null},
  {"id": "countrylanguage:3", "key": ["ABW", "Papiamento"], "value": null},
  {"id": "countrylanguage:4", "key": ["ABW", "Spanish"], "value": null},
  {"id": "countrylanguage:5", "key": ["AFG", "Balochi"], "value": null}
  ]
}

This view is built from the CountryLanguage table primary key made of CountryLanguage.CountryCode and CountryLanguage.Language` columns.

+-------------+---------------+------+-----+---------+-------+
| Field       | Type          | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+-------+
| CountryCode | char(3)       | NO   | PRI |         |       |
| Language    | char(30)      | NO   | PRI |         |       |
| IsOfficial  | enum('T','F') | NO   |     | F       |       |
| Percentage  | float(4,1)    | NO   |     | 0.0     |       |
+-------------+---------------+------+-----+---------+-------+

How to use Couchbase SQL Importer tool?

The importer is a simple Java based command line utility, quite simple to use:

1- Download the CouchbaseSqlImporter.jar file from here. This file is contains all the dependencies to work with Couchbase: the Java Couchbase Client, and GSON. 2- Download the JDBC driver for the database you are using as data source. For this example I am using MySQL and I have download the driver for MySQL Site. 3- Configure the import using a properties file.

## SQL Information ##
sql.connection=jdbc:mysql://192.168.99.19:3306/world
sql.username=root
sql.password=password

## Couchbase Information ##
cb.uris=http://localhost:8091/pools
cb.bucket=default
cb.password=

## Import information
import.tables=ALL
import.createViews=true
import.typefield=type
import.fieldcase=lower

This sample properties file contains three sections :

The two first sections are used to configure the connections to your SQL database and Couchbase cluster (note that the bucket must be created first)
The third section allow you to configure the import itself

4- Run the tool !

java -cp "./CouchbaseSqlImporter.jar:./mysql-connector-java-5.1.25-bin.jar" com.couchbase.util.SqlImporter import.properties

So you run the Java command with the proper classpath (-cp parameter).

And you are done, you can get your data from your SQL database into Couchbase.

If you are interested to see how it is working internally, you can take a look to the next paragraph.

The Code: How it works?

The main class of the tool is really simple com.couchbase.util.SqlImporter, the process is:

Connect to the SQL database
Connect to Couchbase
Get the list of tables
For each tables execute a "select * from table" 4.1. Analyze the ResultSetMetadata to get the list of columns 4.2. Create a Java map for each rows where the key is the name of the columns and the value…is the value 4.3. Serialize this Map into a GSON document and save it into Couchbase

The code is available in the ImportTable(String table) Java method.

One interesting point is that you can use and extend the code to deal with your application.

Conclusion

I have created this tool quickly to help some people in the community, if you are using it and need new features, let me know, using comment or pull request.

Create a Couchbase cluster in less than a minute with Ansible

May 31, 2013 · 6 min read

TL;DR: Look at the Couchbase Ansible Playbook on my Github.

Introduction

When I was looking for a more effective way to create my cluster I asked some sysadmins which tools I should use to do it. The answer I got during OSDC was not Puppet, nor Chef, but was Ansible.

This article shows you how you can easily configure and create a Couchbase cluster deployed and many linux boxes...and the only thing you need on these boxes is an SSH Server!

Thanks to Jan-Piet Mens that was one of the person that convinced me to use Ansible and answered questions I had about Ansible.

You can watch the demonstration below, and/or look at all the details in the next paragraph.

Ansible

Ansible is an open-source software that allows administrator to configure and manage many computers over SSH.

I won't go in all the details about the installation, just follow the steps documented in the Getting Started Guide. As you can see from this guide, you just need Python and few other libraries and clone Ansible project from Github. So I am expecting that you have Ansible working with your various servers on which you want to deploy Couchbase.

Also for this first scripts I am using root on my server to do all the operations. So be sure you have register the root ssh keys to your administration server, from where you are running the Ansible scripts.

Create a Couchbase Cluster

So before going into the details of the Ansible script it is interesting to explain how you create a Couchbase Cluster. So here are the 5 steps to create and configure a cluster:

Install Couchbase on each nodes of the cluster, as documented here.
Take one of the node and "initialize" the cluster, using cluster-init command.
Add the other nodes to the cluster, using server-add command.
Rebalance, using rebalance command.
Create a Bucket, using bucket-create command.

So the goal now is to create an Ansible Playbook that does these steps for you.

Ansible Playbook for Couchbase

The first think you need is to have the list of hosts you want to target, so I have create a hosts file that contains all my server organized in 2 groups:

[couchbase-main]
vm1.grallandco.com

[couchbase-nodes]
vm2.grallandco.com
vm3.grallandco.com

The group [couchbase-main] group is just one of the node that will drive the installation and configuration, as you probably already know, Couchbase does not have any master... All nodes in the cluster are identical.

To ease the configuration of the cluster, I have create another file that contains all parameters that must be sent to all the various commands. This file is located in the group_vars/all see the section Splitting Out Host and Group Specific Data in the documentation.

# Adminisrator user and password
admin_user: Administrator
admin_password: password

# ram quota for the cluster
cluster_ram_quota: 1024

# bucket and replicas
bucket_name: ansible
bucket_ram_quota: 512
num_replicas: 2

Use this file to configure your cluster.

Let's describe the playbook file :

- name: Couchbase Installation
hosts: all
user: root

tasks:

- name: download Couchbase package
get_url: url=http://packages.couchbase.com/releases/2.0.1/couchbase-server-enterprise_x86_64_2.0.1.deb dest=~/.

- name: Install dependencies
apt: pkg=libssl0.9.8 state=present

- name: Install Couchbase .deb file on all machines
shell: dpkg -i ~/couchbase-server-enterprise_x86_64_2.0.1.deb

As expected, the installation has to be done on all servers as root then we need to execute 3 tasks:

Download the product, the get_url command will only download the file if not already present
Install the dependencies with the apt command, the state=present allows the system to only install this package if not already present
Install Couchbase with a simple shell command. (here I am not checking if Couchbase is already installed)

So we have now installed Couchbase on all the nodes. Let's now configure the first node and add the others:

- name: Initialize the cluster and add the nodes to the cluster
hosts: couchbase-main
user: root

tasks:
- name: Configure main node
shell: /opt/couchbase/bin/couchbase-cli cluster-init -c 127.0.0.1:8091  --cluster-init-username=${admin_user} --cluster-init-password=${admin_password} --cluster-init-port=8091 --cluster-init-ramsize=${cluster_ram_quota}

- name: Create shell script for configuring main node
action: template src=couchbase-add-node.j2 dest=/tmp/addnodes.sh mode=750

- name: Launch config script
action: shell /tmp/addnodes.sh

- name: Rebalance the cluster
shell: /opt/couchbase/bin/couchbase-cli rebalance -c 127.0.0.1:8091 -u ${admin_user} -p ${admin_password}

- name: create bucket ${bucket_name} with ${num_replicas} replicas
shell: /opt/couchbase/bin/couchbase-cli bucket-create -c 127.0.0.1:8091 --bucket=${bucket_name} --bucket-type=couchbase --bucket-port=11211 --bucket-ramsize=${bucket_ram_quota}  --bucket-replica=${num_replicas} -u ${admin_user} -p ${admin_password}

Now we need to execute specific taks on the "main" server:

Initialization of the cluster using the Couchbase CLI, on line 06 and 07

Then the system needs to ask all other server to join the cluster. For this the system needs to get the various IP and for each IP address execute the add-server command with the IP address. As far as I know it is not possible to get the IP address from the main playbook YAML file, so I ask the system to generate a shell script to add each node and execute the script.

This is done from the line 09 to 13.

To generate the shell script, I use Ansible Template, the template is available in the couchbase-add-node.j2 file.

{% for host in groups['couchbase-nodes'] %}
/opt/couchbase/bin/couchbase-cli server-add -c 127.0.0.1:8091 -u ${admin_user} -p ${admin_password} --server-add={{ hostvars[host]['ansible_eth0']['ipv4']['address'] }}:8091 --server-add-username=${admin_user} --server-add-password=${admin_password}
{% endfor %}

As you can see this script loop on each server in the [couchbase-nodes] group and use its IP address to add the node to the cluster.

Finally the script rebalance the cluster (line 16) and add a new bucket (line 19).

You are now ready to execute the playbook using the following command :

./bin/ansible-playbook -i ./couchbase/hosts ./couchbase/couchbase.yml -vv

I am adding the -vv parameter to allow you to see more information about what's happening during the execution of the script.

This will execute all the commands described in the playbook, and after few seconds you will have a new cluster ready to be used! You can for example open a browser and go to the Couchase Administration Console and check that your cluster is configured as expected.

As you can see it is really easy and fast to create a new cluster using Ansible.

I have also create a script to uninstall properly the cluster.. just launch

./bin/ansible-playbook -i ./couchbase/hosts ./couchbase/couchbase-uninstall.yml

Six months as Technical Evangelist at Couchbase

May 28, 2013 · 6 min read

Already 6 months! Already 6 months that I have joined Couchbase as Technical Evangelist. This is a good opportunity to take some time to look back.

So first of all what is a Developer/Technical Evangelist?

Hmm it depends of each company/product, but let me tell you what it is for me, inside Couchbase. This is one of the most exciting job I ever had. And I think it is the best job you can have when you are passionate about technology, and you like to share this passion with others. So my role as Technical Evangelist is to help the developers to adopt NoSQL technologies in general, and as you can guess Couchbase in particular.

Let's now see in more details what I have done during these past six months and why I am so happy about it. I have organized the different activities in three types:

Outbound activities : meet the developers
Online activities : reach even more developers
Inbound Activities : make the product better !

Outbound activities : meet the developers !

A large part of my activities for this first semester was made of conferences and meetups. All these events are great opportunities for me to talk about NoSQL and get more people to use Couchbase Server 2.0, here a short list of what I have done:

participated to many Couchbase Developer Days in various cities (Portland, Seattle, Vancouver, Oslo, Copenhagen, Stockholm, Munich, Amsterdam, Barcelona, Paris, ...), these are one day workshops where I am helping developers to get their hands dirty on Couchbase
participated to Couchconf Berlin and Couchbase [UK] our main European events where I met many Customer and key members of the community
submitted talks to conferences and adapt them to the conference, then spoken in various conferences about NoSQL and Couchbase (33Degree Warsaw, NoSQL & Big Data Israel, Devoxx France, NoSQL Matters, and many others).
met many developers during user groups and meetups. I have to say that I have been very active there, and quite happy to see that NoSQL is a very hot topic for developers, and this in all languages.
delivered BrowBagLunches to various technical teams in companies.

Yes! Be a Technical Evangelist means, at least for me, be on the road. It is very nice to meet developers from various countries, different cultures, languages, and… this also means tasting many different types of food!

Another interesting thing when you work on a database/infrastructure layer is the fact that it is technology agnostic; you can access Couchbase with multiple programming languages: Java, .Net,Javascript/Node, Ruby, PHP, Python, C, … and even Go. So with this job I met developers with different backgrounds and views about application development. So yes when I am at a conference or meetup, I am suppose to "teach" something to people, but I have also learned a lot of things, and still doing it.

Online activities : reach even more developers!

Meeting developers during conferences is great but it, it is also very important to produce content to reach even more people, so I have :

written blog post about Couchbase usage, most of them based on feedback/questions from the community
created sample code to show how it works
monitored and answered questions on various sites and mailing lists, from Couchbase discussion forums, mailing lists, Stack Overflow, Quora and others...

This task is quite interesting because it is the moment where you can reach many developers and also get feedback from users, and understand how they are using the product. I have to say that I was not as productive as I was expected, mainly because I was traveling a lot during this period.

Another important thing about online activities, is the "Couchbase Community" itself, many users of Couchbase are creating content : blog posts, samples, new applications, or features - for example I am talking with a person that is developing a Dart Client for Couchbase, so as Technical Evangelist I am also working closely with the most active contributor.

Inbound Activities : make the product better !

So the ultimate goal of a Technical Evangelist at Couchbase is to "convert" developers to NoSQL/Couchbase and get them to talk about Couchbase. Meeting them online or during events is a way of achieving this; but it is also great to do it directly with the product. This means participating to the "development" of the product or its ecosystem. Here some of the things that I have done on this topic:

talked a lot with the development team, core developers, product managers, architects, … Quite exciting to work with so much smart people and have access to them. During this discussions I was able to comment the roadmap, influence features, but also it is all the time an opportunity to learn new things about Couchbase - and many other things around architecture, programming languages, take a look for example to this nice post from Damien Katz .
contributed some code, yes remember Couchbase is an open source project and it is quite easy to participate to the development. Obviously based on my skills I have only help a little bit with the Java and the Javascript SDK. So if like me you are interested to contribute to the project, take a look to this page: "Contributing Changes"
but the biggest contributions to the products are such like doc reviews, testing and writing bug reports, and this is very important and interesting, since once again it helps a lot with the product adoption by the developers.

So what?

As you can see the Technical Evangelist job is a quite exciting job, and one of the reason I really love it, it is simply because it allows me to do many different things, that are all related to the technology. Six months is still a very short period, I still have many things to learn and to with the team to be successful, such as be more present online (blog, sample code, technical article, screencast, ..), be accepted in more conferences, and code a little more (I have to finish for example the Couchbase Data Provider for Hibernate OGM, and many other ideas around application development experience)

Finally, Couchbase needs you ! This is a good opportunity to say that Couchbase is always looking for talents, especially in the Technical/Developer Evangelist team, so do not hesitate to look at the different job openings and join the team !

Screencast : Fun with Couchbase MapReduce and Twitter

April 29, 2013 · One min read

I have created this simple screencast to show how you can, using Couchbase do some realtime analysis based on Twitter feed.

The key steps of this demonstration are

Inject Tweets using a simple program available on my Github Couchbase-Twitter-Injector
Create views to index and query the Tweets by
- User name
- Tags
- Date

The views that I used in this demonstration are available at the bottom of this post.

Views:

Easy application development with Couchbase, Angular and Node

March 6, 2013 · 14 min read

Note : This article has been written in March 2013, since Couchbase and its drivers have a changed a lot. I am not working with/for Couchbase anymore, with no time to update the code.

A friend of mine wants to build a simple system to capture ideas, and votes. Even if you can find many online services to do that, I think it is a good opportunity to show how easy it is to develop new application using a Couchbase and Node.js.

So how to start?

Some of us will start with the UI, other with the data, in this example I am starting with the model. The basics steps are :

Model your documents
Create Views
Create Services
Create the UI
Improve your application by iteration

The sources of this sample application are available in Gihub :

https://github.com/tgrall/couchbase-node-ideas

Use the following command to clone the project locally :

git clone https://github.com/tgrall/couchbase-node-ideas.git

Note: my goal is not to provide a complete application, but to describe the key steps to develop an application.

Model your documents

For this application you need 3 types of document :

Ideas : describes the idea with a author, title and description
Vote : the author and a comment - note that it is a choice to not put a value for the vote, in this first version if the vote exists this means user like the idea.
User : contains all the information about the user (not used in this first version of the application)

You can argue that it is possible to put the votes as a list of element inside the idea document. In this case I prefer to use different document and reference the idea in the vote since we do not know how many votes/comments will have. Using different documents is also interesting in this case for the following reasons :

No "concurrent" access, when a user wants to vote he does not change the idea document itself, so no need to put an optimistic locking in place.
The size of the document will be smaller and easier to cache in memory.

So documents will look like:

What I really like is the fact that I can quickly create a small dataset to validate that it is correct and help me to design the view. The way I do it, I start my server, launch the Couchbase Administration Console, create a bucket, and finally insert document manually and validate the model and views.

Create Views

Now that I have created some documents, I can think about the way I want to get the information out of the database. For this application I need:

The list of ideas
The votes by ideas

The list of idea for this first version is very simple, we just need to emit the title:

For the votes by ideas, I choose to create a collated view, this will give me some interesting options when I will expose them into an API/View layer. I am also for this view using sum() reduce function to be sure I capture the number of votes.

I have my documents, I have some views that allow me to retrieve the list of ideas, the number of vote by idea and count the vote... So I am ready to expose all these informations to the application using a simple API layer.

Create Services

Lately I have been playing a lot with Node.js, just because it is nice to learn new stuff and also because it is really easy to use with Couchbase. Think about it Couchbase loves JSON, and Node.js object format is JSON, this means I do not have any marshaling/unmarshaling to do.

My API layer is quite simple, I just need to create a set of REST endpoint to deal with:

CRUD operation on each type of document
List the different Documents

The code of the services is available in branch 01-simple-services:

You can run the application with simple services using the following command:

> git checkout -f 01-simple-services
> node app.js

and go to you browser using the http://127.0.0.1:3000

About the project

For this project I am using only 2 node modules Express and Couchbase. The package.json file looks like :

{
  "name": "couchbase-ideas-management",
  "version": "0.0.1",
  "private": true,
  "dependencies":
  {
    "express": "3.x",
    "couchbase": "0.0.11"
  }
}

After running the install, let's code the new API interface, as said before I am using an iterative approach so for now I am not dealing with the security, I just want to get the basic actions to work.

I am starting with the endpoints to get and set the documents. I am creating a generic endpoints that take the type as URI parameter allowing user/application to do a get/post on /api/vote, /api/idea. The following code captures this:

In each case I start to test if the URI is one of the supported types (idea, vote, user) and if this is the case I call the get() or upsert() method that will do the call to Couchbase.

The get() and upsert() methods are using more or less the same approach. I test if the document exists, if the type is correct and do the operation to Couchbase. Let's focus on the upsert()`` method. I call it upsert()` since the same operation is used to create and update the document.

In this function I start by testing if the document contains a type and if the type is the one expected (line 3).

Then I check if the document id is present, to see if I need to create it or not. This is one of the reason why I like to keep the id/key in the document, yes I duplicate it, but it makes the development really easy. So if I have to create a new document I have to generate a new id. I chose to create a counter for each type. this is why I call the incr function (line 7) and then use the returned value to create the document (line 10).

Note: as you can see, my documents contain the an ID as part of the attributes. This ID is the same value that the one used to set the document (the "key"). It is not necessary a good practice to duplicate this information, and in many case the application only use the document key itself. I personally like to put the ID in the document itself too, because it simplifies a lot the development.

If the ID is present, I just call the update operation to save the document. (line 15)

The delete operation is equivalent to the get, using the delete HTTP operation.

So now I can get, insert and update the documents. I still need to do some work to deal with the lists. As you can guess, here I need to call the views. I won't go in the detail of the simple list of ideas. Let's focus on the view that shows the result of the votes.

For this part of the application I use a small trick to use the collated view. The /api/results/ call returns the list of ideas with their title and the total number of votes. The result looks like the following:

Note that it is also possible to select only one idea , you just need to pass the id to the call for example.

If you look in more detail the function, not only I call the view, but I build an array in which I put the idea id, label, then on the next loop, I add the number of vote. This is possible because the view is a collated view of the ideas and its votes.

I have now my REST Services, including advanced query capabilities. It is time now to use these services and build the user interface.

Create the UI

For the view I am using AngularJS, that I am packaging in the same node.js application for simplicity reason

Simple UI without Login/Security

The code of the application without login is available branch in 02-simple-ui-no-login

You can run the application with simple services using the following command:

> git checkout -f 02-simple-ui-no-login
> node app.js

The application is based on AngularJS and Twitter Boostrap.

I am using basic feature and packaging for Angular :

/public/js/app.js contains the module declaration and all the routes to the different views/controllers
/public/js/controllers.js contains all the controller. I will show some of them but basically this is where that I call the services that I have created above.
/views/partials/ contains the different pages/screens used by the application.

Because the application is quite simple I have not done any packaging of directive, or other functions. This is true at for AngularJS and Node.js parts.

Dummy user management

In this first version of the UI I have not yet integrated any login/security, so I fake the user login using a global scope variable that $scope.user that you can see in the controller AppCtrl(). Since I have not yet implemented the login/security, I have added at the bottom of the page a textfield where you can enter a "dummy" username to test the application. This field is inserted in the /views/index.html page.

List Views and Number of Votes

The home page of the application contains the list of ideas and number of votes.

Look at the EntriesListCtrl controller and the view/index.html file. As you can guess this is based on the Couchbase collated view that return the list of ideas and number of vote.

Create/Edit an idea

When the user click on the New link in the navigation, the application call the view /view/partials/idea-form.html. This form is called using the "/#/idea/new" URL.

Just look at the IdeaFormCtrl controller to see what is happening :

function IdeaFormCtrl($rootScope, $scope, $routeParams, $http, $location) {
  $scope.idea = null;
    if ($routeParams.id ) {
        $http({method: 'GET', url: '/api/idea/'+ $routeParams.id }).success(function(data, status, headers, config) {           
                $scope.idea = data;
            });
    }
 
    $scope.save = function() {          
        $scope.idea.type = "idea"; // set the type
        $scope.idea.user_id = $scope.user;
        $http.post('/api/idea',$scope.idea).success(function(data) {
            $location.path('/');
        });
    }
    $scope.cancel = function() {
        $location.path('/');
    }
 
}
IdeaFormCtrl.$inject = ['$rootScope', '$scope', '$routeParams','$http', '$location'];

First of all I test if the controller is called with a idea identifier in the URL ( $routeParams.id - line 3) . If the ID is present, I call the REST API to get the idea and set it into the $scope.idea variable.

Then on line 9, you can see the $scope.save() function that calls the REST API to save/update the idea to Couchbase. I use the line 10 and 11 to set the user and the type of data to the idea.

Note: It is interesting to look at these lines, by adding the two attributes (user & type) I modify the "schema" of my data. I am adding new fields to my document that will be stored as it is in Couchbase. Once again, you see here that I drive the data type from my application. I could take another approach and force the type in the service layer. For this example I chose to put that in the application layer, that is supposed to send the proper data types.

Other Interactions

The same approach is used to create a vote associated to a user/idea as you can see in the VoteFormCtrl controller.

I won't go in all the details of all operations, I am just inviting you to look at the code of the application, and feel free to add comment to this blog post if I need to clarify other part of the application.

Iterative Development : adding a value to the vote!

The code of the services is available in branch 01-simple-services:

You can run the application with simple services using the following command:

> git checkout -f 03-vote-with-value
> node app.js

Adding the field in the form

Something that I really like about working with AngularJS, Node and Couchbase is the fact that the developer uses JSON from the database to the browser.

So let's implement a new feature, where instead of having only a comment the user can give a rate to its vote from 1 to 5. Doing that is quite easy, here are the steps:

Modify the UI : adding a new field
Modify the Couchabe View to use the new field

This is it! AngularJS deals with the binding of the new field, so I just need to edit the /views/partials/idea-form.html to add this. For this I need to add the list of values in the controller and expose it into a select box in form.

The list of value located in the $scope.ratings variable :

Once this is done you can add a select box into your view using the following code :

To add the select box into the form, I just use AngularJS features:

the list of value described in my controller using the ng-options attribute
the binding to the vote.rating field object using ng-model attribute.

I am adding the field in my form, I bind this field to my Javascript object; and... nothing else! Since my REST API is just consuming the JSON object as it is, AngularJS will send the vote object with the new attribute.

Update the view to use the rating

Now that my database is dealing with a new attribute in the vote, I need to update my view to use this in the sum function. (I could calculate an average too, but here I want the sum of all the vote/ratings).

The only line that I have changed is the line number 7. The logic is simple, if the rating is present I emit it, if not I emit a 2, that is a medium rating for an idea.

This is a small tip that allow me to have a working view/system without having to update all the existing document if I have some.

I'll stop here for now, and will add new feature later such as User Authentication and User Management using for example Passport.

Version and Upgrade Management

If you looked closely to the code of the application the views are automatically imported from the app.js file when the application is starting.

In fact I have added a small function that check the current version installed and update the views with the correct version when needed.

You can look at the function initApplication() :

Load the version number from Couchbase (document with ID "app.version")
Check the version of if this is different
Update/Create the view (I am doing it in production mode here, in real application it will be better to use dev mode - just prefix the design document ID with "dev_" )
Once the view is created update/create the "app.version" document with the new ID.

Conclusion

In this article we have seen how you can quickly develop your application/prototype and leverage the flexibility of NoSQL for developers. The steps to do this are:

Design your document model and API (REST)
Create the UI that consumes the API
Modify your model by simply adding field into the UI
Update the view to adapt your lists to your new model

In addition to this, I have also quickly explain how you can from your code control the version of your application and deploy new views (and other things) automatically.

I will post another blog post in few days to explain how you can easily integrate user management, security to your application and database easily

How to get the latest document by date/time field?

February 18, 2013 · 2 min read

I read this question on Twitter, let me answer the question in this short article.

First of all you need to be sure your documents have an attribute that contains a date ;), something like :

{
  "type" : "emp",
  "id":"001",
  "name":"John Doe",
  "hiredate":"Jan 1, 2013 8:32:00 AM"
}

To get the "latest hired employee" you need to create a view, and emit the hire date as key. The important part is to check that this date is emitted in a format that is sorted properly, for example an array of value using dateToArray function, or the time as numerical value. In the following view I am using the date as an array like that I will be able to do some grouping but this is another topic. The view looks like the following:

function (doc, meta) {
  if (doc.hiredate) {
    emit( dateToArray(doc.hiredate) );
  }
}

Now that you have a view. You can now query it using the parameters:

descending = true
limit = 1

If you use Java SDK the code will look like the following :

import com.couchbase.client.protocol.views.Query;
import com.couchbase.client.protocol.views.View;
import com.couchbase.client.protocol.views.ViewResponse;
import com.couchbase.client.protocol.views.ViewRow;
...
...
...

  View view = cb.getView("employees", "by_hiredate");
  Query query = new Query();
  query.setIncludeDocs(true);
  query.setLimit(1);
  query.setDescending(true);
  ViewResponse viewResponse = cb.query(view, query);
  for (ViewRow row : viewResponse) {
    String documentJson = row.getDocument();
  }

Finally it is important when you work with views to understand how the index are managed by the server so be sure your read the chapter "Index Updates and the stale Parameter".

Introduction to Collated Views with Couchbase 2.0

February 13, 2013 · 5 min read

Most of the applications have to deal with "master/detail" type of data:

breweries and beer
department and employees
invoices and items
...

This is necessary for example to create application view like the following:

With Couchbase, and many of the document oriented databases you have different ways to deal with this, you can:

Create a single document for each master and embed all the children in it
Create a master and child documents and link them using an attribute.

In the first case when all the information are stored in a single document it is quite easy to use the entire set of data and for example create a screen that shows all the information, but what about the second case?

In this post I am explaining how it is possible to use Couchbase views to deal with that an make it easy to create master/detail views.

As an ex-Oracle employee, I am using the infamous SCOTT schema with the DEPT and EMP tables, as the first example. Then at the end I will extend this to the beer sample data provided with Couchbase.

The Data

Couchbase is a schema-less database, and you can store “anything you want” into it, but for this you need to use JSON documents and create 2 types of document : “department” and “employee”.

The way we usually do that is using a technical attribute to type the document. So the employee and department document will look as follow :

Department

{
  "type": "dept",
  "id": 10,
  "name": "Accounting",
  "city": "New York"
}

Employee

{
  "type": "emp",
  "id": 7782,
  "name": "Blake",
  "job": "Clark",
  "manager": 7839,
  "salary": 2450,
  "dept_id": "dept__10"
}

This shows just the document, in Couchbase you have to associate a document to a key. For this example I am using a simple pattern : type__id, for these documents the keys will look like the following:

dept__10
emp__20

You can use any pattern to create a key, for example for the employee you could chose to put an email.

Note the “dept_id” attribute in the employee document. This is the key of the department; you can see that as the “foreign key”. But remember, the relationship between the department and employee documents are managed entirely by the application, Couchbase Server does not enforce it.

I have created a Zip file that contains all the data, you can download it from here; and import the data into Couchbase using the cbdocloader utility. To import the data run the following command from a terminal window:

./cbdocloader -n 127.0.0.1:8091 -u Administrator -p password -b default ~/Downloads/emp-dept.zip

You can learn more about the cbdocloader tool in the documentation.

The View

Queries inside Couchbase are based on views; and views build indexes, so we have to create a view, a "collated view" to be exact.

The idea behing a collated view is to produce an index where the keys are ordered so that a parent id appears first followed by its children. So we are generating an index that will look like:

DEPT_10, Accounting
DEPT_10, Blake
DEPT_10, Miller
DEPT_20, Research
DEPT_20, Adams
DEPT_20, Ford
...

This is in fact quite easy to do with Couchbase views. The only trick here is to control the order and be sure the master is always the first one, just before its children.

So to control this we can create an compound key that contains the department id, a "sorting" element and the name (beer or brewery)

So the map function of the view looks like the following:

The key is composed of:

the department id extracted from the department document itself or from the employee document depending of the type of document
an arbitrary number that is used to control the ordering. I put 0 for the department, 1 for the employee
the name of the department or the employee, this also allows to sort the result by name

In addition to the key, this view is used to emit some information about the salary of the employees. The salary is simply the sum of the salary plus thecommission when exists. The result of the view looks like:

With this view you can now use the result of the view to build report for your application. It is also possible to use parameters in your query to see only a part of the data, for example by departement, using for example startkey=["dept__20",0]&endkey=["dept__20",2] to view only the data -Department and Employees- of the deparment 20-Research.

The Beer Sample Application

You can create an equivalent view for the beer sample application where you print all the breweries and beers in the same report. The view is called "all_with_beers" in the design document "brewery". The view looks like:

Once you have publish it in production you can use it in the Beer Sample application, for this example I have modified the Java sample application.

Create a servlet to handle user request and on the /all URI.

The "BreweryAndBeerServlet" that calls the view using the following code :

The result of the query is set into the HttpRequest and the all.jsp page is executed. The JSP uses JSTL to print the information using the following code:

The JSP gets the items from the HTTP Request and loops on each items, then based on the type of the item the information is printed. The final result looks like :

This extension to the Beer Sample application is available here :https://github.com/tgrall/beersample-java/tree/BreweriesAndBeers

Using skip / limit Parameters​

Using startkey / startkey_docid parameters​

Views with Reduce function​

Couchbase Java SDK Paginator​

Conclusion​

Introduction​

Introduction​

Introduction​

The Tool: Couchbase SQL Importer​

The Code: How it works?​

Conclusion​

Introduction​

Ansible​

Ansible is an open-source software that allows administrator to configure and manage many computers over SSH.​

Create a Couchbase Cluster​

Ansible Playbook for Couchbase​

Outbound activities : meet the developers !​

Online activities : reach even more developers!​

Inbound Activities : make the product better !​

So what?​

Model your documents​

Create Views​

Create Services​

Create the UI​

Simple UI without Login/Security​

Iterative Development : adding a value to the vote!​

Version and Upgrade Management​

Conclusion​

The Data​

The View​

The Beer Sample Application​

Using skip / limit Parameters

Using startkey / startkey_docid parameters

Views with Reduce function

Couchbase Java SDK Paginator

Conclusion

Introduction

Introduction

Introduction

The Tool: Couchbase SQL Importer

The Code: How it works?

Conclusion

Introduction

Ansible

Ansible is an open-source software that allows administrator to configure and manage many computers over SSH.

Create a Couchbase Cluster

Ansible Playbook for Couchbase

Outbound activities : meet the developers !

Online activities : reach even more developers!

Inbound Activities : make the product better !

So what?

Model your documents

Create Views

Create Services

Create the UI

Simple UI without Login/Security

Iterative Development : adding a value to the vote!

Version and Upgrade Management

Conclusion

The Data

The View

The Beer Sample Application