Tug’s Blog

My journey in Big Data, Hadoop, NoSQL and MapR

How to Calculate the Size of a Folder in JCR (Java Content Repository)?

| Comments

Today I was working with a partner in Paris and he wanted to know how to calculate the size of a specific folder in the eXo Java Content Repository (JCR).

For this specific need the goal is to calculate the size of all the documents stored inside a specific location in the content repository. This could be used for example to manage quotas, estimate the size of a shared or personal storage, … For this specific sample I will only take in consideration the size of the binary part of the document stored in the repository; this means I will not pay attention to the various attributes and meta-data that are also stored, neither the full text index created by Lucene that is embedded in eXo JCR.

How the files are stored in the JCR?

Files are stored in eXo JCR in the standard node type nt:file (and nt:resource). So for this example I will simply list all the nt:file of a folder and aggregate the size of the file itself. It is important to understand how JCR is storing the binary content. The best way to understand it, is to view it. For that I am using the print information given by CRaSH, a shell for content repository developed by eXo team and lead by Julien Viet.

Here the structure of a PDF document :

1
2
3
4
5
6
7
8
/Documents/jsr170-1.0.pdf
+-properties
| +-jcr:primaryType: nt:file
| +-jcr:mixinTypes: [mix:referenceable,mix:versionable]
| +-jcr:uuid: '6b89b6f0c0a8006530a8617df51bb0d7'
| +-jcr:created: 2011-02-28T10:11:50.770+01:00
+-children
| +-/Groups/spaces/intranet_v2/Documents/Technical/jsr170-1.0.pdf/jcr:content

As you can see in this node the ‘binary’ is not visible, nothing bad here. As written in the specification in the section 6.7.22.6 nt:file, the binary content is an attribute of the child node jcr:content that is exposed below:

1
2
3
4
5
6
7
8
9
10
11
/Groups/spaces/intranet_v2/Documents/Technical/jsr170-1.0.pdf/jcr:content
+-properties
| +-jcr:primaryType: nt:resource
| +-jcr:mixinTypes: [exo:owneable,exo:datetime,dc:elementSet]
| +-jcr:uuid: '6b89b6f7c0a80065735b1a8853d389d0'
| +-jcr:data: <binary>
| +-jcr:encoding: 'UTF-8'
| +-jcr:lastModified: 2011-02-28T10:11:50.767+01:00
| +-jcr:mimeType: 'application/pdf'
+-children
</binary>

You can see now that the jcr:content contains some interesting attributes:

  • jcr:mimeType
  • jcr:data this is where the binary content, the PDF itself , is. So using the JCR API you just need to get the content length using the following java code:
1
node.getProperty("jcr:content/jcr:data").getLength()

This returns the number of bits of the binary data.

So to calculate the size of a folder you just need to navigate in all the documents (nt:file or jcr:content) and cumulate the size of all the files. In this following code, I am calculating the size of the folder “/Documents” by navigating into all the files contains in this folder and subfolders. (I could have chose to query all the jcr:content type instead of nt:file)

1
2
3
4
5
6
7
8
9
10
11
12
Session session = getSession(); // use RepositoryService or context to get a session
QueryManager manager = session.getWorkspace().getQueryManager();
String queryStatement = "select * from nt:file where (jcr:path LIKE '/Documents/%')";
Query query = manager.createQuery(queryStatement, Query.SQL);
NodeIterator nodeIterator = query.execute().getNodes();
Node node = null;
long totalSizeInMb = 0;

while (nodeIterator.hasNext()) {
  node = nodeIterator.next();
  totalSizeInMb = totalSizeInMb + node.getProperty("jcr:content/jcr:data").getLength() / (1024*1024);
}

As you can guess since we are navigating in the hierarchy you have to be very careful when using such query. This example is just a simple code sample to show you some of the cool features provided by the JCR API.

iOS 101: How to Convert a String to a NSDate

| Comments

During my vacations, I took some time to play with iOS development. I have been struggling with many small issues… This is the price to pay when learning a new technology, and this is part of the fun of doing it. I will try to document some of these issues in articles… Let’s start with a very common story : working with date.

Objective-C and iOS SDK provide a class to help formatting date (marshaling and unmarshaling), this class is NSDateFormatter. No surprise, the NSDateFormatter uses the Unicode Date Format Patterns.

A small example of date creating from a string:

1
2
3
4
5
    NSDateFormatter *dateFormatter = [[NSDateFormatter alloc]init];
    [dateFormatter setDateFormat:@"yyyy-MM-dd"];
    NSDate *date = [dateFormatter dateFromString:publicationDate ];
    [dateFormatter release];
     // use your date object

The date that I have to create from a sting looks like “2010-11-12”. So I do not have any time information. When I do convert this string with the code above, the result is “2010-11-11 23:00:00 +0000”. As you can see the date is calculated from my current time zone, small reminder I am in France. So the “date” object itself is perfectly fine, but in my example I want to have the date independently of the time.

To be able to manage the date without any time/timezone information, I can force the timezone I want to use when using the NSDateFormatter class. I just need to use the setTimeZone instance method.

The code looks like that now (see line#3):

1
2
3
4
5
6
NSDateFormatter *dateFormatter = [[NSDateFormatter alloc]init];
[dateFormatter setDateFormat:@"yyyy-MM-dd"];
[dateFormatter setTimeZone:[NSTimeZone timeZoneForSecondsFromGMT:0]];
NSDate *date = [dateFormatter dateFromString:publicationDate ];
[dateFormatter release];
 // use your date object

Hope that helps!

What Apple’s Announcement Really Means to Java Developers

| Comments

Hey Steve, keep the bean in the Apple!

The news from last week that grabbed the attention of many Java developers was Apple’s announcement of its intentions to deprecate Java in the latest OS X 10.6 update. One sentence stood out in particular, “Developers should not rely on the Apple-supplied Java runtime being present in future versions of Mac OS X,” and raised the question: should Java developers (many of whom, like me, develop on Macs) freak out?

I don’t think so. (Though it prompted additional speculation and follow-on news stories.)

Let’s be realistic. Most applications run on the server side, on Unix/Linux and/or Windows Server – which has nothing to do with Apple or Mac OS X. And more and more applications are running on the cloud, where the language isn’t necessarily irrelevant, but certainly less important than the services that the application exposes. And I’m sure Java will have a big role in ‘development in the cloud,’ as we can already see with Google AppEngine and the VMWare/SpringSource effort.

I think the more interesting question to ask is “Why did Apple do this?”

I believe this is related Apple’s other big news last week: the new “Mac App Store,” which looks like an effort to have one single technology and language to develop “official” applications for Mac. In fact, for all Apple platforms running OS X and iOS, developers should use X Code and Objective C. That’s fine with me, as I enjoy developing small apps for my iPhone and iPad in my spare time, using these tools. But at eXo, many of our developers are using Java, often on Macs, to build our software.

We’re not talking about the same kind of applications. If, in the future, Java does not exist on Macs, it will not cause enterprise developers to abandon Java, but simply force them to move away from their Macs. Personally, I don’t want that to happen. I switched to Mac in 2001, and I’ve been a big fan of all Apple products ever since (most of my extended family are now also on Macs, and they couldn’t care less about Java).

As a Java developer, do I switch back to PC now? Unlikely. I am very confident (overconfident?) that Java will still be present on OS X. The difference is that Apple will simply stop caring about it – the same way that Microsoft doesn’t care now. I cannot believe that Apple will stop/block Java on their platform. So the future of Java in general, and now on Mac, is fully under the control of the Java community, driven by Oracle and OpenJDK. I am sure we will find many skilled “MacAddicts” to maintain and improve Java on OS X, to at least allow Java developers to run their favorite IDE and test their applications before deploying them on the servers – keeping the “Write Once, Run Anywhere” a reality (almost…). The only “bad” part is the fact that “Java Desktop” will not borrow any of the cool features of Apple Mac OS X. Not a big deal, since Java Desktop has never been that successful anyway.

So my advice to fellow Java developers is this: if you care, be vocal. Let’s make sure Apple lets the community drive the future of Java on Mac, since the future of the Java platform is still very exciting for many of us.

Original Post on eXo Blog.

VirtualBox: How to Clone a Virtual Machine?

| Comments

During some testing I had to put in place a cluster on my network. So I create a first virtual machine. It is not possible to directly copy the Virtual Disk Image (.vdi). VirtualBox saved in each disk image a UUID that is also store inside the virtual machine image. VirtualBox does not support two images with the same number. So to clone the an image you need to use the VBoxManage clonehd command line.

The clonehd command copy the VDI file and assigns a new UUID into it.

1
VBoxManage  clonehd /opt/tools/vm/vm1-rhel.vdi  /opt/tools/vm/vm2-rhel.vdi

Once the copy is done, you can now register this new VDI in your VirtualBox environment and create a new virtual machine.

Note: I am running VirtualBox on MacOS X, and I needed to put complete path to VDI files, if not the command id not working

Alternative approach

Initially I had issue with the clonehd command since I was not using full path. So what you can do is:

1
2
cp vm1-rhel.vdi vm2-rhel.vdi
VBoxManage internalcommands sethduuid vm2-rhel.vdi

You can now add the new VDI to your VirtualBox environment.

USI2009: The Geek and Boss French Conference

| Comments

This year I was lucky enough to have a presentation at the second edition of the “Université du SI”, organized by Octo Technologies. I have to say that this conference is one of the best that I have attended, for sure it is the best in France. Unfortunately I was only able to attend the first day of the conference, but even in one day, I was very happy with the content of the presentations, keynotes, and networking opportunities.

I won’t go in details in all the presentations that I have seen, Google for the Enterprise, Application Server Future, Usability concerns, and keynotes. If you want to have a good feedback about this conference I invite you to read, in French, the reports from Le Touilleur Express.

Let me just share the presentation that I gave with Vincent Massol from XWiki, about CMS vs Wiki.

Wiki vs CMS duel

First of all, the room was packed, so it looks like it is an interesting subject for many of you, so do not hesitate to post comments or question on this entry. Vincent and I will be pleased to update our presentation for a new event.

The main message of the talk was:

  • For collaboration on content the wiki is king
  • For publication of content the CMS is king

Wiki vs CMS

Conference Season for eXo Platform in Paris

| Comments

eXo Platform, and I, will be present in conferences in the upcoming weeks:

  • Linux Solutions, March 31st - April 2nd : In addition to the demonstration pod where you can meet eXo people, I am inviting you to joing us during the OW2 Annual Conference presentations:
    • Next generation Portals: how OpenSocial standard adds social to the mix (April 2, 01:30 - 02:00)
    • Which Portlet Bridge is made for you? (April 2, 02:00 - 02:30)

You can find the full program here.

  • Salon Intranet, May 12th,13th : Once again, eXo will be present with a demonstration pod, but also come to meet eXo CEO, Benjamin Mestrallet and myself during the “eXo Platform, the Open Source solution for your Intranet” on May 12th from 3pm-4pm.

JAX-WS: How to Configure the Service End Point at Runtime?

| Comments

When deploying your Web Service client you often need to change the endpoint of the service that  has been set during the code generation. This short post explains how you can set change it at runtime in the client code.

You have two approaches to do that:

  • set the endpoint in the Port using the BindingProvider
  • get the endpoint URL from the WSDL itself at runtime

Use the Binding Provider to set the endpoint URL

The first approach is to change the BindingProvider.ENDPOINT_ADDRESS_PROPERTY property value of the BindingProvider (Port) using the following code:

1
2
3
4
5
6
7
8
9
10
11
try {
  EmployeeServiceService service = new EmployeeServiceService();
  EmployeeService port = service.getEmployeeServicePort();

  BindingProvider bp = (BindingProvider)port;
  bp.getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY, "http://server1.grallandco.com:8282/HumanRessources/EmployeeServiceService");

  Employee emp = port.getEmployee(123);

  System.out.println("Result = "+ emp);
} catch (Exception ex) {...}

Use the WSDL to get the endpoint URL

Another part is to set the WSDL when you are creating the Service. The service will be using the value that is located in the WSDL port -SOAP Endpoint-. This is simply done using the following code:

1
2
3
4
5
6
7
8
9
10
11
try {
  EmployeeServiceService service =
  new org.demo.service.EmployeeServiceService
  (new URL("http://server1.grallandco.com:8282/HumanRessources/EmployeeServiceService?wsdl"),
  new QName("http://service.demo.org/","EmployeeServiceService"));

  EmployeeService port = service.getEmployeeServicePort();
  Employee emp = port.getEmployee(123);

  System.out.println("Result = "+ emp);
} catch (Exception ex) { ... }

Note that, in Glassfish, like lot of Web Service environments the WSDL can generate dynamically the Endpoint URL based on the URL used  to get the WSDL. With this approach you can also dynamically change the Soap endpoint. (If compatible with the network configuration of the production environment.)