Leveraging Devops Automation to Achieve Maximum Benefits!

What if… you could reduce standard infrastructure setup and configuration time from 8 hours to 15 minutes?

What if… you could achieve YoY engineering cost reduction of 37 percent?

What if… you could reduce time to market for new features by 20 percent?

Aren’t these stats interesting? Here’s how a San Francisco-based IT Company leveraged Devops Automation and achieved this (Download Case Study)!

The company looked to increase its market share by lowering the subscription prices and offer more product features while maintaining profitability. This was quite a challenge where developers and architects struggled in order to achieve this goal. They spent too much time migrating code between local, development, user acceptance testing, and production environments. Without much success, they lost 20% of development time in fine tuning.

The company then approached Imaginea Cloud Services team to look into the problem and suggest solutions that could shorten the development cycle duration and improve asset utilization. They also wanted to cut Opex and CapEx.

Imaginea’s unique solution helped developers to achieve continuous delivery and continuous integration to automate and improve the software delivery process, thus accelerating the build, deploy, test, and release processes.

Imaginea helped the company automate much of their development grunt work and significantly reduced operational rework. This IT team completely redefined its existing perception of “how much can be done in how much time”.

Interested in reading more about this. You can visit us and know more.

You can also download the complete case study here.

 

A staggering fact, but true! A Gartner survey estimates that downtime caused by incorrect manual configurations cost small and medium sized businesses $42,000 an hour, with figures in the millions for larger enterprises.

 

Lucene Custom Scoring – Custom Score Query and Custom Score Provider

In my previous post I had written on the different types of boosting. In addition I had also provided an introduction to the concept of scoring. I had promised in my previous post a series of posts on how to achieve custom scoring. There are many a means to achieve custom scoring, too many in fact to cover all of them in a single blog post. In this post we will take a look at the oft used custom scoring technique of using a custom query in conjunction with a custom score provider.

Prerequisites:

1. It is expected that the reader is aware of the basic concepts of Lucene like Document, Indexing and Analyzing, tokens, terms and querying.
2. Reader should at minimum be acquainted with the use of the basic Lucene API objects like IndexReader, IndexWriter, Query, Directory etc.
Code Samples

The code for this example can be found here.

Notes to set up and run the demo program

1. Download the source code. The code makes use of the latest version (as of date of writing this article) of Lucene -> 4.6.
2. Run mvn package which will generate the JAR –> boost-imaginea-demo-1.0.jar
3. Place this jar along with the following Jars in a folder say “C:\Imaginea-Boost-Demo”.
         a. lucene-analyzers-common-4.6.0.jar
         b. lucene-core-4.6.0.jar
         c. lucene-queryparser-4.6.0
         d. lucene-queries-4.6.0
4. The program usage is as below,

Param 1: Type of scoring:
customscorequery — Custom Score Query and Custom Score Provider Demo

Copyright to Wikimedia

Image Copyright from Wikimedia and the person who posted it there.

I must admit, I am obsessed with SUVs (affording them with Indian taxes regime is another thing though) and wish to sneak them into my technical blog pursuits as well. I will reuse my previous examples of SUVs boosted on white colour and origin. You may please proceed to the technical content below after you are done ogling at the white Scorpio above in all its beauty.

It becomes necessary to score documents individually at the time of querying. We had seen in the previous post of how to achieve query time boosting by assigning a higher score to a specific data set in the query. What if you have a lot of scoring logic to perform on top of the data you run into while querying? It may not be possible to specify all this logic in the query. This is where a custom score query comes in. This in conjunction with a custom score provider provides a neat way to put in our custom scoring logic. To make it better, Lucene neatly hands over to your custom code the scores it calculated in itself which you can further manipulate and provide a final score or pass it on to a super class for coming up with its final score after considering your manipulated inputs in its calculation.

Let’s write a custom score query now shall we? The class you write should extend CustomScoreQuery.

public class ImagineaDemoCustomScoreQuery extends CustomScoreQuery {

public ImagineaDemoCustomScoreQuery(Query subQuery) {
super(subQuery);
}

@Override
public CustomScoreProvider getCustomScoreProvider(final AtomicReaderContext atomicContext) {
return new ImagineaDemoCustomScoreProvider(atomicContext);
}

}

That’s it. We have just written a custom score query and overridden a method which in turn hands out a custom score provider. Now, let’s write our own custom score provider and fit the pieces together.

public class ImagineaDemoCustomScoreProvider extends CustomScoreProvider {

private static AtomicReader atomicReader;

public ImagineaDemoCustomScoreProvider(AtomicReaderContext context) {
super(context);
atomicReader = context.reader();
}

@Override
public float customScore(int doc, float subQueryScore, float valSrcScore)
throws IOException {
Document docAtHand = atomicReader.document(doc);
String[] itemOrigin = docAtHand.getValues(“originOfItem”);
boolean originIndia = false;
for (int counter=0; counter<itemOrigin.length; counter++) {
if (itemOrigin[counter] != null &&
itemOrigin[counter].equalsIgnoreCase(“India”)) {
originIndia = true;
break;
}
}
if (originIndia) {
return 3.0f;
} else {
return 1.0f;
}

}

}

The custom score provider is in place too. It is seen that in the overridden customScore method of the custom score provider implementation, the individual documents have been accessed and it is checked to see if the pertinent SUV has its origin in India. Such documents are boosted to a score of 3.0f whilst the others have their original score of 1.0f. Now that we have the custom score query and the custom score provider in place let’s write the code which will employ them to provide customized scoring.

 

IndexReader idxReader = DirectoryReader.open(ramDirectory);
IndexSearcher idxSearcher = new IndexSearcher(idxReader);
Query queryToSearch = new QueryParser(Version.LUCENE_46, “itemType”, analyzer)
.parse(queryToRun);

CustomScoreQuery customQuery = new ImagineaDemoCustomScoreQuery(queryToSearch);

ScoreDoc[] hitsTop = idxSearcher.search(customQuery, 10).scoreDocs;

Note that the constructor for our custom query accepts a query as a parameter. Internally Lucene runs the query, calculates the score and for each document encountered calls the customScore method of our custom score provider class and allows us to manipulate the score.

Now, let us run the example for ourselves and see some sample data of how this works.

The command to be used is as below,

C:\Imaginea-Boost-Demo>java -cp boost-imaginea-demo-1.0.jar;lucene-analyzers-common-4.6.0.jar;lucene-core-.6.0.jar;lucene-queryparser-4.6.0.jar;lucene-queries-4.6.0.jar com.imaginea.scoring.ScoringExamples customscorequery

 

In the example code a simple query is done without any custom scoring and it is seen that the documents all have a similar score of 0.8. Using our custom score query it is seen that all the SUVs from India have been boosted to the top with individual score of 3.0f each. Simple isn’t it?

Understanding Lucene Boosting – Part 1

Lucene is one of the most popular open source search tools offering high scalability, robustness and versatility leading to many an entire enterprise search servers/engines built around it. Solr and Elastic Search come immediately to mind. Twitter with its humongous data volumes and scalability requirements has in place a search architecture built around a customized lucene version. Lucene offers excellent real world functionality like hit highlighting, spell checking, tokenizing and analyzing etc but one of the powerful and oft used feature is boosting. Most well designed and built websites today offer some degree of search functionality to them which range from searching plain text content within the site to specific content hidden inside binary documents. Lucene in conjunction with many other plugins/tools plays a big part in this.

 

So what is boosting anyway?

 

One of the real world functionality mentioned in the paragraph above is the concept of boosting, you might have inadvertently experienced this in your searches in some website out there. A good example is google in itself where you are shown some search results boosted to the top (or a place which catches your attention) as they would be from a sponsored source. In essence google has boosted the sponsored search result to the top to bring it to prominence. A well designed search interface would provide the ability to adapt to user input and modify the search results accordingly, say in drilling down or choose from among a first of equals. This is where boosting can play a big part. It therefore becomes important to understand boosting in its entirety. Given the complicated inner workings of how lucene gets boosting to work it would be better to understand this in phases. With that in mind I present to you the first of the 3 part series on what boosting is and how it works. A quick glance on what the three part series has to offer,

1. Part 1 (current article) – What is boosting? The different types of boosting and a quick look into the some of the underlying concepts like scoring and norm.

2. Part 2 – A deeper look into scoring with special focus on customizing the scoring to our need. This part will be further broken up into individual pieces covering such topics as custom query implementation, custom score provider, scoring using expressions etc.

3. Part 3 – Lucene by default uses a combination of the the Tf/Idf Vector space and Boolean models for scoring purposes. There are many other models apart from the default one used by Lucene which will be looked into in this part. This part will complement and drill deeper into the areas covered in part two.

                So let’s get started with part 1 but first a quick look into the prerequisites and the code that comes along with this article.

 

Prerequisites:
1. It is expected that the reader is aware of the basic concepts of Lucene like Document, Indexing and Analyzing, tokens, terms and querying.
2. Reader should at minimum be acquainted with the use of the basic Lucene API objects like IndexReader, IndexWriter, Query, Directory etc.

 

Code samples:
Present here is the example code to be used in conjunction with this article to understand the topic at hand. The code demonstrates the 2 types of boosting in Lucene (Indexing and Query Time) and also prints out the various scoring information associated with the results. The code is in the form of a Maven project and uses a RAMDirectory for ease of use.

 

Notes to set up and run the demo program,
1. Download the source code. The code makes use of the latest version (as of date of writing this article) of Lucene -> 4.6.
2. Run mvn package which will generate the JAR –> boost-imaginea-demo-1.0.jar
3. Place this jar along with the following Jars in a folder say “C:\Imaginea-Boost-Demo”.
         a. lucene-analyzers-common-4.6.0.jar
         b. lucene-core-4.6.0.jar
         c. lucene-queryparser-4.6.0
         d. lucene-queries-4.6.0
4. The program usage is as below,

 

Param 1: Type of boost:

index – Index Time Boosting

query – Query Time Boosting

both – Demo both Index and Query boosting

Param 2: Print scoring info: Either true or false

 

5. Example commands are as below,
C:\Imaginea-Boost-Demo>java -cp boost-imaginea-demo-1.0.jar;lucene-analyzers-common-4.6.0.jar;lucene-core-4.6.0.jar;lucene-queryparser-4.6.0.jar com.imaginea.boost.BoostExamples index false

 

C:\Imaginea-Boost-Demo>java -cp boost-imaginea-demo-1.0.jar;lucene-analyzers-common-4.6.0.jar;lucene-core-4.6.0.jar;lucene-queryparser-4.6.0.jar com.imaginea.boost.BoostExamples query false

 

C:\Imaginea-Boost-Demo>java -cp boost-imaginea-demo-1.0.jar;lucene-analyzers-common-4.6.0.jar;lucene-core-4.6.0.jar;lucene-queryparser-4.6.0.jar com.imaginea.boost.BoostExamples both false

First up in this article we need to pay a visit to the very important concepts of Scoring and Information Retrieval Models whose understanding will lay a good foundation towards understanding how boosting works beneath the hood.

 

Scoring:

You would most certainly have run into scoring in your routine Lucene search queries, after all, Lucene sorts the query results based on their “score” if you don’t specify any sorting criteria. Every document has a score to it indicating how relevant it is to the search query specified. Lucene assigns a score to every document brought up by the search after running some number crunching (more of it to come in this article) and presents the results sorting on this score with the highest valued ones first. This scoring process begins the moment the query has been processed and submitted to the IndexSearcher object. The first set of documents retrieved from the search are by means of a Boolean model (see information retrieval models below) which basically checks to see if the document at hand has the term/token or not. Once the basic subset of documents from the index have been retrieved that the scoring process begins which involves assigning of score to each document in this subset. It is by means of manipulation of the score attached to a given document that it is possible to selectively elevate the score of a subset of documents and boost them to top of the search results.

 

Information Retrieval Model:    

Now to understand how the scoring process crunches numbers and assigns a score to each document we will need to bring into context the concept of the Information Retrieval models. The theoretical world of information retrieval is rife with several models which deal with coming up with information relevant to a search query. When Lucene started out only the Boolean and the Vector Space models were implemented in it. The Vector space model is still the default Lucene model but the first subset of documented returned by the search before they are scored is always through the Boolean model which checks the presence of the search tokens in the documents. The more recent version of Lucene have had more number of information retrieval models added to them. The complete list is as below,

1. Vector Space Model

2. Probabalistic Relevance Models. There are many flavours to this like DFR (Divergence From Randomness) and BM25.

3. Language Models.

As mentioned earlier Lucene by defalt uses the Vector Space Model. Lucene permits the changing of the models used for scoring using the Similarity class. We will be looking at the changing and implementing of custom scoring and information models in parts 2 and 3 respectively. For now refer to this link to understand how Lucene implements the vector space model.

 

Different types of boosting:

Lucene supports two types of boosting, they are as below,

1. Index time boosting.

2. Query time boosting.

Although index time boosting earlier comprised of both field boosting and the document as a whole, the latter was discarded in later versions due to its irrelevance and other associated issues. For now index time boosting is only possible at a field level. Let us delve into both these in greater depth.

 

Index Time Boosting:

You would have come across the following type in the Field.Index Enum (stands deprecated now starting version 4) –> “ANALYZED_NO_NORMS“. Note the term “NORM” which is relevant in the context of index time boosting. More on it in a short while but first to define Index Time Boosting. Index time boosting is basically programatically setting the score of a field(s) (and thus impacting that of the overall document) at the time of indexing. However, you are not actually setting the score here, score is dependent on a lot of factors (for example the tokens in the query in itself which adds to the score), so what is being set is a number against a field which plays a part in the calculation of the score based on the query. This is where NORM comes into play. Norm is basically that one number against the field which affects the document’s score and thus position in the search result pecking order. Norm basically is short for normalized value. The Norm values are added to the index and this can potentially (again, potentially) help increase the query time.

When should I use index time boosting?

This pretty much depends on the business scenario at hand. For those scenarios where you know which subset of documents need to be boosted before hand, index time boosting would come in useful. Let us take a real world example here, say you have a shopping site selling cars with visitors from around the world. It is required that the search results for cars be boosted to the country of the user currently logged in. Say boost all products which are based in India to those users who have current address country as India?

Let us go ahead and add some documents to the index,

public void populateIndex() {
try {
	System.out.println(printBoostTypeInformation());
	indexWriter = new IndexWriter(ramDirectory, config);
	boostPerType("Lada Niva", "Brown", "2000000", "Russia", "SUV");
	boostPerType("Tata Aria", "Red", "1600000", "India", "SUV");
	boostPerType("Nissan Terrano", "Blue", "2000000", "Japan", "SUV");
	boostPerType("Mahindra XUV500", "Black", "1600000", "India", "SUV");
	boostPerType("Ford Ecosport", "White", "1000000", "USA", "SUV");
	boostPerType("Mahindra Thar", "White", "1200000", "India", "SUV");
	indexWriter.close();
	} catch (IOException | NullPointerException ex) {
	System.out.println("Something went wrong in this sample code -- "
				               + ex.getLocalizedMessage());
		        }
	}

	protected void boostPerType(String itemName, String itemColour,
			String itemPrice, String originOfItem, String itemType)
			throws IOException {
	  Document docToAdd = new Document();
	  docToAdd.add(new StringField("itemName", itemName,
			Field.Store.YES));

	  docToAdd.add(new StringField("itemColour", itemColour,
                                               Field.Store.YES));
	  docToAdd.add(new StringField("itemPrice", itemPrice,
                                               Field.Store.YES));
	  docToAdd.add(new StringField("originOfItem", originOfItem,
				               Field.Store.YES));

	  TextField itemTypeField = new TextField("itemType", itemType,
                                               Field.Store.YES);
	  docToAdd.add(itemTypeField);
	  //Boost items made in India
	  if ("India".equalsIgnoreCase(originOfItem)) {
		itemTypeField.setBoost(2.0f);
	  }
	  indexWriter.addDocument(docToAdd);
	}

 


The cars have been added to the index in a random order. Notice these particular lines of code in the method boostPerType,

          //Boost items made in India
	  if ("India".equalsIgnoreCase(originOfItem)) {
	 	itemTypeField.setBoost(2.0f);
	  }

Here, the field “originOfItem” is being specifically matched against the text “India” and a specific boost is being assigned to the field. Let us write a query which just does a term search for “suv” against the itemType field. The query would be as below,

itemType:suv

The code which performs the search is as below,

public void searchAndPrintResults() {
try {
  IndexReader idxReader = DirectoryReader.open(ramDirectory);
  IndexSearcher idxSearcher = new IndexSearcher(idxReader);
  Query queryToSearch = new QueryParser(Version.LUCENE_46, "itemType",
                                  analyzer).parse(getQueryForSearch());

  System.out.println(queryToSearch);
  TopScoreDocCollector collector = TopScoreDocCollector.create(10, true);
  idxSearcher.search(queryToSearch, collector);
  ScoreDoc[] hitsTop = collector.topDocs().scoreDocs;
  System.out.println("Search produced " + hitsTop.length + " hits.");
  System.out.println("----------");
  for(int i=0;i<hitsTop.length;++i) {
  int docId = hitsTop[i].doc;
  Document docAtHand = idxSearcher.doc(docId);
  System.out.println(docAtHand.get("itemName") + "\t" +
                                         docAtHand.get("originOfItem")
       	  + "\t" + docAtHand.get("itemColour") + "\t" +
                                         docAtHand.get("itemPrice")
	  		         + "\t" + docAtHand.get("itemType"));

 if (printExplanation) {
    Explanation explanation = idxSearcher.explain(queryToSearch,
                                                    hitsTop[i].doc);
    System.out.println("----------");
    System.out.println(explanation.toString());
    System.out.println("----------");
     }
   }
} catch (IOException | ParseException ex) {
    System.out.println("Something went wrong in this sample code -- "
                   + ex.getLocalizedMessage());
} finally {
			ramDirectory.close();
   }		

}

 
Let us take a look at the results, you can also try this in the demo code by running the following command,

C:\Imaginea-Boost-Demo>java -cp boost-imaginea-demo-1.0.jar;lucene-analyzers-common-4.6.0.jar;lucene-core-4.6.0.jar;lucene-queryparser-4.6.0.jar com.imaginea.boost.BoostExamples index false

The output would be as below, notice that all the documents with the country India have been boosted.

index-without-explanation  We can actually take a look at how Lucene has calculated the score for our query result documents. It will be seen that the score of the boosted car results with origin as India will have a score much higher than the others. You can also try this in the demo code by running the following command,

C:\Imaginea-Boost-Demo>java -cp boost-imaginea-demo-1.0.jar;lucene-analyzers-common-4.6.0.jar;lucene-core-4.6.0.jar;lucene-queryparser-4.6.0.jar com.imaginea.boost.BoostExamples index false

 
index-with-explanation   It is seen that the boosted India origin cars have a higher score than the ones not boosted. 1.69 > 0.8.

Query Time Boosting

We noted that in index time boosting, the normalized value is assigned to a field which is later used in calculating score at the time of querying. In Query time boosting the boost value is directly specified at the time of querying. You could this directly using the setBoost method of the various query objects or directly in the query. Let us look at an example using the same data set of cars. There is a slight change in requirement though. It is now required that the cars of white colour are boosted to the top of search results. Let us write a query for this,

itemColour:white ^2 OR itemType:suv

 


Note the text “^2″ which immediately follows the term itemColour:white. Here we have boosted that those documents which have a colour white be assigned higher rank and thus boosted. Let us take a look at the results, you can also try this in the demo code by running the following command,

C:\Imaginea-Boost-Demo>java -cp boost-imaginea-demo-1.0.jar;lucene-analyzers-common-4.6.0.jar;lucene-core-4.6.0.jar;lucene-queryparser-4.6.0.jar com.imaginea.boost.BoostExamples query false

 
query-without-explanation When should I use query time boosting?   When you require the search results to be driven by the user input or if you need to bring in specific boosts — for example you look up an external service to look up sponsored cars and boost these in specific, you did not have this information pre-hand and were thus unable to boost at index time.
 Using the explain method of the searcher to understand what happens under the hood
 In the example code above you would have noticed the following line

Explanation explanation = idxSearcher.explain(queryToSearch, hitsTop[i].doc);
System.out.println("----------");
System.out.println(explanation.toString());

 
The explain method of the IndexSearcher object is a powerful tool to understand how Lucene has calculated the score and will be helpful in debugging as well.

 

—————————————————————————————————————-
Hope this part one was useful in understanding the basics of boosting. More on boosting coming up in parts 2 and 3. Please do feel free to leave any comments for feedback or corrections in the content.

Interpolation Search: A search algorithm better than Binary Search

Although pros and cons go hand in hand, with some restrictions in datasets to be searched Interpolation search gives better performance than traditional Binary Search.

Undoubtedly binary search is a great algorithm for searching with average running time complexity of log(n). Binary search always looks for the middle of the dataset and chooses the first or the second half depending on the value of middle and the key being looked for.

The interpolation search differs from Binary in the fact that it resembles how humans search for any key in an ordered set like a word in dictionary. Suppose a person wants to find word “Algorithm” in dictionary, he already knows that he has to look in the beginning of the dictionary. The same way this algorithm keeps narrowing the search space where the searched key might be. A constant is found using the formula

C = (x – min)/(max – min), where
x is the key being looked for
max is the maximum value in the dataset
min is the minimum value in the dataset

This constant C is used to narrow down the search space.
For binary search, this constant C is (min + max)/2.

The average case running time of Interpolation search is log(log n). For this algorithm to give
best results, the dataset should be ordered and uniformly distributed.

////////////// Interpolation Search /////////

int interpolationSearch(int items[], int searchItem)
{
int low = 0;
int high = sizeof(items) – 1;
int mid;

while (items[low] = searchItem) {
mid = low +
(((searchItem – items[low]) * (high – low)) /
(items[high] – items[low]));

if (items[mid] searchItem)
high = mid – 1;
else
return mid;
}

if (items[low] == searchItem)
return low;
else
return -1;
}

void main()
{
// sorted array
int List[20] = {1,2,3,4,5,6,7,8,9,10};
int item = 0;
// returns the index of searched item or -1
item = interpolationSearch(List, 2);
if(item >= 0)
{
printf(“Searched item is: %d”, List[item]);
}
else
{
printf(“Item not found”);
}
getch();
}

References:
https://www.princeton.edu/~achaney/tmve/wiki100k/docs/Interpolation_search.html
http://www.cs.technion.ac.il/~itai/publications/Algorithms/p550-perl.pdf
http://programmers.stackexchange.com/questions/119703/interpolation-search-vs-binary-search
http://www.stoimen.com/blog/2012/01/02/computer-algorithms-interpolation-search/

MVC DatePicker the easiest way

Imagine how your user experience will improve if you provide a date picker to pick a date in your applications. As you know there are several JQuery plugins available which provide this functionality. But as a MVC developer you have to integrate Javascript with your code and write both dotnet code and javascript code. We all know that how tedious that it can be.

Again imagine that how it would be to do this in just few simple lines in dotnet code and not worry about JQuery. Now check the below code:

       @(Html.Juime().DatePicker(“datePicker“)
                                  .DataMap(item => {
                                                                  item.Value = DateTime.Today;
                                                      })
            )

 Here is the output of the above code:

As you see with few lines of code we were able create a DatePicker control in our application. Juime made this possible.

Juime is a free open source controls built on top of famous JQuery UI. With Juime you can bring JQuery UI controls into your project without any JavaScript code. Juime also provides several customizable configuration options for this date picker or any other controls.

Check our GitHub site for more details.

Juime Controls

GitHub Site

JQuery UI MVC

With the advent of ASP.NET MVC, the web development paradigm changed. We are now doing more client side coding than ever. Luckily JQuery is helping us to simplify the client side coding. But still there is a big learning curve. This learning in tandem with MVC learning is a humungous task.

Our open source product, JQuery UI MVC Extensions (aka Juime) eliminates or reduces your JQuery coding without sacrificing the rich web development, so that you can focus more on your strengths, MVC development.

Our goal is to provide simple yet efficient developer experience while creating complex UI elements. That is the reason we based our Juime on top the most populat JQuery UI components. In this blog I will show you how to easily create a tab in your view using the razor syntax you already know.

@(Html.Juime().Tab(“tabId”)
                   .Panels(panels => {
                                                   panels.Add(panel => {
                                                                            panel.Header = “First Tab”;
                                                                            panel.Action(‘Action’, ‘Controller’);
                                                                          });
                                                   panels.Add(panel => {
                                                                            panel.Header = “Second Tab”;
                                                                            panel.Ajax(‘AjaxAction’, ‘AjaxController’);
                                                                           });
                                })
)
 

The above code demonstrates simple usuage of a tab control. Juime provides full control to the developer by providing several complex options and even client side javascript methods and events.

As you see our syntax is simpler and easy to use than most of the commercially available products. For more information go to our GitHub page

https://github.com/Imaginea/JUIME

Beginning with Spring Data JPA

In my previous post, I briefly wrote about JPA, a Java specification for accessing, persisting, and managing data between Java objects / classes and a relational database. In this post, I will talk about Spring Data JPA with a simple example.

Why Spring Data JPA?

The data access code which uses the Java Persistence API contains a lot of unnecessary boilerplate code. Too much boilerplate code has to be written  If we have to write dynamic queries or implement pagination.

Spring Data JPA is another layer of abstraction for the support of persistence layer in spring context. The goal of Spring Data repository abstraction is to significantly reduce the amount of boilerplate code required to implement data access layers for various persistence stores. We just have to write repository interfaces, including custom finder methods, and Spring will provide the implementation automatically.

Defining repository interfaces

As a first step we define a domain class-specific repository interface. The interface must extend Repository and be typed to the domain class and an ID type. If you want to expose CRUD methods for that domain type, extend CrudRepository instead of Repository. The CrudRepository provides sophisticated CRUD functionality for the entity class that is being managed.  We can add additional query methods to this repository interface.

public interface EmployeeRepository extends CrudRepository<Employee, Integer> {
}

Note: Employee entity class is defined in previous post.

XML configuration

Each Spring Data module includes repositories element that allows you to simply define a base package that Spring scans for you.  Spring bean configurations are used to configure JPA vendor, data source for creating entity manager.

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:jpa="http://www.springframework.org/schema/data/jpa"
       xsi:schemaLocation="http://www.springframework.org/schema/beans

http://www.springframework.org/schema/beans/spring-beans.xsd

http://www.springframework.org/schema/data/jpa

    http://www.springframework.org/schema/data/jpa/spring-jpa.xsd">

    <jpa:repositories base-package="com.coderevisited.spring"/>

    <bean id="dataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource">
        <property name="driverClassName" value="com.mysql.jdbc.Driver"/>
        <property name="url" value="jdbc:mysql://localhost/test"/>
        <property name="username" value="root"/>
        <property name="password" value="password"/>
    </bean>

    <bean id="jpaVendorAdapter" class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter">
        <property name="showSql" value="true"/>
        <property name="generateDdl" value="true"/>
        <property name="database" value="MYSQL"/>
    </bean>

    <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
        <property name="dataSource" ref="dataSource"/>
        <property name="jpaVendorAdapter" ref="jpaVendorAdapter"/>
        <!-- spring based scanning for entity classes-->
        <property name="packagesToScan" value="com.coderevisited.spring"/>
    </bean>

    <bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager"/>

</beans>

Querying Repository

We can execute query methods from application class by referring to repository bean.

public class EmployeeTest {

    private static CrudRepository repository;

    public static void main(String[] args) {
        AbstractApplicationContext context = new ClassPathXmlApplicationContext("spring-config.xml");
        repository = context.getBean(EmployeeRepository.class);

        createEmployee(22, "Saint", "Peter", "Engineering");
        createEmployee(23, "Jack", " Dorsey", "Imaginea");
        createEmployee(24, "Sam", "Fox", "Imaginea");

        context.close();

    }

    private static void createEmployee(int id, String firstName, String lastName, String dept) {

        Employee emp = new Employee(id, firstName, lastName, dept);
        repository.save(emp);
    }

}

The above application code when executed would insert 3 rows to Employee table in test database.

The code for this example can be downloaded from here

Beginning with JPA 2.0

The Java Persistence Architecture API (JPA) is a Java specification for accessing, persisting, and managing data between Java objects / classes and a relational database. JPA allows POJO (Plain Old Java Objects) to be easily persisted. JPA allows an object’s object-relational mappings to be defined through standard annotations or XML defining how the Java class maps to a relational database table.

JPA also defines a runtime EntityManager API for processing queries and transaction on the objects against the database. JPA defines an object-level query language, JPQL, to allow querying of the objects from the database.

Why JPA?

It is a standard for the management of persistence and object/relational mapping with Java EE and Java SE. JPA supports the large data sets, data consistency, concurrent use, and query capabilities of JDBC. Like object-relational software and object databases, JPA allows the use of advanced object-oriented concepts such as inheritance. JPA avoids vendor lock-in by relying on a strict specification. JPA focuses on relational databases. And JPA is extremely easy to use.

 Currently most of the persistence vendors have released implementations of JPA confirming its adoption by the industry and users. These include Implementations  include Hibernate, TopLink, and Kodo JDO, Cocobase and JPOX.

Java Persistence consists of four areas

Entities

In Object oriented paradigm, An entity is a lightweight persistence domain object. Typically, an entity represents a table in a relational database, and each entity instance corresponds to a row in that table. Changing values of entity instance works just like changing values on any other class instance. The difference is that you can persist those changes in database.

The primary programming artifact of an entity is the entity class, although entities can use helper classes. The persistent state of an entity is represented through either persistent fields or persistent properties. These fields or properties use object/relational mapping annotations to map the entities and entity relationships to the relational data in the underlying data store.

Characteristics of an Entity

  1. Persistability
    • An Entity is persistable since it can be saved to a persistence store. But this doesn’t happen automatically. It has to be invoked by API. It is important because it leaves control over persistence to application.

  2. Identity
    • Identifier, is the key that uniquely identifies an entity instance and distinguishes it from all the other instances of the same entity type.

  3. Transactionality
    • Changes made to the database either succeed or fail atomically, so the persistent view of an entity should indeed be transactional.

  4. Granularity
    • Entities are meant to be fine-grained objects and they are business domain objects that have specific meaning to the application that accesses them.

Typical Entity class with annotations metadata

@Entity
@Table(name = "Employee")
public class Employee {

    @Id
    @Column(name = "id")
    private int id;

    @Column(name = "fistName")
    private String firstName;

    @Column(name = "lastName")
    private String lastName;

    @Column(name = "dept")
    private String dept;

    public Employee(){

    }

    public Employee(int id, String firstName, String lastName, String dept){
        this.setId(id);
        this.setFirstName(firstName);
        this.setLastName(lastName);
        this.setDept(dept);
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String getFirstName() {
        return firstName;
    }

    public void setFirstName(String firstName) {
        this.firstName = firstName;
    }

    public String getLastName() {
        return lastName;
    }

    public void setLastName(String lastName) {
        this.lastName = lastName;
    }

    public String getDept() {
        return dept;
    }

    public void setDept(String dept) {
        this.dept = dept;
    }
}

Managing Entities

 

Entities are managed by the entity manager. Each EntityManager instance is  associated with a persistence context: a set of managed entity instances that exist in a particular data store. A persistence context defines the scope under which particular entity instances are created, persisted, and removed. The EntityManager interface defines the methods that are used to interact with the persistence context.

EntityManager is to be implemented by a particular persistence provider. It is the provider that supplies the backing implementation engine for the entire Java Persistence API, from the EntityManager through to implementation of the query classes and SQL generation.

 

Illustration of Entity manager

public class EmployeeTest {

    private static EntityManager em;

    public static void main(String[] args) {

        EntityManagerFactory emf = Persistence.createEntityManagerFactory("EmployeeService");
        em = emf.createEntityManager();

        createEmployee(1, "Saint", "Peter", "Engineering");
        createEmployee(2, "Jack", " Dorsey", "Imaginea");
        createEmployee(3, "Sam", "Fox", "Imaginea");

    }

    private static void createEmployee(int id, String firstName, String lastName, String dept) {
        em.getTransaction().begin();
        Employee emp = new Employee(id, firstName, lastName, dept);
        em.persist(emp);
        em.getTransaction().commit();
    }
}

Persistence.xml

I have used Hibernate provider. Added this persistent.xml file to META-INF dir.

 

Maven dependencies

 

org.eclipse.persistence:javax.persistence:2.0.0
org.hibernate:hibernate-entitymanager:4.2.8.Final
mysql:mysql-connector-java:5.1.27

 

Project Structure in IntelliJ IDEA

Asgard with jclouds

Asgard is a grail web application developed by Netflix for managing cloud and deploying applications.  It provides additional features AWS dashboard doesn’t offer directly.

jclouds is an open source project for creating  and controlling cloud resources with single interface for most cloud providers  by abstracting common features in all providers as  compute and blobstore services and also providing cloud provider specific features. Provisioning with jclouds becomes simple to switch cloud provider easily.

Why Asgard with jclouds ?

Looking at additional features asgard adds on top of AWS it would be a great cloud provisioner/manager if  we can leverage application and cluster model  used in asgard with other cloud providers with minimal configuration .

So by changing service layer to jclouds as cloud provisioner will add benefits like.

Netflix’s Asgard  has few assumptions which needs to be  changed  to make it more generic for cloud provider portability

Asgard re-engineering with jclouds

What’s  done so far…

Setting up Asgard with Jclouds

Checkout  code from github  repostory

Build war using grails war plugin and deploy it in server

Add User Name ,API key for cloud providers (Access Key and Secret Key for AWS, Openstack EC2) . End point needed for Openstack.

Challenges/Work Ahead

jclouds abstraction is still evolving and lot of features still needs to be added to abstraction, At present most of the other  cloud providers are lagging behind AWS, as these cloud providers catch up with AWS. More features may be abstracted in jclouds.

What’s TypeScript

TypeScript’s definition from Microsoft:

TypeScript is a typed superset of JavaScript that compiles to plain JavaScript and is a language for application-scale JavaScript development.

TypeScript is a not new language, but a thin layer on top of existing JavaScript strengthening it with tooling, IDE services and refactoring. It enables easy embrace of JavaScript while building enterprise applications using todays technologies (see my blog on this). Thanks to Microsoft, now everyone can build scalable large JavaScript applications targeting all browsers and platforms.

TypeScript is a superset of JavaScript which starts and end with JavaScript. All your existing JavaScript is TypeScript. Unlike other transcompilers, like CoffeeScript, TypeScript is a syntactic sugar built on top of JavaScript to support large applications and teams. By building on JavaScript, TypeScript can make your JavaScript development at very high level or close to the metal.

TypeScript is not a replacement for JavaScript and not for better performance or optimization. It only aids in writing, maintaining and verifying code. TypeScript estimates what happens during runtime and performs static checking while you are writing code thereby protecting you from unwanted side effects and dramatically increasing your productivity. It also incorporates design patterns and best practices. TypeScript also provide a mechanism for documentation and statement completion for most of your favorite JavaScript libraries.

The interesting thing about TypeScript is that the TypeScript’s compiler and tooling support is also written in TypeScript. With its local and non-intrusive code generation, TypeScript works wherever JavaScript works i.e., browsers, node, cloud, Windows 8 etc.

In this blog I outline the important features and tooling supported by TypeScript. In my next blog I delve into these in detail.

TypeScript Features:

The feature set of TypeScript is similar to that of any object oriented program such as C# or VB.NET. With these feature set Microsoft is trying for easy adoptability to JavaScript.

Type System

TypeScript as the name indicates provide an ability to define types in JavaScript. With the static types, TypeScript provides syntax highlighting helping you identify bugs before even running the code. These static types are supported for both variables and parameters. This type system is optional and is only used to aid writing code.

TypeScript supports all the primitive types such as number, string, boolean, null. It also support complex types such as DOM elements, custom types such as JQuery elements and a special type called Any

Class

Similar to other object oriented programs, TypeScript included Class syntax thereby making .NET and object oriented programming developers feel home and easily group related functions and variables within this container. Similar to classes in other programming languages, these classes helps in code abstraction, inheritance, reusability and maintainability.

TypeScript classes align with ECMA Script 6 proposal which supports classes in JavaScript.

Properties

TypeScript provides properties using get and set accessor declarations. These properties similar to that of the C# or VB.NET

Methods

The Method support is similar to that of JavaScript and other languages. TypeScript uses prototype to chain the method calling.

Interface

Interfaces helps to provide consistency across modules and teams. Interface also helps to provide documentations for custom or external JavaScript libraries.

Inheritance

Similar to OO, TypeScript provides Inheritance. The syntax is similar to that of Java with extends and super keywords

Modules

Modules are similar to the namespace concept in .NET. They wrap the classes in a naming container, help to organize the classes and modules and avoid naming collisions. They also provide a mechanism to exposes classes or have internal classes.

Accessibility

TypeScript provides accessibility options for classes and their members. The class members’ accessibility can be set by using public and private and the class accessibility can be restricted or allowed using the export keyword.

Open ended

Both the modules and classes are open ended, meaning you can define them in any place and they all belong to the same class or module. You can think of this like a partial class.

 

TypeScript Tooling:

In order to build enterprise scale JavaScript applications a robust tooling is needed. TypeScript provides such tooling in Visual Studio 2012 and Visual Studio 2013. TypeScript tooling is not limited to Visual Studio, but also support Eclipse, Web Storm, Sublime Text, emacs, and vim. But Visual Studio provides best developer experience.

Below are some of the tooling support in Visual Studio:

Microsoft is planning to add more tooling support such as split screen view, generated code grouping in their next releases.

Try Out TypeScript

Microsoft designed a playground to try out TypeScript and it’s capabilities before deciding to use it in your project. In that playground they also included some samples.

http://www.typescriptlang.org/Playground

 

With so many features and tools, TypeScript is must for application development and is too expensive in time for not having TypeScript in your armor.