Wall Street Journal 2009 technology Awards

Date Difference in MS SQL Server

There are cases where you need to quickly find the difference between 2 dates, you can use the following SQL Query:

SELECT DATEDIFF(ss, ‘2009-09-15 10:09:43.000′,’2009-09-15 10:10:07.000′ )

Removing blank lines in text file using Regular Expression

This tutorial explains about the removing blank lines using regular expression. Here I have used the TextPad Editor to do this.

For Example, the contents of the file is as follows , with lot of blank lines. We will see how we can remove this using TextPad

image001

Select Search -> Replace or F8 (keyboard shortcut)

image003

Select the option shown below, in the “Find what” field enter “^\n” (without the quotes), and then click on “Replace All” button.

image005

This screenshot shows the final outcome of the text file.

image007

Final Note:

What does the “^\n” mean ?

In regular expression “^” indicates the start of the line, “\n” is end of line,  so “^\n” is that any line starting new line character , replace with blank character.

Application Data Quality and Data Management

As the Application Evolves, which means adding new features, upgrades, fulfillment activities, new business process changes one of the biggest problem is Data Discrepancy between various heterogeneous data sources.

The Data Quality will start deteriorating as the system evolves, this needs to be addressed. A proper data management is required.

Some problems:
Clean up effort to large set of data will be difficult job.
This might increase lot of customer services calls. The company might turn into reactive mode of addressing customer issues.

Apache – RESTful Web services

Apache Wink provides framework for RESTful webservices.

http://incubator.apache.org/wink/

Download from the following location:

http://incubator.apache.org/wink/downloads.html

Java API for Cloud and Cloud Providers

Java API for Cloud

http://www.cloudloop.com/
http://dasein-cloud.sourceforge.net/

Cloud Providers

Amazon S3
Nirvanix
Eucalyptus Walrus
Rackspace CloudFiles
Sun Cloud
Microsoft Azure
EMC Atmos
Aspen
Diomede

Simple Sort Program in Java

Here is a simple sort program in Java. This program sorts the integer in an array.

package beam;

public class SimpleSort {

public void sortAndPrint(int[] iArr) {
/* define an array */
int[] iNerArr = new int[iArr.length];
int minItem = 0;
int minItemPos = -1;
int newItem = 0;
int iter=0;

for (int i = 0; i < iArr.length; i++) {
minItem = iArr[i];
minItemPos = -1;
for (int j = i; j < iArr.length; j++) {
newItem = iArr[j];
iter++;
/* if curritem gt newitem, interchange the values */
if (minItem > newItem) {
minItem = newItem;
minItemPos = j;

}
}

if (-1 != minItemPos) {
iArr[minItemPos] = iArr[i];
iArr[i] = minItem;
}

}

System.out.println(“total iterations ” + iter);
printArray(“after “, iArr);
}

/*method to print array. you can use what ever you want.*/
public void printArray(String str, int[] iArr) {
System.out.println(“————–(” + str + ” )—————–”);
System.out.print(“[");
for (int itemp = 0; itemp < iArr.length; itemp++) {
System.out.print(iArr[itemp] + “,”);
}
System.out.print(“]”);
System.out.println(“\n————–(” + str + ” )—————–”);
}

/**
* @param args
*/
public static void main(String[] args) {
SimpleSort ss = new SimpleSort();
int[] iArr = { 4, 5, 6, 2, 1 };
ss.printArray(“before”, iArr);
ss.sortAndPrint(iArr);

int[] iArr1 = { 2, 3, 4, 5, 6 };
ss.printArray(“before”, iArr1);
ss.sortAndPrint(iArr1);

int[] iArr2 = { -22, 1003, 2314, 1235, 132336 };
ss.printArray(“before”, iArr2);
ss.sortAndPrint(iArr2);

int[] iArr3 = { 1,2,4,3,2,1,6,3,2,4,3,5 };
ss.printArray(“before”, iArr3);
ss.sortAndPrint(iArr3);

}

}

Blog on Scalability / Storage solutions

Articles on Scalability / Storage solutions :

http://www.hfadeel.com/Blog/

Apache Lucene – First Tutorial

Download Lucene 2.4.1 , Extract to  c:\downloads

Copy the luceneweb.war to the <Tomcat Installation>/webapps folder.

Indexing files: The default location for the Lucene is /opt/lucene/index. Create a folder structure
manually on your windows/linux box

Windows:
c://opt/lucene/index

Create an create_index.bat , for convenience

set LUCENE_EXT=C:\downloads\lucene-2.4.1
set CLASSPATH=%CLASSPATH%;%LUCENE_EXT%\lucene-core-2.4.1.jar;%LUCENE_EXT%\lucene-demos-2.4.1.jar;
java org.apache.lucene.demo.IndexHTML -create  /opt/lucene/index

Results will be shown as below

C:\downloads\lucene-2.4.1>java org.apache.lucene.demo.IndexHTML -create  /opt/lucene/index
Optimizing index…
313 total milliseconds

Start the Tomcat

Accessing the luceneweb webapp

http://localhost/luceneweb
http://localhost:8080/luceneweb

Development environment for Java and Hadoop

Development environment for Java and  Hadoop

Purpose: The purpose of this document is to describe the first steps towards the java development on the Hadoop Environment.

Approach: Here we are utilizing power of the VM (virtual machine) to setup the development environment. Remote development feature from Eclipse.

  1. Hadoop Infrastructure :

To reduce the setup time of the hadoop infrastructure we use the pre-configured VM Image . This can be downloaded from Google website.

http://code.google.com/edu/parallel/tools/hadoopvm/index.html

Download the VM from the above location, and follow the instructions.

  1. Eclipse Plugin for Hadoop:

This jar can be downloaded from http://code.google.com/edu/parallel/tools/hadoopvm/hadoop-eclipse-plugin.jar

The instructions are available at the following location :

http://code.google.com/edu/parallel/tools/hadoopvm/index.html

  1. Making Eclipse to talk to Hadoop:

In Eclipse,

Window à Show View à Other à Map Reduce Tools à Map Reduce Servers

Add the new Hadoop Server in the following view

Add the following entries in the Hadoop configuration Setup, Click on validation to check the health of the Hadoop Server.

Change to “Map Reduce” Perspective, the view the server files

Create a new project

You need to download the Hadoop API (Hadoop-0.14.0), just for compiling purposes. Use that “Specify Hadoop Library Location”.

Use the following Java Program (provided by the Apache Hadoop Site)

[codesyntax]

package org.myorg;

import java.io.IOException;

import java.util.*;

import java.util.regex.Matcher;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.conf.*;

import org.apache.hadoop.io.*;

import org.apache.hadoop.mapred.*;

import org.apache.hadoop.util.*;

import beam.AccessLogFileAnalyzer;

public class WordCount {

public static class Map extends MapReduceBase implements Mapper {

private final static LongWritable one = new LongWritable(1);

private Text word = new Text();

private IntWritable minute = new IntWritable();

public void map(WritableComparable arg0, Writable arg1,

OutputCollector arg2, Reporter arg3) throws IOException {

String line = arg1.toString();

StringTokenizer tokenizer = new StringTokenizer(line);

while (tokenizer.hasMoreTokens()) {

word.set(tokenizer.nextToken());

arg2.collect(word, one);

}

}

}

public static class Reduce extends MapReduceBase implements Reducer {

public void reduce(WritableComparable arg0, Iterator arg1,

OutputCollector arg2, Reporter arg3) throws IOException {

int sum = 0;

while (arg1.hasNext()) {

sum += ((LongWritable) arg1.next()).get();

}

arg2.collect(arg0, new LongWritable(sum));

}

}

public static void main(String[] args) throws Exception {

JobConf conf = new JobConf(WordCount.class);

conf.setJobName(“wordcount”);

conf.setOutputKeyClass(Text.class);

conf.setOutputValueClass(LongWritable.class);

conf.setMapperClass(Map.class);

conf.setCombinerClass(Reduce.class);

conf.setReducerClass(Reduce.class);

conf.setInputPath(new Path(“/user/guest/wordcountin”)); // args[0] changed

// to

conf.setOutputPath(new Path(“/user/guest/wordcountout2″)); // args[1]

// changed

/*

* FileInputFormat.setInputPath(new Path(args[0]));

* FileOutputFormat.setOutputPath(new Path(args[1]));

*/

JobClient.runJob(conf);

}

}

[/codesyntax]

Note:

/user/guest/wordcountin –

Next Page »