Guess what this blog is about Sherlock! :) - Jeryl Cook

Tuesday, March 20, 2012

NullPointerException TermBuffer ...felt like a noob with this one..

java.lang.NullPointerException
 at org.apache.lucene.index.TermBuffer.set(TermBuffer.java:95)
 at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:160)
 at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:232)
 at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
 at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:911)
 at org.apache.lucene.index.DirectoryReader.docFreq(DirectoryReader.java:644)
 at org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java:138)
 at org.apache.lucene.search.Similarity.idfExplain(Similarity.java:735)
 at org.apache.lucene.search.TermQuery$TermWeight.(TermQuery.java:46)
 at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:171)
 at org.apache.lucene.search.BooleanQuery$BooleanWeight.(BooleanQuery.java:188)
 at org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:362)
 at org.apache.lucene.search.Query.weight(Query.java:101)
 at org.apache.lucene.search.Searcher.createWeight(Searcher.java:147)
 at org.apache.lucene.search.Searcher.search(Searcher.java:98)


Ever get t his non-descriptive exception in lucene?  its because your passing a null value within the TermQuery!!

Wednesday, December 29, 2010

Lucandra(Cassandra & Lucene) + Hector

Remember to called IndexReader.close(), after releasing the Cassandra client back to the pool when using Lucandra!

Labels: , ,

Wednesday, October 20, 2010

Restlet ,Guard, Spring Security 3.0, and HTTP BASIC AUTHENTICATION o my!

Not sure what happen to this guy eshepelyuk and the project
restlet-spring-security

But I needed a clean way to integrate Spring Security 3.0.x into a Restlet Application I've been working on..

I updated the code(ServiceSpringSecurityGuard.java)
to work with Spring 3.0.x so here it is:

package org.restlet.ext.spring.security;

import org.restlet.Guard;
import org.restlet.data.ChallengeScheme;
import org.restlet.data.Request;
import org.springframework.beans.BeansException;
import org.springframework.beans.factory.InitializingBean;
import org.springframework.context.ApplicationContext;
import org.springframework.context.ApplicationContextAware;
import org.springframework.security.authentication.AuthenticationManager;
import org.springframework.security.authentication.UsernamePasswordAuthenticationToken;
import org.springframework.security.config.BeanIds;
import org.springframework.security.core.Authentication;
import org.springframework.security.core.AuthenticationException;
import org.springframework.security.core.context.SecurityContextHolder;
import org.springframework.util.Assert;

public class ServiceSpringSecurityGuard
extends Guard  implements ApplicationContextAware, InitializingBean {

 private AuthenticationManager authentificationManager;
 private ApplicationContext applicationContext;

 public ServiceSpringSecurityGuard() {
  super(null, ChallengeScheme.HTTP_BASIC, "Spring Security");
 }

 public AuthenticationManager getAuthentificationManager() {
  return authentificationManager;
 }

 public void setAuthentificationManager(AuthenticationManager authentificationManager) {
  this.authentificationManager = authentificationManager;
 }

// private AccessDecisionManager accessDecisionManager;

// public AccessDecisionManager getAccessDecisionManager() {
//  return accessDecisionManager;
// }
//
// public void setAccessDecisionManager(AccessDecisionManager accessDecisionManager) {
//  this.accessDecisionManager = accessDecisionManager;
// }

 public void setApplicationContext(ApplicationContext applicationContext) throws BeansException {
  this.applicationContext = applicationContext;
 }

 public void afterPropertiesSet() throws Exception {
  Assert.notNull(this.applicationContext, "applicationContext is null");

//  if (null == accessDecisionManager) {
//   setAccessDecisionManager((AccessDecisionManager) applicationContext.getBean(BeanIds.ACCESS_MANAGER));
//  }

  if (null == authentificationManager) {
   setAuthentificationManager((AuthenticationManager) applicationContext.getBean(BeanIds.AUTHENTICATION_MANAGER));
  }

  Assert.notNull(this.authentificationManager, "authentificationManager should be specified");
//  Assert.notNull(this.accessDecisionManager, "accessDecisionManager should be specified");
 }

 @SuppressWarnings("unused")
 public boolean checkSecret(Request request, String identifier, char[] secret) {
  try {
   Authentication auth = authentificationManager.authenticate(new UsernamePasswordAuthenticationToken(identifier, new String(secret)));
   if (auth.isAuthenticated()) {
    SecurityContextHolder.getContext().setAuthentication(auth);
   }
   return auth.isAuthenticated();
  } catch (AuthenticationException e) {
   SecurityContextHolder.getContext().setAuthentication(null);
   return false;
  }
 }
}

Labels: , , ,

Wednesday, October 13, 2010

Cassandra-0.7.0 and Hector.

*sigh*...midnight ...and I have tons of coding tomorrow at 6am!

Anyway hope this helps someone..

When connecting to Apache-cassandra-0.7.0-beta2 use Hector-0.6.0.17 and not the other connection crap using sockets,etc(in the test cases,sources and samples.).....it will hang when trying to getKeyspace.

sooo,its something like this...

CassandraClientPool pool = CassandraClientPoolFactory.INSTANCE.get(); 
        CassandraClient client = pool.borrowClient(localhost, 9160);        
        Cassandra.Iface  cassandraClient = client.getCassandra();


Just to note this only work when using Hector to connection to Cassandra...
I am working on a project that requires 'real-time' indexing/searching so "Lucandra" was my obvious choice!
as of right now Lucandra-0.7 works ONLY with

Apache-cassandra-0.7.0-beta1 and NOT beta2... Also stop the 'connection' pool insanity and trust me, you want to modify the source to use Hector-0.6.0.17 just remember you have to set the keyspace after you create a Cassandra client.

cassandraClient.set_keyspace("Lucandra");

Labels: , ,

Thursday, June 10, 2010

HIERARCHY_REQUEST_ERR and Spring-WS

Stuck on a issue for longer than 1 hour!

if you are using Spring-WS 1.5.x WebServiceTemplate
and having issues combining your content using JAXB binding. Do not use

org.springframework.ws.soap.axiom.saaj.SaajSoapMessageFactory

use the Axiom factory instead.
org.springframework.ws.soap.axiom.AxiomSoapMessageFactory 
]

There is a bug in the SAAJ factory that will produce the annoying exception:

org.w3c.dom.DOMException: HIERARCHY_REQUEST_ERR: An attempt was made to insert a node where it is not permitted. 

Hope this helps someone!

Labels: ,

Thursday, November 12, 2009

Type Dependency Parsing using Java

It is possible to extract meaningful terms and concepts from unstructured information with the help of Text Type Dependency Parsers.

Unstructured information can be Text files,PDF, or MS Word documents buried on a hard drives, within emails on an exchange server, or even Audio streams after they are converted to text.


We won't waste much time going over Tree Structures,etc. and just simply dive into the good stuff..

Stanford Parser


ConceptExtractorImpl.java

import java.util.ArrayList; import java.util.Collection; import java.util.Iterator; import java.util.List; import org.apache.log4j.Logger; import edu.stanford.nlp.parser.lexparser.LexicalizedParser; import edu.stanford.nlp.trees.GrammaticalStructure; import edu.stanford.nlp.trees.GrammaticalStructureFactory; import edu.stanford.nlp.trees.PennTreebankLanguagePack; import edu.stanford.nlp.trees.Tree; import edu.stanford.nlp.trees.TreebankLanguagePack; import edu.stanford.nlp.trees.TypedDependency; public class ConceptExtractorImpl {     Logger logger = Logger.getLogger(ConceptExtractorImpl.class);       int maxWordCount = 50;     LexicalizedParser lexicalizedParser = null;         TreebankLanguagePack treebankLanguagePack = null;     GrammaticalStructureFactory grammaticalStructureFactory = null;         String lexicalizedParserFile = "conf/models/standford/englishPCFG.ser.gz";          public ConceptExtractorImpl(){         lexicalizedParser = new LexicalizedParser(lexicalizedParserFile);             lexicalizedParser.setOptionFlags(new String[] { "-maxLength""80",                 "-retainTmpSubcategories" });         treebankLanguagePack = new PennTreebankLanguagePack();         grammaticalStructureFactory = treebankLanguagePack.grammaticalStructureFactory();     }     public String removePosition(String text){         StringBuffer newWord = new StringBuffer();         boolean isLastPosition = false;         for(int index=text.length()-1;index>=0;index--){             if ( isLastPosition){                 newWord.append(text.charAt(index));             }             if ( text.charAt(index) == '-'){                 isLastPosition = true;             }         }         StringBuffer word = new StringBuffer();         for(int index=newWord.length()-1;index>=0;index--){                 word.append(newWord.charAt(index));         }     return word.toString();         }          private boolean shouldUse(TypedDependency typedDependency) {         boolean shouldUse = false;         if(              typedDependency.reln().getShortName().trim().equalsIgnoreCase("nn") ||              typedDependency.reln().getShortName().trim().equalsIgnoreCase("prep") ||              typedDependency.reln().getShortName().trim().equalsIgnoreCase("dep") ||              typedDependency.reln().getShortName().trim().equalsIgnoreCase("conj_and") ||              typedDependency.reln().getShortName().trim().equalsIgnoreCase("num") ||              typedDependency.reln().getShortName().trim().equalsIgnoreCase("amod")              ){             shouldUse = true;         }         return shouldUse;     }          public List<String> extractConcepts(String sentence){         List<String> concepts = new ArrayList();         Tree tree = (Tree) lexicalizedParser.apply(sentence);         StringBuffer typedDependcy  = new StringBuffer();         GrammaticalStructure gs = grammaticalStructureFactory.newGrammaticalStructure(tree);         Collection tdl = gs.typedDependenciesCCprocessed(false);         Iterator it = tdl.iterator();          while (it.hasNext()) {             TypedDependency typedDependency = (TypedDependency) it.next();             String phrase =  removePosition( typedDependency.dep().toString().toLowerCase() ) + " " + removePosition( typedDependency.gov().toString().toLowerCase());             if ( shouldUse(typedDependency)                     ){                 concepts.add(phrase);                           }         }         return concepts;     } }
The above code will extract "meaning" from the example sentence.

The woman has cancer in lower left lung.


and extract the following concepts from it:
1. lower lung.
2. left lung.
3. lung cancer.

After obtaining these terms and concepts one usually maps them to a taxonomy.

Labels: ,