-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 
Author Message
 Post subject: Get found terms
PostPosted: Wed Mar 07, 2012 12:28 pm 
Newbie

Joined: Wed Mar 07, 2012 12:21 pm
Posts: 6
Location: Zürich
Hi,

I'm looking for a way to get the found terms (words) from a query. Meaning I would like to get telephone if the query is tele* (and telephone is found in the index, obviously).

I've been looking and trying for a while but could not get an implementation to work so far. Can you give me a hint on how to get the matched terms.

thanks.rico


Top
 Profile  
 
 Post subject: Re: Get found terms
PostPosted: Thu Mar 08, 2012 6:07 am 
Newbie

Joined: Wed Mar 07, 2012 12:21 pm
Posts: 6
Location: Zürich
this is what I tried so far.

Using the highlighter:

Code:
SearchFactory searchFactory = fullTextEntityManager.getSearchFactory();
IndexReaderAccessor indexReaderAccessor = searchFactory.getIndexReaderAccessor();
IndexReader indexReader = indexReaderAccessor.open(Record.class);
IndexSearcher indexSearcher = new IndexSearcher(indexReader);

TopDocs hits = null;

try {
   hits = indexSearcher.search(luceneQuery, 10);
} catch (IOException ex) {
   Logger.getLogger(HibernateSearchManager.class.getName()).log(Level.SEVERE, null, ex);
}

if(hits.scoreDocs.length != 0) {
   QueryScorer scorer = new QueryScorer(luceneQuery);
   scorer.setExpandMultiTermQuery(true);

   Formatter formatter = new SimpleHTMLFormatter();
   Highlighter highlighter = new Highlighter(scorer);

   Analyzer analyzer2 = new StandardAnalyzer(Version.LUCENE_34);
   Document doc = null;
   try {
      doc = indexSearcher.doc(hits.scoreDocs[0].doc);
   } catch (CorruptIndexException ex) {
      Logger.getLogger(HibernateSearchManager.class.getName()).log(Level.SEVERE, null, ex);
   } catch (IOException ex) {
      Logger.getLogger(HibernateSearchManager.class.getName()).log(Level.SEVERE, null, ex);
   }

   String title = doc.get("content");


   TokenStream stream = null;
   
   try {
      stream = TokenSources.getAnyTokenStream(indexReader,
            hits.scoreDocs[0].doc,
            "content",
            doc,
            analyzer2);
   } catch (IOException ex) {
      Logger.getLogger(HibernateSearchManager.class.getName()).log(Level.SEVERE, null, ex);
   }

   String fragment = "";

   try {
      fragment = highlighter.getBestFragment(analyzer2,"content", title);
   } catch (IOException ex) {
      Logger.getLogger(HibernateSearchManager.class.getName()).log(Level.SEVERE, null, ex);
   } catch (InvalidTokenOffsetsException ex) {
      Logger.getLogger(HibernateSearchManager.class.getName()).log(Level.SEVERE, null, ex);
   }

   System.out.println("Fragment => " + fragment);
}


However, this does not give me the term but the the fragment. This is not exactly what I'm looking for.

I tried to use the explain method as well:

Code:
for (int i = 0; i < hits.totalHits; i++) {
   ScoreDoc match = hits.scoreDocs[i];
   Explanation explanation = null;
   try {
      explanation = indexSearcher.explain(luceneQuery, match.doc);
   } catch (IOException ex) {
      Logger.getLogger(HibernateSearchManager.class.getName()).log(Level.SEVERE, null, ex);
   }
   System.out.println("----------");
   Document explainDoc = null;
   try {
      explainDoc = indexSearcher.doc(match.doc);
   } catch (CorruptIndexException ex) {
      Logger.getLogger(HibernateSearchManager.class.getName()).log(Level.SEVERE, null, ex);
   } catch (IOException ex) {
      Logger.getLogger(HibernateSearchManager.class.getName()).log(Level.SEVERE, null, ex);
   }
   System.out.println(explainDoc.get("title"));
   System.out.println(explanation.toString());
   System.out.println(explanation.getDescription());
}


Unfortunately that does not work with fuzzy and wildcard searches, and I can't figure out how to get the actual term.

Any help very much appreciated - rico


Top
 Profile  
 
 Post subject: Re: Get found terms
PostPosted: Thu Mar 08, 2012 8:06 am 
Newbie

Joined: Wed Mar 07, 2012 12:21 pm
Posts: 6
Location: Zürich
Finally I've found some useful information here. However, it does not feel as clean as I would expect it to be.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.