comirva.web.crawling
Class PageCountsRetriever
java.lang.Object
java.lang.Thread
comirva.web.crawling.PageCountsRetriever
- All Implemented Interfaces:
- Runnable
public class PageCountsRetriever
- extends Thread
This class implements functions for retrieving page counts with
any search engine with Google-like interface.
Method Summary |
DataMatrix |
getPageCountMatrix()
Returns the page count matrix for the co-occurence of the search terms on web pages. |
void |
run()
This method is called when the thread is started. |
Methods inherited from class java.lang.Thread |
activeCount, checkAccess, countStackFrames, currentThread, destroy, dumpStack, enumerate, getAllStackTraces, getContextClassLoader, getDefaultUncaughtExceptionHandler, getId, getName, getPriority, getStackTrace, getState, getThreadGroup, getUncaughtExceptionHandler, holdsLock, interrupt, interrupted, isAlive, isDaemon, isInterrupted, join, join, join, resume, setContextClassLoader, setDaemon, setDefaultUncaughtExceptionHandler, setName, setPriority, setUncaughtExceptionHandler, sleep, sleep, start, stop, stop, suspend, toString, yield |
PageCountsRetriever
public PageCountsRetriever(PageCountsRetrieverConfig pcrCfg,
Vector searchWords,
Vector ml,
DefaultListModel lm,
JLabel statusBar)
- Creates a PageCountsRetriever for accessing Google-like search engines
and calculating a page count matrix for the co-occurence of the terms
in the
searchList
Vector.
- Parameters:
pcrCfg
- a PageCountsRetrieverConfig-instance containing the configuration for the web crawlssearchWords
- a Vector containing the search words for which the (joint) appearance on web pages should be determinedml
- the Vector to which the name of the DataMatrix should be added after it has been determined by web crawllm
- the DefaultListModel to add the name of the matrix to the UIstatusBar
- the JLabel represetning the status bar (for writing current loading progress)
run
public void run()
- This method is called when the thread is started.
It creates AnySearch-instances for each query, raises the query
and stores the retrieved page counts into a DataMatrix.
- Specified by:
run
in interface Runnable
- Overrides:
run
in class Thread
- See Also:
Runnable.run()
getPageCountMatrix
public DataMatrix getPageCountMatrix()
- Returns the page count matrix for the co-occurence of the search terms on web pages.
- Returns:
- a DataMatrix with the page count for the co-occurence of the search terms