|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object comirva.data.EntityTermProfile
public class EntityTermProfile
This class implements a term profile for entities like artist names. It is intended to be used for text mining purposes. In particular, its design reflects the main usage for HTML-files.
Field Summary | |
---|---|
(package private) String |
crawlDetails
|
(package private) File |
dirLocal
|
(package private) Hashtable<String,Integer> |
documentFrequency
|
(package private) String |
entityName
|
(package private) Vector<String> |
extAudio
|
(package private) Vector<String> |
extImage
|
(package private) Vector<String> |
extVideo
|
(package private) Hashtable<String,Double> |
IDF
|
(package private) Integer |
numberDocuments
|
(package private) Vector<SingleTermList> |
singleTermLists
|
(package private) Hashtable<String,Long> |
termFrequency
|
(package private) Vector<Vector<Integer>> |
termOccurrenceOnDocuments
|
(package private) Vector<String> |
terms
|
(package private) int[][] |
tfDocs
|
(package private) Hashtable<String,Double> |
TFxIDF
|
Constructor Summary | |
---|---|
EntityTermProfile()
Creates a new EntityTermProfile-instance. |
|
EntityTermProfile(File dirLocal)
Creates a new EntityTermProfile-instance. |
Method Summary | |
---|---|
void |
calculateOccurrences(Vector<String> termList,
FileFilter documentFileFilter)
Calculates the occurrences of the terms given as Vector |
String |
getCrawlDetails()
|
File |
getDirLocal()
|
Hashtable<String,Integer> |
getDocumentFrequency()
|
String |
getEntityName()
|
TermsWeights |
getMostImportantTerms(int maxNoTerms,
Hashtable termWeightings)
|
Integer |
getNumberDocuments()
|
Vector<SingleTermList> |
getSingleTermLists()
|
Hashtable<String,Long> |
getTermFrequency()
|
Vector<Vector<Integer>> |
getTermOccurrenceOnDocuments()
|
Vector<String> |
getTerms()
|
Hashtable<String,Double> |
getTFxIDF()
|
void |
readXML(XMLStreamReader reader)
Deserializes an EntityTermProfile-instance from an XML-file. |
void |
setCrawlDetails(String crawlDetails)
|
void |
setDirLocal(File dirLocal)
|
void |
setEntityName(String entityName)
|
void |
setExtAudio(Vector<String> extAudio)
|
void |
setExtImage(Vector<String> extImage)
|
void |
setExtVideo(Vector<String> extVideo)
|
void |
writeXML(XMLStreamWriter writer)
Serializes an EntityTermProfile-instance as XML-file. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
Vector<SingleTermList> singleTermLists
Vector<String> terms
Hashtable<String,Long> termFrequency
Vector<Vector<Integer>> termOccurrenceOnDocuments
Hashtable<String,Integer> documentFrequency
Hashtable<String,Double> TFxIDF
Hashtable<String,Double> IDF
int[][] tfDocs
File dirLocal
String entityName
Integer numberDocuments
String crawlDetails
Vector<String> extAudio
Vector<String> extImage
Vector<String> extVideo
Constructor Detail |
---|
public EntityTermProfile(File dirLocal)
dirLocal
- the directory where all documents belonging to the entity are storedpublic EntityTermProfile()
Method Detail |
---|
public void calculateOccurrences(Vector<String> termList, FileFilter documentFileFilter)
termList
- a VectordocumentFileFilter
- a FileFilter for the documents that should be searched for the terms in the term listpublic void writeXML(XMLStreamWriter writer)
writeXML
in interface XMLSerializable
writer
- a XMLStreamWriter that points to the XML-file.XMLSerializable.writeXML(javax.xml.stream.XMLStreamWriter)
public void readXML(XMLStreamReader reader)
readXML
in interface XMLSerializable
reader
- a XMLStreamReader that points to the XML-file.XMLSerializable.readXML(javax.xml.stream.XMLStreamReader)
public TermsWeights getMostImportantTerms(int maxNoTerms, Hashtable termWeightings)
public String getCrawlDetails()
public void setCrawlDetails(String crawlDetails)
crawlDetails
- The crawlDetails to set.public File getDirLocal()
public void setDirLocal(File dirLocal)
dirLocal
- The dirLocal to set.public String getEntityName()
public void setEntityName(String entityName)
entityName
- The entityName to set.public Hashtable<String,Long> getTermFrequency()
public Hashtable<String,Integer> getDocumentFrequency()
public Vector<Vector<Integer>> getTermOccurrenceOnDocuments()
public Vector<String> getTerms()
public Integer getNumberDocuments()
public Hashtable<String,Double> getTFxIDF()
public Vector<SingleTermList> getSingleTermLists()
public void setExtAudio(Vector<String> extAudio)
extAudio
- a Vectorpublic void setExtImage(Vector<String> extImage)
extImage
- a Vectorpublic void setExtVideo(Vector<String> extVideo)
extVideo
- a Vector
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |