|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object comirva.data.SingleTermList
public class SingleTermList
This class implements a single term list of a text document. It is intended to be used for text mining purposes. In particular, its design reflects the main usage for HTML-files.
Field Summary | |
---|---|
(package private) Vector<String> |
audioContent
|
(package private) String |
crawlDetails
|
(package private) Vector<String> |
extAudio
|
(package private) Vector<String> |
extImage
|
(package private) Vector<String> |
extVideo
|
(package private) File |
fileLocal
|
(package private) Hashtable<String,Integer> |
frequency
|
(package private) Vector<String> |
imageContent
|
(package private) String |
searchTerm
|
(package private) String |
urlSource
|
(package private) Vector<String> |
videoContent
|
Constructor Summary | |
---|---|
SingleTermList()
Creates a new SingleTermList-instance. |
|
SingleTermList(File fileLocal)
Creates a new SingleTermList-instance. |
Method Summary | |
---|---|
void |
calculateOccurrences(Vector<String> termList)
Calculates the occurrences of the terms given as Vector |
Vector<String> |
extractLinks(String htmlLine,
String searchAttribute)
Tries to extract all links from the given attribute (in an arbitrary tag) that occur somewhere in the passed htmlLine. |
Vector<String> |
extractLinks(String htmlLine,
String searchTag,
String searchAttribute)
Tries to extract all links from the given attribute in the given tag that occur somewhere in the passed htmlLine. |
Vector<String> |
extractLinks(String htmlLine,
Vector<String> hrefs,
String searchAttr)
Tries to extract all links from the given attribute (in an arbitrary tag) that occur somewhere in the passed htmlLine. |
Vector<String> |
extractLinks(String htmlLine,
Vector<String> hrefs,
String searchTag,
String searchAttr)
Tries to extract all links from the given attribute in the given tag that occur somewhere in the passed htmlLine. |
Vector<String> |
getAudioContent()
|
String |
getCrawlDetails()
|
File |
getFileLocal()
|
Hashtable<String,Integer> |
getFrequency()
|
Vector<String> |
getImageContent()
|
String |
getSearchTerm()
|
String |
getUrlSource()
|
Vector<String> |
getVideoContent()
|
void |
printTFs()
Prints a list of the term frequencies. |
void |
readXML(XMLStreamReader reader)
Deserializes a SingleTermList-instance from an XML-file. |
void |
setCrawlDetails(String crawlDetails)
|
void |
setExtAudio(Vector<String> extAudio)
|
void |
setExtImage(Vector<String> extImage)
|
void |
setExtVideo(Vector<String> extVideo)
|
void |
setFileLocal(File fileLocal)
|
void |
setSearchTerm(String searchTerm)
|
void |
setUrlSource(String urlSource)
|
void |
writeXML(XMLStreamWriter writer)
Serializes a SingleTermList-instance as XML-file. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
Hashtable<String,Integer> frequency
File fileLocal
String urlSource
String searchTerm
String crawlDetails
Vector<String> audioContent
Vector<String> imageContent
Vector<String> videoContent
Vector<String> extAudio
Vector<String> extImage
Vector<String> extVideo
Constructor Detail |
---|
public SingleTermList(File fileLocal)
fileLocal
- the file for which the term list should be created/loadedpublic SingleTermList()
Method Detail |
---|
public void calculateOccurrences(Vector<String> termList)
termList
- a Vectorpublic void writeXML(XMLStreamWriter writer)
writeXML
in interface XMLSerializable
writer
- a XMLStreamWriter that points to the XML-file.XMLSerializable.writeXML(javax.xml.stream.XMLStreamWriter)
public void readXML(XMLStreamReader reader)
readXML
in interface XMLSerializable
reader
- a XMLStreamReader that points to the XML-file.XMLSerializable.readXML(javax.xml.stream.XMLStreamReader)
public void printTFs()
public String getCrawlDetails()
public void setCrawlDetails(String crawlDetails)
crawlDetails
- The crawlDetails to set.public File getFileLocal()
public void setFileLocal(File fileLocal)
fileLocal
- The fileLocal to set.public String getSearchTerm()
public void setSearchTerm(String searchTerm)
searchTerm
- The searchTerm to set.public String getUrlSource()
public void setUrlSource(String urlSource)
urlSource
- The urlSource to set.public Hashtable<String,Integer> getFrequency()
public Vector<String> getAudioContent()
public Vector<String> getImageContent()
public Vector<String> getVideoContent()
public void setExtAudio(Vector<String> extAudio)
extAudio
- a Vectorpublic void setExtImage(Vector<String> extImage)
extImage
- a Vectorpublic void setExtVideo(Vector<String> extVideo)
extVideo
- a Vectorpublic Vector<String> extractLinks(String htmlLine, Vector<String> hrefs, String searchTag, String searchAttr)
htmlLine
- the HTML code to analyzehrefs
- just for recursive callsearchTag
- the tag to search (e.g. "a")searchAttr
- the attribute to search within the tag (e.g. "href")
public Vector<String> extractLinks(String htmlLine, String searchTag, String searchAttribute)
htmlLine
- the HTML code to analyzesearchTag
- the tag to search (e.g. "a")searchAttr
- the attribute to search within the tag (e.g. "href")
public Vector<String> extractLinks(String htmlLine, Vector<String> hrefs, String searchAttr)
htmlLine
- the HTML code to analyzehrefs
- just for recursive callsearchAttr
- the attribute to search within the tag (e.g. "href")
public Vector<String> extractLinks(String htmlLine, String searchAttribute)
htmlLine
- the HTML code to analyzesearchAttr
- the attribute to search within the tag (e.g. "href", "src")
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |