Professional Documents
Culture Documents
1 2
WANG Yong-gui JIA Zhen
Dept of Software Dept of Electronics and Information Engineering
Liaoning Technical University Liaoning Technical University
Huludao, Liaoning, China Huludao, Liaoning, China
yghI2000@163.net liazh555@126.com
Abstract-A semantic-based Web mmmg is mentioned by information contained in Web, because, the desired search
many people in order to improve Web service levels and results will be submerged by the traditional search engines
address the existing Web services which is supported by the which are based on the keywords; the other, since the
lack of semantic problem. Semantic-based Web data mining is majority of Web data is unstructured, which lead to the
a combination of the semantic Web and Web mining. Web traditional data mining results will be unsatisfactory. In
mining results help to build the semantic Web. The knowledge
order to solve these problems, people start to use semantic
of Semantic Web makes Web mining easier to achieve, but
information to improve the Internet capacity to provide
also can improve the effectiveness of Web mining. This paper
services for human. Machine-processable semantics
firstly introduces the related knowledge of Semantic Web and
information can be with the intelligent software products
Web mining, and then discusses the semantic-based Web
such as Agent to effectively interact. Web mining based on
mining, finally proposes to build a semantic-based Web
semantic is a combination of semantic Web and Web mining,
mining model under the framework of the Agent.
which can better improve the intelligence level of access to
Keywords-Web Ming; Semantic Web; Ontology; Agent information.
Web content mining is used to extract the text, image, or engine query further to get more accurate and useful
other information and knowledge component of the web information.
content. Which sites sell cars? Which pages are in Chinese? Web structure mining is used to extract the network
Which pages introduce the music, or introduce news? Search topology information, that is, the link between pages of
engines, intelligent agents, and some recommend use information. Mine knowledge from the WWW organization
content mining to help the user in the vast network of space and links. Which pages are linked to other pages? Which
to find the necessary content. Web content mining has two pages point to other page? Which collection of pages
strategies: page text mining; process results for search
constitutes an independent entity? Can sort the page and URI is responsible for resource identification, which allows
found that an important page. precise retrieval of information possible. The Second layer
Web usage mining is used to extract about the customer of XML + NS (Namespace) + XML Schema, is responsible
how to use the browser and use the page links. It extracts for representing the content and structure of data from the
interested patterns from the access to records of Web. For linguistic to separate the performance format, the data
example, which pages are the client accesses? How long structure and content of the network information form
spent on each page? What next click on? What are the entry through the use of a standard format language. The third
and exit routes? WWW Each server retains the Web access layer of RDF + RDF Schema, which provides a semantic
log, recording information for the user access and interaction. model used to describe the information on the Web and type.
Analysis of these data can help understand the user's The fourth layer of ontology vocabulary layer is responsible
behavior, thus improving the structure of the site, or to for the definition of shared knowledge and describes the
provide users with personalized services. semantic relationships between the various kinds of
information to reveal the semantic between information
B. Semantic Web
itself and information. The fifth layer of logic layer is
[2 ]
The basic idea of Semantic Web is that embed responsible for providing axioms and inference principles to
machine-readable, on behalf of certain types of knowledge provide the basis for intelligent services. The sixth layer of
mark in the Web message. So that the data on the Web is not Proof and the seventh layer of trust are responsible for
only used to display, but also be understood by the machine providing authentication and trust mechanisms. Digital
so as to enhance the quality of the information services and signatures and encryption technology used to detect changes
explore a variety of new, intelligent information services. If in the document situation is a mean to enhance Web
the knowledge that reflect the link between data and security.
application are embedded in a variety of different This is a hierarchical structure of the enhanced functional.
information sources in a user transparent manner, Web XML, RDF (S) and the Ontology are its core in the Semantic
pages, database, procedures will be able to link up through Web architecture. The formation of the Semantic Web's
the agent and each other collaborate. technical support system mark with the three core
According to Berners-Lee's vision, the semantic network technology. They support semantic description for network
Constituted by seven levels is constituted of a layered information and knowledge, to play a central role in
[3]
architecture . As shown in Table 1. The first layer of URI achieving the semantic-level knowledge sharing and
and Unicode is the basis for the structure of the entire system. knowledge reuse.
Unicode is responsible for processing resources encoding,
V
High
Layer 6
Layer 7
Proof
Trust
According to logic, to verify statements in order to draw conclusions
VI-68 Volume 1
2010 International Conference On Computer Design And Appliations (ICCDA 2010)
Ontology Le ming
The first step: In the beginning, you need to build an conceptual level (initial ontology). The ontology level will
initial ontology. To build an initial ontology first need to be stored in ontology library system to provide support for
obtain the relevant set of atomic concepts, we use clustering the next phase of work.
algorithm to obtain the document from the Web; and then The second step: resource acquisition module collects
get this concept hierarchy by a variety of different ways. task-related data sets according to received tasks instructions
One way is to use the knowledge acquisition methods to by ontology Agent from a Web mining. Usually this step is
generate, such as ONTEX (ontology Exploration) which essential. Because the data set on Web is very scattered,
input a group of concept sets depending on knowledge dynamic and often inconsistent data, whether the data
acquisition techniques of properties detect, and then output collection is good or bad will have a direct impact on the
the level of above concept collection. Another way also can results of Web mining.
use many of the ontology models that the current ontology The third step: RDF clustering module achieves ontology
researchers have developed. These include both general clustering learning to the data that resource acquisition
knowledge ontology model description and a specific modules has collected. The resource nodes of closest
description of knowledge in the field. Ontology model characteristics will be got together in the RDF data
combine knowledge of experts in the field builds a repository.
VI-69 Volume I
2010 International Conference On Computer Design And Appliations (ICCDA 2010)
IV. SUMMARY
REFERENCES
VI-70 Volume 1