Based on the above considerations, a knowledge discovery schema language, KDSL, has been designed for specifying several kinds of knowledge.
It consists of the specifications of following major primitives in data mining :
(1) The kind of knowledge (Mandatory) to be discovered which may include classification, association rules, clustering, etc.
(2) The set of relevant data (Mandatory) can be specified in a way similar to that of a relational query, which is to be used to fetch the set of relevant data from the database.
(3) The background knowledge (Optional) is a set of concept hierarchies or generalization operators which provide corresponding higher level concepts.
(4) Selection Function (Optional) to select data which is relevant to a data mining schema.
(5) Cleaning Function (Optional) to clean the data like taking care of missing and noisy values.
(6) Transformation Function (Optional) to specify if there is any kind of transformation required in the data.
(7) Mining Algorithm (Mandatory) to specify the algorithm which is used to actually mine the data.
(8) Various parameters (Optional) to specify things like periodicity, priority and start time of the schema. Also, there should be provision to specify knowledge dependent parameters like confidence, support, no. of clusters, etc. if the user wants so.