APPLICATION OF EXPERT SYSTEMS IN ENVIRONMENT GIS WITH DIRECTION TO LINE BUILDINGS

Jaroslav Smutný
Department of Railway Construction and Structures,
Faculty of Civil Engineering,
Technical University of Brno,
Czech Republic

Abstract

This paper presents a new approach to rough set theory in railroad reconstruction and maintenance decision support systems. Rough set theory is a relatively new mathematical tool to processing data with vagueness and uncertainty. This approach seems to be of fundamental importance in problems of artificial intelligence and cognitive sciences, in the areas of machine learning, knowledge acquisition, decision analysis, expert systems, but this data can be utilised upon implementation of analysis in areas railroad rehabilitation and maintenance.

Abstrakt

Příspěvek prezentuje nový přístup při realizaci expertních systémů v oblasti plánování údržby železničních tratí s využitím teorie "hrubých množin" . Jde o relativně novou matematickou metodu vhodnou ke zpracování dat, které jsou charakterizovány nejistotou a neurčitostí. Taková data nabývají základního významu v problematice umělé inteligence a rozpoznávání, v oblasti strojového učení, znalostního inženýrství, rozhodovací analýze, expertních systémů, ale mohou být použita při provádění analýz v oblasti obnovy a údržby železničních tratí.

INTRODUCTION

A major objective of a railroad management system is to assist railway engineers and upper management in making consistent and cost-effective decisions related to maintenance and rehabilitation of railroad. An efficient expert and maintenance system requires a good and comprehensive database. The data are often presented in database as a record of objects described by a set of multi-valued attributes like features, variables, characteristics and conditions, and so on. The objects are associated with some decisions as actions, opinions, classes and diagnoses, taken by an expert. Such record is called an information system. It is necessary to say, that the database often consists of incomplete and imprecise data and information [1].

The traditional decision techniques, which have long been used in railroad expert and maintenance systems, include decision tree, linear programming method, statistical techniques and so on. A decision tree is a visual display of the structure of decision problem. It looks like a tree with branches spreading out from nodes. Linear programming is a mathematical techniques used to optimise resource allocation when confronted with certain side constraints that limit the range of choices. Although traditional techniques (e. g. statistical techniques) can solve some problems in railroad management systems, many limits exist for their application.

THEORETICAL BACKGROUND

Rough set theory is a tool for studying imprecision, vagueness and uncertainty in data analysis. It focuses on delivering patterns, rules and knowledge in data. This part of paper gives an introductory description of Rough set theory. The process of dividing a universe of objects into different categories is called classification. Rough set theory deals with the analysis of this classificatory property of set of objects. If we have large data sets, acquired from measurement or from human experts, these data sets may represent vague knowledge, for instance uncertain or incomplete knowledge. Rough set theory provides the means to discern and classify objects in data sets of this type, when it is not possible to divide the objects into defined categories.

In Rough set theory, knowledge is represented in information systems. An information system is a data set represented in a table. Each row in the table represents an object, for instance a case or event. Each column in the table represents an attribute, for instance a variable, an observation or a property. To each object (row) there are assigned some attribute value. An information system is defined as [4]

, (1)

where U is non-empty finite set of objects called the universe and A is non-empty finite set of attributes such that

, (2)

where V_a is the value set of a called the range of a.

One of the most important concepts of Rough set theory is indiscernibility, which is used to define equivalence classes for the object. Given a subset of attributes B A, each subset defines an equivalence relation IND_A(B) called an indiscernibility. The indiscernibility relation is defined as

, (3)

Equation 3 states that the subset of attributes B will be defined a partitioning of the universe into sets such that each object in a set cannot be distinguished from other object in the set using only the attributes in B. The sets which the objects are divided into are called equivalence classes.

If a new attribute is added to the information system and this attribute represents some classification of the object, the system is called a decision system. We get

, (4)

where d is decision attribute.

The elements of A are called conditional attributes or conditions. The decision is not necessarily constant on the equivalence classes. That is, for two objects belonging to the same equivalence class, the value of the decision attributes may be different. In this case, the decision system is inconsistent (non-deterministic). If a unique classification can be made for all the equivalence classes, the system is consistent (deterministic). In order to classify an object based only on the equivalence class in which it belongs, we need the concept of set approximation. Given an information system S=(U, A) and subset attributes B A, we would like to approximate a set of objects X using only the information contained in B. We define B-lower approximation of X as

(5)

and B-upper approximation of X as

(6)

The lower approximation is the set containing all objects for which the equivalence class corresponding to the object is a subset of the set we would like to approximate. This set contain all objects which certainty belong to the set X. The upper approximation is the set containing the objects for which the intersection of the object's equivalence class and the set we would like to approximate is not the empty set. This set contains all objects, which possibly belong to the set X. B-boundary region of X is given by equation

. (7)

This set contain the objects that cannot be classified as definitely inside X nor definitely outside X. Sometimes not all of the knowledge in an information system is necessary to divide the objects into classes. In these cases we can reduce the knowledge. Reducing the knowledge results in reducts. A reduct is minimal set of attributes, B A such that IND_A(B)=IND_A(A), The reduct [4] is a combination of attributes that will make you able to discern between objects as well as you would if you used all attributes. Reduct can be computed in the basis of discernibility matrices and discernibility functions. A discernibility matrix of S is a symmetric n x n matrix with entries

, (8)

The entries for each object are thus attributes that are needed in order to discern object i from object j. From the discernibility matrix, we can build a discernibility function. A discernibility function f_A for an information system S is a Boolean function of m Boolean variables (corresponding to the attributes ) defined as below, where

(9)

The discernibility function is a conjunction of all the entries in the discernibility matrix that are not the empty set. The conjunctions of simplification are the possible reducts for the information system. It is also possible to generate a discernibility function from the discernibility matrix for one of the objects in the information system.

For decision system , we would like to find an approximation of the decision d. This can be done by construction the decision-relative discernibility matrix of S. This matrix tells us how to discern an object from objects belonging to another decision class.

If is discernibility matrix of S, the decision-relative discernibility matrix of S is defined as assuming ,

From the reducts computed from this discernibility matrix, we can generate decision rules for classification of the objects [2, 3].

DESCRIPTION OF THE SYSTEM

The designed and tested system was formed so, as to be able to transfer without problems data mainly from new diagnostic means for measurement of railway superstructure and substructure by Czech Railways. Technical Centre of Transport Ways Praha started in 1999 the operation of newly developed diagnostic means - the measuring car and the measuring truck. Both means are destined for measurement of geometric features of the rail. Measuring car is in addition equipped for measuring of the cross profile of the rail, micro-geometry of rail's surface and for evaluation of the response of the vehicle. All measured parameters are stored in data structures of Geography Information System.

The system was formed in Visual Basic language with environment GIS software - Geomedia Professional. The whole system is presented on figure 1. The method of forming of given expert system was as follows:

1. A training decision table of the system was formed. That means that were tipped given key attributes a₁, a₂, ... a_n, describing variable parameters of railway track (rail). These than form columns of the decision table. Rows represent appropriate sections of the track. From the information system of Czech railways were acquired appropriate parameters and these were imported into the decision table of the system. On the base of the consultations with specialists even the last column of the table, marked as d was filled (represents the decision of the expert on the base of defined attributes). It is to be noted, that the track was described by 8 attributes and that informations about 20 one hundred metres sections were acquired.

2. In further step missing attributes were accomplished. Further was made the reduction and minimisation of decision table. In this way excessive columns of decision table were removed. For this case the Johnsons´ algorithms process was used. After this were removed duplicate rows in the decision table.

3. In further steps was made the classification and from training decision table were generated decision rules.

4. From generated rules were removed duplicates.

5. From generated rules was compiled the expert algorithm.

The method of evaluation of geometric layout of the rail (GLR) is based on statistical analysis of standard deviations (SD) of individual basic characteristics for 100 m sections of tracks in the Czech Railways network. For the digital output are instead of standard deviation GLR used dimension-free parameters, i.e. marks (indexes) of quality, which transfer values SD to numeric values with the same meaning for any class or category of track, any speed range and any rail parameter. The re-count was executed in the range 1-10 (as higher the value of parameter, so higher the degree of damage). In the range of tested system following marks of section evaluation were used:

1. Index of packing a₁

2. Index of vehicle response a₂

3. Index of gauge a₃

4. Index of rail distortion a₄

5. Index of waviness a₅

6. Index of vertical rail worn-down a₆

7. Index of side rail worn-down a₇

8. Index of rail incline a₈

Parameter d presenting the decision included following activities: 0-no activity, 1- grinding of rails, 2- packing, 3- reconstruction of superstructure, 4-rail reconstruction.

CONCLUSION

The theory of Rough Sets presents one of non-statistic approaches to expert analysis of data. Although the theory of Rough Sets originated from the pure mathematic surrounding it finds its use in different branches of science, research and praxis. It is to be pointed out, that in many cases we often deal with subjects, that are in their features not described completely (some features of certain subject we therefore not know). Subjects with missing features can either be set aside, or these missing features can be complemented on the base of knowledge's of subjects with higher number (or full number) of known features. The Rough Set Theory gives as well good possibilities to reduction of attributes describing given subjects and so the reduction of decision rules. These facts only stress the advantage of the use of Rough Set Theory by the construction of expert systems comparing to other mathematical practices. It is reasonable to suppose, that this method will find its´ use as well in the branch of execution of expert and decision systems with orientation to data elaboration and to execution of analyses in the branch of railway constructions and structures.

Presented contribution is the evidence, that it is a very interesting and efficient method. The effectiveness is moreover possible to development and use of further mathematic procedures (genetic algoritmuses, neurone networks etc.) On small extent of data was composed and tested expert system for the purpose of evaluation of basis properties. The formed system was tested in conditions of the Faculty of Civil Engineering by the active participation of undergraduate and graduate students.

Based on the acquired results it is possible to say, that the expert system, based on the Rough Set Theory presents a very interesting alternative comparing to classic, presently commonly used decision methods.

REFERENCES

[1] Slowinski R.: Rough Set Approach to Decision Analysis. AI Expert, March 1995
[2] Ziarko W.: Review of Basics of Rough Sets in the Context of Data Mining, Proceedings of the Fourth International Workshop on Rough Sets, Fuzzy Sets and Machine Discovery, Tokyo Nov. 6-8 1996
[3] Bjanger M.S.: Vibration Analysis in Rotating Machinery using Rough Set Theory, University of Science and Technology, Norwegian, 1999
[4] Attoh-Okine N. O.: Rough Set Application to Data Mining Principles in PMS, Journal of Computing, 5/1999

This research has been supported by the research project CEZ J22/98 No.~261100007 ("Theory, reliability and mechanism of failure statically and dynamically loaded building construction")

Recenzoval: prof. Ing. Petr Cenek, CSc. (Žilinská univerzita)

Figure No 1