With the increasing popularity of Java based applications, development platforms become more in demand. We selected Eclipse for this project as it is open source and proven application that can be used to build upon.
The ÒApplication SearchÓ will enable developers within a group or corporation to re-use existing code and follow pre-defined standards.
Our design uses Case Base
Reasoning as the main AI technique.
Experienced developers normally search through the existing code trying to visually select similar sources and gather ideas during the design of a new application. This process can become very difficult and ineffective, basically relying on the developerÕs expertise.
Within a large corporation, this process can take time and still leave a junior developer with a poor selection of examples to start off.
Having a tool, integrated to a widely used development platform, that will automate such search is definitely a great asset to any corporation. It can also come handy to any developer with a large library of existing source codes.
ÒEclipse [1] is an open
source community whose projects are focused on providing a vendor-neutral open
development platform and application frameworks for building software.Ó
This section will give an overview of the Eclipse development application. We chose to leave details aside. The reader can take advantage of the extensive documentation presented in [1].
Eclipse, as the majority of the development visual applications available, enables the creation of workspaces and different kind of projects. The user can organize the code, compile, debug and run Java applications.
Eclipse already has a Search tool with the following options:
¤ File Search: search for files that contain a given text in their bodies. This is very similar to the search available through the operations system.
¤ Java Search: using this option, the developer can look for a text, specifying if the system should look for methods, package, field, restricting the search for what it is really required.
¤ Plug-in Search: look for plug-ins in the file system also using filters such as ÒEnabled Plug-ins onlyÓ.
Refer to Figure 1 for more details.

Figure 1 - Eclipse interface
The ÒApplication SearchÓ is designed to enhance the level of reusability the developer is able to achieve. It is a great tool, especially for projects focusing on particular kinds of applications that the developer doesnÕt master or hadnÕt had the opportunity to work with.
Imagine a developer just got a new job and is given his first development task. Every company has different coding standards that must be kept through the corporation. There are also new components of the language that he might not be that familiar with. In addition, the company might use different business concepts.
It would be much more productive, if a developer was able to intuitively select a list of applications and easily visualize how coding was developed. Eclipse Search options are very useful if the developer really knows what he is looking for, not for the scenario described above. It requires the developer to know the language technicalities and thus limits the search capabilities.
Proposed Solution
By using a tool that links language and business components to search keywords, the developer has a greater level of abstraction while designing a new application. His search is no more dependent on the domain or language specific expertise. For example, if he wants to develop a billing application for a shopping mall. He knows that he would like to reuse code which has implemented the GUI features and which also contains snippets for billing related code. The developer need not know the required GUI related APIs to search for code. He also need not have prior knowledge about business implications to design a billing system. Thus, our tool provides help with a high level of abstraction. Not only is the tool helpful for the entrŽe level developers but also for experienced developers. If an experienced coder wants to write a new application and wants to refer to some previously written code which has already been tested for adherence to quality, performance, standards, accuracy, etc, then he can make use of this search facility. In this case, the developer can make use of his expertise for better search queries. Thus, the tool helps experienced developers to make sure all the aspects were taken in consideration, all possibilities were covered and the best option was chosen while writing the new application.
We decided to use Case Base Reasoning for this project because it is a widely used technique that has presented great results. With an appropriate set of attributes, we could model different kinds of applications fulfilling most of the developerÕs needs.
The approach adopted by us, to implement the search facility, is in lines with the basic theme of our project, i.e. reusing of prior knowledge. Case-based reasoning (CBR), broadly construed, is the process of solving new problems based on the solutions of similar past problems. An auto mechanic who fixes an engine by recalling another car that exhibited similar symptoms is using case-based reasoning. A lawyer who advocates a particular outcome in a trial based on legal precedents or a judge who creates case law is using case-based reasoning. So, too, an engineer copying working elements of nature, is treating nature as a database of solutions to problems. Case-based reasoning is a prominent kind of analogy making. Of the three approaches of CBR, viz. textual, structural and conversational, we propose to use the Conversational CBR (CCBR).
The approach involves 4 steps: Retrieve, Reuse, Revise and Retain.
CCBR is an interactive form of case-based reasoning. It uses a mixed-initiative dialog to guide users to facilitate the case retrieval process through a question-answer sequence. In the traditional CBR process, users are expected to provide a well-defined problem description and based on such a description, the CBR system can find the most appropriate case. Usually users cannot define their problem clearly and accurately. So, instead of letting users guess how to describe their problem, CCBR identifies the most discriminative questions automatically and incrementally and displays them to users to extract information to facilitate the retrieval process.
The reader can find more
information about Conversational Case Base Reasoning in the reference [4].
6.1) Case Base
Figure 2 - Case Base with some Examples
The case base contains two different kind of cases:
¤ Java Applications
The Java Applications are the kind of applications you can generate using the Java language, such as GUI and Applets. For this category of cases the author need to be a Java specialist that has a deep knowledge of what kind of components are needed for each kind of Java application.
Java
Applications are stored in the case base with attribute Type = ÒJÓ.
¤ Business Applications
This category lists the applications related to a particular business. The case base of Figure 2 has some cases in this category. Cases 14-19, for example, list a set of keywords for a payment application. Bill, address, payment and account are references to database tables while get_payment_info and create_payment are references to existing methods.
For these set of cases, the author is expected to be a senior developer that know the applicationÕs business rules. He should be able to select a set of keywords that represent the business aspects of the application.
Business Applications are stored in the case base with attribute Type = ÒBÓ.
Users within a corporation should be able to share the case base and get the most updated version. This can be achieved by keeping the case base as an xml file. Such file can be loaded every time the ÒApplication SearchÓ is used. Initially it can contain a basic set of Java Applications cases, such as the ones listed in Figure 2.Figure 3 contains a sample xml file that can be used as a starting point when building the case base. The xml is organized in two different sections: cases and types. The cases list all the cases in the case base; while the types list the attributes. Notice that, the attribute ÒtypeÓ contains a list of valid values. No value different than ÒJÓ and ÒBÓ should be added to the case base.
<?xml version="1.0"?><caseBase> <cases> <case> <name>x1</name> <symptoms> <application>GUI</application> <keyword>JDialog</keyword>
<link>http://java.sun.com/j2se/1.4.2/docs/api/javax/swing/JDialog.html</link> <tip></tip> </symptoms> </case> <case> <name>x2</name> <symptoms> <application>GUI</application> <keyword>JButton</keyword>
<link>http://java.sun.com/j2se/1.4.2/docs/api/javax/swing/JButton.html</link> <tip></tip> </symptoms> </case> </cases> <types> <type> <name>application</name> <valueType>symbol</valueType> </type> <type> <name>type</name> <valueType>symbol</valueType> <value>J</value> <value>B</value> </type> <type> <name>link</name> <valueType>symbol</valueType> </type> <type> <name>tips</name> <valueType>symbol</valueType> </type> </types></caseBase> |
Figure 3 - Sample xml
6.2) Retrieval
Weighted Hamming distance with α
parameter should be applied to order the cases according to their similarity to
the given problem. Cases will be retrieved according to a given threshold. The
retrieved cases should be the ones with the Hamming distance less or equal then
the threshold, the cases that are closer to the problem.
The α
parameter is used to give different weights to the attributes. The sum of all α
parameters should be always 1 (α1 + α2 + É + αv
= 1).
Different weights can be assigned to
matches/mismatches, represented by the ÒalphaÓ variable. For this project alpha should be set to 0.5 which will give
equal weight for matches and mismatches.
The Weighted
Hamming distance is defined as follows:
¤
Matches: for each attribute, the equal function will
return 1 if the problem matches the case and 0 otherwise. The number of matches
is computed using the a parameter and stored as m.
m =
equal(X1,Y1) * α1 + equal(X2,Y2) * α2 + ... + equal(Xv,Yv)
* αv;
equal(A,B)
{ if (A==B) return 1;
else return 0; }
¤
Mismatches: for each attribute, the equal function will
return 1 if the problem mismatches the case and 0 otherwise. The number of
mismatches is computed using the a parameter and stored as x.
x =
diff(X1,Y1) * α1 + diff(X2,Y2) * α2 + ... + diff(Xv,Yv)
* αv;
diff(A,B)
{ if (A!=B) return 1;
else return 0; }
¤ Hamming Distance
HW(X,Y) = 1-
((alpha*m)/((alpha*m) + ((1-alpha)*x)));
The retrieval phase is split into tree steps in order to offer the user an option to filter the components before searching for the code.
FIRST STEP - The user selects the type of application he is looking for, Business or Java.
INPUT
Threshold: 0.4
Alpha: 0.5
Problem and α parameters:
|
|
Application |
Type |
keywords |
Link |
tips |
|
Problem |
ÒÓ |
<Type selected by the
user> |
ÒÓ |
ÒÓ |
ÒÓ |
|
α |
0.1 |
0.6 |
0.1 |
0.1 |
0.1 |
For the retrieved cases, populate a list that will display all the available applications.
SECOND STEP - The user selects an application from the list.
INPUT
Threshold: 0.3
Alpha: 0.5
Problem and α parameters:
|
. |
Application |
Type |
keywords |
Link |
tips |
|
Problem |
<Applications selected by
the user> |
<Type selected by the
user> |
ÒÓ |
ÒÓ |
ÒÓ |
|
α |
0.35 |
0.35 |
0.1 |
0.1 |
0.1 |
THIRD STEP
The user is given the option to use all the possible components (represented by all the keywords for a given application in the case base) or just some of them.
After the components are
selected, the system searches for all Ò.javaÓ files containing the given
keywords. The search technique may be the algorithm used by EclipseÕs existing
search engines or any other keyword search that attend the projectÕs needs.
The default home directory (the directory were the system starts the search, going though all its sub-directories as well) is the local user home directory. The user should be able to select a different starting point if he wants to look for files in another location or even in a remote server. The result is presented to the user within some links to java documents and useful tips. The tips are well-known issues that the author decides to share with the developers to ensure a higher quality of the code.
6.3) Authoring
The authoring system is a very important tool for any knowledge base system. For this project, the authoring tool will give the domain expert the option to add new cases to the case base by simply choosing a set of attribute value pairs.
The author should be able to:
¤ Add a new case;
¤ Delete an existing case;
¤ Change an exiting case.
Any change to the case base must be reflected to the shared xml file in order to propagate it to the other developers using the system.
New cases should be added to the end of the cases section of the xml file, before the tag </cases>. New cases should use the xml format described below.
XML Entry |
Comment |
<case> |
Beginning
of the case |
<name>x1</name> |
Each case
should have a unique name |
<symptoms> |
The
symptoms are the set of possible attribute/value pairs |
<application>GUI</application> |
Application
name/type |
<keyword>JDialog</keyword> |
Keyword
relevant to the application |
|
<link>http://java.sun.com/j2se/1.4.2/docs/api/javax/swing/JDialog.html</link> |
Link for
more information |
<tip></tip> |
Extra
guiding from expert developers (author) |
</symptoms> |
End of
symptoms section |
</case> |
End of
the case |
6.4) Implementation
Checklist
¤
Choose a xml parser;
¤
Build the case base using the cases from Error!
Reference source not found.;
¤
Implement the retrieval methods including the Hamming
Distance calculation and the keyword search;
¤
Implement the authoring methods that update the case
base;
¤ Build the Application Search GUI.
7.1) Main Window
![]()




7.1.2)
Search For
The user chooses between Business and Java applications. The system should use this selection to retrieve all the relevant cases from the case base. From all the cases, extract the list of applications and display it on the Application drop-down list.
7.1.3)
Application
This list contains all the applications for the given type. The user should be able to choose one application for this list and see the components stored in the case base for the given application.
7.1.4)
Component Selection
These two boxes contain controls in order to give the user the possibility to choose among a set of components (keywords) for the Application Search. If no component is selected, all the keywords should be used for the search (the same behavior for all components selected).
7.1.5)
Customize
Customize button should open the ÒCustomizeÓ Window (7.2).
7.2) Customization

Though this window, the user should be able to:
¤ Change the case base (xml file) location;
¤ Access the Authoring System.
7.3) Authoring
7.3.1) Password Request
Just authorized users should be able to make changes in the case base.

7.3.2) New Case
Add a new case to the case base. The type section from the xml file is used to populate the ÒTypeÓ valid values (ÒJÓ for Java and ÒBÓ for Business applications).

7.3.3) Cases
List all available cases in the case base.

7.3.4) Edit Case
The Author should be able to select a case from the case list. A new window is displayed with the case details. When ÒOKÓ button is pressed, the case is updated in the case base.

Remember that any change to the case base must be reflected to the xml file. So, whenever a developer within the corporation performs a search, the new/updated cases will be used.
7.4) Results

Case Base Reasoning is a powerful AI technique widely applied in commercial applications. By applying Case Base Reasoning to Eclipse Search Engine, it is possible to offer developers a wider range of options in order to improve code reuse, especially within a corporation.
Developers with different
levels of expertise will benefit from such tool during design and
implementation of new applications.
[1] www.eclipse.org
[2] http://www.cse.lehigh.edu/~munoz/CSE335/
[3] http://java.sun.com/j2se/1.4.2/docs/api/
[4] Conversational Case-Based Reasoning – David Aha, Leonard Breslow, Hector Munoz-Avila, Sept 1999
[5] Conversational Case-Based Software Reasoning in
Reuse – Mingyang Gu
[6] Experience Management – Ralph Bergmann