October 2
- Noted some objectives for present and future development
(eg. December / January).
- Objective 1: Separate the low level robotic control from the
basic GUI and simulation framework, so that others can use the
code for their own robot platforms. That is, abstract the
notion of physical robotic experimentation into a separate
component in the code (or control in the GUI).
- Objective 2: Write some fast C code (openGL) for the graphics
portion of the MazeFrame so we can show more than a two
dimensional overview of the robot. Envision using 3D graphics
to represent the robot and maze during simulation. May also
make this the primary frame, ie flatten the GUI so that we
are keeping track of statistics and camera movement in
secondary windows.
- Objective 3: Do some user task analysis for the interface,
including, but not limited to, task scenarios and evaluation
using heuristics.
- Objective 4: Most importantly, get the simulation working, at
least at a crude level. Compare the algorithms and do some
algorithmic analysis before incorporating machine learning.
Thus another reason to modulized the robot/GUI code better.
August 21
- All documentation have been converted to html format.
- Need a simple, possibly non-GUI application that uses and tests
only the QLearning* classes (~ 2 days).
- JNI code needs to be changed for the PC platform (~ 3 - 4 days).
August 11
- ExplorerFrame and MazeFrame are basically complete.
- StatusFrame is not complete, but the source of the data
has already been incorporated, so the fix should not be too
time-consuming (~ 3 - 5 hrs).
- Four neighbor drawing code has been fixed; works fine now
for LineRobot. However, CamRobot drawing code is buggy
for eight neighbors (~ 2 hrs).
- Readme dialog won't link hyperlinks correctly (~ 3 hrs).
- About dialog is empty, but is very simple to implement
(~ 20 minutes).
- Communication in the InterfaceFrame is not finished; in
particular, we need an ExperimentManager that tells the
QLearningExperiment to do things and communicates the info
to the frames (~ 3 days).
- Open and save utilities not implemented; need a file format
and some file stream code in the listener for the open
and save buttons (~ 4 - 5 days).
- Camera.c won't compile, due to Java glue and IMA (~
forever).
Native code is implemented for the Unix system at the
UCR Visualization and Intelligent
Systems Laboratory. Namely, CamRobot.c, LineRobot.c, Camera.c, and
HandyBoard.c are for use specifically with the IMA imaging system and the MIT
Handy Board. To use the code for a different platform, simply change
the specific references to these classes in the Status, Explorer, and
Maze frames, or add your implementation as an option. See
Program Structure for details.
All files are in the ucr_reu directory (corresponding to the
ucr_reu package). The java-specific paths must be set as follows:
CLASSPATH set to the directory containing ucr_reu, LD_LIBRARY_PATH
set to ucr_reu itself.
To recompile the project, do the following:
- Make sure the CamRobot.java file static block loads the "CamRobot"
library.
- Compile the java files using
javac *.java , with
the -classpath option set to the directory containing ucr_reu.
- Create a header file for CamRobot.java using
javah -jni
CamRobot .
You must also reset the CLASSPATH (or use the -classpath option)
to ucr_reu so that javah can find CamRobot.java.
- Make sure your C implementation file meets the requirements of
the header file (same function declarations).
- Invoke the C compiler, creating the shared library in the process
using
cc -G -I/usr/java/include -I/usr/java/include/solaris
CamRobot.c -o libCamRobot.so , assuming your java directory is
located in usr (as in sevenup). For shasta, use: cc -G
-I/usr/local/inst/jdk1.2.2/include
-I/usr/local/inst/jdk1.2.2/include/solaris CamRobot.c -o
libCamRobot.so
- Repeat procedure for LineRobot and Camera classes.
To run the program, use java ucr_reu/InterfaceFrame ,
with -classpath set to the directory containing ucr_reu; or simply
java InterfaceFrame (unset your CLASSPATH).
Given starting coordinates and a starting orientation, the task
of the robot is to search for the goal coordinate of an arbitrary
maze possibly filled with obstacles. When the goal is found,
the robot is replaced at the starting coordinates and the next
trial (or episode) of the search takes place. The robot and the
corresponding search algorithm has no knowledge of the goal
coordinates in all trials. They also have initial knowledge of the
environment. As the number of episodes increases, the
performance of the robot, as measured by total number of moves
taken to reach the goal, should improve.
Although mazes and algorithms of various shapes and types can be
used for this experiment, we chose to implement only 4 and 8
neighbor q learning on a rectangular maze. The interface is
flexible enough, however to allow for mazes of different types
(one only has to implement a specific maze class that extends the
abstract Maze class). Similarly, different types of algorithms
can be created (one needs to make small changes to
NewExperimentDialog, implement an option panel for the new
algorithm specifying the user input parameters, a custum Maze or
RectMaze class that keeps track of state information, and an
Experiment class which takes a subclass of Robot and a subclass
of Maze among its arguments and generates specific actions by
calling the robot and updating the maze).
An algorithmic implementation necessitates an implementation of
the abstract Robot class. Currently, the two direct subclasses
of Robot are LineRobot and CamRobot. LineRobot controls the line-
sensing robot from the 1999 NSF UCR REU robotics project. See
last year's
documentation for more details. CamRobot controls the
NSF UCR REU robot. It has a mounted camera, four proximity
sensors, stepper motors, servo motors for rotating and tilting
the camera, encoders for fine position control, and bump sensors.
When a new robot is built, we need to construct a new subclass
of Robot. We also need to add a specific case for the code in
the ExplorerFrame and possibly the initialization dialogs.
This year's focuses on the use of a mounted camera to detect and
avoid obstacles. To improve performance, the camera communicates
directly with the computer, which sends signals to the Handy Board
on the robot, telling it to check sensors, move forward, rotate,
etc. The robot also has error checking capabilities associated
with movement (encoders) and emergencies (bump sensors). We
hope to demonstrate improved performance, as measured by the
number of actions taken to reach the goal state in a maze, in a
real-time robotic system.
The only algorithms implemented so far are 4 and 8 neighbor q
learning.
The following is a list of classes for the project, along with
specifications, their super class, and links to their present
source code. (You can freely access the html source and compile
the classes directly after removing the first and last lines of
the file. Note that package statements have been omitted from
the source to allow easy access.).
In addition to these classes, the project contains the interface
InterfaceConstants,
for the Reinforcement Learning Robotic Interface package.
It contains all the programmer-specified constants for the
package, including
- interface modes
- required bounds (in pixels) of frames and graphic components
- colors, borders and other GUI specifications
- default start-up values for robots and mazes
- strings for image files and hyperlinks
Every concrete class in the package implements InterfaceConstants.
The three abstract classes: Robot, Experiment, and Maze can be used in
other projects, perhaps as extensions. They are the core classes to extend
for expanding the capabilities of the Reinforcement Robotic Interface.
Note that some classes are implemented by native code in conjunction with
the
Java Native Interface.
All robot code is written in Interactive C, for the MIT Handy Board.
HandyBoard.c is the implementation
of receiver/transmitter communication from the Handy Board's side.
Note that motor and sensor code have not been added. A possible way to
control the stepper motor using software is presented in
StepperMotor.c, which was found
in Peter Harrison's
web site.
The computer/board communication has been tested; optimizations for speed,
however, have not yet been made. We hope to increase the baud rate for
the communication for CamRobot and decrease the delay time on the computer's
side. A version of the code for the LineRobot is working. Specifically, we
can command the old line robot directly and show its status and position
in the Interface Frame.
Class |
Description |
Extends |
AboutDialog
|
A modal dialog that provides product information about the
reinforcement learning robotic interface package. It also
contains links to documentation on Java and the interface
package on the web. |
javax. swing. JDialog |
AlgorithmOptionPane
|
An opaque panel that obtains the name of the preferred algorithm
from the user via a set of radio buttons. At present, only
q learning is supported, but extensions for new algorithms
have been made. |
javax. swing. JPanel |
CamRobot
|
An implementation of Robot capable of moving an arbitrary number
of steps and rotating an arbitary number of degrees. It
has a camera mounted on the robot and four proximity
sensors that return an integer describing the distance of
the nearest obstacle in four different directions.
Implementation in native code is found in
CamRobot.c and
CamRobot.h. |
Robot |
Camera
|
An Object that can grab image from the Imaging Modular Vision
system. It can also describe obstacles detected by the
camera. Implementation in native code is found in
Camera.c and
Camera.h. |
java. lang. Object |
CommandModeDialog
|
A modal dialog containing setup preferrences for a robot command
mode session. It allows robot type selection and maze
characterization (width, height, start and goal
coordinates), but not obstacle initialization. |
TabbedDialog |
DirectionVector
|
An eight dimensional vector containing double values for the eight
directions in the order (North, Northeast, East, Southeast, South,
Southwest, West, Northwest). The structure can also give
the maximum of the direction values without sorting by keeping
track of a max value variable. |
java. lang. Object |
abstract
Experiment
|
An abstract class for running robotic experiments. It holds an
instance of Robot and specifies methods that control the
flow of the experiment by calling the robot (experimental
initiation should take place in the subclass's constructor). |
java. lang. Object |
ExperimentFileChooser
|
A modal file chooser for opening and saving robotic
experiment files. At present the only active file filter
implemented are for ".qre" files which record experiment
statistics for continuing q learning robotic experiments. |
javax. swing. JFileChooser |
ExplorerFrame
|
A resizable and movable internal frame that captures the
live view of the maze from the robot's perspective. For a
CamRobot, this is implemented by the camera mounted
directly on the robot. The frame also contains tool bars
for commanding the robot during a command mode session. |
javax. swing. JInternalFrame |
HistoryTableModel
|
A data model for the status frame that displays robot status
and experiment progress statistics. It can be modified to
display different types of information for different
interface modes. |
javax. swing. table. AbstractTableModel |
InterfaceFrame
|
The main frame for the Reinforcement Learning Robotic Interface
package. It has a menu bar for selecting new experiments,
command mode sessions, or simulations, for saving
experiments, and for displaying help information. It
contains a back ground image and three internal frames:
explorer frame, status frame, and maze frame. It also
contains references to Robot, RectMaze, and Experiment
objects, which are passed between the internal frames, the
initialization dialogs, and the interface frame itself. |
javax. swing. JFrame |
LineRobot
|
An implementation of Robot capable of moving in a grid by
using line sensors to detect gird lines on a rectangular maze.
It can be monitored via an overhead camera. Implementation in
native code is found in
LineRobot.c and
LineRobot.h. |
Robot |
abstract
Maze
|
An abstract class that encapsulates the notion of a
2-dimensional space with specific goal coordinates. It keeps
track of the start and goal coordinates for a search through
the abstract maze space and specifies methods for placing
and displacing obstacles. |
java. lang. Object |
MazeFrame
|
A resizable and movable internal frame that provides a
real-time, live view of the maze from an overhead
perspective. The robot, maze, start and goal coordinates,
obstacles, and search path are illustrated as the experiment
or command mode session takes place in real time. |
javax. swing. JInternalFrame |
MazeOptionPane
|
An opaque panel that obtains the maze size, start coordinates,
and goal coordinates from the user via a set of text fields.
It modifies the RectMaze passed in through the constructor
directly and provides action-based error checking. |
javax. swing. JPanel |
MazePane
|
A graphic component that illustrates the progress of the robot
in real time. In a simulation session, the obstacles are
pre-drawn on the maze; otherwise, obstacles are drawn as
they are discovered. |
javax. swing. JPanel |
NewExperimentDialog
|
A modal dialog containing setup preferrences for a new robotic
experiment. It allows robot type selection, maze
characterization (width, height, start and goal coordiantes),
algorithm selection, algorithm specification, but not
obstacle initialization. |
TabbedDialog |
ObstacleOptionPane
|
An opaque panel that obtains the obstacle coordinates from
the user during simulation session initialization. It
contains a graphic panel which recieves input from the user
in the form of mouse clicks and displays the current
selection in a label. |
javax. swing. JPanel |
ObstacleSelectionPane
|
A graphic component that illustrates the placement and
displacement of obstacles in a simulation session initialization.
It draws obstacles as the user clicks on a grid, checking for
errors dynamically as they occurr. |
javax. swing. JPanel |
QLearningExperiment
|
An implementation of Experiment for used in a 4 or 8 neighbor
Q Learning reinforcement experiment or simulation. It
controls the flow of the experiment by calling its Robot
object to perform actions and updating the q values via
its QLearningRectMaze object. Specifically, the action
taken during each robot state is randomly selected with
respect to the q values for each possible action:
P(ai | s) = (k ^ Q(s, ai)) / (sum over j (k ^ Q(s, aj)))
where the probability of taking the action ai at state s
is calculated from the user-specified exploitation factor k
greater than 0, and the q values for each action at state s.
As k increases, the equation favors actions with high q
values more and more. Note that when k = 1 or when
Q(s, ai) = 0 for all ai, the calculated probability for
taking any available action ai are the same. |
Experiment |
QLearningOptionPane
|
An opaque panel that obtains data from the user specifying the
q learning experiment parameters: number of neighbors,
number of trials, goal reinforcement, discount factor,
exploitation factor, etc. It performs action-based error
checking on the user input. |
javax. swing. JPanel |
QLearningRectMaze
|
A subclass of RectMaze that supports q learning experiments.
It keeps track of q values for all possible state actions,
previous headings and coordinates, and the number of times
a state action has occurred. It can updates its state
action table and q value table. Q values are updated by
treating the experiment as a nondeterministic Markov
Decision Process, with the update rule:
Qn(s, a) = (1 - X) * Qn-1(s, a) + X(r + d * max (Qn-1(s', a')))
where the Q value at state s while taking the action a is
computed from X (the learning rate, which depends on the
number of times the specific state action has occurred), r
(the reinforcement value associated with the state), Qn-1
(the previos q value at s), d (discount factor between 0
and 1), and the maximum of the q values at the the new
state s'. |
RectMaze |
ReadmeDialog
|
A modal dialog that brins the interface readme into the program
by accessing its URL. It also allows the user to browse
through the javadoc documentation. |
javax. swing. JDialog |
RectMaze
|
An implementation of Maze that represents a rectangular
maze with square cells that are either obstacles or free
cells. It also keeps track of the current position of the
target within the grid and performs error checking on
attempts to modify the maze. |
Maze |
abstract
Robot
|
An abstract class that encapsulates the basic
functionalities of a movable machine that accepts commands
from the user. It keeps track of the current heading of
the robot as it traverses through a maze. |
java. lang. Object |
RobotOptionPane
|
An opaque panel that obtains the type of robot to be used
for the experiment from the user. At present only line
robot and camera robot are supported, both of which are
quite specific in their functionality, and both are
implemented via a subclass of Robot. A better
implementation would involve identifying the components of
the robot and specifying their attributes during experiment
initialization. |
javax. swing. JPanel |
StatusFrame
|
A resizable and movable internal frame that keeps track of the
status of the robot and maze characteristics in a history table.
It displays different types of information for different
interface modes (simulation, experiment, command). |
javax. swing. JInternalFrame |
TabbedDialog
|
A general purpose dialog window with finite number of tabs and
two buttons (ok and cancel). Its subclasses are used to obtain
user input during experiment, simulation, and command mode
initialization. |
javax. swing. JDialog |
- Mitchell, Tom.
Machine Learning.
McGraw-Hill, 1997. Ch 13.
Introduces QLearning theory and heuristics such as probability algorithms
for selection of the next action and simple algorithms for balancing
exploitation and exploration in a real experiment.
- Jain, Ramesh. And Rangachar Kasturi, Brian G. Schunck.
Machine Vision.
McGraw-Hill, 1995. Ch 3.
Overview of simple thresholding algorithms for detecting obstacles in
different situations; for example, when the maze grid size is fixed.
- Jones, Joseph L. And Bruce A. Seiger, Anita M. Flynn.
Mobile Robots: Inspiration to Implementation.
2nd Ed. AK Peters, 1999. Robot Programming Chapter.
Teaches the art of programming for a mobile robot using a real robot
example. Uses and teaches the Interactive C paradigm.
- Martin, Fred.
The Handy Board Technical Reference.
Availabe Online.
Details on Handy Board library functions, usage of Interactive C programming
language, components of the Handy Board, etc.
- Motorola.
Motorola 68HC11 Porgramming Reference Guide.
Available Online.
Handy reference for the motorola chip used in the Handy Board, including
address locations of control bits and other assembly programming necessities.
- Kernighan, Brian. And Dennis Ritchie.
The C Programming Language.
Prentice Hall, 1988. Ch 7, 8.
Describes the Unix system interface and the use of C to read and write
to the serial port on an Unix machine. Also describes files, file pointers,
and related technology for using data files to debug code.
- Campione, Mary. And Kathy Walrath.
The Java Tutorial.
2nd Ed. Addison-Wesley, 1998. GUI section.
Gives the details of Swing/JFC GUI programming. Look specifically at the
Java2D API for drawing the robot image, JNI for native interface implementatins,
and Essential Java Classes for discussion of file streams and strings.
- Austin, Calvin. And Monica Pawlan.
Advanced Programming for the Java 2 Platform.
Available Online. Ch 5 - 6.
An overview of various Java essentials for advanced GUI building, Java Native
Interface, and performance issues.
The Java2 API is available
here
.
The documentation API for the Reinforcement Learning Robotic
interface package is
also available
.
An overview of the inheritance hierarchy for the project is
here
.
For more information, please contact:
Pat Leang
, for questions regarding the transmitter / receiver,
motor, sensor, encoder and handy board-related code
Ray Luo
, for questions regarding the interface GUI, and communication
between robot, computer, and camera
Thuan Dinh
, for questions regarding the camera and the vision
processing algorithm
Steve Wong
, for questions regarding the testing code
Created by Ray Luo
|