Home Books Software Projects Forums

Python + UML =

CP4E? This new acronym stands for Computer Programming for Everybody, a new initiative by the creator of the Python language, Guido van Rossum. The proposal for CP4E was submitted in July, 1999 to the Corporation for National Research Initiatives (CNRI). CP4E has set as its primary goals the creation of a training curriculum and advanced, easy-to-use development tools to be used to teach the Python language to millions of non-technical computer users. Why?

"If we are successful, non-experts will be able use their computers and other intelligent devices much more effectively, reducing their level of frustration and increasing their productivity and work satisfaction."

The Python language has already won acclaim as one of the most elegant and powerful, object-oriented programming languages in use today. Python is open source and freely available for a wide number of platforms and programming environments. Many highly scalable applications have been implemented with Python on these platforms. See our Python Link Collection for links to Python resources and to get an idea of the range of Python's deployment.

It is our belief that UML, a widely adopted, standard notation for designing object-oriented software, could play a key role in the CP4E initiative and with Python development in general, as we will proceed to explain.


It is useful to understand the scope of what Guido is proposing with CP4E. Python is many things to many different people but several of its strongest underpinnings are:

  • It is a scripting language used to rapidly accomplish a wide variety of tasks.
  • It has syntax designed for simplicity of use and the elimination of the arcane details of many programming languages.
  • It is object-oriented and consequently many of its libraries are designed for extensibility.
Given these strengths, Python is well positioned as a language that may be used to customize a variety of all sorts of tasks for computer users. Guido's vision is that computers will be embedded in devices of many types in the post-PC era and that the users of these devices need to be empowered to tailor the way they use these devices according to their own needs.

"In the future, non-programmers will be using a plethora of information appliances... A person may have messages arriving over many media--text, voice, video--and access them via many devices such as PDA, mobile phone, and computer screen. We expect seamless interoperability, so that for example a mobile phone can be used to follow up to an email with a voice response without having to look up the number. Small programs could be used to customize the interfaces on these devices and to filter and limit the flow of messages through them."

The nearest analogy is Microsoft's Visual Basic for Applications (VBA), which has had similar wide ranging ambitions set for it in the PC era, allowing users to customize and extend desktop applications by embedding VBA in these applications.

Visual Programming

While Python is very mature as a language and has a broad collection of library modules, it is far behind when it comes to the development tools used to build applications. After all, real programmers don't need debuggers, object browsers or GUI layout tools, right? Well, not quite.

To be fair, Python is open source and cross-platform. To produce a powerful, state-of-the-art IDE that works on multiple platforms requires significant resources not generally available in open source projects. What Python developers have relied on to date is limited to syntax-aware editing extensions for editors like emacs. These extensions help with code indentation, syntax completion, and 'colorize' the source code as a visual cue. Colorization is the color coding of syntax elements in the language for better visualization.

Guido himself has worked on the nucleus of a development tool written in portable Python called IDLE (Integrated DeveLopment Environment). IDLE supports such features as colorization, auto-indentation, call tips, and undo and is more syntax-aware than editor extensions. IDLE also has a basic object browser which displays the classes that have been loaded. When selecting a class, IDLE displays the methods for the class. When a method is selected the code editor jumps to the method's definition in the code. The browser is basically what a developer of object-oriented code would expect from an IDE.

Among the weak points of IDLE however, the debugger is underdeveloped by Guido's own admission. Nevertheless, IDLE has been identified as a starting point for the easy-to-use tools proposed for the CP4E initiative. The specifics of how CP4E will significantly improve the Python development environment for the students and casual users who are targeted by CP4E are not, however, detailed in the proposal.

How can Python users visualize and navigate their source code better?

"Our focus on relatively inexperienced programmers requires that our tools contain excellent visualization modules, which can present the discovered design to the user without causing information overload."

Object Oriented

Strong support for object-oriented programming is a distinguishing feature of Python. The CP4E proposal does not, surprisingly, elaborate on this advantage for easing the job of the inexperienced programmer. Is object orientation too complex a topic to expose to beginning programmers?

Let's see. The success of Visual Basic for Applications has been virtually built around exposing application logic through rich object and component models. Applications which provide VBA support usually supply charts which diagram the hierarchy of components accessible to programmers who want to automate frequent tasks or extend the application. The documentation and programming tools for VBA focus on allowing a user to easily introspect and customize these components.

The research results from the CP4E project may provide the discovery of newer and even better usability paradigms than those of the VBA component programming model. Python as a language does provide some distinct advantages from the perspective of object orientation which should be exploited for the CP4E project:

  • Inheritance - a class may be inherited from, allowing a programmer to easily extend or customize existing functionality by creating a subclass. Inheritance of implementation is supported.

  • Polymorphism - a programmer may add new types to an existing type hierarchy to suit a specific need. Using polymorphism, no changes need to be made to clients of the type; the type just slips right in to the existing program.

  • Encapsulation - by providing encapsulation, classes encourage clean program design and maintainability through data hiding. But because the visibility is public for all members of a class, accessibility to the attributes of an object is made easier.

  • Loose Typing - similar to Smalltalk, Python is loosely typed which allows more flexibility to achieve polymorphism; as long as a type supports a required message signature, it may participate in polymorphic usage by clients of the message signature.

These are difficult concepts to teach to programmers, especially to newcomers. It is also difficult to visualize these concepts when working within a development environment. But if these are core strengths of the Python language, how can courseware and IDE's afford to not address these concepts?

First Mover

The UML and the modeling tools which support this notation have become a focal point of the object-oriented design community. How can the Python community leverage the UML for both teaching the language and improving the development environment?

Let's see what has been done so far. One tool, ObjectDomain from Object Domain Systems, has succeeded in getting a nice jump on the Python + UML equation. Currently ObjectDomain supports the following Python integration:

  • Reverse-engineering - Python classes may be reverse-engineered into a UML model and displayed within class diagrams.

  • Scripting - The tool provides an embedded JPython interpreter and a console window for entering Python scripts. Scripts may be loaded from files as well. An API is provided which exposes the meta-model for the object model under development. Python scripts may access the model through the API to modify the model or produce derivative artifacts.

This is a sizeable accomplishment, for starters, and gives ObjectDomain a first-mover status. The tool performs these tasks reasonably well, as we will see further on in an example below. It isn't necessary to describe here the benefits of modeling software using the UML; it is taken as a given that the reader already appreciates that fact.

Python + UML

Let's list some blue-sky wishes for what else we think could be done to help the equation for Python and UML tools. We view the development environment of the future as more integrated and distributed, so for now let's blur the lines that have existed between development tools and focus on functionality:

Integrated Editor - One well-known UML tool, Together/J, provides a nice integration between the UML diagram and the source code: click on a class within a class diagram, or an attribute or method within a class, and the source code for the selection appears in an adjacent window. Since real programmers 'live' inside editors, it would be nice to see the inverse paradigm: right click on a class within the code of your favorite editor and a popup selection list would allow you to display, in an adjacent window, one of the class diagrams which contains the class. This paradigm allows a developer navigating the source code to quickly understand the class within the context of a whole model. Better, if sequence diagrams have been created, allow the selection of a method in the code to provide a link to the sequence diagrams which use the method in their message flows.

[Note: Visual SlickEdit's tagging feature is a closely related paradigm, but for hyperlinking within source code only. The feature is very useful.]

Dynamic Modeling - Python is an excellent environment for realizing what we have described elsewhere as the auto-generation of sequence diagrams. Python supports a debug mode and a trace-back facility which may be used to trace the execution of a program. The trace could be used to selectively diagram the interaction between objects in a system by showing the sequence of messages passed between the objects under observation. This feature - providing a UML sequence diagram to enable the visualization of the dynamic behavior - could significantly enhance a programmer's understanding of the code being developed and could help in debugging.

XMI - XMI is XML for UML and one of the exciting new features just appearing now in modeling tools. We have written about the potential of XMI to expand the add-on market for modeling tools. One good example is in the area of producing documentation for object models: see our XMI to HTML pages. Python is being used to develop some very large object-oriented systems and frameworks. For new people coming up to speed and for code maintenance, documentation is essential. Because XMI is an XML notation, tools supporting XSL may be used to transform the XMI into HTML or PDF for publishing object models to the web. Also, the advent of XML repositories is close. This new class of database will provide sophisticated search capability for XML documents and has the potential to allow developers to locate and navigate portions of an object model when XMI is the representation. Python could benefit by leveraging this emerging infrastructure from within an IDE.

Libraries and Frameworks - Python has an extensive and growing collection of modules in the Python standard library. In addition, application frameworks such as Zope are appearing which have a number of interfaces. In order to facilitate the usage of these modules and frameworks, it would be very helpful if developers had a standard documentation and navigation interface within IDE's. The IDE's for Java and VB provide component introspectors. These are quite useful but don't allow developers to see the bigger context. Providing support for the display and navigation of a UML model for a module or framework from within an IDE has the potential to supply this bigger context.

Teaching Python + UML

So far we have discussed how UML has the potential to ease object-oriented development using Python. To test this premise, it is time for an example. The example must also demonstrate to some extent the potential for UML to aid in teaching Python since this is one of the main goals of the CP4E project. The example we have chosen is to diagram the use of inheritance within the Medusa web server. The example was chosen not because we have any expertise on this software; to the contrary, we wanted to see if using UML could help clarify our own and hopefully others' understanding of the software.

Please continue to our Python + UML Example.

Valid XHTML 1.0!