Internal Research ProjectsThis is the entry point to Blueberry's Internal Research Projects pages. For commercial reasons, access to these pages is only available to Blueberry staff and some of our research partners. Please log in below with your usual Blueberry central server username and password.
Other companies and institutions who are interested in collaborating with Blueberry should please contact us.
Parsing systems like "yacc" and "lex" are sufficiently complex to deter many application programmers from using them. Also they are based on deterministic grammars: the input either matches a grammar element or it doesn't. This makes such systems quite poor at error reporting, because the error message often has little relationship to the mistake made by the human. Debugging such grammars can be very difficult: it requires a detailed understanding of the parser architecture. The aim of this project is to create a completely new system for parsing input files that avoids the problems identified above. The final result should be a fuzzy, top-down, interactive parser definition system, in which the user "teaches" the system how to recognise the input file. The project involves designing a completely new model for fuzzy, "learned" parsing, and implementing it in a GUI program for interactive parser definition.
This project is related to the Interactive Parser Definition System discussed above, but is aimed at smaller-scale parsing and integration with existing RAD systems. Current RAD systems offer impressive GUI support for a large part of the development cycle, but they don't add any support for parsing input data. The objective of this project is to design an interactive way of parsing simple strings (not whole files) to extract key components. For example, an email address "mickey@disney.com" could be parsed to extract key components such as the username. The project should begin with a review of existing micro- parsers, such as "sscanf" and "regex". The result should be a graphical tool that allows programmers to create micro-parsers within a RAD system environment, without writing code or grammatical rules. Instead, the tool should infer the rules from examples provided interactively by the programmer.
Interfacing current OOP systems to SQL databases is recognised as a difficult problem. Blueberry is interested in investigating alternative, object- oriented databases. However, implementing a new database from scratch is expensive-so there are good reasons for attempting to map an object-oriented data model onto SQL. Python is an object-oriented scripting language with a very rich data model: it has built-in types for strings, tuples, lists, and dictionaries. The idea of this project is to design and implement a set of Python classes that transparently map SQL databases onto Python data structures. For instance, an SQL table may be exposed to Python as a object that resembles a list of dictionaries. So a pair of SQL tables, say "People" and "Companies", might be mapped onto a list of "person" objects, with each "person" object having a reference to a "company" object. Possibly the biggest challenge will be to design the Python classes so that the SQL databases they wrap up can be queried. Is it possible to express SQL queries in Python with a simple and intuitive syntax? Ultimately, Blueberry would like to combine Python, Microsoft SQL Server, and other software technologies into a system for Rapid Development of Database Applications.
Python is a relatively new scripting language that is cross-platform and object oriented. Python includes an C API, which allows programmers to load the Python interpreter, execute commands and access Python data objects. However, this is strictly a C API. There are third-party Python interface libraries for C++, but these seem more than a little complex. The object of this project is to review the existing Python C++ interface libraries, identify shortcomings and areas of over-complexity, and then design improved classes to resolve these problems.
Current e-mail clients offer relatively limited support for conducting conversations: users wishing to respond to multiple points in a single message often need to perform significant manual editing. This is acceptable when messages are short, but becomes more of a problem when trying to do collaborative work over e-mail. The object of this project is to create a GUI tool that allows users to conduct conversations over e-mail more efficiently. The tool should be independent of the users' e-mail clients. Here's an example of an e-mail conversation:
This conversation has hierarchical structure, and so could also be represented as a tree:
The conversation editor could allow conversations to be manipulated in their tree representation (i.e. nodes could be deleted, moved, replied to, etc.) and then converted back into their text form, with all the chevrons (>) added in the right places. |
||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||