DDI2SQL
DDI 4 to SQL
Dr. Brigitte Mathiak, Oliver Hopt
Abstract: In this paper, we propose a generic way to produce Java classes with a SQL and XML binding for DDI 4 automatically. This has been tested on a prototype, further testing needs to be done on the actual model as part of the general production process. There is no XMI export as of yet.
Introduction
It's a strategic goal of DDI 4 to cover as many platforms as possible. For covering SQL-based DB, we propose to use the automatic object-to -DBMS mapping standard JPA. This has the benefit of providing Java POJOs, a configurable DB-Schema, a mapping between the two, choice between a number of solutions to implement the mapping, and the same thing for XML as well. The key for being able to do this is that the UML model only has a limited number of options in modeling (the example model in Fig. 1 contains all of them).
Fig. 1 Example model used for the prototype
Automatic Rendering
The goal of automatic rendering is to offer as much support to the implementers as possible with the minimum amount of hand crafting. Our approach is to envision an ideal scenario, in which everything implementers should need is automatically generated. Then, we attempt to implement this ideal scenario and observe the dead ends we run into.
Implementation of a prototype
To test our scenario, we have decided to implement in a small prototype in spirit of the second way. It includes all five allowed modeling options: class, attribute, inheritance, association and aggregation/composition.
Fig. 3 Minimal Use Case for an UML diagram containing the allowed modeling options
We started with the Java classes, as the requirements of the Java annotations would drive most of the issues for the UML to Java conversion.
@Entity
@XmlAccessorType(XmlAccessType.FIELD)
@XmlRootElement(name="Study")
public class Study extends DDIObject {...}
Inheritance is straightforward. Since we only allow inheritance by extension and no substitution groups, this is adequately covered by the Java mechanism. There are some annotations for JPA and JAXB, but, disregarding the name, they are identical for each class.
@Column
@XmlElement
private String year;
The annotation for the attributes are always the same. The @column denotes the column in the database this attribute is mapped to, @XmlElement directs the transformation to an XML element. The attribute name is the default name in both systems, but can be changed as above.
Aggregation proves to be somewhat trickier. Aggregation does include certain types of pseudo many-to-many relationships in which an object is reused. Therefore, the data structure in the database and the XML/object data structure are somewhat different. The UML states a one-to-many relationship, but due to the realities of reuse, an additional relationship table is needed
Example: Two Questionnaires include the same question.
–In a relational DB, we would need a many-to-many relationship between Questionnaire and Question to allow this, or we would need to duplicate the question.
–In flat style XML or OO, we simply refer to the same Question ID twice.
The difference can be addressed by treating both cases differently.
@XmlElement(name="RoleName")
@XMLIDRef
@ManyToMany
private List<Variable> variable = new ArrayList<Variable>();
For the XML generation, the references with the name RoleName are directly inserted into the objects, but due to the @XMLIDRef tag, they are only mentioned as references, resulting in the aforementioned flat hierarchy. For the database, a many-to-many table is generated. This is not a problem, as we never go directly from XML to the database, but always have the domain classes in between.
Real many-to-many relationships, such as the connection between variables and questions, are less problematic, as they can be modeled with explicit relationship objects. This will also include grouping and packaging abilities.
Ideally, grouping relationships will always be modeled in a similar way
Future Work
The demonstration prototype integrates the whole workflow from Java to the database, back and the same with XML. What still needs to be done is linking to the actual XMI export and synchronize with the XML Schema. This work is planned to be done by Oliver Hopt over the summer 2014. Similar work could be done by .NET, JSON and a number of other formats. The need for this will be evaluated at the developer's meeting at IASSIST.
Appendix
Evaluation of available tools
Fig. 2 Typical layer model of an application
In the layer model (cf. Fig. 1), we start by assuming that there exists an English + UML specification. For data exchange, however, we would need a fixed language that should ideally be validable. Obvious candidates are XML Schema and RDF. Next, we assume that such documents exist; therefore we would need an implementation to consume the data, e.g. in Java or the .NET framework. Moreover, we need to store the data in a database. Then, we need to read the data again from the database and export it to XML to close the cycle.
When looking at available tools for these tasks, those with a high esteem in the community are usually centered around the implementation layer. Since our expertise lies with Java, we have focused on that branch predominantly, but apparently, the situation is similar in .NET. We also ignored RDF, so far.
The most active field is the conversion between implementation layer and the data storage layer. In Java, this field is dominated by Hibernate, but also standardized via the JPA 2.0 standard. The ideal situation would be therefore to have a JPA annotation, which is a subset of the Hibernate capabilities, the standard with the highest support in the community. In their paper at the 4th European DDI User Conference (EDDI 2012), Bosch and Zloch have demonstrated how such annotations can be used on DDI data.
For Java to XML and reversely, JAXB is the standard solution, also with a set of standardized annotations to Java domain classes. JAXB allows moreover the generation of fairly good XML Schemata, however, the classes generated from XML Schema tend to be quite unusable. The produced XML Schema is not necessary to work with JAXB, therefore changes can still be made, should the JAXB capabilities not suffice.
There are some tools that convert XML Schema to Java (e.g. XMLBeans), but most are problematic in some way. One large problem is that the classes generated in such a way are fixed, otherwise the XML import might fail. This is problematic, because we would like to add Hibernate annotations, and besides, there may be changes required by performance considerations and probably for the abstract base classes in the core, as they have special requirements.
Starting from UML there are a number of tools that allow an export to XMI, Java and various other options. Import from XMI and therefore exchange of UML between products seems to be based on luck. Exports to Java work surprisingly well, producing nice domain classes in our tests, with only minimal need for adjustment. However, the transformation is not flexible; therefore, it is generally not possible to add annotations based on the UML. The same problem occurs for the XML Schema, but is much more problematic, as the Schema has to be flat, which was not the case in any of our tests.
Our evaluation shows two ways to fulfill the required technology pipeline. The obvious way is to start from UML transforming to XML Schema then to Java and finally to the DB, but it is only sub-optimally supported. Especially the lack of customization options from the XML Schema to Java is troubling. The other way is again to start from UML, but then transform directly to Java, to add the annotations by code generation and then, to generate the XML Schema. The second may have the problem of adding compatibility concerns with other platforms, e.g. .NET, but since the requirements and the form of the XML are quite strict, we assume that this will not be the case. After all, XML is an exchange format.