By Erik Gfesser — Apr 15, 2010

Liquibase in the Enterprise: Part 2

In the first part of this discussion, database change management tool Liquibase was presented as one of the best software development tools I have used in recent years, and along with the practicality of this open source LGPL software comes several core benefits for the enterprise data architect: decreased cost, increased maintainability, and effective incorporation into the software build process.

Unfortunately, as with other open source LGPL software, Liquibase documentation is sparse in general, and when it does exist the level of quality is often inconsistent. As enterprise data architect over the past year, Liquibase has been an important part of my toolbox, and because it has been received so well – by software development teams as well as DBAs – I write this series to evangelize the benefits of the tool as well as share at least a peek into day-to-day in-the-trenches usage.

Before getting into the specifics of using Liquibase, it is important to understand that although Liquibase can be used in a number of different ways, one of my goals here is to extol its usage as part of an agile software development effort. Now, since you are reading this series of discussions, it is assumed that you are already aware of the benefits of agile software development in general. If not, there are plenty of other resources to get up to speed in this regard, starting with the Manifesto for Agile Software Development.

In this discussion series, although Liquibase will be periodically presented as a standalone tool, it will also be referenced in conjunction with several other tools, all of which work together: Apache Maven, Artifactory, and Subversion. Maven is a software project management and comprehension tool, Artifactory is a repository manager, and Subversion is a version control system.

While this assembly of tools is referenced here, other assemblies can be used. Along with the four essential freedoms of open source that were mentioned in the first part of this discussion, it is important to note that open source software can provide a high level of community as well as enable a high level of configuration, and this latter benefit is what can permit so many options.

From a user perspective, the heart of Liquibase is the changeSet. Each changeSet contains operations that are to be executed against the target relational database. Each changeSet is equivalent to one or more SQL statements, and so a grouping of changeSets is equivalent to a SQL script. While SQL can be used in a changeSet, the maintainability that Liquibase offers will be partly forsaken if one takes this route. In addition to decreased maintainability, software developer involvement in the process might be decreased, since many software developers are simply more accustomed to the object world.

One of the simplest SQL operations that might be used in a SQL script is INSERT. For example:

INSERT INTO PersonRoleReference (Id, Code, Description, Display) VALUES (1, 'APPL', 'Person applying for membership', 'Applicant');

Used within a changeSet, this SQL statement can be simply enclosed within <sql> tags:

<sql> INSERT INTO PersonRoleReference (Id, Code, Description, Display) VALUES (1, 'APPL', 'Person applying for membership', 'Applicant'); </sql>

Using Liquibase, however, the <insert> tag is preferred:

<insert tableName="PersonRoleReference"> <column name="Id" valueNumeric="1"/> <column name="Code" value="APPL"/> <column name="Description" value="Person applying for membership"/> <column name="Display" value="Applicant"/> </insert>

The <insert> tag is preferred because it decreases error risk and is database agnostic.

Decreased error risk. While the <insert> tag requires more text input, the software developer does not need to be concerned for example with matching up column names and column values, all of which are separated by commas. In addition, software developers are typically more accustomed to simplified XML key / value pairs.

Database agnostic. Perhaps a more important aspect of the <insert> tag is that it is database agnostic. What this means is that regardless of which of the relational database products to which Liquibase might be pointed, the same syntax can be used. While the INSERT statement might already be standardized SQL, we will see with other types of SQL statements that this quality will provide considerable benefit to the software developer.

One of the initial objections one might receive when attempting to introduce Liquibase into the enterprise is that software developers need to learn new syntax. In my experience, the syntax is typically intuitive, and the software developer can replicate use of a particular Liquibase tag once it has already been used in a script. Use of SQL is apt to be much more tedious, especially for INSERT operations like the above. And Liquibase conveniently provides the <loadData> tag when the quantity of <insert> tags is not trivial:

<loadData tablename="PersonRoleReference" file="src/liquibase/personRoleReference.csv"> <column name="Id" type="NUMERIC"> <column name="Code" type="STRING"> <column name="Description" type="STRING"> <column name="Display" type="STRING"> </loadData>

In this example, a comma-delimited CSV file containing multiple table records is referenced, and the <column> tags here indicate the names and data types of each column. The specific numeric data type is not needed, so the software developer does not need to be concerned with whether the data is of type BigInt or SmallInt, for example, and the specific text data type is not needed either, negating the need to specify VarChar or VarChar2, for example. As we will see, specifying specific data type will still be important when creating a database table as it is when using SQL, but these specifics are not needed when loading data to a database table.

While relational database products often offer load features, these are typically performed via the command line rather than SQL scripts. Liquibase conveniently provides the ability to use all tags at the script level. We will discuss the Liquibase command line in a future installation of this discussion series, but usage of the command line is typically reserved for script execution and reference rather than executing isolated commands such as <loadData>.

Now that we have gotten a taste of the Liquibase changeSet, I will introduce some additional common tags in the next installation of this discussion series.

Liquibase in the Enterprise: Part 1

Liquibase in the Enterprise: Part 3

Liquibase in the Enterprise: Part 4

Liquibase in the Enterprise: Part 5

Liquibase in the Enterprise: Part 6

Liquibase in the Enterprise: Part 7

Subscribe to Erik on Software