New Book Review: "Getting Started with Talend Open Studio for Data Integration"

New book review for Getting Started with Talend Open Studio for Data Integration, by Jonathan Bowen, Packt Publishing, 2012, reposted here:

Getting_started_with_talend_open_studio_for_data_integration

Stars-4-0._V192240704_

Decent introductory text on getting started with Talend Open Studio for Data Integration. After briefly discussing installation (which was trivial in my case, since I installed on Windows 7, and the Talend website offers a separately available installation guide, user guide, and components reference guide), the author walks the reader through the Eclipse-based environment, discussing general and Talend-specific terms along the way, such as "workspace", "project", "job", "repository", "palette", and "metadata". In subsequent chapters, Bowen covers file transformation between XML and CSV files (the largest discussion in the book) and working with databases, followed by processing techniques such as filtering and sorting, managing files, job orchestration, job management, variables and contexts, and a walkthrough of some more involved flows that combine elements discussed in previous chapters.

As a consultant who is just getting started with this product, has used Eclipse for many years for Java web application development, and has both used the IBM WebSphere counterpart for project work and been trained in the Informatica counterpart, the chapters that I appreciated the most are Chapter 3 ("Transforming Files"), Chapter 4 ("Working with Databases"), Chapter 5 ("Filtering, Sorting, and Other Processing Techniques"), Chapter 6 ("Managing Files"), and Chapter 7 ("Job Orchestration"). In reading through these chapters, I also developed all jobs that these chapters discuss, since it always helps me personally to actually write code rather than just read about it. Several of the job descriptions that the author presents do not provide accurate or full instructions, so actually going through the process of construction forced me to understand how everything is tied together.

Unlike the first 7 chapters, however, Chapter 8 ("Managing Jobs"), Chapter 9 ("Global Variables and Contexts"), and Chapter 10 ("Worked Examples"), are not well put together, and leave something to be desired. Some jobs simply did not work as described, unlike several exercises in earlier chapters that I was able to get to work with a little extra effort by making small modifications. While I did run across some scenarios where files were missing from the Packt Publishing download that is provided with this book, the inability to import the sample Talend jobs covered in Appendix A is my only real complaint, since Chapter 10 heavily relies on these files and does not discuss how to build them step by step as is the case in the rest of this book.

It is very possible that this is the case because the text assumes use of version 5.1 of the product, and I am using the latest version (5.4), so be forewarned if you want to use this book with a different version (downloads of version 5.2.3 and 5.3.2 are also currently available from the Talend website). Other reviewers here already point out the obvious. This book is the only recently published text currently available on the product, apart from the aforementioned guides which are available directly from Talend for each version of the product. But this is a book review, not a product review. In general, this text is recommended for anyone completely new to Talend Open Studio for Data Integration, and will help those familiar with other products in this space understand the basics of Talend terminology as well as how to construct integrations with files and databases (but not components involving other third-party products).

This book was used to prep for the "Certified Talend Open Studio for Data Integration Consultant" exam for which I later received training from Talend and successfully passed on the first attempt. If you are interested in going this route, be aware that while the official training contains significant overlap with what is covered in this text, it is helpful to take advantage of both, although with ample experience with the product this book is likely to provide enough of a starting point to get going. Hands-on training is currently being provided for version 5.3.1, but my familiarity with the latest version did not detract from transferability of skills.

Subscribe to Erik on Software

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe