Welcome Guest. | Log In | Register | Membership Benefits


Topics:   :
  • |   Email this page E-mail
  • |  Print Print
  • |   Bookmark and Share

Is It Time For NoETL?

I've been bemused by NoSQL, the movement that propounds database-management diversity. Is it now similarly time for a NoETL movement, reflecting a new world of liberated, semantically enriched, analysis-ready, mashable data?

Mar 24, 2010 | 01:45 PM | 

By Seth Grimes
Informationweek
I've been bemused by NoSQL, the movement that propounds database-management diversity with the very valid claim that a one-size-fits-all relational approach is a poor match for emerging, demanding data challenges. Didn't we all know that relational databases, based on tables and joins, aren't always best? Hadn't the issue been the lack of usable, reliable, enterprise worthy alternatives? Similarly, haven't we long understood that wiring-up extract-transform-load (ETL) is laborious -- all those adapters and rules and the need for hand-matching -- even if necessary given the perceived need to gather, cleanse, and integrate diverse BI data sources? Is that preparatory data work still essential? Or is it now time for a NoETL movement, reflecting a new world of liberated, semantically enriched, analysis-ready, mashable data?NoSQL

The "SQL" of NoSQL is Structured Query Language, which has been closely associated with relational databases since the '70s, since the RDBMS early days. With Oracle's and IBM's support, SQL vanquished superior alternatives such as Ingres's Quel. SQL is an easy target for criticism, on its own and standing in as a proxy for relational systems.

SQL is stateless, given which limitation, vendors have wrapped it in diverse, incompatible procedural languages to support multi-step data processes. SQL's set-oriented approach creates a data-handling burden for application programmers so we have cursors, a row-/record-oriented retrieval kludge. Correlated subqueries are a usability nightmare, and the check-list demand for ACID compliance -- transactional atomicity, consistency, isolation, durability -- is simply overhead overkill for analytical applications.

NoSQL is a catch-all term for a grab bag of relational alternatives. NoSQL is a New Testament that seeks to supplant the Codd of Old.

SQL's deficiencies have been known for years; nonetheless, SQL has served the database community well and supported the creation of immense business value for the many, many millions of RDBMS end users. So have the ETL technologies that feed relational (and other) databases from flat-file, spreadsheet, operational-system, and database sources -- technologies, plural. Is ETL still relevant in a world of semantic computing?

Semantic computing

Semantic computing relies on meaning-ful data. That data may be stored in RDBMS tables with an associated metadata repository. It may be modeled with a graph structure, described via RDF (the XML based Resource Description Framework), and captured in a "triple store" for query via SPARQL. It may be mapped into an ontology, a mechanism for knowledge representation. ("Knowledge" here is a network of relationships, a.k.a. facts, that link entities within a subject-matter domain.)

Semantic computing involves methods and software designed to mine meaning, relationships, and usages from sources both conventional and unconventional, from structured databases and from the chaos that is the Web. All that good stuff is inferred from whatever definitions, data profiles (i.e., information on the distributions of the values of variables), and context are available.

The payoff is that you have all the ingredients necessary to support dynamic integration, to enable as-you-like-it data mashability.

Dynamic integration: NoETL

A number of tools claim/aim to support dynamic integration, some metadata or semantics driven, so that are essentially visually programmed without reliance, for the end user or behind the scenes, on the ETL equivalent of SQL. They include companies such as Expressor, Progress Software, and JackBe, the latter an enterprise mashup vendor.

I'll credit JackBe with prompting me to think much more intently about this stuff than I would have otherwise. I wrote a short paper for them, Nimble Intelligence: Enterprise BI Mashup Best Practices, and presented on the same topic in a JackBe webinar yesterday. (I was paid for this work and for strategy consulting.) The thought is that mashups bring agility to BI, the possibility of integrating the data and application elements you need, when needed, without much or most of the overhead typically associated with conventional BI.

It's freedom baby, yeah!

NoETL is an extension of this concept, actually a sort-of retake on Enterprise Information Integration (EII), a once-promising but now neglected notion that one can successfully build and query a unified virtual schema, spanning data sources, without requiring data collection into a single data warehouse or repository. In considering NoETL, let's recognize the value of traditional ETL and of EII and use them where they fit best. Let's also understand the promise and power of semantics, and of the diversity of NoSQL-ite data representations, in seeking data integration approaches that enable truly agile BI.I've been bemused by NoSQL, the movement that propounds database-management diversity. Is it now similarly time for a NoETL movement, reflecting a new world of liberated, semantically enriched, analysis-ready, mashable data?



Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dark Reading encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dark Reading moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Dark Reading further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
Subscribe to RSS









  1. Cookies, Social Media And FireSheep
  2. SMB Guide To Credit Card Regulations, Part 2: The Low-Hanging Fruit
  3. HP And The Scary Corporate Fifth Column Concept
  4. Taking USB Attacks To The Next Level
  5. NoSQL: Not Much, Anyway
  1. Taking Cybersecurity Lessons To The Bank
  2. Researchers See Real-Time Phishing Jump
  3. 'BlackSheep' Sniffs Out Firesheep WiFi-Hacking
  4. Slideshow: Ten Free Security Monitoring Tools
  5. A Different Spin On Sleuthing Stuxnet
  6. M&A Activity Muddles Database Security
  1. Secure Managed Web Hosting Saves 960.gs from Malicious Hackers
  2. Access Governance as a Business Service: An Integrated Strategy for Automation with ITSM
  3. Business Driven Access Management and Governance: Simplifying the Delivery and Governance of Access Throughout
 
 


 
  Ars Technica
Boing Boing
Channel 9 Forums
CRN Blogs
Dr.Dobb's Portal: Blogs
Engadget
Gizmodo
GrokLaw
  Lifehacker
Schneier on Security
Slashdot
TechCrunch
Techdirt
Techmeme
Valleywag
 
  February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
  May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008