"Normalizing" MongoDB

Posted by Brody on 25-Jun-2014 22:38

Love it or hate it, MongoDB is a different database.  Basic RDBMS concepts like ACID Transactions, table definitions, column constraints, and joins are not (currently) supported by MongoDB.  Ironically, while MongoDB is touted for the benefits their data model provides, it is also the reason that many “relationally” minded database gurus fail in their initial MongoDB project(s).  I’m a strong believer that many who hate MongoDB are really just struggling to understand successful schema design given their strong RDBMS experience and the lack of MongoDB best practices and examples to build upon. 

If you’re unfamiliar with an RDBMS data model vs MongoDB, let me provide a short explanation.  In the relational world, data duplication is discouraged and avoided.  Information in tables are linked via primary key/foreign key relationships.  In MongoDB (and many other NoSQL solutions) data is nested.  Let's consider the design of a simple database containing contact information.  In an RDBMS, you’ll likely have a contact table containing the contact name which also has an ID column that is a unique row identifier.  You may then have a separate table to hold addresses (work, home, other).  In addition to columns that store the address (street, city, state, zip, etc), this table would also contain a foreign key column that associates the address with a specific contact.  In the MongoDB world, the address could simply be implemented as an array of embedded documents which is part of the document/row containing the contact name.  It’s easiest just to show the structure of a JSON document for a contact:

{name: “Brody”, address: [ { label: “home”, street: “101 Main Street”, city: “Raleigh”, state: “NC”, zip: 27613}, {label: “work”, street: “3005 Carrington Mill Blvd”, city: “Morrisville”, state: “NC”, zip: 27560} ] }  

Despite some people’s struggles, many are successfully using MongoDB for their projects and achieve superb performance.  As more data ends up in MongoDB, more people want and need to access it with standard reporting, analytic, and BI tools based on the ODBC and/or JDBC standards.  There’s just a few major complications here: 

  1. ODBC and JDBC drivers are a means to execute SQL, and MongoDB doesn’t support SQL.
  2. As discussed above, successful MongoDB schema designs have a nested structure where data is de-normalized.  ODBC and JDBC applications are accustomed to normalized, relational data.
  3. Similar to the point above, MongoDB’s schema is flexible and sparse.  Flexible means, data types for a given “column” can change from “row” to “row”.  Sparse means columns don’t need to exist in every “row”.

As you may have guessed, seen, or heard, we’ve recently released both ODBC and JDBC drivers that attempt to address these challenges.  I don’t want to downplay the effort or skills required to implement quality SQL support for MongoDB, nor do I want to downplay the importance in a quality and high performance implementation of the ODBC/JDBC driver.  However, what sets us well above other solutions is our work to expose the MongoDB data model to relationally minded ODBC and JDBC applications as the normalized, relational structure they expect. In the simple example above, our driver exposes the MongoDB data model in the exact way that I described the relational data model.

I think it’s important to have an understanding of exactly how our driver “normalizes” MongoDB.  Our user guide (installed with the product and available online) provides solid examples of how this functionality behaves.  None the less, I’m sure users will have questions.  In addition to the documentation and our technical support, this user community is a great resource for getting answers, please don't hesitate to utilize it!

All Replies

Posted by Sumit Sarkar on 25-Jun-2014 22:50

Great write up Brody!  From attending MongoDBWorld and joining sessions from Finserv shops like Citi and Goldman Sachs, or even pharma like Sanofi; a lack of reliable SQL connectivity is a show stopper for widespread MongoDB adoption. With the approach you outlined, shops are starting to leverage our normalized approach. Here is my related blog article, and it's definitely not limited to Finserv:  blogs.datadirect.com/.../sql-access-mongodb-odbc-expanded-adoption-financial-services.html

Posted by Brody on 18-Jul-2014 09:09

We also recently posted a webinar that discusses our solution in more detail: forms.progress.com/.../wbr-TransformMongoDB

This thread is closed