Your Ad Here

Sunday, October 21, 2012

Not Only SQL



NoSQL database, also called Not Only SQL, is an approach to data management and database design that's useful for very large sets of distributed data.   NoSQL, which encompasses a wide range of technologies and architectures, seeks to solve the scalability and big data performance issues that relational databases weren’t designed to address. NoSQL is especially useful when an enterprise needs to access and analyze massive amounts of unstructured data or data that's stored remotely on multiple virtual servers in the cloud. 

 
            The approach taken by the two of today’s best-known NoSQL systems: CouchDB and MongoDB  was inspired by the Lotus founder Mitch Kapor and Microsoft chief architect Ray Ozzie who worked together to create a collaboration and personal productivity tool called Lotus Notes way back in 1980.
            Like Notes, these database systems store information not as normalized relational tables, but as documents in a rich self describing structure. Both use a variant of JavaScript Object Notation (JSON) to store these documents. JSON is somewhat like XML,but offers more compact storage and lower processing overhead. Document databases primarily appeal to developers for the very reason that relational databases don’t.
            Contrary to misconceptions caused by its name, NoSQL does not prohibit structured query language (SQL). While it's true that some NoSQL systems are entirely non-relational, others simply avoid selected relational functionality such as fixed table schemas and join operations.
The most popular NoSQL database is Apache Cassandra. Cassandra, which was once Facebook’s proprietary database, was released as open source in 2008. Other NoSQL implementations include SimpleDB, Google BigTable, Apache Hadoop, MapReduce, MemcacheDB, and Voldemort. Companies that use NoSQL include NetFlix, LinkedIn and Twitter.
            However, IBM, Oracle, and Microsoft are seeing an unprecedented challenge to their SQL licensing money machines thanks to enterprise data workloads that are getting more diverse, larger, and more complex. In those cases, IT will trade stringent data consistency for low latency and a favorable cost versus performance balance.
            NoSQL is ideal also feature staggering volumes and velocities of data; we’re talking, at the low end, a few tera bytes of records, with 10,000 or 20,000 concurrent inserts per second, minimum. Once you’re at that level, IT needs to focus on platforms that can serve as the infrastructure layer—capable of scaling out, without costing a fortune.
            Under heavy volume, NoSQL can be as much as 100 times as fast as SQL for such workloads, according to the tests performed by the NoSQL open source community. Oracle, on the performance of its NoSQL offering, demonstrated excellent scalability, throughput, and latency; one test showed close to 100,000 inserts per second and 3.7 milliseconds average latency for 360 client threads. As for costs, conventional SQL (hardware and software included) ends up about five to 10 times as costly as a typical NoSQL setup.


No comments:

Post a Comment