Introduction to MongoDB


An Introduction to MongoDB

MongoDB — from “humongous” — is an open source, non-relational, document-oriented database. The goal of the MongoDB project is to bridge the gap between key-value stores (which are fast and highly scalable) and traditional RDBMS systems (which provide rich queries and deep functionality).

MongoDB is a scalable, high-performance, document-oriented schemaless database. In this short demo, long-time developerWorks contributor Andrew Glover introduces MongoDB, provides a quick tour of its use, and helps you understand where it’s most applicable.

About Mike Dirolf

Mike Dirolf is a Software Engineer at 10gen, the company the started the MongoDB project. Mike maintains the MongoDB client drivers for Python and Ruby. He also takes time out to talk about MongoDB and has presented at EuroPython, Strange Loop Conf, ReR, RuPy, RubyConf, CodeMash and FLOSS Weekly as well as at various meetup groups around the world. He is currently working on writing MongoDB:

MongoDB is a NoSQL database, instead of storing data in tables, it stores structured data in JSON-like documents, also known as BSON (Binary JSON).

SQL operations versus MongoDB operations

MongoDB queries are similar in concept to SQL queries and use a lot of the same terminology. There is no special language or syntax to execute MongoDB queries; you simply assemble a JSON object. The MongoDB site has a complete set of example queries done in both SQL and MongoDB JSON docs to highlight the conceptual similarities. What follows is several small listings to compare MongoDB operations to SQL.

Insert

SQL
INSERT INTO CONTACTS (NAME, PHONE_NUMBER) VALUES('Shekhar Shete','520-555-1212')
MongoDB
db.contacts.insert({name:'Shekhar Shete',phoneNumber:'520-555-1212'})

Selects

SQL
SELECT name, phone_number FROM contacts WHERE age=30 ORDER BY name DESC
MongoDB
db.contacts.find({age:30}, {name:1,phoneNumber:1}).sort({name:-1})
SQL
SELECT name, phone_number FROM contacts WHERE age>30 ORDER BY name DESC
MongoDB
db.contacts.find({age:{$gt:33}}, {name:1,phoneNumber:1}).sort({name:-1})

Creating indexes

SQL
CREATE INDEX contact_name_idx ON contact(name DESC)
MongoDB
db.contacts.ensureIndex({name:-1})

Updates

SQL
UPDATE contacts SET phoneNumber='415-555-1212' WHERE name='Shekhar Shete'
MongoDB
db.contacts.update({name:'Shekhar Shete'}, {$set:{phoneNumber:1}}, false, true)

Terminology and Concepts

The following table presents the various SQL terminology and concepts and the corresponding MongoDB terminology and concepts.

SQL Terms/Concepts MongoDB Terms/Concepts
database database
table collection
row document or BSON document
column field
index index
table joins embedded documents and linking
primary keySpecify any unique column or column combination as primary key. primary keyIn MongoDB, the primary key is automatically set to the _idfield.

Examples

The following table presents the various SQL statements and the corresponding MongoDB statements. The examples in the table assume the following conditions:

  • The SQL examples assume a table named users.
  • The MongoDB examples assume a collection named users that contain documents of the following prototype:
    {
      _id: ObjectID("509a8fb2f3f4948bd2f983a0"),
      user_id: "abc123",
      age: 55,
      status: 'A'
    }

Create and Alter

The following table presents the various SQL statements related to table-level actions and the corresponding MongoDB statements.

SQL Schema Statements MongoDB Schema Statements Reference
CREATE TABLE users (
    id MEDIUMINT NOT NULL
        AUTO_INCREMENT,
    user_id Varchar(30),
    age Number,
    status char(1),
    PRIMARY KEY (id)
)
Implicitly created on first insert operation. The primary key _id is automatically added if _id field is not specified.

db.users.insert( {
    user_id: "abc123",
    age: 55,
    status: "A"
 } )

However, you can also explicitly create a collection:

db.createCollection("users")
See insert() andcreateCollection()for more information.
ALTER TABLE users
ADD join_date DATETIME
Collections do not describe or enforce the structure of the constituent documents. See the Schema Design wiki page for more information. See update() and$set for more information on changing the structure of documents in a collection.
ALTER TABLE users
DROP COLUMN join_date
Collections do not describe or enforce the structure of the constituent documents. See the Schema Design wiki page for more information. See update() and$set for more information on changing the structure of documents in a collection.
CREATE INDEX idx_user_id_asc
ON users(user_id)
db.users.ensureIndex( { user_id: 1 } )
See ensureIndex()and indexes for more information.
CREATE INDEX
       idx_user_id_asc_age_desc
ON users(user_id, age DESC)
db.users.ensureIndex( { user_id: 1, age: -1 } )
See ensureIndex()and indexes for more information.
DROP TABLE users
db.users.drop()
See drop() for more information.

Insert

The following table presents the various SQL statements related to inserting records into tables and the corresponding MongoDB statements.

SQL INSERT Statements MongoDB insert() Statements Reference
INSERT INTO users(user_id,
                  age,
                  status)
VALUES ("bcd001",
        45,
        "A")
db.users.insert( {
       user_id: "bcd001",
       age: 45,
       status: "A"
} )
See insert() for more information.

Select

The following table presents the various SQL statements related to reading records from tables and the corresponding MongoDB statements.

SQL SELECT Statements MongoDB find() Statements Reference
SELECT *
FROM users
db.users.find()
See find()for more information.
SELECT id, user_id, status
FROM users
db.users.find(
    { },
    { user_id: 1, status: 1 }
)
See find()for more information.
SELECT user_id, status
FROM users
db.users.find(
    { },
    { user_id: 1, status: 1, _id: 0 }
)
See find()for more information.
SELECT *
FROM users
WHERE status = "A"
db.users.find(
    { status: "A" }
)
See find()for more information.
SELECT user_id, status
FROM users
WHERE status = "A"
db.users.find(
    { status: "A" },
    { user_id: 1, status: 1, _id: 0 }
)
See find()for more information.
SELECT *
FROM users
WHERE status != "A"
db.users.find(
    { status: { $ne: "A" } }
)
See find()and $ne for more information.
SELECT *
FROM users
WHERE status = "A"
AND age = 50
db.users.find(
    { status: "A",
      age: 50 }
)
See find()and $and for more information.
SELECT *
FROM users
WHERE status = "A"
OR age = 50
db.users.find(
    { $or: [ { status: "A" } ,
             { age: 50 } ] }
)
See find()and $or for more information.
SELECT *
FROM users
WHERE age > 25
db.users.find(
    { age: { $gt: 25 } }
)
See find()and $gt for more information.
SELECT *
FROM users
WHERE age < 25
db.users.find(
   { age: { $lt: 25 } }
)
See find()and $lt for more information.
SELECT *
FROM users
WHERE age > 25
AND   age <= 50
db.users.find(
   { age: { $gt: 25, $lte: 50 } }
)
See find(),$gt, and$lte for more information.
SELECT *
FROM users
WHERE user_id like "%shek%"
db.users.find(
   { user_id: /shek/ }
)
See find()and $regexfor more information.
SELECT *
FROM users
WHERE user_id like "shek%"
db.users.find(
   { user_id: /^shek/ }
)
See find()and $regexfor more information.
SELECT *
FROM users
WHERE status = "A"
ORDER BY user_id ASC
db.users.find( { status: "A" } ).sort( { user_id: 1 } )
See find()and sort()for more information.
SELECT *
FROM users
WHERE status = "A"
ORDER BY user_id DESC
db.users.find( { status: "A" } ).sort( { user_id: -1 } )
See find()and sort()for more information.
SELECT COUNT(*)
FROM users
db.users.count()

or

db.users.find().count()
See find()and count()for more information.
SELECT COUNT(user_id)
FROM users
db.users.count( { user_id: { $exists: true } } )

or

db.users.find( { user_id: { $exists: true } } ).count()
See find(),count(), and $existsfor more information.
SELECT COUNT(*)
FROM users
WHERE age > 30
db.users.count( { age: { $gt: 30 } } )

or

db.users.find( { age: { $gt: 30 } } ).count()
See find(),count(), and $gt for more information.
SELECT DISTINCT(status)
FROM users
db.users.distinct( "status" )
See find()anddistinct()for more information.
SELECT *
FROM users
LIMIT 1
db.users.findOne()

or

db.users.find().limit(1)
See find(),findOne(), and limit()for more information.
SELECT *
FROM users
LIMIT 5
SKIP 10
db.users.find().limit(5).skip(10)
See find(),limit(), and skip()for more information.
EXPLAIN SELECT *
FROM users
WHERE status = "A"
db.users.find( { status: "A" } ).explain()
See find()andexplain()for more information.

Update Records

The following table presents the various SQL statements related to updating existing records in tables and the corresponding MongoDB statements.

SQL Update Statements MongoDB update() Statements Reference
UPDATE users
SET status = "C"
WHERE age > 25
db.users.update(
   { age: { $gt: 25 } },
   { $set: { status: "C" } },
   { multi: true }
)
See update()$gt, and $set for more information.
UPDATE users
SET age = age + 3
WHERE status = "A"
db.users.update(
   { status: "A" } ,
   { $inc: { age: 3 } },
   { multi: true }
)
See update()$inc, and $set for more information.

Delete Records

The following table presents the various SQL statements related to deleting records from tables and the corresponding MongoDB statements.

SQL Delete Statements MongoDB remove() Statements Reference
DELETE FROM users
WHERE status = "D"
db.users.remove( { status: "D" } )
See remove() for more information.
DELETE FROM users
db.users.remove( )
See remove() for more information.

SQL to Aggregation Framework Mapping Chart

The aggregation framework allows MongoDB to provide native aggregation capabilities that corresponds to many common data aggregation operations in SQL. If you’re new to MongoDB you might want to consider the Frequently Asked Questions section for a selection of common questions.

The following table provides an overview of common SQL aggregation terms, functions, and concepts and the corresponding MongoDB aggregation operators:

SQL Terms, Functions, and Concepts MongoDB Aggregation Operators
WHERE $match
GROUP BY $group
HAVING $match
SELECT $project
ORDER BY $sort
LIMIT $limit
SUM() $sum
COUNT() $sum
join No direct corresponding operator; however, the $unwind operator allows for somewhat similar functionality, but with fields embedded within the document.

Examples

The following table presents a quick reference of SQL aggregation statements and the corresponding MongoDB statements. The examples in the table assume the following conditions:

  • The SQL examples assume two tables, orders and order_lineitem that join by the order_lineitem.order_id and theorders.id columns.
  • The MongoDB examples assume one collection orders that contain documents of the following prototype:
    {
      cust_id: "abc123",
      ord_date: ISODate("2012-11-02T17:04:11.102Z"),
      status: 'A',
      price: 50,
      items: [ { sku: "xxx", qty: 25, price: 1 },
               { sku: "yyy", qty: 25, price: 1 } ]
    }
  • The MongoDB statements prefix the names of the fields from the documents in the collection orders with a $ character when they appear as operands to the aggregation operations.
SQL Example MongoDB Example Description
SELECT COUNT(*) AS count
FROM orders
db.orders.aggregate( [
   { $group: { _id: null,
               count: { $sum: 1 } } }
] )
Count all records fromorders
SELECT SUM(price) AS total
FROM orders
db.orders.aggregate( [
   { $group: { _id: null,
               total: { $sum: "$price" } } }
] )
Sum theprice field from orders
SELECT cust_id,
       SUM(price) AS total
FROM orders
GROUP BY cust_id
db.orders.aggregate( [
   { $group: { _id: "$cust_id",
               total: { $sum: "$price" } } }
] )
For each uniquecust_id, sum theprice field.
SELECT cust_id,
       SUM(price) AS total
FROM orders
GROUP BY cust_id
ORDER BY total
db.orders.aggregate( [
   { $group: { _id: "$cust_id",
               total: { $sum: "$price" } } },
   { $sort: { total: 1 } }
] )
For each uniquecust_id, sum theprice field, results sorted by sum.
SELECT cust_id,
       ord_date,
       SUM(price) AS total
FROM orders
GROUP BY cust_id, ord_date
db.orders.aggregate( [
   { $group: { _id: { cust_id: "$cust_id",
                      ord_date: "$ord_date" },
               total: { $sum: "$price" } } }
] )
For each uniquecust_id,ord_dategrouping, sum the pricefield.
SELECT cust_id, count(*)
FROM orders
GROUP BY cust_id
HAVING count(*) > 1
db.orders.aggregate( [
   { $group: { _id: "$cust_id",
               count: { $sum: 1 } } },
   { $match: { count: { $gt: 1 } } }
] )
For cust_idwith multiple records, return the cust_idand the corresponding record count.
SELECT cust_id,
       ord_date,
       SUM(price) AS total
FROM orders
GROUP BY cust_id, ord_date
HAVING total > 250
db.orders.aggregate( [
   { $group: { _id: { cust_id: "$cust_id",
                      ord_date: "$ord_date" },
               total: { $sum: "$price" } } },
   { $match: { total: { $gt: 250 } } }
] )
For each uniquecust_id,ord_dategrouping, sum the pricefield and return only where the sum is greater than 250.
SELECT cust_id,
       SUM(price) as total
FROM orders
WHERE status = 'A'
GROUP BY cust_id
db.orders.aggregate( [
   { $match: { status: 'A' } },
   { $group: { _id: "$cust_id",
               total: { $sum: "$price" } } }
] )
For each uniquecust_idwith status A, sum theprice field.
SELECT cust_id,
       SUM(price) as total
FROM orders
WHERE status = 'A'
GROUP BY cust_id
HAVING total > 250
db.orders.aggregate( [
   { $match: { status: 'A' } },
   { $group: { _id: "$cust_id",
               total: { $sum: "$price" } } },
   { $match: { total: { $gt: 250 } } }
] )
For each uniquecust_idwith status A, sum theprice field and return only where the sum is greater than 250.
SELECT cust_id,
       SUM(li.qty) as qty
FROM orders o,
     order_lineitem li
WHERE li.order_id = o.id
GROUP BY cust_id
db.orders.aggregate( [
   { $unwind: "$items" },
   { $group: { _id: "$cust_id",
               qty: { $sum: "$items.qty" } } }
] )
For each uniquecust_id, sum the corresponding line item qtyfields associated with the orders.
SELECT COUNT(*)
FROM (SELECT cust_id, ord_date
      FROM orders
      GROUP BY cust_id, ord_date) as DerivedTable
db.orders.aggregate( [
   { $group: { _id: { cust_id: "$cust_id",
                      ord_date: "$ord_date" } } },
   { $group: { _id: null, count: { $sum: 1 } } }
] )

Query, Update, Projection, and Aggregation Operators

Click Here for More

Happy Coding with MongoDB…! 🙂

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s