Professional Documents
Culture Documents
Introduction
Benjamin Darfler
o @bdarfler
o http://bdarfler.com
o Senior Software Engineer at Localytics
Localytics
o Real time analytics for mobile applications
o 100M+ datapoints a day
o More than 2x growth over the past 4 months
o Heavy users of Scala, MongoDB and AWS
This Talk
o Revised and updated from MongoNYC 2011
MongoDB at Localytics
Use cases
o Anonymous loyalty information
o De-duplication of incoming data
Scale today
o Hundreds of GBs of data per shard
o Thousands of ops per second per shard
History
o In production for ~8 months
o Increased load 10x in that time
o Reduced shard count by more than a half
Disclaimer
Quick Poll
Who is using MongoDB in production?
Who is deployed on AWS?
Who has a sharded deployment?
o More than 2 shards?
o More than 4 shards?
o More than 8 shards?
Shorten Names
Before
{super_happy_fun_awesome_name:"yay!"}
After
{s:"yay!"}
After
{u:BinData(0, "...")}
Override _id
Before
{_id:ObjectId("..."), u:BinData(0, "...")}
After
{_id:BinData(0, "...")}
Pre-aggregate
Before
{u:BinData(0, "..."), k:BinData(0, "abc")}
{u:BinData(0, "..."), k:BinData(0, "abc")}
{u:BinData(0, "..."), k:BinData(0, "def")}
After
{u:BinData(0, "abc"), c:2}
{u:BinData(0, "def"), c:1}
Prefix Indexes
Before
{k:BinData(0, "...")} // indexed
After
{
p:BinData(0, "...") // prefix of k, indexed
s:BinData(0, "...") // suffix of k, not indexed
}
Sparse Indexes
Create a sparse index
db.collection.ensureIndex({middle:1}, {sparse:true});
After
db.collection.find().limit(10);
db.collection.findOne();
After
db.collection.find({u:BinData(0, "...")}, {field:1});
Covering Indexes
Create an index
db.collection.ensureIndex({first:1, last:1});
Prefetch
Before
db.collection.update({u:BinData(0, "...")}, {$inc:{c:1}});
After
db.collection.find({u:BinData(0, "...")});
db.collection.update({u:BinData(0, "...")}, {$inc:{c:1}});
Inserts
doc1
doc2
doc3
doc4
doc5
Deletes
doc1
doc1
doc2
doc3
doc2
doc3
doc4
doc5
doc4
doc5
Updates
doc1
doc1
doc2
doc3
doc2
doc3
doc4
doc5
doc4
doc5
doc3
Reclaiming Freespace
doc1
doc1
doc2
doc3
doc2
doc6
doc4
doc5
doc4
doc5
doc1
doc2
doc6
doc4
doc5
}
}
page
page
Fragmentation
Fragmentation Mitigation
Automatic Padding
o MongoDB auto-pads records
o Manual tuning scheduled for 2.2
Manual Padding
o Pad arrays that are known to grow
o Pad with a BinData field, then remove it
Free list improvement in 2.0 and scheduled in 2.2
Fragmentation Fixes
Repair
o
db.repairDatabase();
with primary
Compact
o
db.collection.runCommand( "compact" );
Shard 2
Chunk 1
k: 1 to 5
Chunk 1
k: 1 to 5
Chunk 2
k: 6 to 9
{k: 4, }
{k: 4, }
{k: 8, }
{k: 3, }
{k: 3, }
{k: 5, }
{k: 7, }
{k: 5, }
{k: 6, }
db.runCommand({split:"db.collection", middle:{_id:99}});
Pre-move
o
Shard 2
Chunk 1
k: 1 to 5
Chunk 1
k: 1 to 5
Chunk 2
k: 6 to 9
{k: 3, }
{k: 3, }
{k: 4, }
{k: 4, }
{k: 5, }
{k: 5, }
{k: 6, }
{k: 7, }
{k: 8, }
Pathological Test
What happens when data far exceeds RAM?
o 10:1 read/write ratio
o Reads evenly distributed over entire key space
One Mongod
Index in RAM
Many Mongods
Index in RAM
Reminder
Questions?
@bdarfler
http://bdarfler.com