MongoDB Performance

Explain Query: https://nosqlbooster.com/
Tips: https://docs.mongodb.com/manual/tutorial/optimize-query-performance-with-indexes-and-projections/
MongoDB Profiler: https://studio3t.com/knowledge-base/articles/mongodb-query-performance/

-o-

MongoDB Connection Pool

https://stackoverflow.com/questions/41271707/java-mongodb-connection-pool

MongoDB Notes

Write Concern:
Acknowledge Write – 1 – DB Setting – False – Fast Response – Small error of missing write
Acknowledge Write – 1 – DB Setting – True – Slow Response – No Error
Unacknowledge Write-0 – Not waiting for server to respond.

In replicated environment, there are many more variables.

——-
Network Errors

use case 1: Write to DB, Due to network error, system didn’t responded. Data was written to disk
Option 1: Based on case, due to failure, try to write one more time.
Option 2: In sensitive case, read data to make sure that data was written and act acordingly.

——-
Replication

Availability
Fault Tolerance

To select Primary, we need to have odd number of servers.

—————

Types of Replica Set Nodes
1. Regular Node – Primary or Secondary
2. Arbiter Node – Only for voting purposes. No Data on it.
3. Delayed Node – It can’t become primary node. This is one hour late on updates with compare to other nodes.
4. Hidden Node – It can’t become primary node. Used for Analytics
All nodes can participate in election

——
Write Consistency

Always writes/reads goes to Primary
Application can read from secondary
Eventual consistency

During the time when failover is occurring, can writes successfully complete?
No

—————–

Sharding

mongos is router..takes care of distribution…
Sharding is used for horizontal scalability

—————–

MongoDB – Aggregation Expressions

sum
avg
min
max
push
addtoset
first
last
=================

Which of the following aggregation expressions must be used in conjunction with a sort to make any sense?
$first, $last
===================
use agg
db.products.aggregate([
{$group:
{
_id: {
“manufacturer”:”$manufacturer”
},
num_products:{$sum:”$price”}
}
}
])
================

mongoimport is independent command. Don’t run this from shell.
>mongoimport –db agg –collection zips –file zips.json

>mongo
>use agg
>db.zips.count()

use agg;
db.zips.aggregate([{“$group”:{“_id”:”$state”, “population”:{$sum:”$pop”}}}])

use agg;
db.zips.aggregate([{“$group”:{“_id”:”$state”, “average_pop”:{$avg:”$pop”}}}])

================
addToSet

use agg
db.products.aggregate([
{$group:
{
_id: {
“maker”:”$manufacturer”
},
categories:{$addToSet:”$category”}
}
}
])

{ “_id” : { “maker” : “Amazon” }, “categories” : [ “Tablets” ] }
{ “_id” : { “maker” : “Google” }, “categories” : [ “Tablets” ] }
{ “_id” : { “maker” : “Sony” }, “categories” : [ “Laptops” ] }
{ “_id” : { “maker” : “Samsung” }, “categories” : [ “Tablets”, “Cell Phones” ] }
{ “_id” : { “maker” : “Apple” }, “categories” : [ “Tablets”, “Laptops” ] }

================

Write an aggregation query that will return the postal codes that cover each city.

use agg;

db.zips.aggregate([{$group:{
“_id”: “$city”,
“postal_codes”: {“$addToSet”:”$_id”}
}}])
================
use agg
db.products.aggregate([
{$group:
{
_id: {
“maker”:”$manufacturer”
},
maxprice:{$max:”$price”}
}
}
])
================

MongoDB – Aggregation

db.stuff.insert({ “_id” : ObjectId(“50b26f9d80a78af03b5163c8”), “a” : 1, “b” : 1, “c” : 1 })
db.stuff.insert({ “_id” : ObjectId(“50b26fb480a78af03b5163c9”), “a” : 2, “b” : 2, “c” : 1 })
db.stuff.insert({ “_id” : ObjectId(“50b26fbf80a78af03b5163ca”), “a” : 3, “b” : 3, “c” : 1 })
db.stuff.insert({ “_id” : ObjectId(“50b26fcd80a78af03b5163cb”), “a” : 3, “b” : 3, “c” : 2 })
db.stuff.insert({ “_id” : ObjectId(“50b26fd380a78af03b5163cc”), “a” : 3, “b” : 5, “c” : 3 })

 

db.stuff.aggregate([{$group:{_id:'$c'}}])

{ "_id" : 3 }
{ "_id" : 2 }
{ "_id" : 1 }


---------------------

Compount Grouping

use agg
db.products.aggregate([
 {$group:
 {
 _id: {
 "manufacturer":"$manufacturer", 
 "category" : "$category"},
 num_products:{$sum:1}
 }
 }
])

use agg
db.products.aggregate([
 {$group:
 {
 _id: {
 "category" : "$category",
 "manufacturer":"$manufacturer"
 },
 num_products:{$sum:1}
 }
 }
])

---------------------

MongoDB – Aggregation

Aggregation Commands
aggregate
count
distinct
group
mapReduce

Aggregation Methods

https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/
Pipeline Aggregation Stages
$collStats
$project
$match
$redact
$limit
$skip
$unwind : Deconstructs an array field from the input documents to output a document for each element.
$group
$sample
$sort
$geoNear
$lookup
$out
$indexStats
$facet
$bucket
$bucketAuto
$sortByCount
$addFields
$replaceRoot
$count
$graphLookup

Boolean Aggregation Operators: $and, $or, $not

Set Operators: $setEquals, $setIntersection, $setUnion, $setDifference, $setIsSubset, $anyElementTrue, $allElementsTrue

Comparison Aggregation Operators: cmp, eq, gt, gte, lt, lte, ne

Arithmetic Aggregation Operators: abs, add, ceil, divide, exp, floor, ln, log, log10, mod, multiply, pow, sqrt, subtract, trunc

String Aggregation

Text Search Aggregation

Array Aggregation Operators

Aggregation Variable Operators

Aggregation Literal Operators

Date Aggregation Operators

Conditional Aggregation Operators

Group Accumulator Operators

Limitations:
By default 100 megabytes of RAM.
set the allowDiskUse option to true to overcome this issue.

MongoDB – Aggregation

use agg
db.products.drop()

db.products.insert({‘name’:’iPad 16GB Wifi’, ‘manufacturer’:”Apple”,
‘category’:’Tablets’,
‘price’:499.00})
db.products.insert({‘name’:’iPad 32GB Wifi’, ‘category’:’Tablets’,
‘manufacturer’:”Apple”,
‘price’:599.00})
db.products.insert({‘name’:’iPad 64GB Wifi’, ‘category’:’Tablets’,
‘manufacturer’:”Apple”,
‘price’:699.00})
db.products.insert({‘name’:’Galaxy S3′, ‘category’:’Cell Phones’,
‘manufacturer’:’Samsung’,
‘price’:563.99})
db.products.insert({‘name’:’Galaxy Tab 10′, ‘category’:’Tablets’,
‘manufacturer’:’Samsung’,
‘price’:450.99})
db.products.insert({‘name’:’Vaio’, ‘category’:’Laptops’,
‘manufacturer’:”Sony”,
‘price’:499.00})
db.products.insert({‘name’:’Macbook Air 13inch’, ‘category’:’Laptops’,
‘manufacturer’:”Apple”,
‘price’:499.00})
db.products.insert({‘name’:’Nexus 7′, ‘category’:’Tablets’,
‘manufacturer’:”Google”,
‘price’:199.00})
db.products.insert({‘name’:’Kindle Paper White’, ‘category’:’Tablets’,
‘manufacturer’:”Amazon”,
‘price’:129.00})
db.products.insert({‘name’:’Kindle Fire’, ‘category’:’Tablets’,
‘manufacturer’:”Amazon”,
‘price’:199.00})
————————-
use agg
db.products.aggregate([
{$group:
{
_id:”$manufacturer”,
num_products:{$sum:1}
}
}
])

{ “_id” : “Amazon”, “num_products” : 2 }
{ “_id” : “Google”, “num_products” : 1 }
{ “_id” : “Sony”, “num_products” : 1 }
{ “_id” : “Samsung”, “num_products” : 2 }
{ “_id” : “Apple”, “num_products” : 4 }
————————-
use agg
db.products.aggregate([
{$group:
{
_id:”$category”,
num_products:{$sum:1}
}
}
])

{ “_id” : “Laptops”, “num_products” : 2 }
{ “_id” : “Cell Phones”, “num_products” : 1 }
{ “_id” : “Tablets”, “num_products” : 7 }
————————-

MongoDB – Morphia

https://mongodb.github.io/morphia/

MongoDB – Schema Design

What’s the single most important factor in designing your application schema within MongoDB?
Matching the data access patterns of your application.

Few Points:
1. Rich Documents
2. Pre Join / Embed Data (Required joins should be stored along with Document)
3. No Mongo Joins (We need to do in Application level)
4. No Constraints
5. Atomic operations (Only in one document)
6. No Declared schema (All documents will follow same schema, with minor exceptions. i.e we can add few more required attributes)

——–
Normalization of Database

Normalization is used for mainly two purpose,

Eliminating redundant(useless) data.
Ensuring data dependencies make sense i.e data is logically stored.

http://www.studytonight.com/dbms/database-normalization.php

———-
Normalization, JDBC, DAO, Queries…all are making it complex to design RDBMS schema and developing application.
In today’s world, space is available at low cost.
Storing data in a denormalized format.
These things triggered Schema-less, Denormalized systems.
——–

MongoDB Transactions
No Support for Transactions
Supports ACID for single document
1. Design to fit all in one document
2. Implement in Software
3. Tolerate issues
———————-
One to One Relations

What’s a good reason you might want to keep two documents that are related to each other one-to-one in separate collections? Check all that apply.
1. To reduce working set size of application
2. Because the combined size of documents is more than 16MB

———————-
One to Many Relations

When is it recommended to represent a one to many relationship in multiple collections?
Whenever the many is large.
———————-
Many to Many relations:

Books with Authors: Both should be first class collections with reference to other object.
Students and Teachers: same as above

————————-

MultiKey Indexes

Improves efficiency in searching

Reference:
https://docs.mongodb.com/manual/core/index-multikey/
To index a field that holds an array value, MongoDB creates an index key for each element in the array. These multikey indexes support efficient queries against array fields. Multikey indexes can be constructed over arrays that hold both scalar values (e.g. strings, numbers) and nested documents.

also study .explain()
This shows how find is working smoothly with the multikey search.
———————–
Benefits of embedding
1. Improved read performance
2. One round trip to the DB.
————————
Trees:

https://docs.mongodb.com/manual/tutorial/model-tree-structures-with-ancestors-array/

————————
Denormalized Data

1:1 – Embed
1:Many – Embed
Many:Many – Link as arrays

—————————

MongoDB Search

Which of the choices below is the title of a movie from the year 2013 that is rated PG-13 and won no awards? Please query the video.movieDetails collection to find the answer.

db.movieDetails.find( { $and :[{“year”:2013},{“rated”:”PG-13″}, { awards : { $exists : true } }]})

db.movieDetails.find( { $and :[{“year”:2013},{“rated”:”PG-13″},{“awards.wins”:0}]})
——————–
Using the video.movieDetails collection, how many movies list “Sweden” second in the the list of countries.

db.movieDetails.find({“countries.1”: “Sweden”}).count()

——————–