Extract/display Cryosparc metrics data

Hi,
I am running a CryoSPARC instance on a multi-user cluster and so far it has been a successful adventure.
I would like to plot the usage or the usage growth (or other metrics) on a per-month basis. In the CryoSPARC dashboard, I can see how many jobs were submitted last week or month, but I want to know how I can query the CryoSPARC’s Mongo DB to get a per month result to prepare nice plots for a yearly report

I am categorizing it as a request for a new feature, but what I need right now is just some sample db query that I can use to get the usage data in a given time range

Best regards,
Tomek

1 Like

Hi @osinskit ,

Thanks for the post! You can certainly query the cryoSPARC MongoDB to gather aggregate information. First, enter the MongoDB shell:

cryosparcm mongo

Count of submitted/queued jobs by month:

db.jobs.aggregate([
{$match: { queued_at: { $ne: null }}},
{$group: { _id: { month: { $month: "$queued_at" }, year: { $year: "$queued_at" } }, count: {$sum: 1} }},
{$project: { _id: 0, year: '$_id.year', month: '$_id.month', count: 1 } },
{$sort: { year: 1, month: 1 }}
]);

Count of completed jobs by month:

db.jobs.aggregate([
{$match: { status: 'completed', completed_at: { $ne: null }}},
{$group: { _id: { month: { $month: "$completed_at" }, year: { $year: "$completed_at" } }, count: {$sum: 1} }},
{$project: { _id: 0, year: '$_id.year', month: '$_id.month', count: 1 } },
{$sort: { year: 1, month: 1 }}
]);

Count and runtime (seconds) of completed jobs since Jan 1, 2022 grouped by job type:

db.jobs.aggregate([
{ $match: { status: 'completed', deleted: false, completed_at: { $gte: new Date('January 1, 2022'), $lt: new Date() } } },
{ $project: { job_type: 1, running_at: 1, completed_at: 1, duration: {$divide: [{$subtract: ["$completed_at", "$running_at"]}, 1000]} } },
{ $group: { '_id': '$job_type', count: { $sum: 1 }, totalTimeSec: { $sum: '$duration' } } },
{ $sort: { totalTimeSec: -1 } }
])

Count and runtime (seconds) of completed jobs since Jan 1, 2022 grouped by date:

db.jobs.aggregate([
{ $match: { status: 'completed', deleted: false, completed_at: { $gte: new Date('January 1, 2022'), $lt: new Date() } } },
{ $project: { job_type: 1, running_at: 1, completed_at: 1, duration: {$divide: [{$subtract: ["$completed_at", "$running_at"]}, 1000]}, day: { $dateToString: { date: '$completed_at', format: '%Y-%m-%d' } } } },
{ $group: { '_id': '$day', count: { $sum: 1 }, totalTimeSec: { $sum: '$duration' } } },
{ $sort: { totalTimeSec: -1 } }
])

Hope that helps!

- Suhail

2 Likes

Exactly what I needed!
Thank you @sdawood