Speed up Python MongoDB query -
I am new to mongodibi I am trying to read data from several collections. I want to do some statistics on GHTorrent, so I am trying to print .csv with the data that I am interested in. The problem is that my query is now running for some 30 minutes and I'm sure my search is less effective than it is, I'm not sure how to improve it.
First of all, I
close_issues = ghdb.issues.find ({"state": "off"}, # query criteria {# projection "id": 1 1, "number": 1,} , "created_at": 1, "closed": 1, "comments": 1, "repo": 1, p> Then, after opening a file and typing in the headlines, I do to issue in closed numbers: countMentioned = ghdb.issue_events.find ("problem": issue ['Number'], "repo": issue ['repo'], "owner": issue ['owner'], "incident": "outlined"}). Calculation (); = "Ghdb.i ['number'], "repo": issue ['repo'], "owner": issue ['owner'], "incident": "subscribed"}). count (); counted Issue ['rep'], issue ['repo'], "owner": issue ['owner'], "event": "assigned": "assigned" Time_created = parser.parse (issue ['created_at']) time_closed = parser.parse (issue ['band_at']) timediff = time_closed - time_created; + +, "+ str (issue ['id "+ str (issue ['number']) +", "+ str (issue ['repo'])," + str (issue [+ "," + str (timerfitable_seconds ()) + ", "+ Str (m Dada ['Comments']) + "," + str (Guaranteed) + "," + str (count membership) + "," + Str (countAssigned) + '\ n') < P> As you can see, I use three out of four criteria for three different searches per problem. What is the most effective way to search for a combination of issue_id , repo and owner and calculate for each three different codes > Event
The Mongoode Aggregation Framework is a great tool for those queries that generate collected data I do like it -
I will start there and I would like to play with it a little bit. In this case of usage you can usually start from there and as a result you can wrap some extra code as a result of exporting the data in the format you want.
Comments
Post a Comment