How To
SSDEEP tasks hanging and generating a large Hangfire queue
ssdeep tasks are taking longer to process (up to over 24 hours) the following errors are seen from the hangfire page proxyerror httpsconnectionpool(host='swimlane corpzone internalzone com', port=443) max retries exceeded with url /api/user/authorize (caused by proxyerror('cannot connect to proxy ', newconnectionerror('\<urllib3 connection httpsconnection object at 0x0000000003232c88> failed to establish a new connection \[errno 11001] getaddrinfo failed dropping the hangfire jobgraph collection db hangfire jobgraph drop() does not resolve the issue restarting the swimlane task service on windows seems to help resolve the issue for just a few minutes, then the issue starts again restarting the tasks server on a linux system with docker stop sw tasks and docker start sw tasks also helps for a few minutes, then the issue starts again the issue lies in the ssdeep integration task python script workaround modify the following in the python integration script the loopback time the loopback time is used to create a record filter then a “for loop” will run through the filter and process the data if there are thousands of records that come up in the search filter, this may cause the task to be stuck in a loop for hours, causing a huge queue the loopback time variable is set to 24 hours (1 day) by default set this to fewer hours between 1 and 12 hours 2\ comment out the lines for deduped email and tracking id lines in the fuzzymatch() at times, depending on what you get from the logs, you may need to comment out the lines for not eq for deduped email and tracking id records filter('deduped email', not eq, 'yes') records filter('tracking id', not eq, trackingid) mongo profiler logs indicated that using the not eq boolean was causing the indexing to fail and was slowing mongo lookup in the records in hangfire, causing a huge processing queue comment the lines out and replace both lines with the “for loop” statements below for record in records if record tacking id == trackingid continue if record\['deduped alert'] == 'yes' continue 3\ sleep timer limiting the sleep times in the code should help the overall performance update all sleep timer in the code to sleep for less than 3 seconds the default may be (0, 3) which will indicate sleep between 0 and 3 seconds reduce all to (0 5, 1) 4\ restart task and api pods