The Datacleanup utility is a resource intensive process. The retention days option in the datacleanup command has a big impact on the purge performance. The smaller the retention day, the more data being purged and the longer the purge process will take.
Environment
Product: TIBCO MDM 8.2.x, 8.3.x, and 9.x
Operating System: All OS supported by MDM
Database: All databases supported by MDM
App server: All app servers supported by MDM
Resolution
Instead of running purge in one go, we recommend selecting retention days in such a way that the data set to be purged is small so that the purge will run faster. We can run the purge a number of times to determine the retention day we can use to improve the purge performance. Once the retention day is selected, we can schedule the purge to run on a regular basis.
For example, if we have 100 days worth of data that need to be purged. Running purge with -d 1 is going to be quite resource intensive. Instead we can run the purge command with -d 95 first, then -d 90, then -d 85, -d 80 and so on. This way each purge execution will be less resource intensive and then faster.
For history purge you can use the following query to get the record and decide on retention units.
select ID from event ev where ev.eventDate <= and ev.parentID < 1 and ev.status != 'INPROGRESS' and not exists ( select '1' from event ev2 where ev.id = ev2.parentid and ev2.status = 'INPROGRESS') and not exists ( select '1' from recordsynclog r where ev.id = r.eventid).
For force history purge you can use the following query to get the records and decide on retention units.
select ID from event ev where ev.eventDate <= and ev.parentID < 1 and not exists ( select '1' from recordsynclog r where ev.id = r.eventid)
For record versions purge you can use below query to get the records and decide on retention units. select productkeyID, max(modVersion) from principalKey where moddate <= and catalogID = ? group by productKeyID
The above will be the cursor queries in each of their purge cases. When you run these queries based on the date (i.e., retention unit) you will know what will be the data that will be part of the purge. Keep smaller sets of data so that purge runs fast.
Issue/Introduction
If we specify small retention days when running the datacleanup utility, it tends to run slowly because a huge amount of data need to be deleted.