[SalesForce] Unable to extract bulk ContentVersion and ContentDocumentLink data using Salesforce API

Use Case:
Transfer existing (and OOTB) Files that are linked with couple of custom objects (CO 1 & 2) to another Custom object (CO X)? The new custom object (CO X) has look up fields for CO 1 & CO 2 and and is associated with oob Files object as related item. *This is Not a migration of attachment to Files.

This is an interim solution that we intend to provide and hence, we are trying to avoid custom code.

Steps done so far:
1. Extracted bulk csv for ContentDocument object using workbench. As expected, Parent Id is blank. We have 4100 files in total- which are associated to many COs incl. CO 1, 2 & X.

  1. Encountered the following error while trying to extract ContentVersion object with VersionData using DL, that is, even after reducing the batch size to 1.
    Error Message: 'Java heap error'.

I was able to query the above with WB, but only with one record shown at one time. However, received the following error while extracting bulk csv: 'Batch failed: Feature mot enabled: Binary field not supported when exporting related object'.

Only success was in extracting XML ContentVersion with VersionData using WB. This generated four separate xml files sized several hundreds of MBs each.

Can this be done differently and/or more efficiently?

  1. Unable to extract ContentDcoumentLink data using either DL or WB due to the following: 'Implementation Restriction: Content Documentlink requires a filter by a single id on Content DocumentId or LinkedEntity Id using the equals operator or multiple Ids included in the IN parameter'.

This is a huge roadblock considering that we have over 4000 Files in our org and checking the Linkedentity Id for each File is cumbersome to say the least.

Has anyone experienced a similar situation and gotten a solution to it? Please share. Thanks.

Best Answer

As I could understand, You want to transfer the file linked to Custome object 1 and 2 to Custom Object 3. If you are not going for Custom code, On what basis you will transfer the file using DataLoader, WorkBench or DataLoader.io.

The problem is after getting the ContentDocumentLink in a spreadsheet, You need to manually change the LinkedEntityID for each file. For your Information, You can get the ContentDocumentLink through workbench as follows:-

select id, ContentDocumentId, LinkedentityId from ContentDocumentLink where LinkedentityId in (select id from Custom_Object__c)

Also, ContentDocument having the field parentId is something different you are referring. Here ParentID is:-

ID of the library that owns the document. Created automatically when inserting a ContentVersion via the API for the first time.

As I would suggest you go with a Batch apex to transfer the file to the custom object record whatever you want. You will have more flexibility to control which file to be linked with which record.

  • In your start method of the batch apex, you can query on Custom Object X.
  • In your execute method, For each scope, you need to fetch the ids of Custom object 1 and Custom object 2 in a set.
  • Then query the ContentDocumentLink where LinkedEntityId equals to set of id containing the id of CO 1 and 2 for that scope.
  • Then you need to Create a ContentDocumentLink with the LinkedEntityID according to your desired logic and delete the ContentDocumentLink for Existing one.