[SalesForce] Unable to extract bulk ContentVersion and ContentDocumentLink data using Salesforce API

Use Case:
Transfer existing (and OOTB) Files that are linked with couple of custom objects (CO 1 & 2) to another Custom object (CO X)? The new custom object (CO X) has look up fields for CO 1 & CO 2 and and is associated with oob Files object as related item. *This is Not a migration of attachment to Files.

This is an interim solution that we intend to provide and hence, we are trying to avoid custom code.

Steps done so far:
1. Extracted bulk csv for ContentDocument object using workbench. As expected, Parent Id is blank. We have 4100 files in total- which are associated to many COs incl. CO 1, 2 & X.

Encountered the following error while trying to extract ContentVersion object with VersionData using DL, that is, even after reducing the batch size to 1.
Error Message: 'Java heap error'.

I was able to query the above with WB, but only with one record shown at one time. However, received the following error while extracting bulk csv: 'Batch failed: Feature mot enabled: Binary field not supported when exporting related object'.

Only success was in extracting XML ContentVersion with VersionData using WB. This generated four separate xml files sized several hundreds of MBs each.

Can this be done differently and/or more efficiently?

Unable to extract ContentDcoumentLink data using either DL or WB due to the following: 'Implementation Restriction: Content Documentlink requires a filter by a single id on Content DocumentId or LinkedEntity Id using the equals operator or multiple Ids included in the IN parameter'.

This is a huge roadblock considering that we have over 4000 Files in our org and checking the Linkedentity Id for each File is cumbersome to say the least.

Has anyone experienced a similar situation and gotten a solution to it? Please share. Thanks.

Best Answer

As I could understand, You want to transfer the file linked to Custome object 1 and 2 to Custom Object 3. If you are not going for Custom code, On what basis you will transfer the file using DataLoader, WorkBench or DataLoader.io.

The problem is after getting the ContentDocumentLink in a spreadsheet, You need to manually change the LinkedEntityID for each file. For your Information, You can get the ContentDocumentLink through workbench as follows:-

select id, ContentDocumentId, LinkedentityId from ContentDocumentLink where LinkedentityId in (select id from Custom_Object__c)

Also, ContentDocument having the field parentId is something different you are referring. Here ParentID is:-

ID of the library that owns the document. Created automatically when inserting a ContentVersion via the API for the first time.

As I would suggest you go with a Batch apex to transfer the file to the custom object record whatever you want. You will have more flexibility to control which file to be linked with which record.

In your start method of the batch apex, you can query on Custom Object X.
In your execute method, For each scope, you need to fetch the ids of Custom object 1 and Custom object 2 in a set.
Then query the ContentDocumentLink where LinkedEntityId equals to set of id containing the id of CO 1 and 2 for that scope.
Then you need to Create a ContentDocumentLink with the LinkedEntityID according to your desired logic and delete the ContentDocumentLink for Existing one.

Related Solutions

[SalesForce] Problem trying to upload ContentVersion from base64 image data via REST API

I eventually ruled out being able to use base64, this is because the API does not seem to support the "Content-Transfer-Encoding: base64" header.

Further to this, when I tried sending raw binary data I had a problem where RESTed (the native Mac REST client that I was using) was escaping all of the UTF8 characters before sending the request.

When I used curl with the --data-binary flag the image uploaded and worked with no problems. I am kicking myself now because the examples in the documentation all mentioned using curl!

Thanks for all of those people who tried to help!

[SalesForce] Relationship between ContentVersion object and Document object

Content Document: It Represents a document that has been uploaded to a library in Salesforce CRM Content or Salesforce Files. The maximum number of documents that can be published is 30,000,000. This object record you don’t have to create. It gets created when you create ContentVersion which is the child of ContentDocument.

Content Version: Represents a specific version of a document in Salesforce CRM Content or Salesforce Files. In other words, this object stores document information similar like Attachment.

Now have a look on the relationship between these two objects below-

Now ContentDocumentLink: This object will share the files with Users, Records, Groups etc. You can create multiple records to attach the same files under multiple records.

Some useful SOQL queries on Content Version and Content Document-

1.Get Content Document Id by Content Version Id

SELECT ContentDocumentId from ContentVersion where Id = '068......'

2.Get Content Version Id by Content Document Id

SELECT Id from ContentVersion where ContentDocumentId = '069......'

3.Get Download Number of Document by Version

SELECT Count(Id), ContentVersionId FROM ContentVersionHistory where field='contentVersionDownloaded' and ContentVersion.ContentDocumentId  = '069....'   group by ContentVersionId

4.Get Total Download Number for a Document

SELECT Count(Id) FROM ContentVersionHistory where field='contentVersionDownloaded' and ContentVersion.ContentDocumentId = '069....' group by ContentVersion.ContentDocumentId

That's all from my end. I got these from different resources. Thanks.

Best Answer

Related Solutions

[SalesForce] Problem trying to upload ContentVersion from base64 image data via REST API

[SalesForce] Relationship between ContentVersion object and Document object

Related Topic