[SalesForce] Query to see all Files related to specific object (Events)

The following post talks about running a query to return Files related to a specific record:

How to query all files attached to the Opportunity

What I'm trying to do is return ALL files related to an object (Events in this case). I tried the following query, but it returned an error that indicates that I need to include all of the record id's that I want it to look through (which could be millions). I just want the query to return the results based on the associated object, without having to put in individual record id's

SELECT ContentDocumentId,Id,LinkedEntityId FROM ContentDocumentLink WHERE LinkedEntityId LIKE '00U%'

Here's the error I received:

MALFORMED_QUERY: Implementation restriction: ContentDocumentLink requires a filter by a single Id on ContentDocumentId or LinkedEntityId using the equals operator or multiple Id's using the IN operator.

Best Answer

ContentDocumentLink, as you can see from that error message, has very strict requirements on how you can query it. You must supply a literal Id value or a set of Id values.

This is a major limitation. The only way I have found around it for very broad-based or global queries like the one you're interested in is a two-stage process off-platform with some Python glue to construct ContentDocumentLink queries. (If there is a better solution, I would love to hear about it!)

You may be able to do something similar in an Apex batch job depending on what you're hoping to do with these files/ContentDocumentLinks.

First, you'd use a report, or Workbench Bulk API job, to pull the full list of Ids of Events in your organization. Then you take that huge list of Ids and paste it into a Python script, as below. (Note that this requires the superb simple_salesforce module, which I use for this kind of glue-API-scripting all the time).

What this script does is iterate over the arbitrarily-large set of Ids you provide it and construct ContentDocumentLink queries by batching those Ids into groups as large as allowed by the platform. It saves the ContentDocumentLink data in CSV format on standard output.

#!/usr/bin/env python

from simple_salesforce import Salesforce 
import csv
import sys

ids = [x for x in """
PASTE_YOUR_ID_LIST_HERE
""".split('\n') if len(x) > 0]

query_start = """
SELECT Id, LinkedEntityId, ContentDocumentId, Visibility, ShareType
FROM ContentDocumentLink
WHERE ContentDocumentId IN (
"""

query_end = """
)
ORDER BY ContentDocumentId
"""

query = ''
sf = Salesforce(username='YOUR_USER_NAME', password='YOUR_PASSWORD', security_token = 'YOUR_SECURITY_TOKEN')
writer = csv.DictWriter(f = sys.stdout, fieldnames=['Id', 'LinkedEntityId', 'ContentDocumentId', 'Visibility', 'ShareType'])
writer.writeheader()

while len(ids) > 0:
    query = '\'' + ids.pop() + '\''

    # The maximum length of the WHERE clause is 4,000 characters
    while len(query) < 4000 and len(ids) > 0:
        query += ', \'' + ids.pop() + '\''

    cdls = sf.query_all(query_start + query + query_end)

    for cdl in cdls.get('records'):
        writer.writerow({key: cdl[key] for key in writer.fieldnames})

Related Solutions

[SalesForce] File SOQL query

There is not Title on ContentDocumentLink, availible fields are documented here

You could instead query the LinkedEntity.Name or the ContentDocument.Title if that hepls:

SELECT ContentDocumentID, LinkedEntity.Name, ContentDocument.Title
FROM   ContentDocumentLink 
WHERE  LinkedEntityId IN
  (SELECT ID FROM Account 
  WHERE   Type = 'Active Partner (signed contract)' 
  OR      Type = '2-Tier Partner')

[SalesForce] Unable to extract bulk ContentVersion and ContentDocumentLink data using Salesforce API

As I could understand, You want to transfer the file linked to Custome object 1 and 2 to Custom Object 3. If you are not going for Custom code, On what basis you will transfer the file using DataLoader, WorkBench or DataLoader.io.

The problem is after getting the ContentDocumentLink in a spreadsheet, You need to manually change the LinkedEntityID for each file. For your Information, You can get the ContentDocumentLink through workbench as follows:-

select id, ContentDocumentId, LinkedentityId from ContentDocumentLink where LinkedentityId in (select id from Custom_Object__c)

Also, ContentDocument having the field parentId is something different you are referring. Here ParentID is:-

ID of the library that owns the document. Created automatically when inserting a ContentVersion via the API for the first time.

As I would suggest you go with a Batch apex to transfer the file to the custom object record whatever you want. You will have more flexibility to control which file to be linked with which record.

In your start method of the batch apex, you can query on Custom Object X.
In your execute method, For each scope, you need to fetch the ids of Custom object 1 and Custom object 2 in a set.
Then query the ContentDocumentLink where LinkedEntityId equals to set of id containing the id of CO 1 and 2 for that scope.
Then you need to Create a ContentDocumentLink with the LinkedEntityID according to your desired logic and delete the ContentDocumentLink for Existing one.

Best Answer

Related Solutions

[SalesForce] File SOQL query

[SalesForce] Unable to extract bulk ContentVersion and ContentDocumentLink data using Salesforce API

Related Topic