Hello Sean Leino & Mohammad ASIF,
Thanks to you all, for your endurance, feedback and clarifications.
I understand that it's been a serious challenge retrieving training source metadata when using the DocumentIntelligenceAdministrationClient.GetModelAsync method in the Azure.AI.DocumentIntelligence v1.0.0 SDK.
The key clarification is that the GetModelOptions class does not exist in the GA SDK release. Those references to it are from early preview packages or outdated documentation, and so sorry for misleading. In the current SDK, GetModelAsync only accepts the modelId with an optional RequestContext, as documented here: GetModelAsync API Docs.
It is also important to distinguish between migrated and native V4 models. For migrated V3 → V4 models, training source metadata such as BlobSource
and BlobFileListSource
does not carry over. This is by design and results in null
values even if you use include=source
. The only way to recover this information is to pull it from your original training scripts, infrastructure-as-code definitions (ARM, Bicep, Terraform), or the Azure portal. If you require full metadata support, the best practice is to retrain a native V4 model using your original container URL and prefix. See the official Migration Guide.
For native V4 models, the REST API does support retrieving training source details using the include=source
query parameter:
GET https://{endpoint}/documentintelligence/documentModels/{modelId}?include=source&api-version=2024-11-30
However, the .NET SDK currently does not surface this feature. As a workaround, you can call the REST API directly using HttpClient
in C#:
var client = new HttpClient();
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "<your-key>");
var response = await client.GetAsync($"{endpoint}/documentintelligence/documentModels/{modelId}?include=source&api-version=2024-11-30");
var content = await response.Content.ReadAsStringAsync();
Console.WriteLine(content);
This will return training source information if the service makes it available. If REST also returns null for a natively trained V4 model, that indicates a service-side limitation rather than an SDK gap.
Therefore, migrated models will never return training sources, native V4 models require REST API calls for now, and SDK support for include=source is not yet available. The official API reference for DocumentIntelligenceAdministrationClient.GetModelAsync shows no overload that accepts an include=source parameter or a GetModelOptions object. The most reliable path forward is to retrain in V4 for future-proofing and, in the short term, use REST API workarounds where SDK support falls short.I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.