@Generated public class ServingEndpointsAPI extends Object
You can use a serving endpoint to serve models from the Databricks Model Registry or from Unity Catalog. Endpoints expose the underlying models as scalable REST API endpoints using serverless compute. This means the endpoints and associated compute resources are fully managed by Databricks and will not appear in your cloud account. A serving endpoint can consist of one or more MLflow models from the Databricks Model Registry, called served entities. A serving endpoint can have at most ten served entities. You can configure traffic settings to define how requests should be routed to your served entities behind an endpoint. Additionally, you can configure the scale of resources that should be applied to each served entity.
| Constructor and Description |
|---|
ServingEndpointsAPI(ApiClient apiClient)
Regular-use constructor
|
ServingEndpointsAPI(ServingEndpointsService mock)
Constructor for mocks
|
| Modifier and Type | Method and Description |
|---|---|
BuildLogsResponse |
buildLogs(BuildLogsRequest request)
Retrieves the build logs associated with the provided served model.
|
BuildLogsResponse |
buildLogs(String name,
String servedModelName) |
Wait<ServingEndpointDetailed,ServingEndpointDetailed> |
create(CreateServingEndpoint request)
Create a new serving endpoint.
|
Wait<ServingEndpointDetailed,ServingEndpointDetailed> |
createProvisionedThroughputEndpoint(CreatePtEndpointRequest request)
Create a new PT serving endpoint.
|
void |
delete(DeleteServingEndpointRequest request)
Delete a serving endpoint.
|
void |
delete(String name) |
ExportMetricsResponse |
exportMetrics(ExportMetricsRequest request)
Retrieves the metrics associated with the provided serving endpoint in either Prometheus or
OpenMetrics exposition format.
|
ExportMetricsResponse |
exportMetrics(String name) |
ServingEndpointDetailed |
get(GetServingEndpointRequest request)
Retrieves the details for a single serving endpoint.
|
ServingEndpointDetailed |
get(String name) |
GetOpenApiResponse |
getOpenApi(GetOpenApiRequest request)
Get the query schema of the serving endpoint in OpenAPI format.
|
GetOpenApiResponse |
getOpenApi(String name) |
GetServingEndpointPermissionLevelsResponse |
getPermissionLevels(GetServingEndpointPermissionLevelsRequest request)
Gets the permission levels that a user can have on an object.
|
GetServingEndpointPermissionLevelsResponse |
getPermissionLevels(String servingEndpointId) |
ServingEndpointPermissions |
getPermissions(GetServingEndpointPermissionsRequest request)
Gets the permissions of a serving endpoint.
|
ServingEndpointPermissions |
getPermissions(String servingEndpointId) |
HttpRequestResponse |
httpRequest(ExternalFunctionRequest request)
Make external services call using the credentials stored in UC Connection.
|
ServingEndpointsService |
impl() |
Iterable<ServingEndpoint> |
list()
Get all serving endpoints.
|
ServerLogsResponse |
logs(LogsRequest request)
Retrieves the service logs associated with the provided served model.
|
ServerLogsResponse |
logs(String name,
String servedModelName) |
EndpointTags |
patch(PatchServingEndpointTags request)
Used to batch add and delete tags from a serving endpoint with a single API call.
|
PutResponse |
put(PutRequest request)
Deprecated: Please use AI Gateway to manage rate limits instead.
|
PutAiGatewayResponse |
putAiGateway(PutAiGatewayRequest request)
Used to update the AI Gateway of a serving endpoint.
|
QueryEndpointResponse |
query(QueryEndpointInput request)
Query a serving endpoint
|
ServingEndpointPermissions |
setPermissions(ServingEndpointPermissionsRequest request)
Sets permissions on an object, replacing existing permissions if they exist.
|
Wait<ServingEndpointDetailed,ServingEndpointDetailed> |
updateConfig(EndpointCoreConfigInput request)
Updates any combination of the serving endpoint's served entities, the compute configuration of
those served entities, and the endpoint's traffic config.
|
UpdateInferenceEndpointNotificationsResponse |
updateNotifications(UpdateInferenceEndpointNotifications request)
Updates the email and webhook notification settings for an endpoint.
|
ServingEndpointPermissions |
updatePermissions(ServingEndpointPermissionsRequest request)
Updates the permissions on a serving endpoint.
|
Wait<ServingEndpointDetailed,ServingEndpointDetailed> |
updateProvisionedThroughputEndpointConfig(UpdateProvisionedThroughputEndpointConfigRequest request)
Updates any combination of the pt endpoint's served entities, the compute configuration of
those served entities, and the endpoint's traffic config.
|
ServingEndpointDetailed |
waitGetServingEndpointNotUpdating(String name) |
ServingEndpointDetailed |
waitGetServingEndpointNotUpdating(String name,
Duration timeout,
Consumer<ServingEndpointDetailed> callback) |
public ServingEndpointsAPI(ApiClient apiClient)
public ServingEndpointsAPI(ServingEndpointsService mock)
public ServingEndpointDetailed waitGetServingEndpointNotUpdating(String name) throws TimeoutException
TimeoutExceptionpublic ServingEndpointDetailed waitGetServingEndpointNotUpdating(String name, Duration timeout, Consumer<ServingEndpointDetailed> callback) throws TimeoutException
TimeoutExceptionpublic BuildLogsResponse buildLogs(String name, String servedModelName)
public BuildLogsResponse buildLogs(BuildLogsRequest request)
public Wait<ServingEndpointDetailed,ServingEndpointDetailed> create(CreateServingEndpoint request)
public Wait<ServingEndpointDetailed,ServingEndpointDetailed> createProvisionedThroughputEndpoint(CreatePtEndpointRequest request)
public void delete(String name)
public void delete(DeleteServingEndpointRequest request)
public ExportMetricsResponse exportMetrics(String name)
public ExportMetricsResponse exportMetrics(ExportMetricsRequest request)
public ServingEndpointDetailed get(String name)
public ServingEndpointDetailed get(GetServingEndpointRequest request)
public GetOpenApiResponse getOpenApi(String name)
public GetOpenApiResponse getOpenApi(GetOpenApiRequest request)
public GetServingEndpointPermissionLevelsResponse getPermissionLevels(String servingEndpointId)
public GetServingEndpointPermissionLevelsResponse getPermissionLevels(GetServingEndpointPermissionLevelsRequest request)
public ServingEndpointPermissions getPermissions(String servingEndpointId)
public ServingEndpointPermissions getPermissions(GetServingEndpointPermissionsRequest request)
public HttpRequestResponse httpRequest(ExternalFunctionRequest request)
public Iterable<ServingEndpoint> list()
public ServerLogsResponse logs(String name, String servedModelName)
public ServerLogsResponse logs(LogsRequest request)
public EndpointTags patch(PatchServingEndpointTags request)
public PutResponse put(PutRequest request)
public PutAiGatewayResponse putAiGateway(PutAiGatewayRequest request)
public QueryEndpointResponse query(QueryEndpointInput request)
public ServingEndpointPermissions setPermissions(ServingEndpointPermissionsRequest request)
public Wait<ServingEndpointDetailed,ServingEndpointDetailed> updateConfig(EndpointCoreConfigInput request)
public UpdateInferenceEndpointNotificationsResponse updateNotifications(UpdateInferenceEndpointNotifications request)
public ServingEndpointPermissions updatePermissions(ServingEndpointPermissionsRequest request)
public Wait<ServingEndpointDetailed,ServingEndpointDetailed> updateProvisionedThroughputEndpointConfig(UpdateProvisionedThroughputEndpointConfigRequest request)
public ServingEndpointsService impl()
Copyright © 2026. All rights reserved.