Assets Registry
Introduction
The Assets Registry is a backend service framework designed to manage the lifecycle of digital assets through a structured upload, registration, and retrieval process. It supports bulk ingestion of asset metadata and associated binary files using ZIP-based transport, persistent storage in object stores (such as Amazon S3 or Ceph), and metadata indexing via a MongoDB-backed registry.
This system is built for environments that require scalable, auditable, and modular asset handling, including AI models, datasets, configurations, and other domain-specific artifacts.
Key functionalities include:
- Streaming ZIP Uploads: Asset bundles containing metadata and files are uploaded as ZIP archives. These archives are parsed and processed in a non-blocking manner.
- Object Storage Integration: Binary files extracted from the ZIP are uploaded to S3-compatible storage, and URLs are embedded into the asset metadata.
- Metadata Registration: Extracted metadata is submitted to the asset registry via structured API endpoints and persisted in MongoDB.
- Dynamic ZIP Downloads: Assets can be retrieved by asset ID and streamed back as ZIP archives constructed on-demand using stored file references.
The system is designed with modularity in mind, enabling it to be adapted to various storage backends, validation workflows, and integration protocols. It exposes both REST and GraphQL APIs for asset registration, querying, and lifecycle management.
Architecture
The Assets Registry architecture is designed for modular, scalable ingestion and retrieval of asset metadata and associated files. It separates responsibilities across storage, processing, and indexing layers while maintaining stateless, asynchronous, and streaming-compatible interfaces.
The system is composed of several interconnected modules that coordinate to provide full asset lifecycle support through REST and GraphQL APIs, as well as ZIP-based upload and download pipelines.
System Overview
The architecture consists of the following key layers:
-
API Layer
-
Exposes REST and GraphQL endpoints for asset registration, querying, upload, and download.
-
Handles validation, background job management, and request routing.
-
Processing Layer
-
Manages ZIP stream parsing, file upload to S3-compatible storage, and metadata extraction.
-
Supports asynchronous background threads for long-running uploads.
-
Storage Layer
-
Persists metadata in MongoDB collections.
- Stores large files in external object storage (Amazon S3 or Ceph).
-
Upload status and tracking is handled through Redis.
-
Streaming Layer
-
Uses in-memory file-like objects to stream ZIP files in and out.
- Enables download of reconstructed ZIPs directly from stored metadata and file URLs.
Upload Architecture
Upload Flow (POST /zip/upload
):
- Client uploads a ZIP archive containing
asset.json
and files. - The API spawns a background thread and immediately returns an
upload_id
. -
The
StreamingZipParser
: -
Extracts and parses
asset.json
. - Streams each file to
S3UploaderPlugin
, returning a file URL. - The URLs are embedded into metadata.
WriteAPIClient
sends the full metadata to the Assets Create API (POST /assets
).- Upload status is tracked using Redis (
UPLOAD_STATUS:<id>
).
Download Architecture
Download Flow (GET /zip/download/<asset_id>
):
- Client requests a ZIP archive for a given
asset_id
. - The
ReadAPIClient
fetches metadata from the registry. - The
S3DownloaderPlugin
streams each file from object storage. -
The
StreamingZipArchiver
dynamically constructs a ZIP stream including: -
All file entries
- A synthesized
asset.json
containing metadata - The ZIP is streamed directly to the client using Flask’s
Response
.
Integration Points
Component | Role |
---|---|
Flask |
REST + GraphQL web API layer |
MongoDB |
Persistent metadata store |
Redis |
Upload job tracking (non-blocking uploads) |
Amazon S3 / Ceph |
Large binary file storage |
zipfile , io.BytesIO |
ZIP stream parsing and generation |
Schema
This section defines the core data model used in the Assets Registry. The registry represents an asset as a structured document that includes metadata, policies, API specifications, files, and indexing instructions. Each entity is modeled using Python @dataclass
structures and stored as part of a unified asset document in MongoDB.
Asset
@dataclass
class Asset:
asset_id: str
asset_uri: str
asset_version: str
asset_profile_id: str
asset_file_ids: List[str]
asset_container_uri: Optional[str] = None
asset_policy_ids: List[str] = field(default_factory=list)
asset_container_registry_creds_config: Optional[Dict[str, Any]] = None
asset_workflow_id: Optional[str] = None
asset_api_ids: List[str] = field(default_factory=list)
asset_brief_description: Optional[str] = None
profiles: List[AssetProfile] = field(default_factory=list)
policies: List[AssetPolicy] = field(default_factory=list)
files: List[AssetFile] = field(default_factory=list)
apis: List[AssetAPI] = field(default_factory=list)
index_mappings: List[IndexMapping] = field(default_factory=list)
Field | Type | Description |
---|---|---|
asset_id |
str |
Unique identifier for the asset |
asset_uri |
str |
URI pointing to the asset namespace or logical location |
asset_version |
str |
Version string of the asset |
asset_profile_id |
str |
Associated profile ID |
asset_file_ids |
List[str] |
List of file IDs referenced within the asset |
asset_container_uri |
Optional[str] |
URI for asset container or docker image |
asset_policy_ids |
List[str] |
List of associated policy IDs |
asset_container_registry_creds_config |
Optional[Dict] |
Credentials config for pulling container from a private registry |
asset_workflow_id |
Optional[str] |
Workflow ID linked to this asset |
asset_api_ids |
List[str] |
List of asset API definition IDs |
asset_brief_description |
Optional[str] |
Human-readable description of the asset |
profiles |
List[AssetProfile] |
Embedded profile descriptions |
policies |
List[AssetPolicy] |
Embedded policy definitions |
files |
List[AssetFile] |
Files stored in S3 and referenced in metadata |
apis |
List[AssetAPI] |
API interface specifications for the asset |
index_mappings |
List[IndexMapping] |
Page-level index mapping for document rendering |
AssetProfile
@dataclass
class AssetProfile:
asset_profile_id: str
asset_type: str
asset_sub_type: str
asset_id: str
asset_metadata: Optional[Dict[str, Any]] = None
asset_creator_info: Optional[Dict[str, Any]] = None
asset_tags: Optional[List[str]] = field(default_factory=list)
asset_description: Optional[str] = None
asset_complete_docs_url: Optional[str] = None
asset_man_page_url: Optional[str] = None
asset_sample_input_json: Optional[Dict[str, Any]] = None
asset_sample_output_json: Optional[Dict[str, Any]] = None
asset_sample_input_data_url: Optional[str] = None
asset_sample_output_data_url: Optional[str] = None
asset_author_metadata: Optional[Dict[str, Any]] = None
Field | Type | Description |
---|---|---|
asset_profile_id |
str |
Unique ID for the profile |
asset_type |
str |
Asset's top-level category (e.g., model, data, tool) |
asset_sub_type |
str |
Asset’s sub-type (e.g., transformer, image) |
asset_id |
str |
Back-reference to the parent asset |
asset_metadata |
Optional[Dict] |
Arbitrary metadata about the asset |
asset_creator_info |
Optional[Dict] |
Metadata about the asset creator |
asset_tags |
Optional[List[str]] |
Searchable tag list |
asset_description |
Optional[str] |
Detailed textual description |
asset_complete_docs_url |
Optional[str] |
Link to complete documentation |
asset_man_page_url |
Optional[str] |
Link to man-page or quickstart docs |
asset_sample_input_json |
Optional[Dict] |
Example input for the asset |
asset_sample_output_json |
Optional[Dict] |
Example output |
asset_sample_input_data_url |
Optional[str] |
Link to sample input data file |
asset_sample_output_data_url |
Optional[str] |
Link to sample output data file |
asset_author_metadata |
Optional[Dict] |
Information about the author(s) |
AssetPolicy
@dataclass
class AssetPolicy:
asset_policy_id: str
asset_id: str
asset_policy_type: str
asset_policy_rule_uri: Optional[str] = None
asset_policy_rule_config: Optional[Dict[str, Any]] = None
asset_policy_rule_params: Optional[Dict[str, Any]] = None
Field | Type | Description |
---|---|---|
asset_policy_id |
str |
Unique policy identifier |
asset_id |
str |
Parent asset ID |
asset_policy_type |
str |
Type of policy (e.g., access, runtime, audit) |
asset_policy_rule_uri |
Optional[str] |
External URI for policy code |
asset_policy_rule_config |
Optional[Dict] |
Policy-specific configuration |
asset_policy_rule_params |
Optional[Dict] |
Parameters to apply during policy evaluation |
AssetFile
@dataclass
class AssetFile:
asset_file_id: str
asset_file_type: str
asset_file_mime_type: str
asset_file_url: str
asset_file_metadata: Optional[Dict[str, Any]] = None
Field | Type | Description |
---|---|---|
asset_file_id |
str |
Unique identifier for the file |
asset_file_type |
str |
Logical type (e.g., weights, config) |
asset_file_mime_type |
str |
File’s MIME type (e.g., application/zip) |
asset_file_url |
str |
Download URL (usually S3/HTTPS) |
asset_file_metadata |
Optional[Dict] |
Optional file-specific metadata |
AssetAPI
@dataclass
class AssetAPI:
asset_api_id: str
asset_id: str
asset_api_metadata: Optional[Dict[str, Any]] = None
asset_api_svc: Optional[str] = None
asset_api_route: Optional[str] = None
asset_api_protocol: Optional[str] = None
asset_protocol_specific_config: Optional[Dict[str, Any]] = None
asset_api_man_page: Optional[str] = None
asset_api_swagger_doc: Optional[str] = None
asset_api_usage_samples: Optional[List[Dict[str, Any]]] = field(default_factory=list)
Field | Type | Description |
---|---|---|
asset_api_id |
str |
Unique API identifier |
asset_id |
str |
Parent asset reference |
asset_api_metadata |
Optional[Dict] |
General API metadata |
asset_api_svc |
Optional[str] |
Backend service name |
asset_api_route |
Optional[str] |
API route or endpoint path |
asset_api_protocol |
Optional[str] |
Protocol used (e.g., HTTP, gRPC) |
asset_protocol_specific_config |
Optional[Dict] |
Configuration for protocol handling |
asset_api_man_page |
Optional[str] |
Documentation link |
asset_api_swagger_doc |
Optional[str] |
Swagger/OpenAPI specification link |
asset_api_usage_samples |
Optional[List[Dict]] |
Example requests/responses |
IndexMapping
@dataclass
class IndexMapping:
json_doc_id: str
mapping_field_index: int
table_name: str
field_name: str
render_page_no: int
render_order_no: int
Field | Type | Description |
---|---|---|
json_doc_id |
str |
ID of the document containing this field |
mapping_field_index |
int |
Index of the field in the JSON document |
table_name |
str |
Table or logical structure name for rendering |
field_name |
str |
Name of the specific field |
render_page_no |
int |
Page number to render the field on (in UI/PDF) |
render_order_no |
int |
Order in which to render the field on the target page |
Create, Delete, and Update APIs
The Assets Registry provides RESTful endpoints for managing the lifecycle of assets. This section documents the endpoints used to create, update, and delete assets.
Each endpoint accepts and returns data in JSON format. All requests should use the Content-Type: application/json
header unless otherwise specified.
POST /assets
Creates a new asset by submitting a complete asset document. The asset must include the required top-level fields and embedded metadata components (e.g., files, profiles).
Request
- Method:
POST
- Path:
/assets
- Body: Full
Asset
JSON object
Response
201 Created
on success400 Bad Request
if validation fails or asset already exists
cURL Example
curl -X POST http://localhost:8080/assets \
-H "Content-Type: application/json" \
-d '{
"asset_id": "asset-001",
"asset_uri": "models://example.com/asset-001",
"asset_version": "v1",
"asset_profile_id": "profile-001",
"asset_file_ids": [],
"profiles": [{
"asset_profile_id": "profile-001",
"asset_type": "model",
"asset_sub_type": "llm",
"asset_id": "asset-001"
}],
"policies": [],
"files": [],
"apis": [],
"index_mappings": []
}'
PUT /assets/<asset_id>
Updates one or more fields of an existing asset. Only the fields provided in the request body will be updated.
Request
- Method:
PUT
- Path:
/assets/<asset_id>
- Body: Partial or full
Asset
fields to update
Response
200 OK
if update was successful404 Not Found
if asset ID does not exist
cURL Example
curl -X PUT http://localhost:8080/assets/asset-001 \
-H "Content-Type: application/json" \
-d '{
"asset_brief_description": "Updated description"
}'
DELETE /assets/<asset_id>
Deletes the specified asset and all associated embedded metadata from the database.
Request
- Method:
DELETE
- Path:
/assets/<asset_id>
Response
200 OK
if deletion was successful404 Not Found
if the asset does not exist
cURL Example
curl -X DELETE http://localhost:8080/assets/asset-001
Query and GraphQL APIs
The Assets Registry supports both RESTful and GraphQL-based querying to retrieve and search asset metadata. These interfaces allow for flexible and fine-grained access to asset information by ID, type, tags, API route, or arbitrary filters.
REST Query APIs
GET /query/by-id/<asset_id>
Retrieves a single asset by its unique ID.
cURL Example:
curl http://localhost:8081/query/by-id/asset-001
GET /query/by-type/<asset_type>
Fetches all assets matching a given asset_type
.
cURL Example:
curl http://localhost:8081/query/by-type/model
GET /query/by-sub-type/<sub_type>
Fetches assets with a specific asset_sub_type
inside any embedded profile.
cURL Example:
curl http://localhost:8081/query/by-sub-type/llm
GET /query/by-tag/<tag>
Finds assets that are tagged with the given value.
cURL Example:
curl http://localhost:8081/query/by-tag/generative
GET /query/by-api-route/<route>
Finds assets that define an API using the given route.
cURL Example:
curl http://localhost:8081/query/by-api-route/api/v1/generate
POST /query/search
Allows custom MongoDB-style filter queries for advanced use cases.
cURL Example:
curl -X POST http://localhost:8081/query/search \
-H "Content-Type: application/json" \
-d '{"asset_version": "v1", "profiles.asset_type": "model"}'
GraphQL Query Endpoint
URL: /graphql
A flexible GraphQL interface for querying assets using defined filters and selectors. Only read operations are supported.
Example: Get asset by ID
query {
getAssetById(assetId: "asset-001") {
assetId
assetUri
assetVersion
assetBriefDescription
}
}
Example: Filter by type
query {
getAssetsByType(assetType: "model") {
assetId
assetUri
profiles
}
}
Example: Custom filter
query {
searchAssets(filters: { "profiles.asset_sub_type": "llm" }) {
assetId
assetUri
assetVersion
}
}
GraphQL Features
Query Field | Parameters | Returns |
---|---|---|
getAssetById |
assetId (String) |
A single asset |
getAssetsByType |
assetType (String) |
List of assets |
getAssetsByTag |
tag (String) |
List of assets |
getAssetsBySubType |
subType (String) |
List of assets |
getAssetsByApiRoute |
route (String) |
List of assets |
searchAssets |
filters (GenericScalar) |
Custom filtered list of assets |
You're welcome. Below is the next section of the Assets Registry documentation:
ZIP Upload and Download APIs
The Assets Registry supports bulk transport of asset metadata and files using ZIP archives. These interfaces enable clients to upload or download complete assets in a single compressed file, making it suitable for versioned model packaging, migration, or offline exchange.
The upload API is asynchronous and provides a polling mechanism to track the status of the operation. The download API reconstructs a ZIP on-the-fly from stored asset metadata and files in object storage.
POST /zip/upload
Accepts a ZIP file containing:
- An
asset.json
metadata file - One or more binary files (e.g., models, configs)
The ZIP is parsed asynchronously. Files are uploaded to object storage, and the metadata is registered via the Assets Create API.
Request
- Method:
POST
- Content-Type:
multipart/form-data
- Form Field:
file
(the ZIP archive)
Response
202 Accepted
with a uniqueupload_id
to track progress
cURL Example
curl -X POST http://localhost:8081/zip/upload \
-F "file=@asset_bundle.zip"
Response:
{
"success": true,
"upload_id": "0a5e3de1-1fcd-4a65-9c83-18c5c0b4d612"
}
GET /zip/status/<upload_id>
Returns the current status of a previously submitted ZIP upload job.
Status values
queued
processing
success
failed: <reason>
cURL Example
curl http://localhost:8081/zip/status/0a5e3de1-1fcd-4a65-9c83-18c5c0b4d612
Response:
{
"upload_id": "0a5e3de1-1fcd-4a65-9c83-18c5c0b4d612",
"status": "success"
}
GET /zip/download/<asset_id>
Dynamically generates and streams a ZIP archive for the specified asset. The archive includes:
- Files listed in the asset metadata (
files[]
) - The original
asset.json
metadata
Response
Content-Type
:application/zip
Content-Disposition
:attachment; filename=<asset_id>.zip
cURL Example
curl -OJ http://localhost:8081/zip/download/asset-001
Result: A file named asset-001.zip
is downloaded.