Skip to main content
Version: 10.9.x

Files Service

This microservice allows you to upload and download files to a third-party service. Google Cloud Storage, MongoDB GridFS and Amazon s3 are currently supported.

Consequently, it needs to know either one or more between MongoDB GridFS bucket configurations, Amazon s3 bucket configurations or Google Storage credentials.

In addition, after each upload it saves the file's information using the CRUD Service on a configurable mongoDB collection (usually files).

CRUD collection

The CRUD collection can be named as you want, but must contain the following fields:

  • name (type: String): original file name.
  • file (type: String): unique name of the file that should be used to retrieve it using this service.
  • size (type: Number): size in bytes of the uploaded file.
  • location (type: String): the URL that can be used to download the file using the same service that performed the upload.

These fields will be automatically filled during the upload of files.

Environment variables

  • CONFIG_FILE_PATH (required): the path of the configuration file to configure connection with the online bucket for the supported services.
  • CRUD_URL (required): the CRUD url, comprehensive of the files collection name chosen during the CRUD collection creation (e.g. http://crud-service/files/ where files is the CRUD collection name).
  • PROJECT_HOSTNAME: the hostname that will be saved in the database as the root of the file location. Incompatible with PATH_PREFIX.
  • PATH_PREFIX: Use a relative path as file location prefix. Incompatible with PROJECT_HOSTNAME.
  • SERVICE_PREFIX: the prefix used for the path of the service endpoints.
  • HEADERS_TO_PROXY: comma separated list of the headers to proxy (the Mia-Platform headers).
  • FILE_TYPE_INCLUDE_LIST (from v2.3.0): comma separated list of file extensions (without the dot) to be accepted for upload. If you do not set the variable, the service will accept all uploaded file types.
  • TRUSTED_PROXIES: the string containing the trusted proxies values.
  • ADDITIONAL_FUNCTION_CASTER_FILE_PATH: the path of the file that exports the function to cast.
  • GOOGLE_APPLICATION_CREDENTIALS: the path to access to the google storage credentials. This is required for GoogleStorage type.

Either one of PATH_PREFIX and PROJECT_HOSTNAME is required.

Configuration file

Files Service can be used in multi-bucket mode (from version v2.7.0), meaning that you can configure it to manage different buckets. These have to be related to the same technology provider (e.g., only MongoDB).

In all of the multi-bucket configurations a scope attribute is needed. This differentiates each bucket instance and is used to separate the different domains of data that the service is to manage. For this reason, its values have to be unique through all of the configuration. In this way, you can use the same instance of the Files Service to manage different types of data. In fact, a separate CRUD collection is used with the name assigned to the scope attribute. Not only, this value will become the root for the API path of the different buckets.

Below you can find examples related to each different bucket service supported.

MongoDB GridFS configuration file (single-bucket option)

You need to specify the database URL and the name of the GridFS buckets where the files will be stored. If the bucket doesn't exist, the files service will create it as soon as it is needed.

{
"type": "mongodb",
"options": {
"url": "url-to-mongo",
"bucketName": "my-bucket"
}
}

Cache configuration

If the used bucket does not provide any caching mechanism, the Files Service can provide it. To make use of it you can add the cache property to the configuration file. If you set the cacheControlMaxAge property then a cache-control header will be set in the response of each file with the max-age attributed defined with the provided number of seconds.

Caster file

An example for a custom caster file. This file add (if present in the post parameters) the tags, authorId and ownerId params to CRUD collection.

'use strict'

module.exports = function caster(doc) {
return {
tags: (doc.tags || '').split(','),
authorId: doc.authorId || undefined,
ownerId: doc.ownerId || undefined,
}
}

module.exports.additionalPropertiesValidator = {
tags: { type: 'string' },
authorId: { type: 'string' },
ownerId: { type: 'string' },
}

Passing from a single-bucket configuration to a multi-bucket

It's possible to configure in a multi-bucket mode a Files Service that was already in a single-bucket configuration. As the multi-bucket mode assures retrocompatibility, there is no need to build up a new Files Service instance.

Single-bucket configuration

Let's say we have a single-bucket Files Service already configured. Its link to the Crud Service is specified in its environment variable:

CRUD_URL=http://localhost:3001/animals/

As we can see, the Crud Service is managing the animals collection, where metadata about files saved on the bucket are saved.

Being in a single-bucket configuration, there is only one bucket configured (a mongodb type in this case):

{
"type": "mongodb",
"options": {
"url": "mongodb://localhost:27017/animals",
"bucketName": "filesbucket"
},
"cache": {
"cacheControlMaxAge": 3000
}
}

To manage another domain of files, let's say files related to fruits in a way. we have to configure another Files Service, having the Crud Service pointing to:

CRUD_URL=http://localhost:3001/fruits/

and the config for another bucket:

{
"type": "mongodb",
"options": {
"url": "mongodb://localhost:27017/fruits",
"bucketName": "filesbucket"
},
"cache": {
"cacheControlMaxAge": 3000
}
}

Multi-bucket configuration

We want now to adopt a multi-bucket configuration. In this case, we can use just one instance of Files Service simply changing the pointer to the Crud Service and updating the config file for the bucket configuration.

The Crud environment variable will become:

CRUD_URL=http://localhost:3001/

and the configuration for the bucket will now be an array of configurations:

[
{
"type": "mongodb",
"options": {
"url": "mongodb://localhost:27017/animals",
"bucketName": "filesbucket",
"scope": "animals"
},
"cache": {
"cacheControlMaxAge": 3000
}
},
{
"type": "mongodb",
"options": {
"url": "mongodb://localhost:27017/fruits",
"bucketName": "filesbucket",
"scope": "fruits"
},
"cache": {
"cacheControlMaxAge": 3000
}
},
]

As we can see, we simply got rid of the name of the specific collection in the environment variable for the Crud Service. In fact, it will be the Files Service itself to build the proper url to the Crud Service using the scope parameter.