Fast, accessible and privacy friendly AI deployment

Mithril Security

Last update: Dec 23, 2022

Related tags

Machine learning rust privacy ai python3 intel-sgx confidential-computing

Overview

Mithril Security - BlindAI

Website | LinkedIn | Blog | Twitter | Documentation | Discord

Fast, accessible and privacy friendly AI deployment 🚀 🔒

BlindAI is a fast, easy to use and confidential inference server, allowing you to deploy your model on sensitive data. Thanks to the end-to-end protection guarantees, data owners can send private data to be analyzed by AI models, without fearing exposing their data to anyone else.

We reconcile AI and privacy by leveraging Confidential Computing for secure inference. You can learn more about this technology here.

We currently only support Intel SGX, but we plan to cover AMD SEV and Nitro Enclave in the future. More information about our roadmap will be provided soon.

Our solution comes in two parts:

A secure inference solution to serve AI models with privacy guarantees.
A client SDK to securely consume the remote AI models.

Getting started

To deploy a model on sensitive data, with end-to-end protection, we provide a Docker image to serve models with confidentiality, and a client SDK to consume this service securely.

Note

Because the server requires specific hardware, for instance Intel SGX currently, we also provide a simulation mode. Using the simulation mode, any computer can serve models with our solution. However, the two key properties of secure enclaves, data in use confidentiality, and code attestation, will not be available. Therefore this is just for testing on your local machine but is not relevant for real guarantees in production.

Our first article Deploy Transformers with confidentiality covers the deployment of both simulation and hardware mode.

A - Deploying the server

Deploy the inference server, for instance using one of our Docker images. To get started quickly, you can use the image with simulation, which does not require any specific hardware.

docker run -p 50051:50051 -p 50052:50052 mithrilsecuritysas/blindai-server-sim

B - Sending data from the client to the server

Our client SDK is rather simple, but behind the scenes, a lot happens. If we are talking to a real enclave (simulation=False), the client actually verifies we are indeed talking with an enclave with the right security properties, such as the code loaded inside the enclave or security patches applied. Once those checks pass, data or model can be uploaded safely, with end-to-end protection through a TLS tunnel ending inside the enclave. Thanks to the data in use, protection of the enclave and verification of the code, everything sent remotely will not be exposed to any third party.

You can learn more about the attestation mechanism for code integrity here.

i - Upload the model

Then we need to load a model inside the secure inference server. First we will export our model from Pytorch to ONNX, then we can upload it securely to the inference server. Uploading the model through our API allows the model to be kept confidential, for instance when deploying it on foreign infrastructure, like Cloud or client on-premise.

from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
import torch
from blindai.client import BlindAiClient, ModelDatumType

# Get pretrained model
model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")

# Create dummy input for export
tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased")
sentence = "I love AI and privacy!"
inputs = tokenizer(sentence, padding = "max_length", max_length = 8, return_tensors="pt")["input_ids"]

# Export the model
torch.onnx.export(
	model, inputs, "./distilbert-base-uncased.onnx",
	export_params=True, opset_version=11,
	input_names = ['input'], output_names = ['output'],
	dynamic_axes={'input' : {0 : 'batch_size'},
	'output' : {0 : 'batch_size'}})

# Launch client
client = BlindAiClient()
client.connect_server(addr="localhost", simulation=True)
client.upload_model(model="./distilbert-base-uncased.onnx", shape=inputs.shape, dtype=ModelDatumType.I64)

ii - Send data and run model

Upload the data securely to the inference server.

from transformers import DistilBertTokenizer
from blindai.client import BlindAiClient

tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased")

sentence = "I love AI and privacy!"
inputs = tokenizer(sentence, padding = "max_length", max_length = 8)["input_ids"]

# Load the client
client = BlindAiClient()
client.connect_server("localhost", simulation=True)

# Get prediction
response = client.run_model(inputs)

What you can do with BlindAI

Easily deploy state-of-the-art models with confidentiality. Run models from BERT for text to ResNets for images, through WaveNet for audio.
Provide guarantees to third parties, for instance clients or regulators, that you are indeed providing data protection, through code attestation.
Explore different scenarios from confidential Speech-to-text, to biometric identification, through secure document analysis with our pool of examples.

What you cannot do with BlindAI

Our solution aims to be modular but we have yet to incorporate tools for generic pre/post processing. Specific pipelines can be covered but will require additional handwork for now.
We do not cover training and federated learning yet, but if this feature interests you do not hesitate to show your interest through the roadmap or Discord channel.
The examples we provide are simple, and do not take into account complex mechanisms such as secure storage of confidential data with sealing keys, advanced scheduler for inference requests, or complex key management scenarios. If your use case involves more than what we show, do not hesitate to contact us for more information.

Install

A - Server

Our inference server can easily be deployed through our Docker images. You can pull it from our Docker repository or build it yourself.

B - Client

We advise you to install our client SDK using a virtual environment. You can simply install the client using pip with:

pip install blindai

You can find more details regarding the installation in our documentation here.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

Disclaimer

BlindAI is still being developed and is provided as is, use it at your own risk.

Comments

Export proofs, python tests and some refactor
Description

I am sorry this PR is big :sweat_smile:

refactor Policy to a proper class

add typings on a lot of functions and classes, add AttestationError error

fix python imports (the init hack we used, which caused conflicts on downstream code..)

moved generated protobuf files to a client/blindai/pb instead of client/blindai directly

add a protobuf file, which contains the file format for proof files

change RunModelResponse & UploadModelResponse classes to proper clean classes

add SignedResponse which is inherited by RunModelResponse and UploadModelResponse

new apis on SignedResponse: is_signed, save_to_file, as_bytes, load_from_file, load_from_bytes, validate (see #37)

change some methods and fields from BlindaiClient to private

add unittests to python client for send_model and run_model and (they mock the server responses)

add unittests to python client for the newly added proof export API surface

Future work

CI integration for these new unittests (which are separate from the e2e tests)

create unittests for dcap_attestation.py (verify_dcap_attestation, verify_claims)

Related Issue

Closes #37

Type of change

[x] This change requires a documentation update

[x] This change affects the client

[ ] This change affects the server

[ ] This change affects the API

[ ] This change only concerns the documentation

How Has This Been Tested?

Added unit tests for the new api surface

Checklist:

[x] My code follows the style guidelines of this project

[x] I have performed a self-review of my code

[x] I have commented my code, particularly in hard-to-understand areas

[x] My changes generate no new warnings

[ ] I have updated the documentation according to my changes

[x] I have added tests that prove my fix is effective or that my feature works

[x] New and existing unit tests pass locally with my changes
opened by cchudant 8
feat/client: configurable ports
Description

Configurable ports numbers in the client side.

Related Issue

None.

Type of change

[X] This change requires a documentation update

[X] This change affects the client

[X] This change affects the API (The client interface, not the proto)

How Has This Been Tested?

Tested locally with default and custom ports numbers.

Checklist:

[X] My code follows the style guidelines of this project

[X] I have performed a self-review of my code

[X] I have commented my code, particularly in hard-to-understand areas

[X] My changes generate no new warnings

[X] I have updated the documentation according to my changes (The changelog file)

Info : Client :snake: Type : New Feature :heavy_plus_sign:
opened by CerineBnsd 7
[Question] How to use multiple inputs for model?

Hi,

How can I upload a model with multiple inputs? The distilbert example does not use multiple inputs but it's quite normal with pre-trained models. What should I pass to dtype and shape in this case?

Thanks.

opened by liebkne 5
server: Optimize docker image sizes
Description

On dockerhub, the images are very very big:

software => 852.76 MB hardware => 929MB hardware-dcsv3 => 853.85 MB

This is the compressed size, meaning the actual sizes are even bigger

Result of docker images <image> --format "{{.Size}}" currently (uncompressed size)

software => 2.96GB hardware => 3.17GB hardware-dcsv3 => 2.98GB

This pull request changes the uncompressed sizes to:

software => 281MB hardware => 532MB (still big, it contains nodejs..) hardware-dcsv3 => 286MB

I don't have the numbers for the compressed sizes

How is that possible?

Docker works on an overlay filesystem. This means, every time we use an instruction such as RUN during the build, it will create a new filesystem layer. The final image is just every layer overlapped on one another. This means that if we install a temporary dependency in a RUN command, we have to uninstall it in the same RUN command, or else, it will still impact the image size after being uninstalled.

The way this new Dockerfile works is by creating separate images for building the app and running it. Build images are quite big since they have all the build dependencies, and run images are as slim as possible, and optimized for size.

Docs: Developer environment

This PR introduces a base-build stage/image that has almost everything you need for developing on BlindAI server. This is a good opportunity to document how to create a proper dev environment for the server on the docs, using docker and vscode.

Something like

DOCKER_BUILDKIT=1 docker build \ -f ./server/docker/build.dockerfile \ -t blindai-dev-env \ --target base-build \ --name blindai-dev-env \ --volume $(pwd):/blindai \ # --device /dev/sgx/enclave \ # --device /dev/sgx/provision \ ./server

What do you think? Where in the docs would that fit?

Related Issue

None

Type of change

[ ] This change requires a documentation update

[ ] This change affects the client

[x] This change affects the server

[ ] This change affects the API

[ ] This change only concerns the documentation

How Has This Been Tested?

This has been tested in software mode on my machine. The images compile fine in hardware and hardware-dcsv3 mode, but I will need to test on actual machines to make sure I did not break anything (CI doesn't check hardware and hardware-dcsv3 yet)

This PR is marked as draft until I do these tests.

Checklist:

[x] My code follows the style guidelines of this project

[x] I have performed a self-review of my code

[x] I have commented my code, particularly in hard-to-understand areas

[ ] My changes generate no new warnings

[ ] I have updated the documentation according to my changes

Info : Build :building_construction:
opened by cchudant 5
Connection Error: client.connect_server(..) - Hardware mode - CovidNet Example
Hi, I encountered some problems while implementing your framework in hardware with the CovidNet example that you provide. Do you think of anything I forgot?

Description

I have an error that I cannot explain during the step of connecting the client to the server.

Blindai Versions

BlindAI Client : "0.2.0"

BlindAI Server : "0.2.2"

Additional Information

Ubuntu

Version: 20.04.1

Package Manager Version: pip 22.0.4

Language version : Python 3.8.10

Kernel : 5.13.0-40-generic

Screenshots:

Type : Bug :lady_beetle:
opened by Alexis-CAPON 4
Merge hardware and software in notebook examples
Description

Merge hardware and software in notebook examples

Something like

client = BlindAiClient() # Comment this line for hardware mode client.connect_server(addr="localhost", simulation=True) # Comment this line for simulation mode client.connect_server( addr="localhost", policy="policy.toml", certificate="host_server.pem" )

Why this modification is needed?

Make the notebooks clearer, and less redundant

What documents need to be updated

[ ] Main README

[ ] CONTRIBUTING guidelines

[ ] BlindAI Client README

[ ] BlindAI Server README

[ ] BlindAI Client CHANGELOG

[ ] BlindAI Server CHANGELOG

[ ] Python Docstrings

[x] others => python examples

Additional Information

None

Checklist

[x] This issue concerns BlindAI Client

[ ] This issue concerns BlindAI Server

Type : Documentation :memo: Type : Testing :test_tube:
opened by cchudant 4
Client telemetry
Description

Here are the info it collects:

For every event,

uid is a unique id created by taking the sha256 of the hostname and username of the machine

platform, composed of os, release, version, arch. Example:

os: Linux/Windows

release: 5.13.19-2-MANJARO on my machine

version: SMP PREEMPT Sun Sep 19 21:31:53 UTC 2021 on my machine (kernel/windows version)

arch: x86_64

the time

Event list:

class ConnectEvent(Event): simulation: bool policy_allow_debug: Optional[bool] server_sgx_enabled: bool server_platform: str server_version: str class SendModelEvent(Event): simulation: bool policy_allow_debug: Optional[bool] server_sgx_enabled: bool server_platform: str server_version: str sign: bool class RunModelEvent(Event): simulation: bool policy_allow_debug: Optional[bool] server_sgx_enabled: bool server_platform: str server_version: str sign: bool

server_platform, server_platform and server_version are sent by the server to the client using the GetServerInfo call.

This builds upon #48, so i expect conflicts if #48 gets new commits.

Questions

Should we remove the server-side telemetry?

Are the infos I collect here enough / too much?

I am making a call to amplitude every time the client connects/send_model/run_model

this makes me worry about performance, since we don't batch events, we send them one at a time at every request..

also, connect/send_model/run_model has to block until the amplitude request is fully sent, nothing is done in the background and it's fully synchronous

maybe there is a better way => an alternative is to send more info to the server when doing a request (like the client uid for example, client version, client platform..)

this may mean we don't have to do client-side telemetry, which imo is a better idea? what do you think

I then need to update the documentation (docs / readme / everywhere we talk about telemetry)

These questions will need to get answered before I can make more progress :) @dhuynh95 @JoFrost

Related Issue

Closes #46

Type of change

[x] This change requires a documentation update

[x] This change affects the client

[x] This change affects the server

[ ] This change affects the API

[ ] This change only concerns the documentation

How Has This Been Tested?

Tested locally and using the amplitude dashboard, and it works :)

Checklist:

[ ] My code follows the style guidelines of this project

[ ] I have performed a self-review of my code

[ ] I have commented my code, particularly in hard-to-understand areas

[ ] My changes generate no new warnings

[ ] I have updated the documentation according to my changes

2022.RustRepo

Fast, accessible and privacy friendly AI deployment

Related tags

Overview

Mithril Security - BlindAI

Website | LinkedIn | Blog | Twitter | Documentation | Discord

Fast, accessible and privacy friendly AI deployment 🚀 🔒

Getting started

Note

A - Deploying the server

B - Sending data from the client to the server

i - Upload the model

ii - Send data and run model

What you can do with BlindAI

What you cannot do with BlindAI

Install

A - Server

B - Client

Contributing

Disclaimer

Comments

Export proofs, python tests and some refactor

Description

Future work

Related Issue

Type of change

How Has This Been Tested?

Checklist:

feat/client: configurable ports

Description

Related Issue

Type of change

How Has This Been Tested?

Checklist:

[Question] How to use multiple inputs for model?

server: Optimize docker image sizes

Description

How is that possible?

Docs: Developer environment

Related Issue

Type of change

How Has This Been Tested?

Checklist:

Connection Error: client.connect_server(..) - Hardware mode - CovidNet Example

Description

Blindai Versions

Additional Information

Screenshots:

Merge hardware and software in notebook examples

Description

Why this modification is needed?

What documents need to be updated

Additional Information

Checklist

Client telemetry

Description

Questions

Related Issue

Type of change

How Has This Been Tested?

Checklist: