Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduled removal of MinIO Gateway for GCS, Azure, HDFS #14331

Closed
harshavardhana opened this issue Feb 17, 2022 · 50 comments
Closed

Scheduled removal of MinIO Gateway for GCS, Azure, HDFS #14331

harshavardhana opened this issue Feb 17, 2022 · 50 comments

Comments

@harshavardhana
Copy link
Member

harshavardhana commented Feb 17, 2022

MinIO Gateway will be removed by June 1st, 2022 from the MinIO repository:

Community Users

  • Please migrate your MinIO Gateway deployments from Azure, GCS, HDFS to MinIO Distributed Setups

  • MinIO S3 Gateway will be renamed as "minio edge" and will only support MinIO Backends to extend the functionality of supporting remote credentials etc locally as "read-only" for authentication and policy management.

  • Newer MinIO NAS/Single drive setups will move to single data and 0 parity mode (that will re-purpose the erasure-coded backend used for distributed setups but with 0 parity). This would allow for distributed setup features to be available for single drive deployments as well such as

    • Versioning
    • ILM
    • Replication and more...
  • Existing setups for NAS/Single drive setups will work as-is nothing changes.

Paid Users

All existing paid customers will be supported as per their LTS support contract. If there are bugs they will be fixed and backported fixes will be provided. No new features will be implemented for Gateway implementations.

@harshavardhana harshavardhana added this to the ni milestone Feb 17, 2022
@harshavardhana harshavardhana pinned this issue Feb 17, 2022
@harshavardhana harshavardhana self-assigned this Feb 17, 2022
@harshavardhana harshavardhana modified the milestones: ni, Next Release Feb 17, 2022
@harshavardhana harshavardhana changed the title Scheduled removal of MinIO Gateway for GCS, Azure, HDFS and NAS Scheduled removal of MinIO Gateway for GCS, Azure, HDFS Feb 17, 2022
@fungiboletus
Copy link

Do you have any alternative to recommend for non paid users who enjoyed MinIO Gateway ? A MinIO distributed setup is not quite the same than Azure Blob Storage for example.

@harshavardhana
Copy link
Member Author

Do you have any alternative to recommend for non paid users who enjoyed MinIO Gateway ? A MinIO distributed setup is not quite the same than Azure Blob Storage for example.

You don't have to upgrade @fungiboletus

@TestMsr
Copy link

TestMsr commented Feb 17, 2022

There used to be a standalone gateway repo,but i can't find it now.was it deleted?

@fungiboletus
Copy link

You don't have to upgrade @fungiboletus

I don't want to have unmaintained software in my stacks though. I have found s3proxy which could be a replacement, but it's Java based and it's lacking some features like the cache or the encryption.

@multinerd
Copy link

For bullet point #3, does that mean I can run a single node setup with version enabled on a bucket?

@harshavardhana
Copy link
Member Author

For bullet point #3, does that mean I can run a single node setup with version enabled on a bucket?

When it is available. It's not finished yet.

@harshavardhana
Copy link
Member Author

You don't have to upgrade @fungiboletus

I don't want to have unmaintained software in my stacks though. I have found s3proxy which could be a replacement, but it's Java based and it's lacking some features like the cache or the encryption.

Oh yes you shouldn't definitely. Thats why you should migrate to a more supported deployment model.

@alecmerdler
Copy link

Is the team planning on sharing motivation for this decision?

@harshavardhana
Copy link
Member Author

Is the team planning on sharing motivation for this decision?

Gateway was designed for migration purposes to MinIO deployments.

From docs:

MinIO Gateway adds Amazon S3 compatibility layer to third-party NAS and Cloud Storage vendors. MinIO Gateway is implemented to facilitate the migration of existing from your existing legacy or cloud vendors to MinIO distributed server deployments.

Gateway has been around for years now, the time given for migration has reached its fruition. It is time now to move away and move our focus on following things.

For Azure, GCS we support Tiering directly from MinIO

HDFS gateway is not needed

We have migration tools written https://github.com/minio/hdfs-to-minio and HDFS also provides a way to copy data out directly to MinIO.

what is not going away?

  • S3 Gateway is becoming minio edge for edge deployments with disk caching
  • NAS Gateway is becoming a 0 parity setup with the same erasure-coded style backend used for distributed setups.

@flixr
Copy link

flixr commented Feb 18, 2022

I understand and having the possibility for bucket versioning in single node setup will be great!
However it's a pity that Azure gateway will go away. We wanted to use that in an edge deployment with caching (unfortunately it has to be Azure blob storage in that project).
Guess we'll have to look for a different solution then.

@supriyo-biswas
Copy link

This would allow for distributed setup features to be available for single drive deployments as well

Here, are we referring to single filesystem mode, or something else?

@harshavardhana
Copy link
Member Author

This would allow for distributed setup features to be available for single drive deployments as well

Here, are we referring to single filesystem mode, or something else?

Single drive mode, minio server /drive

@Julius112
Copy link

Very interesting decision. Many on-premise enterprise grade Kubernetes solutions are backed by a high available storage solution which is either based on NFS or iSCSI. I'm wondering, how a HA setup of Minio would look like in such setups?

Until now, I've made use of the Minio Gateway implementations to have multiple instances exposing the same NFS-based Kubernetes persistent Volume (PV).

If I understand correctly, a 0 parity distributed setup would still be backed by multiple PVs (at least one per minio instance). If the parity is set to 0 and one of the nodes/instances fails, the data on its PV is not available until a new pod/instance is spawned (even though the storage backend is HA and the data could be available to the other instances).

How would I solve such situation in the "new Minio world"? Do I have to use the minio edge implementation?

@dboc
Copy link

dboc commented Feb 21, 2022

@Julius112 Great Question!

We have the same situation here. MinIO Gateways NAS is the simplest solution to provide Object Storage when you have Kubernetes and A Hardware Storage that provide NFS. Other situation is old systems that output files in a folder, than we use MinIO to expose it, maybe here we could use single drive mode minio server /folder.

I think MinIO Gateway NAS is simple and a great solution to initiate the use of object storage, an after the death of it would be more difficult to new clients initiate using it.

As example, we initiated using MinIO Gateway NAS, and now MinIO became very important in our solutions that we are in process to adquire the paid MinIO Support.

@klauspost
Copy link
Contributor

@Julius112 Please read the original post. edge is for s3 backends.

Note this:

Existing setups for NAS/Single drive setups will work as-is nothing changes.

Also...

Newer MinIO NAS/Single...

It sounds like you are looking for a distributed setup, but refusing to do a distributed setup.

@dboc Feel free to write sales@minio.io and describe your case.

@dboc
Copy link

dboc commented Feb 21, 2022

@klauspost Our sales team have been in contact with then, independently we still have interest.

Yes, we now looking for a distributed setup.
The point is: The MinIO Gateway NAS is simplest than a Distributed Setup, and this encouraged us to used it in our production environment.
Now that MinIO became important, we decide to make a step further.

This is why i think it will be interesting that MinIO Gateway NAS continue to exist.

@Julius112
Copy link

Julius112 commented Feb 21, 2022

@klauspost sorry I misread the info about minio edge.

Please don't get me wrong - I'm not trying to fight against the decision of discontinuing development on the NAS Gateway. I'm just trying to figure out how the new available components will fit into the scenario which I described above. I have a few clients that are using one external high available, redundant storage solution (eg. Netapp) as their storage backend for Kubernetes.

I'm looking for a high available setup which, as far as I understand, can only be done by using Minio in distributed mode. That leads to:
a) data being either redundant (parity != 0) or
b) being exclusively available to one instance (parity = 0).
Option a) creates unnecessary redundancy of data as it's already being handled by the storage backend. And option b) is not highly available on eviction/node maintenance events.

Maybe I'm getting this whole thing wrong and then I'm happy to be corrected. However, the NAS gateway seems to be the only solution to me in this scenario.

@fungiboletus
Copy link

My use case for MinIO gateway is simpler : I run stuff in various cloud providers. Azure doesn't have an object storage with an S3 API while everyone else does. I could adapt my code but a lot of software are compatible with S3 and not Azure, for good reasons. So I use MinIO Gateway in front of their object storage with good success so far. I don't have the capabilities, the resources, nor the motivation to install a distributed MinIO setup as good as a cloud object storage.

I understand that my use case doesn't correspond to what MinIO wanted the gateway to be.

@klauspost
Copy link
Contributor

@fungiboletus In that case, just keep your setup as-is.

@Julius112 I don't see where Harsha states that the new setup would disallow accessing the same data from multiple servers, though I can't really see much benefit from it. One MinIO server should be able to handle everything your storage can deliver fine. If you are using it only for failover, you won't need concurrent access anyway.

Distributed mode is truly scalable. What you are proposing is not.

duanhongyi added a commit to drycc/imagebuilder that referenced this issue Apr 25, 2022
harshavardhana pushed a commit that referenced this issue Apr 26, 2022
newer MinIO server removes "gcs" gateway support as per #14331
@harshavardhana
Copy link
Member Author

I have a use case for caching S3 edge proxy with non-MinIO backends (as do many other people who have posted here).
Could this functionality be kept somehow?
Perhaps as a paid option? I'd be happier paying a license fee to you guys (the experts) than I would be paying inexperienced developers to maintain a fork.

Unfortunately no, that's not a priority for us - we have given it enough thought. This has been rather explained very well in https://blog.min.io/deprecation-of-the-minio-gateway/

You can keep using the older release of MinIO there is nothing new that we have added anyways in the gateway for close to 2yrs now.

@meirsegev
Copy link

Hi,

Will I be able to use older minio docker image versions which support gcs gateway ?
I mean for example to use the "docker pull minio/minio:RELEASE.2022-04-16T04-26-02Z.fips"
Or those images will be deleted also ?

Thanks !

@harshavardhana
Copy link
Member Author

@meirsegev please read the issue description - this has been mentioned multiple times already.

@CosmicToast
Copy link

Is there a recommended migration path to move existing single drive setups to the new erasure-coded 0-parity mode? It's possible to spin up a second instance, mirror everything over (though that would take 2x space temporarily), move over all policies etc., but it would be nice to have a direct upgrade path.

@mw-0
Copy link

mw-0 commented May 16, 2022

I don't see a comment about when versioning and replication will work on single drive setups as from the docs it says it needs a distributed setup. Is there any information on this now the gateway has been removed?

@harshavardhana
Copy link
Member Author

I don't see a comment about when versioning and replication will work on single drive setups as from the docs it says it needs a distributed setup. Is there any information on this now the gateway has been removed?

Watch out for when this issue is closed, that's when it's "available"

@kudrew
Copy link

kudrew commented May 20, 2022

Users of the GCS gateway functionality can look into using the GCS XML API.

For my use cases it was just a config change and the provision of a set of hmac credentials for my application to switch from Minio GCS Gateway straight to GCS XML API.

My application points to storage.googleapis.com on 443 and uses the GCP service account's Access Key and Secret instead of Minio's access key and secret key

https://cloud.google.com/storage/docs/xml-api/overview
https://cloud.google.com/storage/docs/authentication/hmackeys

@harshavardhana
Copy link
Member Author

harshavardhana commented May 20, 2022

Here is a list of APIs GCS implements poorly and implements all wrong, moving to GCS while using S3 API has its limitations.

API AWS S3 MinIO GCS (S3 compatibility)
ListObjectVersions ✔️ ✔️

GCS fails in this manner while they succeed on AWS S3 and MinIO fine.

~ mc ls --versions gcs/harshavardhana
mc: <ERROR> Unable to list folder. unrecognized option:Marker
API AWS S3 MinIO GCS (S3 compatibility)
GetBucketEncryption ✔️ ✔️

GCS fails in this manner while they succeed on AWS S3 and MinIO fine.

~  mc encrypt info gcs/harshavardhana
mc: <ERROR> Unable to get encryption info: 503 Service Unavailable.
API AWS S3 MinIO GCS (S3 compatibility)
PutBucketLifecycle ✔️ ✔️

GCS fails in this manner while they succeed on AWS S3 and MinIO fine.

~ mc ilm add gcs/harshavardhana --expiry-days 365
mc: <ERROR> Unable to set new lifecycle rules. The XML you provided was not well-formed or did not validate against our published schema.
API AWS S3 MinIO GCS (S3 compatibility)
SelectObjectContent ✔️ ✔️

GCS fails in this manner while they succeed on AWS S3 and MinIO fine.

~ mc sql gcs/harshavardhana/hosts.csv
mc: <ERROR> Unable to run sql Invalid argument.
API AWS S3 MinIO GCS (S3 compatibility)
MultiObjectDelete ✔️ ✔️

GCS fails in this manner while they succeed on AWS S3 and MinIO fine.

~ mc rm gcs/harshavardhana/hosts
Removing `gcs/harshavardhana/hosts`.
mc: <ERROR> Failed to remove `gcs/harshavardhana/hosts`. Invalid argument.
API AWS S3 MinIO GCS (S3 compatibility)
PutObjectTags ✔️ ✔️

GCS fails in this manner while they succeed on AWS S3 and MinIO fine.

~ mc tag set gcs/harshavardhana/hosts "key1=value1&key2=value2&key3=value3"
mc: <ERROR> Failed to set tags for https://storage.googleapis.com/harshavardhana/hosts: 503 Service Unavailable.

Summary

AWS S3 API Compatibility is quite limited, since few basic APIs do not work well. Other APIs that are not mentioned are not part of GCS S3 compatibility layer implementation are

@harshavardhana
Copy link
Member Author

Single drive mode is now fully XL backend format, no more pre-existing data is supported anymore with single drive mode.

Legacy FS mode continues to work for folks who have existing content, no new deployments are allowed.
No new NAS gateway deployments are allowed as well, existing setups continue to work.

This issue shall be closed now, we haven't moved minio gateway s3 to minio edge yet - this will be subsequently finished which would bring minio edge to speak to only MinIO servers.

@mindrunner
Copy link

I am confused,. sorry. We are using minio as a gateway to Azure. How are we supposed to migrate?

@Cave-Johnson
Copy link

For anyone stuck with the removal of gateway, I'd recommend looking at SeaweedFS in Cloud Drive mode. It can even read the contents of current bucket and store the metadata locally. Was a very easy migration for me.

@astrolox
Copy link

astrolox commented Jun 2, 2022

I am confused,. sorry. We are using minio as a gateway to Azure. How are we supposed to migrate?

I believe that they want you to setup a new minio cluster, which could be running on Azure or multiple clouds, etc.
After it is running, then you have to copy your data out of your old minio gateway setup into the new setup. I presume they expect you to do that by writing a script which uses the S3 API.


I'm considering forking the old minio gateway and maintaining it. However, I'm not sure if my company has enough time to dedicate to maintenance without community support.

@mindrunner
Copy link

mindrunner commented Jun 2, 2022

For anyone stuck with the removal of gateway, I'd recommend looking at SeaweedFS in Cloud Drive mode. It can even read the contents of current bucket and store the metadata locally. Was a very easy migration for me.

@Cave-Johnson Do you use that with Azure File Storage backend? Is Seaweed fully compatible with s3 client libraries?

@mindrunner
Copy link

I'm considering forking the old minio gateway and maintaining it. However, I'm not sure if my company has enough time to dedicate to maintenance without community support.

@astrolox I was thinking the same, but we definitely do not have the capacity :(

@Cave-Johnson
Copy link

Cave-Johnson commented Jun 2, 2022

For anyone stuck with the removal of gateway, I'd recommend looking at SeaweedFS in Cloud Drive mode. It can even read the contents of current bucket and store the metadata locally. Was a very easy migration for me.

@Cave-Johnson Do you use that with Azure File Storage backend? Is Seaweed fully compatible with s3 client libraries?

I wasn't using Azure backend but it's fully compatible with the s3 client libraries (I have used both the minio library and boto3 to speak to it)

https://github.com/chrislusf/seaweedfs/wiki/Cloud-Drive-Architecture

@harshavardhana
Copy link
Member Author

I am confused,. sorry. We are using minio as a gateway to Azure. How are we supposed to migrate?

You should migrate to https://min.io/product/multicloud-azure-kubernetes-service

@harshavardhana
Copy link
Member Author

For anyone stuck with the removal of gateway, I'd recommend looking at SeaweedFS in Cloud Drive mode. It can even read the contents of current bucket and store the metadata locally. Was a very easy migration for me.

@Cave-Johnson Do you use that with Azure File Storage backend? Is Seaweed fully compatible with s3 client libraries?

Seaweed uses MinIO's authN layers here and there - they have borrowed in some security issues as well looks like never bothered to fix it.

https://github.com/chrislusf/seaweedfs/blob/master/weed/s3api/auth_signature_v4.go#L3

https://github.com/chrislusf/seaweedfs/blob/master/weed/s3api/auth_signature_v2.go#L3

https://github.com/chrislusf/seaweedfs/blob/master/weed/s3api/policy/post-policy.go#L4

https://github.com/chrislusf/seaweedfs/blob/master/weed/s3api/chunked_reader_v4.go#L6

@Cave-Johnson @mindrunner

@minio minio locked as resolved and limited conversation to collaborators Jun 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests