Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serving] Image support in JSONFFIEngine #2208

Merged
merged 1 commit into from May 6, 2024

Conversation

anibohara2000
Copy link
Contributor

This PR adds support for Image input in JSONFFIEngine, so that now LLaVa model can be used.

The image has to be passed as base64 encoded. See tests/python/json_ffi/test_json_ffi_engine_image.py for example usage.

For reading the image in C++, the header-only file stb_image.h is being used from the nothings/stb repository. This repository has been added as a submodule in 3rdparty/

The image has to be pre-processed before passing into the LLaVa model (which uses CLIP Vision model). Currently Bilinear interpolation is being used for resizing, instead of Huggingface CLIPImageProcessor's default Bicubic interpolation

@tqchen
Copy link
Contributor

tqchen commented Apr 24, 2024

@anibohara2000 please fix the windows build error


} // namespace json_ffi
} // namespace llm
} // namespace mlc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need new line at end of file

// Center crop
const int crop_x = (new_width - target_size) / 2;
const int crop_y = (new_height - target_size) / 2;
float* cropped_image_data = new float[target_size * target_size * 3]();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid using new, instead use std::vector cropped_imagte_data(size);

They get automatically released when going out of scope

MemoryBufferStream stream(base64_str.c_str(), base64_str.size());
tvm::support::Base64InStream base64_stream(&stream);
size_t decoded_size = Base64DecodedSize(base64_str);
unsigned char* decoded = new unsigned char[decoded_size]();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer std::vector/std::string over new

}

picojson::object config = json::LoadJSONFromString(model_config_str, err).value();
if (config.find("model_config") == config.end()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we are lazily loading these config per request. I think it is better to populate once during reload

const int new_width = width < height ? new_short_side : new_long_side;
const int new_height = width > height ? new_short_side : new_long_side;

float* processed_image_data = new float[new_width * new_height * 3]();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer managed memory (std::vector


using namespace tvm::runtime;

unsigned char* LoadImageFromBase64(std::string base64_str, int* width, int* height,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally prefer managed memory. We can consider return an NDArray (on CPU), with {height, width} dimension and uint8 as dtype.



class JSONFFIEngine:
def __init__( # pylint: disable=too-many-arguments,too-many-locals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us consolidate the implementation of JSONFFIEngine into a single place.

mlc_llm/python/json_ffi/engine.py

Then import these from there, so we don't have to duplicate things for json_ffi_engine_image and json_ffi testcases

@tqchen
Copy link
Contributor

tqchen commented Apr 27, 2024

#2241 should help that lifts out the common JSONFFI into a package namespace


std::optional<NDArray> LoadImageFromBase64(std::string base64_str, std::string* err);

NDArray ClipPreprocessor(NDArray image_data, int target_size, DLDevice device, std::string* err);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: document all public API


using namespace tvm::runtime;

std::optional<NDArray> LoadImageFromBase64(std::string base64_str, std::string* err);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const std::string &base64_str

namespace llm {
namespace json_ffi {

using namespace tvm::runtime;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid using namespace in header file, instead, do tvm::runtime::NDArray in arguments

*err += "vision_config should be an object";
return std::nullopt;
}
picojson::object vision_config = model_config["vision_config"].get<picojson::object>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create VisionConfig and ModelConfig as Explicit in memory structure object(ModelConfig has an optional VisionConfig field) in-memory so we don't have to such parsing per Call

*err += "model_config should be an object";
return std::nullopt;
}
picojson::object model_config = config["model_config"].get<picojson::object>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use explicit object. In this case, ideally we should do const picojson::object& model_config, otherwise it will trigger a deep copy of the object. Of course moving to explicit in-memory structure will avoid the issue. Let us do that

@tqchen
Copy link
Contributor

tqchen commented May 3, 2024

we just had a major refactor landed, unfortunately there was a bit of conflict, please rebase

@tqchen tqchen merged commit 5ae393a into mlc-ai:main May 6, 2024
2 checks passed
@tqchen
Copy link
Contributor

tqchen commented May 6, 2024

Thank you @anibohara2000 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants