Minigpt4 Inference on CPU

heyitsguay · on July 20, 2023

I know it's not the main point of this, but... so many multimodal models now that take frozen vision encoders and language decoders and weld them together with a projection layer! I wanna grab the EVA02-CLIP-E image encoder and the Llama-2 33B model and do the same, I bet that'd be fun :D

famouswaffles · on July 20, 2023

Qformer isn't necessary just to be clear. Llava is just a projection layer

GaggiX · on July 20, 2023

Not just a projection layer but also Q-former, in this case it was already trained for that specific vision encoder but if you change it you would need to train a Q-former from scratch.

famouswaffles · on July 20, 2023

Not for mini gpt-4 but it's just a projection layer for many others(like Llava). The Qformer isn't a necessary part of the equation.

pizzafeelsright · on July 20, 2023

I am not an ML expert. I want to know how to add my own documents without sending them off to a 3rd party.

extasia · on July 20, 2023

Use a local LLM like LLaMa 2.

quickthrower2 · on July 20, 2023

Not heard of minigpt4. Why that name? Is it claiming to be specifically a gpt4 competitor?

sanxiyn · on July 20, 2023

MiniGPT-4 is a multimodal model. The name (a bad one, IMO) is a reference to GPT-4's multimodal capability.

Zondartul · on July 20, 2023

They should rename it to ManyGPT.

WantonQuantum · on July 20, 2023

There's more info here: https://github.com/Vision-CAIR/MiniGPT-4 (Linked in the Readme of the repo.)

Der_Einzige · on July 20, 2023

Any data on inference speed? I’ve found that the non quantized model was much faster on GPU than the quantized versions due to lower GPU utilization.

api · on July 20, 2023

It's a RAM tradeoff. If you have enough GPU RAM to load the non-quantized model it may be faster.