It's also important to understand that the use of local models means you’re inevitably going to suffer from a smaller context window — that is the ability to handle large chunks of text in one go, ...
Note: You may need 80GB GPU memory to run this script with deepseek-vl2-small and even larger for deepseek-vl2.